Should I make subgroups of input data or merging runs after fitering step with DADA2


I have several dataset which were sequenced in different runs. Can I make small subgroups of them for different analyses or I have to merge them after filtering step of the whole runs. I don’t like the second way because each of my dataset combines different experiments. I find it difficult to generate correct statistics results from the big mixed data since Qiime makes unexpected groups for its analysis.

Please let me know if my question is clear,

Thank you,


If you intend to analyze and publish the datasets separately, then go ahead and process separately. If you intend to compare these data, it would be wise to keep together to make sure these data are analyzed in a uniform way.

Thanks a lot @Nicholas_Bokulich,

I am just not sure how much the runs affect to the results.



There are lots of general discussions on this forum about batch effects — search the archive for those discussions and judge from there whether your data are appropriate to compare against each other or whether batch effects may be an issue (e.g., if treatment groups are separated by runs).

