I would appreciate some advice on the best workflow for my analysis.
I have two sets of samples generated from two different sequencing runs. Both runs are 16S rRNA gene amplicons (V3–V4 region, 2×300 bp). One run contains 96 samples, while the second run contains only 2 samples.
I need to perform two separate analyses:
-
An analysis including only 64 samples from the first run.
-
An analysis including all samples from both runs.
My current plan is as follows:
-
Run DADA2 denoising separately for each sequencing run, using identical parameters.
-
Merge the resulting feature tables (and representative sequences).
-
Continue the downstream analysis on the merged data to obtain analysis #2 (all samples).
-
For analysis #1, filter the merged feature table to retain only the 64 selected samples and then perform downstream analyses based on this filtered table.
Does this workflow make sense from a QIIME 2 / DADA2 best-practice perspective?
Are there any potential issues or better approaches I should consider when dealing with multiple sequencing runs and subset analyses?
Thank you in advance for your help.
PM