Separate denoised data based on projects


I am new to Qiime and am using it to analyze my 16s results on mouse samples. I received 3 files from the sequencing core, forward reads, reverse reads and barcodes file. I was able to import the data to qiime, demultiplex and denoise it.I also built a phylogeny tree for all my samples. My problem is I had the data from 4 different projects, all together in the fastq. I tried looking up on the forum as to how I can separate the data by projects and analyze each one separately. I found several posts on merging the data from different reads but not much on separating the data based on projects. I found one post "separating data imported together", where the taxonomy classification is also done on all samples before separating them using filter-samples command.
So what I have done so far:
My data is demultiplexed, denoised and then I build phylogeny trees for all the data (all 4 projects combined), so the (feature-table.qza ), a rep-set (rep-set.qza ) and the stats.file that I have is for all the data combined.
I went ahead and separated the feature table based on projects, using the following command:
qiime feature-table filter-samples
I was able to separate the 4 projects, for the next step: diversity core metrics, I used the individual project feature table, but the phylogeny rooted-tree.qza that I used for the core metrics was the one created for all the samples combined, and then did alpha diversity analysis from there.
My first question is if this is the right way to do it? Or should I have started the whole analysis with separate projects?
From what I understand the denoising step won't change if I do it for all the samples combined or if I separate them based on the projects but I'm very new so I'd really appreciate your help. Thanks!


Looks like you already figured it out by your own!

No, I do not think you need to start over.

Probably it is a better idea to rebuild a tree for each project and recalculate diversity metrics.


Thanks for getting back to me so quickly. I understand what you're saying but then the rep-seqs.qza in the qiime phylogeny align-to-tree-mafft-fasttree, comes from the dada2 analysis which was done for all samples combined. Do you think I should go back and do that again for each project?
Like, what step is it usually advisable to separate the data into projects?
I'd really appreciate the help, thanks!

No, since you did it in the right way. Dada2 results are more trustworthy when you are processing all the reads from one run/lane together, and subdivision before this step may affect denoising outputs.

Now I understood your question. You can use this plugin to filter rep-seq.qza. Just provide your rep-seq.qza as an input and filtered by project table to get new rep-seq file that will be specific to certain project.

Thank you for the advise, I was able to separate the rep-seqs.qza based on projects and create individual phylogeny trees for each.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.