Hi,
I am new to qiime2 analysis and sorry if I ask a very simple question.
I am trying to get the number of sequences for each phyla (e.g. number of sequences for Firmucutes and Bacteroidetes in each sample). I get the number of OTUS from the output files but not the number of sequences. Is there any file which can be generated from Qiime2 which gives me that information.
Many thanks, Thilini
Hi @Thilini,
Welcome to QIIME2!
Could you please clarify what you are trying to do? Are you trying to get the number of sequences in each sample for a specific phylum? E.g., the number of Firmicutes sequences observed in each sample? Or the total number of Firmicutes sequences observed across all samples?
Furthermore, are you trying to count the number of unique sequences (e.g., OTUs) that represent that phylum? Or are you trying to count the total number of sequences for all OTUs in that phylum?
How do you plan to use this information? With a little more clarity I may be able to tell you whether such a method already exists in QIIME 2 or give some advice for obtaining the values you need. Thanks!
It sounds like what you are probably trying to do is get the count of all sequences classified to a specific phylum in each sample. To do that, you would use taxa collapse
to collapse at phylum level (level 1). The resulting feature table would contain the frequency (number of sequences observed) for each phylum in each sample. You could then use heatmap
to view this frequency information as a heatmap (though depending on what you are trying to do you may want to either convert to relative abundance and show as a barplot or evenly sample your feature table before visualizing to control for uneven read coverage between samples).
Hi Nicolas,
Thank you very much for your reply.
I first thought what I get from the Qiime 2 analysis is the OTU table which has numbers at each level (e.g. number of Bacteroidetes or number of E coli ect.) But it is not right it seems because table.qza has the counts of sequences classified to a each level in each sample (e.g. number of sequences observed for Bacteroidetes in each sample)
Please correct me if I am wrong.
It sounds like what you are probably trying to do is get the count of all sequences classified to a specific phylum in each sample. To do that, you would use taxa collapse to collapse at phylum level (level 1). yes, that's I needed and thank you very much for your prompt reply. I ran followings.
qiime taxa collapse
--i-table table.qza
--i-taxonomy taxonomy.qza
--p-level 2
--o-collapsed-table table-2.qza
qiime tools export table-2.qza --output-dir taxonomy-table
biom convert --to-tsv -i taxonomy-table/feature-table.biom -o taxonomy-table/feature-table.tsv
When I looked at the feature table.tsv it had the same same numbers of OTUs that I got from taxonomy.qzv.
So from that, I understand that final OTU table has the number of sequencing counts not number of OTUs. Right? Please correct me if I am wrong.
Many thanks, Thilini
Hi @Thilini,
Correct; it is the number of times each features is observed in each sample. These features can be OTUs, sequence variants, or taxa (if you use taxa collapse
), or even other types of data.
Correct.
If you want to count the number of unique OTUs observed in each sample that belong to a specific phylum (instead of the number of times sequences belonging to phylum X are observed), you can do the following:
- Use
taxa filter-table
to only include OTUs that belong to the phylum you are interested in. - Use
diversity alpha
with--p-metric observed_otus
to count the number of unique OTUs belonging to phylum X in each sample.
I hope that helps clarify!
Thanks a lot, Nicholas. Really appreciate your help in this regard.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.