# of sequencing reads for each phyla in each sample

Hi,
I am new to qiime2 analysis and sorry if I ask a very simple question.
I am trying to get the number of sequences for each phyla (e.g. number of sequences for Firmucutes and Bacteroidetes in each sample). I get the number of OTUS from the output files but not the number of sequences. Is there any file which can be generated from Qiime2 which gives me that information.
Many thanks, Thilini

Hi @Thilini,

Welcome to QIIME2! :sun_with_face:

Could you please clarify what you are trying to do? Are you trying to get the number of sequences in each sample for a specific phylum? E.g., the number of Firmicutes sequences observed in each sample? Or the total number of Firmicutes sequences observed across all samples?

Furthermore, are you trying to count the number of unique sequences (e.g., OTUs) that represent that phylum? Or are you trying to count the total number of sequences for all OTUs in that phylum?

How do you plan to use this information? With a little more clarity I may be able to tell you whether such a method already exists in QIIME 2 or give some advice for obtaining the values you need. Thanks!

It sounds like what you are probably trying to do is get the count of all sequences classified to a specific phylum in each sample. To do that, you would use taxa collapse to collapse at phylum level (level 1). The resulting feature table would contain the frequency (number of sequences observed) for each phylum in each sample. You could then use heatmap to view this frequency information as a heatmap (though depending on what you are trying to do you may want to either convert to relative abundance and show as a barplot or evenly sample your feature table before visualizing to control for uneven read coverage between samples).

2 Likes

Hi Nicolas,
Thank you very much for your reply.
I first thought what I get from the Qiime 2 analysis is the OTU table which has numbers at each level (e.g. number of Bacteroidetes or number of E coli ect.) But it is not right it seems because table.qza has the counts of sequences classified to a each level in each sample (e.g. number of sequences observed for Bacteroidetes in each sample)

Please correct me if I am wrong.

It sounds like what you are probably trying to do is get the count of all sequences classified to a specific phylum in each sample. To do that, you would use taxa collapse to collapse at phylum level (level 1). yes, that's I needed and thank you very much for your prompt reply. I ran followings.

qiime taxa collapse
--i-table table.qza
--i-taxonomy taxonomy.qza
--p-level 2
--o-collapsed-table table-2.qza

qiime tools export table-2.qza --output-dir taxonomy-table

biom convert --to-tsv -i taxonomy-table/feature-table.biom -o taxonomy-table/feature-table.tsv

When I looked at the feature table.tsv it had the same same numbers of OTUs that I got from taxonomy.qzv.

So from that, I understand that final OTU table has the number of sequencing counts not number of OTUs. Right? Please correct me if I am wrong.

Many thanks, Thilini

Hi @Thilini,

Correct; it is the number of times each features is observed in each sample. These features can be OTUs, sequence variants, or taxa (if you use taxa collapse), or even other types of data.

Correct.

If you want to count the number of unique OTUs observed in each sample that belong to a specific phylum (instead of the number of times sequences belonging to phylum X are observed), you can do the following:

  1. Use taxa filter-table to only include OTUs that belong to the phylum you are interested in.
  2. Use diversity alpha with --p-metric observed_otus to count the number of unique OTUs belonging to phylum X in each sample.

I hope that helps clarify!

2 Likes

Thanks a lot, Nicholas. Really appreciate your help in this regard.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.