I did an analysis about microbiome in cancer vs normal tissue. I performed the analysis with QIIME2 and then wit R. And I have a question in rarefaction,
First: After a denoising I got the table stats.tsv
and it contains de number of non-chimeric sequences by sample.
Here an example:
sample-id | input | filtered | percentage of input passed filter | denoised | merged | percentage of input merged | non-chimeric |
---|---|---|---|---|---|---|---|
NT01 | 210543 | 28876 | 13.72 | 22526 | 15749 | 7.48 | 9172 |
NT02 | 212269 | 38087 | 17.94 | 33803 | 28788 | 13.56 | 8286 |
NT03 | 218688 | 63771 | 29.16 | 58613 | 50601 | 23.14 | 12628 |
In this experiment the minimum number of non-chimeric sequences was 2319, so I performed my rarefaction curve with the plugin qiime diversity alpha-rarefaction
, with this command:
qiime diversity alpha-rarefaction \
--i-table table.qza \
--i-phylogeny rooted-tree.qza \
--p-max-depth 2319 \
--m-metadata-file metadata.tsv \
--o-visualization alpha-rarefaction.qzv
And I got my rarefaction and I was happy.
Then, I import my data to R in order to use phyloseq
, so I constructed my phyloseq object with my metadata
, the taxonomy that I got in taxonomy.qza
, the feature table that I got in table.qza
and my phylogenetic tree. But when I sum the columns of feature-table
I didn't get the same number of sequences than in the denoising-stats
, actually it is very different. Obviusly, when I performed the rarefaction with rarefy_even depth
it's also very different.
So.. the question is: Why I get different number of sequences per sample? Shouldn't it be the same result if you sum the feature-table
columns (they are the samples) and the data in the column non-chimeric
in the denoising-stats
table?
Thanks in advance.