Is it necessary to rarefy FeatureData[Sequence] if I rarefy FeatureTable[Freqeuncy]?

Anisur_Rahman · October 14, 2021, 7:46am

Hi,

I am trying to do UPGMA clustering for some data from different countries. I collected these data from different databases.

Finally, I will use the following code;

qiime diversity beta-rarefaction

But, samples from some countries contain a very higher number of reads ( as much as 2.7 million), while samples of some countries have a very low number of reads (like only 9k).

Eventually after denoising the data with Deblur, some samples have a very higher number of feature and other contain very few features.

So, I am planning to rarefy the FeatureTable[Frequency] files to make the feature numbers even. In the qiime feature-table rarefy code, I will use;

--p-no-with-replacement option for samples with higher features
--p-with-replacement option for samples with lower features

Then I will merge the FeatureTable[Frequency] and FeatureData[Sequence] files.

In the next step, I will use these two files to generate a phylogenetic tree artifact by qiime fragment-insertion sepp.

And then, finally, I will create the UPGMA cluster.

Now my questions are:

Does rarefy the FeatureTable[Frequency] files is the right decision?
Is it okay to use different replacement options for lower or higher feature containing samples?
I don't find any plugin or command in qiime to rarefy FeatureData[Sequence] files. Do, I need to rarefy FeatureData[Sequence], if I rarefy corresponding FeatureTable[Frequency] files?

thermokarst · October 15, 2021, 2:51pm

Hi @Anisur_Rahman!

I suggest you watch these videos for some discussion rarefaction - this should help clear up some of your confusion here:

system · November 15, 2021, 8:51pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.