does diversity (alpha/beta) of one sample/group affects others

khemlalnirmalkar · June 9, 2021, 9:43pm

Hello,
I came across to an unusual observation. I have two different data set but controls are common/same. When I analyzed these two data sets separately and checked the mean/median of controls in both, they are different (big difference). Why? i used the same control fastqs for both data set and used the same parameters/criteria. does diversity (alpha/beta) of one sample/group affects others?

Thanks,

vheidrich · June 9, 2021, 10:20pm

Hi,

I am sure other members can give you a more complete response, but first thing that comes to my mind is that as far as I know denoising is dataset dependent, which means your control samples may end up with different ASVs depending on the context (denoising run), which is causing the different observed diversities that you reported. Therefore, it is possibly not enough to use the same control fastqs and parameters throughout your two pipelines, maybe you should denoise all of your samples together and separate the datasets afterwards. Alternatively, if each dataset represents a different sequencing run, maybe you should run denoising in parallel respecting the grouping of the sequencing runs and merge/separate these samples as you like after denoising.

Hope this helps

khemlalnirmalkar · June 9, 2021, 11:11pm

@vheidrich Thanks. Now i am thinking this could be one of the reasons. Do you know, how to separate the dataset after denoising? i am going to dig a little bit more in the tutorial.
Thanks

ChrisKeefe · June 10, 2021, 12:42am

Just a couple of things I'd like to add here:

As @vheidrich suggests above, if you're denoising multiple sequencing runs with DADA2, you should not merge your data prior to denoising; it can mess with the error model. Deblur is not similarly affected, AFAIK.
If you are using core-metrics, core-metrics-phylogenetic, or any other workflow in which you rarefy your data, you are randomly subsampling without replacement prior to calculating diversity. Depending on a few factors (e.g. sampling depth), variations in outcome based on this random subsampling can be significant. The deeper you sample, the more representative your outcomes will be.

khemlalnirmalkar · June 10, 2021, 6:11am

Thanks for your suggestion. I dont have multiple sequencing runs. Now I am thinking of doing denoising all my fastqs. Then separate my samples as two different data set but keep the control in both sets. Do you know how to separate the output of denoised samples?

Thanks,

ChrisKeefe · June 10, 2021, 5:29pm

QIIME 2 offers tons of tools for filtering data. Check out the tutorial! I suspect you'll want either demux filter-samples or feature-table filter-samples, depending on your use case.

khemlalnirmalkar · June 12, 2021, 1:53am

Thanks for the suggestion. I filtered the table.qza and did the diversity analyses. Still, I got a minor difference between the control from the two data sets.
Now I don't know what would be the reason for getting different values? e.g. observed features for the same controls are different in two different datasets.

timanix · June 14, 2021, 9:00am

Hi! Sorry for delayed reply.
If you applied a rarefaction to certain sequencing depth before calculation of diversity metrics then minor differences in metrics may be observed since rarefaction is a random subsampling of reads from each sample.
Even repeating the analysis with the same dataset and rarefaction depth may produce a minor differences in diversity metrics.

system · July 15, 2021, 3:01pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.