The number of features is normal but the number of observed_otus is too low?

Hi @sandro.goforit,

A couple things… first, your number of OTUs looks in the range of reasonable, maybe a little bit low for fecal samples, but plausable. Keep in mind that you only have 100 samples and the number of total features (OTUs, ASVs, etc) is often proportional to the number of total samples. So more samples -> more observed ASVs because of noise, sequencing, real data, etc etc etc.

Im a little bit more concerned that your data is plateauing at 100 ASVs. It seems like a very shallow depth, and Id expect your observed ASVs to look like @timanix’s bottom curve. I’m not saying this is wrong - its just not behaviour you might expect, like your title says. My take away from it might be that you just have low complexity samples… how do they look when you go to other methods of examination?

Best,
Justine

4 Likes

Can you show me your DADA2 de-noise code? Ben

3 Likes

You have, for example, 4000 features in 100 samples, and it’s ok to have 50-150 OTUs curve, since the OTUs may differ from sample to sample. 4000 is general amount of OTUs distributed among all your sample together. Some OTUs are the same for every sample, other are not. Or I am not understanding right what are you asking?

3 Likes

Thank you so much!!! Now I understand :blush:

Hi Ben,
This is what I've done👇

Best,
Sandro

Hi Justine,
Thank you. Your explanation helped me a lot. My samples came from cancer patients. When I check the number of observed_otus in other papers which used the same kind of samples, I found the numbers of observed_otus are around 150-250.:thinking::worried:
You menntioned other methods of examination and this is what I've done in two ways.:point_down:
Do you have some suggestions about methods of examination?

Best,
Sandro

Hi @sandro.goforit,

First, that is a beautiful graphic and/or your handwriting is lovely.

If the curves (if not the numbers) are consistent between the two methods, then I would stick with that. It’s a bit odd to me because I would expect your curves to continue to grow, but if this is your data, this is your data.

The actual alpha diversity number presented in papers is a function of their rarefaction depth, sequencing protocol, metric… and not as externally valid as you’d like. Depending on their depth, it seems low to me, and so your low depth may correlate. How do these numbers compare to your healthy controls?

My recommend continuation would be to look at things like beta diversity to see what your overall patterns look like. I’d (personally) stick to the dada2 table without clustering because you’ve done all the work to get there and it seems wasteful to then collapse it back again.

Best,
Justine

1 Like

Can we see the denoising stats and the cut adapt stats? Thank you. Ben

Hi Justine,
Thank you.:grinning:
Actually, I haven't sequenced the samples from healthy people. This is a great suggestion.
And the numbers of observed_otus are consistent between the two methods. I think I will accept this result. Here is a result of beta_diversity. The samples are divided into 6 groups according to different treatment phases. I think there is no obvious difference between these groups👇.

Hi @sandro.goforit,

I think at this point, it looks like you’re okay to move forward. I would recommend looking at some of the tutorials to see what pipelines for statistics, etc are. For base QIIME 2, I think the moving pictures tutorial is a good start, and the Parkinson’s mice is a bit more comprehensive. (It’s new for this release, and covers some more methods and hopefully a bit of interpretation). You also might want to look into q2-longitudinal, if your data is a time series. If you’re working in R, there are probably a lot of really good tutorials there (I’m just less familiar). But, of course, these are all just starting places and there are a lot more good options for exploring and analysing your data.

Best,
Justine

Hi Ben~
This is my stats.dada2.qzv👇



I got a trimmed-sequences.qza from cutadapt.
And this is the "qiime demux summarize" of it.
trimmed_sequences.qzv👇


Is there something wrong?Thank you.:blush:

Best,
Sandro

1 Like

Hi Justine~
Thank you for your suggestions. I will read these tutorials one by one:grinning:

Best,
Sandro

2 Likes

Thanks, I noticed you have lost a lot after merged sequences to non-chimeric.

I think just reviewing this would be interesting (you're losing up to 75% of your reads between merged / non-chimeric.

This may explain why you los resolution. I sometimes lose up to 50% of the reads, but I think that's a lot of reads to lose on the last step.

As you can see, I don't lose a lot of sequences between merged and non-chimeric. See this:

Also try here:

Ben

1 Like

Actually, I just trouble shot this for another lab, here is my code for the V3V4 region:

qiime dada2 denoise-paired
--i-demultiplexed-seqs ~/QIIME2_2_import/paired-end-demux.qza
--p-trim-left-f 13
--p-trim-left-r 13
--p-trunc-len-f 270
--p-trunc-len-r 270
--p-n-threads 0
--o-table ~/QIIME2_3_demux/table_540.qza
--o-representative-sequences ~/QIIME2_3_demux/rep-seqs_540.qza
--o-denoising-stats ~/QIIME2_3_demux/denoising-stats_540.qza

1 Like

Hi Ben,
Super thanks for your suggestion:smiley:! Now Im trying to figure out this and I will read everything as you recommended.:muscle:

Best,
Sandro

2 Likes

Hi Ben,
Your suggestion helped me sososo much!!! :blush::blush: Thank you for finding the problem in my qzv-result. And I finished reading the website you recommended, which helped me to know what I can do for the next step.
Super thanks.:green_heart::green_heart:
I will update my result.

Best,
Sandro

2 Likes

Yes, let me tell you I have run against this same problem, after I fixed it I got better beta-diversity distributions. Could you confirm that your PCOAs look the same or better? Thank you. Ben

1 Like

Hi Yue,

I found someting interesting in your primers for V3-V4 region. Could you please tell me what’s the length of your amplicons or the location of the primer pair? what kind of platform did you use for NGS (Miseq or Hiseq or Nova 6000)? Thanks

Best,

Decen

Hi Decen,
Thank you for your help.:grinning: Sorry for my late reply.
The length of my amplicons is 460 and the primers are CCTACGGRRBGCASCAGKVRVGAAT(upstream) and GGACTACNVGGGTWTCTAATCC(downstream). The platform is Illumina Miseq. Is there something wrong?:face_with_monocle:

Best,
Yue

1 Like

Hi Yue,

Nothing serious.
It looks they are not “common degenerate primer”, when I try to test your primer pair in SILVA “https://www.arb-silva.de/search/testprime/”, it cannot past. I am using the communal primers, amplicon length ~470bp. If you like, Could you please share how comes for your primer pair? Thanks
Best,