A couple things… first, your number of OTUs looks in the range of reasonable, maybe a little bit low for fecal samples, but plausable. Keep in mind that you only have 100 samples and the number of total features (OTUs, ASVs, etc) is often proportional to the number of total samples. So more samples -> more observed ASVs because of noise, sequencing, real data, etc etc etc.
Im a little bit more concerned that your data is plateauing at 100 ASVs. It seems like a very shallow depth, and Id expect your observed ASVs to look like @timanix’s bottom curve. I’m not saying this is wrong - its just not behaviour you might expect, like your title says. My take away from it might be that you just have low complexity samples… how do they look when you go to other methods of examination?
You have, for example, 4000 features in 100 samples, and it’s ok to have 50-150 OTUs curve, since the OTUs may differ from sample to sample. 4000 is general amount of OTUs distributed among all your sample together. Some OTUs are the same for every sample, other are not. Or I am not understanding right what are you asking?
Hi Justine,
Thank you. Your explanation helped me a lot. My samples came from cancer patients. When I check the number of observed_otus in other papers which used the same kind of samples, I found the numbers of observed_otus are around 150-250.
You menntioned other methods of examination and this is what I've done in two ways.
Do you have some suggestions about methods of examination?
First, that is a beautiful graphic and/or your handwriting is lovely.
If the curves (if not the numbers) are consistent between the two methods, then I would stick with that. It’s a bit odd to me because I would expect your curves to continue to grow, but if this is your data, this is your data.
The actual alpha diversity number presented in papers is a function of their rarefaction depth, sequencing protocol, metric… and not as externally valid as you’d like. Depending on their depth, it seems low to me, and so your low depth may correlate. How do these numbers compare to your healthy controls?
My recommend continuation would be to look at things like beta diversity to see what your overall patterns look like. I’d (personally) stick to the dada2 table without clustering because you’ve done all the work to get there and it seems wasteful to then collapse it back again.
Hi Justine,
Thank you.
Actually, I haven't sequenced the samples from healthy people. This is a great suggestion.
And the numbers of observed_otus are consistent between the two methods. I think I will accept this result. Here is a result of beta_diversity. The samples are divided into 6 groups according to different treatment phases. I think there is no obvious difference between these groups👇.
I think at this point, it looks like you’re okay to move forward. I would recommend looking at some of the tutorials to see what pipelines for statistics, etc are. For base QIIME 2, I think the moving pictures tutorial is a good start, and the Parkinson’s mice is a bit more comprehensive. (It’s new for this release, and covers some more methods and hopefully a bit of interpretation). You also might want to look into q2-longitudinal, if your data is a time series. If you’re working in R, there are probably a lot of really good tutorials there (I’m just less familiar). But, of course, these are all just starting places and there are a lot more good options for exploring and analysing your data.
Hi Ben,
Your suggestion helped me sososo much!!! Thank you for finding the problem in my qzv-result. And I finished reading the website you recommended, which helped me to know what I can do for the next step.
Super thanks.
I will update my result.
Yes, let me tell you I have run against this same problem, after I fixed it I got better beta-diversity distributions. Could you confirm that your PCOAs look the same or better? Thank you. Ben
I found someting interesting in your primers for V3-V4 region. Could you please tell me what’s the length of your amplicons or the location of the primer pair? what kind of platform did you use for NGS (Miseq or Hiseq or Nova 6000)? Thanks
Hi Decen,
Thank you for your help. Sorry for my late reply.
The length of my amplicons is 460 and the primers are CCTACGGRRBGCASCAGKVRVGAAT(upstream) and GGACTACNVGGGTWTCTAATCC(downstream). The platform is Illumina Miseq. Is there something wrong?
Nothing serious.
It looks they are not “common degenerate primer”, when I try to test your primer pair in SILVA “https://www.arb-silva.de/search/testprime/”, it cannot past. I am using the communal primers, amplicon length ~470bp. If you like, Could you please share how comes for your primer pair? Thanks
Best,