Just get a few samples in the alpha diversity.

Hi everyone,

This is my very first post! I am a BI student carrying out a mouse analysis, I think i should warn you I am not an expert at all.

My data are 77 sample (paired-end) : 33 were sequenced when the animals were 8 months old and the rest of them at 12 months. One of my aims are studying the differences between ages.

The sequencing quality its really poor, specially in the reverse. The better explanation I find for this is that the fecal matter was not well conserved before sequencing and that the protocol used for the reverse its not optimiced. When I checked with FastQC i didnt find any sample particulary bad they were generally all bad, with a few qualtiy warnings.

Then I used DADA2 for paired ends, to try to denoise trimmering at 242 (forward) and 180 (reverse). And analyzing the output denoising_stars.qzv I noticed that the samples with the higher values for merged and non-chimeric coincide to be my 12 months sample.
I performed the aligment with mafft, then the unrooted tree, then the rooted tree and then the core metric phylogenetic with a 25000 value for the sampling depth. I have read several post to understand the cryteria for this value and all that I can safely its that must be lower than my minimunm sequencing counts (mines are 33000).

My surprise was when I checked the qzv's resulting in the diversity core metricts, there were only samples of the 12M sequencing.

I am sure I am doing something wrong but I can find exactly where its the problem. I guess my denoising table its not quite good.

Looking forward to your help!

Thank you

metadata(3).tsv (4.2 KB)

Hi @VerheulJulia,
Maybe your 8month samples were lower quality for some reason than your 12 month. This might have caused you to have less reads per sample for the 8 month(because dada2 filtered them out due to low quality/inability to merge).

So I think what is happening is that your sampling depth is too high and is filtering out all your 8 month samples. I would look at your table.qzv and go to the interactive-sample-details tab and change the metadata category drop down menu to whatever metadata column contains whether the sample is 8month/12 month. Make sure your selection of 25000 isn't filtering out all your 8 months by entering 25000 as the sampling depth. If the 8 month bar greys out that means all the 8month samples are getting filtered out.

Let me know if that works. If not can you send me your table.qzv so that I can look at it?


Hi @cherman2,

Thank you for answering!

Yes, probably the 8month samples are worse than the 12 months. I tried with a sampling depth of 2000 and I get all sample. I rode a few post before trying to understand the criteria for this value and some of them argued that the valued for microbiome analysis are 2000 and in other post that it should be close to the minimal value of senquencing counts. If I use 2000 am I skewing in any way?

table.qza (114.9 KB)
table.qzv (477.0 KB)

Hello @VerheulJulia
Have you checked out the qiime2 youtube channel? we have a video about selecting sampling depth: PD Mice: Even Sampling Depth - YouTube

I think this might help. It discusses how to choose a sampling depth and make sure that there are not any issues surrounding that sampling depth.

I think something important to note is that there is no number that is the perfect sampling depth. You have to decide where the cut off is for your data based on how many samples you are willing to lose/how many features you want per sample. I think Matt's video hits on this pretty well, so I really recommend it!


I will check it, thanks for your help!!

