Good morning QIIME2 team!
My previous work has focused on metagenomic and 16S sequencing of the rumen microbiome. However, we have begun to work with some more challenging samples in terms of abundance, etc. Recently, we had a set of sheep milk samples sequenced and I wanted to get a few opinions from this group.
This is 16S data that was put on a Nano-run just to evaluate how the sequences looked as we had high cluster density but only 55% of the clusters passed filter. I didn't expect things to go smoothly with milk samples and expected large variation but attached is the table-paired-end qzv and as you can see we have HUGE variation. Samples #52 and #69 are our controls (water and zymogen community) but all others are milk samples. Any advice as to where you would set sampling depth for this type of data? Would running on a full lane improve this output?
Thank you for any and all advice on this matter!
table-paired-end2.qzv (559.7 KB)
Those read counts do not look good! I recommend going back to dada2 to see if changing the truncation settings will allow you to milk out any more reads.
If you just can’t get out any more reads by adjusting truncation parameters, I’d recommend resequencing, since it looks like you are losing too many samples right now.
I have not worked with ewe milk, but I have analyzed other milks (including human milk, which is probably as clean as it comes), and it can be challenging but not exactly a low biomass sample type, so these low read counts look more like an issue with dada2 truncation settings or with the low-quality sequencing run.
Thank you so much for the response! I think our dada2 truncation was fair but I do think it is a product of sequencing. We are going to re-sequence these samples. May I ask what sequencing depth you have experience with in the human milk samples? We are trying to decide appropriate depth and with this being a new sample type we are looking for all recommendations.
No, sorry I cannot recall the precise amount! It’s been a while and I no longer have access to the data.
Lots of food samples are lower diversity than, say, soil — so often get sufficient coverage with ~1k QC’ed reads/sample. Aim high, though, to account for loss during sequencing and QC… maybe 10k/sample?