Meta analysis with different 16S region and get too low features after DADA2

jwdebelius · August 11, 2022, 4:40pm

I am so sorry for disappearing on this.

You pipeline mostly makes sense, but I think there are a few checks you may want to revisit

Are you discarding untrimmed reads when you do primer trimming? Have you checked how many reads you're losing at this step? I would recommend making sure you're not discarding everything if the primers have already been trimmed.

I would recommend this for a primer pair, but not for all the studies overall.

You should not do quality control before you run DADA2. DADA2 assumes that your reads not quality filtered, so I would recommend either using Deblur if you want to quality filter, or not doing that step. I would recommend the same parameters within the same hypervariable region, but would suggest that you can do variable regions specific primers.

I think the fact that you're tracing back to the ASV table is smart, and would recommend checking your read statistics in each step.

I again think this is smart it lets you scaffold all the sequences using consistent identifiers.

I don't understand this step. Why are you doing taxonomic assignment after you did closed refernece OTU clustering? One of the beauties of closed reference clustering is that the taxonomy is already assigned and the tree is already built for you. You just need to import them into QIIME 2 and you're ready to go!

So, I think I'd suggest the following modifications

Check your sequences after primer trimming for each study to make sure that reads have not been discarded
Use paired end reads for regions where you have paired end reads (except V13, that needs to be single end if its Illumina)
Either perform quality filtering and denoise with deblur or use DADA2 alone
Don't try to do taxonomic classification
Keep checking the number of reads retained at each step
Keep using consistent parameters within a single region
Keep asking questions.

Best,
Justine