Hi all, I'm running into similar issues when running uBiome data through DADA2. Here's what my quality quality plots looked like after import (.fastq files were already demultiplexed, and to construct each F and R file I concatenated .fastq files from four separate lanes):
I'm new to this, but the sequence quality looks good, no? I read that uBiome trims barcodes, primers and linkers, but I couldn't find whether they do any additional quality filtering before packaging the .fastq files for download.
Hi Charlie,
Your reads are clearly not long enough to successfully join.
I am guessing you may be targeting the V4 domain? 150 + 150 is not quite enough to overlap this ~290nt-long amplicon with 20 nt of overlap (the default requirement for dada2 to join the reads).
Your only option now will be to use the forward reads and proceed as if you have single-end data.
Thanks for the quick reply. Yes, it’s the V4 region (515F/806R). Out of curiosity, where are you getting the 290bp number from? ~250bp is the number I’ve seen before, but browsing the literature a bit, I’m seeing both ~250bp and 300+bp. Why the discrepancy - what am I missing?
That makes sense that dada2 wouldn’t be able to join the reads if the overlap is so small, but if that’s the case, any idea how uBiome joins reads for its in-house analysis?
Please note though, if you go this route, you will not be able to use q2-dada2 for denoising, since the joined reads quality scores invalidate the error model.