We noticed that for reverse reads, from 50-110 bp the graph looks not normal?
Does this mean that the quality is actually not good?
What we did to the raw data before we imported into QIIME2, we did primer and quality trimming using bbduk. Our previous analysis that used the same workflow did not have problem like this so I doubt it was because of the bbduk trimming.
Thanks for mentioning your preprocessing steps. I don't think bbduk would cause changes to quality score like that, but you could import the raw data just to double check that the strange quality scores are already inside read 2, and not created by bbduk.
If bbduk is causing problems, don't worry! There are Qiime 2 plugins like cutadapt that can remove primers and do quality trimming all from within the Qiime 2 ecosystem
If it’s in the raw data, then it must be from the sequencing run itself.
You could try processing this with dada2 and see if the denoising process can help correct for these low quality regions, but if that does not work, it might be best to just use the forward run as the quality is higher.
We are still thinking if we should proceed with only forward reads.
But may I know what is the significance/implication(?) of this? We know that the region (V3-V4) is around 460 bp but with only forward read, it will consider only half of it: 230 bp?
Appreciate if you can share me your thought on this.
Yeah, only 2 samples made it through processing, and in those, most reads failed to pass filter or join.
That’s right. And it could be shorter if you trim off the low quality end of read 1.
The largest one I can think of is taxonomy classification. Long reads have more information and can resolve similar taxonomy. Shorter reads might not be able to get down to the family, genus, species level.
On the other hand, taxonomy would also suffer with low-quality long reads and a larger number of high-quality reads will give you deeper coverage of your samples!
Thank you so much @colinbrislawn for this insight! I will take this into consideration. Would be better to proceed with only forward read if it can help to provide better taxa classification for our samples.
Btw, I just noticed I made a mistake here.
We only did primer trimming because we read in the forum DADA2 will do error correction during its process and it already includes quality trimming by default which is --p-trunc-q 2, is this correct?
For this case, should we do quality trimming prior to DADA2 either using bbduk or cutadapt?
Can you help to advise?
Yes… and q2-dada2 does other things too as part of it’s pipeline! Take a look at all the options for the plugin.
It depends. I would suggest to do the quality trimming within the dada2 plugin, but that might not work for all data sets. The goal it to remove primers and barcodes that could mess up denoising, and there’s a lot of ways to do that. In some sequencing methods, the reads don’t contain adapters at all, so this is not needed.