Hi, I have a 16S rRNA amplicon data set (EMP standard primers) that was sequenced at a facility and the returned reads aren't great quality, especially the reverse reads. See screenshot and uploaded .qzv. My questions are the following:
would you recommend merging these reads (as opposed to just using forward reads)
It seems like --p-max-ee would kick out most really bad sequences, is that correct?
With --p-trunc-q having a default of 2, will the default actually do anything given that I don't have reads with quality that dips below 2, which seems really low. Am I understanding this correctly?
Alternatively, would you recommend processing these sequences outside of qiime2 because the quality is poor.
Finally, do you think I should inquire to the sequencing facility about this quality? Does it seem unacceptable?
I would either just use the forward reads, or attempt to merge the first 220 nt of forward with 100 nt of reverse. Definitely trim left on each of these to remove the low-quality bases at the start of each read.
dada2 is going to take care of denoising these reads. The greater issue is ensuring that these reads are trimmed so that low-quality bases at the 3' end of each read aren't causing those reads to be tossed. So yes, --p-trunc-len and --p-trim-left will be the main parameters to use. You could also try playing around with --p-max-ee to squeeze out some more reads (since it looks like errors are interspersed through your reads).
correct. it pre-filters the reads prior to denoising.
correct. By default, effectively no trimming is performed. That's the goal. (and personally I like trimming at specific sites rather than using --p-trunc-q which will result in reads/ASVs of different lengths).
No. This is not a qiime2 issue, it is a problem with your reads, so going outside of qiime2 will not help anything necessarily. If you want to do qiime1-style quality filtering and otu picking (much more permissive than dada2/deblur), this can be done with q2-quality-filter and q2-vsearch.
The quality is disappointing but not the worst I've seen. You could chat with them to see if they would be willing to re-run, but I am not sure that this is necessarily "unacceptable". I'd give it a shot and see how much data you can squeeze out...