EDIT: While waiting for my post to be approved I made the parameters much lower (110 & 110) and now 99.99% of my reads are passing filter. This feels like maybe I swung too far the other direction. I'd appreciate any advice on setting these truncation lengths
Hi all, I'm super new to QIIME and testing out merging reads on a single sample (18S V9 sequences, average 167bp in length after adapter trimming in fastp. Reads are paired end 2x300bp) Here is my code:
Thanks for passing these on. First off, it looks like you're in good shape now with your current parameter settings so I recommend using those to process all of the samples in that run (rather than just the single sample you used for testing). A couple of other thoughts:
I think your trim parameters might be unnecessary as it looks like you have very high quality base calls at the beginning of your sequences. And you mentioned trimming adapters prior to running DADA2, so I'm guessing that's not an attempt to remove adapters/primers? If you are doing that to remove primers, I recommend using qiime cutadapt trim-paired instead as it's better targeted to adapter/primer removal.
I'm surprised that you were losing so many reads due to the filter in your first attempt. The quality seems a little low when you get out to 190 bases, but nothing that I would be very concerned about yet. Generally I tend to look for where the median quality crosses some threshold, such as Q30 or Q25, but there are not hard-and-fast rules for where trimming / truncating should be performed. Have you tried intermediate values for those parameters as well to see what the longest you can go is without seeing a big drop in the number of reads that are retained?
Since most of your reads are now passing the filter and are being merged, it probably doesn't matter a whole lot whether you set a higher value than 110 for the truncation length. When joining paired end reads, you mostly need the read length to be able to join the read pairs. If you can avoid the trimming however, that might give you slightly longer sequences, which may be slightly more informative when you do taxonomic assignment.
Thanks Greg, your insight is appreciated! I'll try running all my samples and testing some intermediate values.
Also, yes the trimming parameters were for primers, I believe fastp only removed the adapters. I'll use the command you suggested instead.