Hi everyone.
I'm learning to analyze metagenome data in qiime2.
currently I have sequence data generated from the Ion S5 platform for practice. I'm confused, in the previous post [Can Ion Torrent sequencing data be analyzed in QIIME2 - #7 by ebolyen], why when denoising data from Ion Torrent, we are recommended to set the trim-left variable to 15? is there any special reason for this?
I'm also confused, when we want to remove chimeric, adapter, and primers, should it be done in a certain order? for example, should we first remove the adapter and chimeric before removing the primers?
You should be able to use Ion Torrent data. You will want to remove any non-biological sequence information(primers & adapters) before running q2-dada2 (they can be confused by it as chimeras). See here for more info, if you do end up using q2-dada2 to do it, you will need to add the 15 to any other left trimming you are doing. Also, you will want to double check in the DADA2 docs that left trim is performed after adapter removal if that is how you end up doing it.
DADA2 will remove chimeras for you as part of the de-noising process, so no worries there.
This page by @benjjneb (the developer of DADA2) provides the recommendation to trim the first 15 bps when using Ion Torrent sequence data. He might be able to answer any other questions in more depth.
This was based on our analysis of a single Ion Torrent dataset, about 4-5 years ago, which showed particularly poor denoising performance over the first few nucleotides.
However, at this point I would consider this recommendation to be deprecated due to the absence of new supporting evidence using more recent IT data.