However, this time, my colleagues used 2 commercial primers for the amplification, since the samples for this run were obtained in a different way. Therefore, I was considering using cutadapt, before dada2 step, to remove those primers' sequences.
The problem is that cutadapt only has trim-paired and trim-single options, but ion torrent does not work like Illumina and contains both directions, so.. which option should I use?
If it is possible to use cutadapt with IT sequences and I use it to remove my primers, is not necessary to trim again in the dada2 step, is it?
If it is not, what additional option could I use? Should I remove those primers' sequences with the dada2 trimming option? (In this case, if my primers are 20 nucleotide long, is it enough to trim them in 20?)
I think this is the first time the team has come across this, so we had to discuss. I'm not totally sure this will work, but it's worth trying .
I'm assuming that you already know the primer sequences.
Your idea of using cutadapt is exactly what I would do. I would use trim-single, with the --discard-untrimmed flag, although I'm struggling with a good way to check this.
No, Cutadapt will trim your primers for you, so you do not need to trim them again.
If you use Sidle (use the memory refactor branch as of right now), you should be able to just scaffold the forward and reverse reads as separate regions and combine them.
No, DADA2 and Cutadapt trim differently. Cutadapt trims the sequence, Dada2 just trims the number of basepairs you specify, but it can't trim partial sequences or deal with sequences where there primers have already been trimmed. Since you have multiple regions and need to seperate them, cutadapt will let you do that. Once you've run the trimming with Cutadadapt, you don't need to trim with DADA2.
Even if everything seems to work perfectly, I have come across with some strange things when checking the dada2 output.
Many sequences from rep-seqs-dada2.qza file seem to be really similar, see figure (I think it is the first time I see that huge similarity between reads in this file). Moreover, when reviewing the stats-dada2.qza file, the percentage of input passed filter is really small.
I probably would have trimmed the forward adapter, but it looks like this is mostly working.
Please check your sequence quality and trim length, because that's where you're losing data (likely due to a quality drop at the end). There are a lot of posts on the forum about how to judge dada2 quality; it might be good to look at those to better understand how to use your dada2 summary to guide trimming.
I have rerun the command only removing the forward adapter.
When it comes to the quality trunc, I had followed the commonly used criteria to cut sequences... however, having tried to cut earlier, I have obtained better results.