Cutadapt doesn't recognize all adapter sequences

elephas · November 26, 2020, 5:59pm

Hello everybody!

I am experiencing a problem with qiime cutadapt trim-paired.
My data is from a MiSeq run 2x300 for 16S analysis. According to the RAW-data-sheet data is demultiplexed but adapters have not been trimmed beforehand.

The program is actually running fine, using the following code:

qiime cutadapt trim-paired
--i-demultiplexed-sequences paired-end-demux-16S.qza
--p-anywhere-f 'CTGTCTCTTATACACATCTCCGAGCCCACGAGAC'
--p-anywhere-r 'CTGTCTCTTATACACATCTGACGCTGCCGACGA'
--p-cores 2
--o-trimmed-sequences paired-end-trim-16S.qza
--verbose

Unfortunately when I check the verbose output it presents me very low trimming results:

I got the adapter sequences directly from Macrogen, the company which did the sequencing service.
To be sure that adapter sequences where actually there and sequences where specific to forward and reverse read (as stated by the technical support) I checked fastq sequences for the presence of the adapter and almost (!!!) all reads present the corresponding sequence in either R1 or R2 respectivly:

Although I find it weard that not all complete sequence-reads present adaptors as I expected, the amount of reads that finally got trimmed is far less (1,6%) than the amount of reads that bear the sequence according to fastq sequences.
Note: In former runs I used "front" and "adapter" in place of "anywhere" because I wasn't sure if the adapter sequences where placed in 5' or 3' (actually I think they are placed on both ends). The --verbose outputs where very similar.

Many thanks in advance for any hint that might help!

llenzi · December 2, 2020, 12:10pm

Hi @elephas,

Welcome on the forum!
If you have not fixed the issue yet, I would suggest you to trim the PCR primers instead of the sequencing adapters, which should
be following the adapter sequences in your reads (although it may depend on the library prep used).

Also, I suggest you to look at the following thread:

and add '--p-match-adapter-wildcads' and '--p-discard-untrimmed' as options!

Hope it helps

elephas · December 4, 2020, 6:11pm

Hi IIenzi!
Thanks for the reply and the warm welcome!
I followed your suggestion and searched for the PCR primers but unfortunatly they are localized at the extremes of each reads and apparently the adapters follow afterwards. I checked it in the fastq sequences and also in the procedure description of Macrogen where it is explained that first adapters are added to fragmented gDNA and afterwards PCR amplified. Therefore cutting the primers doesn’t help to get rid of the adapters.
Anyway I ran the commend to see if primer cutting is more efficient and effectivly I got 99,8 % trimmed sequences.
Update: After really actually counting the times adapter sequences apear in my fastq files I see that there are effectivly very few adapters. This is something I completely don't understand because suppoesdly as allmost all reads have the primer sequence the adapter should be automatically amplified together with the 16S fragment.
I could simply cut primers and the few adapters that are there and simply go on, but I want to make sure that I am really doing the correct thing! Is there any further suggestion? Many thanks in advance!

llenzi · December 7, 2020, 9:15am

Hi @elephas,

sorry for not replying earlier but I was not notified for your reply, please be sure to tag @elephas people in for that!

On your question: "Orazio, there are more library prep protocols that stars in the Sky" (I may be confuse please not take it as literal citation ), and I am not a lab person and I don't know many of them!

If (as I assume), with 'adapters' we are referring to the sequencing adapters, these are the priming sites for the sequencing process and, given its directionality, the bases included in the reads are located at the 3' side of the adapters. Hence, to be present in the sequences, the PCR-primers have to 'follow' the sequencing adapters.
In this context, your percentage of sequences containing the adapters and PCR primers are pretty normal, actually they may even gave you sequences already adapter-trimmed! By chance, did you run fastqc on your sequences? It should highlight the presence of adapters (as well as duplicate sequences as the primers ...)

If you remove the PCR primers, could you still find the adapters?

To me, I think removing the primers you will be fine.

Cheers