cutadapt - Cannot trim primers in reverse ITS reads

I am working with paired-end ITS data and while I have no problem removing forward adapters, no matter what I put for my reverse primer after the -A flag, I always get primers left over.

And here is my general pipeline where I first just look at how the primers are distributed in the reads, followed by running cutadapt, and then running my sanity check again to check that all primers are removed. While I do see that some primers are removed for the reverse reads in the RevComp column, it is still not all 0…

FWD <- "CTTGGTCATTTAGAGGAAGTAA"
REV <- "GCTGCGTTCTTCATCGATGC"

> FWD.orients <- allOrients(FWD); FWD.orients
                 Forward               Complement                  Reverse                  RevComp 
"CTTGGTCATTTAGAGGAAGTAA" "GAACCAGTAAATCTCCTTCATT" "AATGAAGGAGATTTACTGGTTC" "TTACTTCCTCTAAATGACCAAG" 

> REV.orients <- allOrients(REV); REV.orients
               Forward             Complement                Reverse                RevComp 
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC" 

# Check primers in original uncut/unfiltered files; not filtering because I lose way too many reads
rbind(FWD.ForwardReads = sapply(FWD.orients, primerHits, fn = fnFs[[1]]),
      FWD.ReverseReads = sapply(FWD.orients,primerHits, fn = fnRs[[1]]),
      REV.ForwardReads = sapply(REV.orients, primerHits,fn = fnFs[[1]]),
      REV.ReverseReads = sapply(REV.orients, primerHits, fn = fnRs[[1]]))

# Output before any trimming is done:
                 Forward Complement Reverse RevComp
FWD.ForwardReads   27211          0       0       0
FWD.ReverseReads    1720       1715    1707   18415
REV.ForwardReads       0          0       0   26660
REV.ReverseReads   26766       1826    1728    1744


FWD <- FWD.orients[["Forward"]]; FWD  # keep the same as is
REV <- REV.orients[["RevComp"]]; REV  # because it matches to the revcomp

> REV.orients <- allOrients(REV); REV.orients
               Forward             Complement                Reverse                RevComp 
"GCTGCGTTCTTCATCGATGC" "CGACGCAAGAAGTAGCTACG" "CGTAGCTACTTCTTGCGTCG" "GCATCGATGAAGAACGCAGC"

Run cutadapt (in terminal). I have noticed that removing the anchors makes this work better for forward reads, because if I use ^ or &, there is always a few primers left in forward reads in the forward orientation. Linking without anchoring has given me the best results thus far but I need them to all show 0 primers present at the end.

Cutadapt code:

for r1 in *_R1.fastq; do r2=${r1/_R1.fastq/_R2.fastq}; sample=${r1%%_R1.fastq}; /home/chauk/.local/bin/cutadapt \
-a CTTGGTCATTTAGAGGAAGTAA...GCATCGATGAAGAACGCAGC -A GCTGCGTTCTTCATCGATGC...TACTTCCTCTAAATGACCAAG \
-n 2 -o original/${sample}_R1_cut1.fastq -p original/${sample}_R2_cut1.fastq $r1 $r2; done

Sanity check these cut files and here’s the output:

                 Forward Complement Reverse RevComp
FWD.ForwardReads       0          0       0       0
FWD.ReverseReads    1720       1715    1707    1518
REV.ForwardReads       0          0       0       0
REV.ReverseReads    1834       1826    1728    1744

As you can see, forward reads are all good and reverse reads have most of the primers removed in the RevComp orientation…but they still exist at every orientation throughout. I don’t get it.


One more thing to add in case this matters…the sequencing place did use 4 length variations of adaptor + primer sequence as shown below, and I don’t know if this matters. I did try running cutadapt on just the adaptor sequence as well and that made virtually no difference. Also, does not explain why the forward reads are completing fine after trimming but the reverse reads are essentially contaminated with primers after trimming,

Forward1 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTTGGTCATTTAGAGGAAGTAA
Forward2 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTTCTTGGTCATTTAGAGGAAGTAA
Forward3 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCTTGGTCATTTAGAGGAAGTAA
Forward4 = ACACTCTTTCCCTACACGACGCTCTTCCGATCTCAACTTGGTCATTTAGAGGAAGTAA

Reverse1 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGCTGCGTTCTTCATCGATGC
Reverse2 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTTGCTGCGTTCTTCATCGATGC
Reverse3 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTACGCTGCGTTCTTCATCGATGC
Reverse4 = GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCATGCTGCGTTCTTCATCGATGC

1 Like

One more update to this.

Update. I ran a second round of cutadapt on these trimmed reads using the following command to see if I can just remove these reads with adapters still present - it made zero difference. I don't understand where these adaptor sequences are being detected in my reverse reads.

cd original
for r1 in *_R1_cut1.fastq; do r2=${r1/_R1_cut1.fastq/_R2_cut1.fastq}; sample=${r1%%_R1_cut1.fastq}; /home/chauk/.local/bin/cutadapt -a CTTGGTCATTTAGAGGAAGTAA...GCATCGATGAAGAACGCAGC -A GCTGCGTTCTTCATCGATGC...TACTTCCTCTAAATGACCAAG --pair-filter=both --discard-trimmed -o round2/${sample}_R1_cut2.fastq -p round2/${sample}_R2_cut2.fastq $r1 $r2; done


# Sanity check in R shows:

# go to path where the first set of trimmed reads go
path.cut2 <- "~/../Desktop/.../original/round2"
myfiles.For.cut2 <- file.path(path.cut, paste0(strsplit(basename(fnFs),".fastq"),"_cut2.fastq"))
myfiles.Rev.cut2 <- file.path(path.cut, paste0(strsplit(basename(fnRs),".fastq"),"_cut2.fastq"))


# Sanity check; call the original primers and check if anymore are removed
rbind(FWD.ForwardReads = sapply(FWD.orients, primerHits, fn = myfiles.For.cut2[[1]]), 
      FWD.ReverseReads = sapply(FWD.orients, primerHits, fn = myfiles.Rev.cut2[[1]]), 
      REV.ForwardReads = sapply(REV.orients, primerHits, fn = myfiles.For.cut2[[1]]), 
      REV.ReverseReads = sapply(REV.orients, primerHits, fn = myfiles.Rev.cut2[[1]]))

# Output; NO DIFFERENCE
                 Forward Complement Reverse RevComp
FWD.ForwardReads       0          0       0       0
FWD.ReverseReads    1720       1715    1707    1518
REV.ForwardReads       0          0       0       0
REV.ReverseReads    1744       1728    1826    1834

I did try with --discard-untrimmed but I lost all my reads.

1 Like

Hi @Katherine_Chau, It sounds like you're using cutadapt directly, not through QIIME 2. Please correct me if I'm wrong about that. Note that we are not the developers of cutadapt, so don't have a lot of expertise to help you out. Someone may jump in to help, but I would recommend reaching out to the cutadapt developers.

If you're working with ITS though, here are a couple of resources on the forum that you might find helpful:

Both are a little dated and may need some updates for more recent versions of QIIME 2, but hopefully this gives you somewhere to start if you want to work with QIIME 2.

2 Likes