ITS cutadapt trimming of primer and reverse complement of the reverse primer

Hello

I'm running qiime2-2021.8 (Conda install) on some ITS data and following along with this tutorial: tutorial.

I did a "grep" of the primer sequences (and reverse-complement) my raw sequences and can see that I have read-through. I then ran cutadapt to remove these and am a little confused by the results.

I imported the data and can see that the reads are 300bps as they should be (pre-trimming):

qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path raw_data \
--input-format CasavaOneEightSingleLanePerSampleDirFmt \
--output-path analysis/seqs/combined_seqs.qza

qiime demux summarize \
 --i-data analysis/seqs/combined_seqs.qza \
 --o-visualization analysis/visualisations/combined_seq.qzv

combined_seq.qzv (318.8 KB)

I then ran cutadapt.

#Forward primer: CTTGGTCATTTAGAGGAAGTAA
#Reverse complement of forward primer: TTACTTCCTCTAAATGACCAAG
#reverse primer: GCTGCGTTCTTCATCGATGC
#reverse complement of the reverse primer: GCATCGATGAAGAACGCAGC

qiime cutadapt trim-paired \
  --i-demultiplexed-sequences analysis/seqs/combined_seqs.qza \
  --p-adapter-f GCATCGATGAAGAACGCAGC \
  --p-front-f CTTGGTCATTTAGAGGAAGTAA \
  --p-adapter-r TTACTTCCTCTAAATGACCAAG \
  --p-front-r GCTGCGTTCTTCATCGATGC \
  --output-dir analysis/seqs_trimmed

qiime demux summarize \
--i-data analysis/seqs_trimmed/trimmed_sequences.qza \
--o-visualization analysis/visualisations/trimmed_sequences.qzv

trimmed_sequences.qzv (325.5 KB)

After running cutadapt above, I was surprised that more sequence in my forward reads had not been trimmed as shown in Demultiplexed sequence length summary (based on what I'd seen in my grep command).

I then ran cutadapt with the parameters on their own to see what the results would be and they seem to trim the data accordingly (again looking at the Demultiplexed sequence length summary):

qiime cutadapt trim-paired   --i-demultiplexed-sequences analysis/seqs/combined_seqs.qza   --p-adapter-f GCATCGATGAAGAACGCAGC  --output-dir analysis/seqs_trimmed_GCATCGATGAAGAACGCAGC

qiime demux summarize --i-data analysis/seqs_trimmed_GCATCGATGAAGAACGCAGC/trimmed_sequences.qza --o-visualization analysis/visualisations/trimmed_sequences_GCATCGATGAAGAACGCAGC.qza

trimmed_sequences_GCATCGATGAAGAACGCAGC.qza.qzv (324.6 KB)

qiime cutadapt trim-paired   --i-demultiplexed-sequences analysis/seqs/combined_seqs.qza --p-front-f CTTGGTCATTTAGAGGAAGTAA  --output-dir analysis/seqs_trimmed_CTTGGTCATTTAGAGGAAGTAA


qiime demux summarize --i-data analysis/seqs_trimmed_CTTGGTCATTTAGAGGAAGTAA/trimmed_sequences.qza --o-visualization analysis/visualisations/trimmed_sequences_CTTGGTCATTTAGAGGAAGTAA.qza

trimmed_sequences_CTTGGTCATTTAGAGGAAGTAA.qza.qzv (324.2 KB)

However, when I run the two parameters at the same time, it seems not to trim the reverse complement ( Demultiplexed sequence length summary).

qiime cutadapt trim-paired --i-demultiplexed-sequences analysis/seqs/combined_seqs.qza --p-adapter-f GCATCGATGAAGAACGCAGC --p-front-f CTTGGTCATTTAGAGGAAGTAA --output-dir analysis/seqs_trimmed_both-f

qiime demux summarize --i-data analysis/seqs_trimmed_both-f/trimmed_sequences.qza --o-visualization analysis/visualisations/trimmed_sequences_both-f.qza

trimmed_sequences_both-f.qza.qzv (324.7 KB)

Am I missing something? I thought that it should trim both the forward primer and the reverse complement of the reverse primer?

Thanks
Gayle

1 Like

@gkphilip,

I think you need to anchor some of your primers to indicate where cutadapt should be searching for them. For example to tell it to look for your forward primer only at the beginning of the sequence you hand it: --p-front-f ^CTTGGTCATTTAGAGGAAGTAA. Check out the docs for additional, detail instruction on this.

While this may seem like it should be unnecessary, it is unfortunately a limitation of the cutadapt API, which we are not the developers of :disappointed:

Thanks Keegan. After looking through the docs, I'm still confused though. If you had to anchor it, how come it works fine and finds and trims the 5' primer when it's not anchored when you run the command with just the --p-front-f parameter (and no other parameters)? Likewise it finds and trims the 3' reverse complement primer by running just --p-adapter-f (and no other parameters). Though running those parameters in combination (running both --p-front-f and --p-adapter-f together), it doesn't do both sets of trimming (5' and 3') which is how the ITS tutorial does it.

I'll give it a go with anchoring the forward primer when I use it in combination and see if that works.

I actually ran it in two steps like here: Cut-adapt trim paired - different results when primers separate vs linked and it gave the expected trimming. Thanks for your help.

3 Likes

Hi there @gkphilip - @Keegan-Evans is out of the office for the rest of the week, he'll get back to you some time next week. Thanks!

2 Likes

@gkphilip,

I am glad you got it figured out! Cutadapt can be a bit picky and likes to be told exactly what to do. I wish it was not so picky, but we are not the developers of Cutadapt itself :woozy_face:

Thanks for hopping back on and documenting that this fixed it for you :ok_hand:t3: Between your post, the post you linked, and another post from this past week, I think we can say that this is now a known issue and warn other users about it!

1 Like

It sure is picky! Do you think it is worthwhile for me to comment on the ITS tutorial on here (Fungal ITS analysis tutorial) to point out that this may be an issue when running the cutadapt step with your own data? In the tutorial, the forward primers have already been trimmed in the raw reads that they use, so those steps outlined in the tutorial work fine for that data.

1 Like

@gkphilip,

That would be great, thanks!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.