Primer sequences still present in representative-sequences after running ITSxpress

Pablo_V · May 6, 2025, 2:07pm

Hi,

I have received de-multiplexed sequences from the sequencing facility and I have trimmed the ITS1 primers I have used using the following cutadapt command:

qiime cutadapt trim-paired \ 

  --i-demultiplexed-sequences fun-rz-rt-demux.qza \ 

  --p-anywhere-f CTTGGTCATTTAGAGGAAGTAA \ 

  --p-anywhere-r GCTGCGTTCTTCATCGATGC \ 

  --o-trimmed-sequences fun-rz-rt-demux-trimmed2.qza \

Then I used this trimmed sequences as input for ITSxpress in the QIIME2 2023.5 using the following command:

qiime itsxpress trim-pair-output-unmerged\
  --i-per-sample-sequences demux-sequences-trimmedprimers.qza \
  --p-region ITS2 \
  --p-taxa F \
  --p-cluster-id 1.0 \
  --p-threads 16 \
  --o-trimmed trimmed_exact.qza

Then I ran DADA2 with the output obtained and conducted taxonomic assignation. The resulting table had 21.9K features, which is half as much in comparison to the number of features obtained without trimming the primers from the intput sequences.

However, now when I check the rep-seqs file, I see a lot of my forward and reverse primer sequences in the rep-seqs sequences.

My main concern would be, if it is alright to find back my primer sequences in the representative sequences.

I feel there might be a mistake in my pipeline since when I check for differentially abundant features with LEfSe, I get none after p-adjustment with FDR.

Thank you for your support.

SoilRotifer · May 7, 2025, 7:03pm

Hi @Pablo_V,

I suggest reading these posts:

In brief, the primer sequences make it easier for tools like ITS express / ITSx, etc... to find and extract the region of interest. This will potentially result in keeping more ITS reads. So, I'd skip the cutadapt step and simply jump right into itsexpress. This is basically what I do whenever I need to process ITS data.

-Mike