Trimming Paired-End Sequences with cutadapt While Maintaining Pairing Integrity

Problem:

I have a set of paired-end sequencing data (_1.fastq for forward reads and _2.fastq for reverse reads) that I need to trim using cutadapt. Specifically, I want to apply different trimming conditions for the forward and reverse sequences, but I also need to ensure that the forward (_1) and reverse (_2) reads remain paired correctly after trimming.

The current plan:

  • Forward sequences (_1.fastq): Trimming with -m 260 (minimum length of 260) and -l 260 (fixed length of 260).
  • Reverse sequences (_2.fastq): Trimming with -m 230 (minimum length of 230) and -l 230 (fixed length of 230).
  • The objective is to maintain the pairing between the forward and reverse reads after trimming.

Here’s the current implementation, where I run cutadapt on the forward and reverse files separately:

Set up input and output directories

INPUT_DIR=data
OUTPUT_DIR=04_trimming
mkdir -p {OUTPUT_DIR} cd {OUTPUT_DIR}

Trimming forward sequences (_1.fastq)

parallel --jobs 4
'cutadapt
-m 260
-l 260
-o {1/}_trimmed_1.fastq
{1} \

{1/}_cutadapt_log.txt'
::: ${INPUT_DIR}/*_1.fastq

Trimming reverse sequences (_2.fastq)

parallel --jobs 4
'cutadapt
-m 230
-l 230
-o {1/}_trimmed_2.fastq
{1} \

{1/}_cutadapt_log.txt'
::: ${INPUT_DIR}/*_2.fastq

The issue:

The trimming process works for each file individually, but I need to ensure that the trimmed forward and reverse sequences still correspond correctly (i.e., they should remain paired). However, cutadapt is being applied separately to the forward and reverse files, which might break the pairing between them.

Hello @valengirardi,

You should take a look at the qiime cut adapt trim-paired; does it let you do what you want? This action will ensure that your paired reads stay in sync.