cutadapt: specify length of read to be checked for matches during demux-paired?

Peer_Starke · February 17, 2022, 6:15pm

Hello Everyone,

is it possible to specify how much of each read is processed by cutadapt?
I am asking because all my reads start with directly with my barcode sequence and are followed by a primer sequence.
e.g. forward read
TCAT (Barcode) CTTGGTCATTTAGAGGAAGTAA (Primer)

If i now run demux-paired with TCAT specified in my metadata as barcode sequence my resulting demultiplexed file will contain every read that has a TCAT somewhere within the read which is a bit annoying.

My current work-arround was to combine my barcode and primer sequence during demux-paired. This however won't allow any error-tolerance. Since the actual goal is to have 0-Error Tolerance in the Barcode.

What i want to achieve is pretty much, that cutadapt allows for no errors in the barcode, but allows for a defined amount of error-tolerance in the primer sequence.
e.g.
100% correct sequence
TCAT CTTGGTCATTTAGAGGAAGTAA
TCAT CTGGGTCATTTAGAGGAAGTAA
TCAT GTTGGTCATTTAGAGGAAGTAA

kindergarten · February 17, 2022, 7:31pm

I use cutadapt to demultiplex based only on barcodes with -e 0. Demultiplexed paired-end files are imported into Qiime2. I trim the primers with Dada2. Dada2 will trim fixed length, irrespective of sequence.

Peer_Starke · February 21, 2022, 1:57pm

Thank you for your reply!
I was able to fix my problem, by just accepting a lot of errors in my demux-paired output and trying to get rid of them by using the length parameter during trim-paired.

qiime cutadapt demux-paired
--i-seqs multiplexed-seqs.qza
--m-forward-barcodes-file metadata.tsv
--m-forward-barcodes-column barcode-forward
--m-reverse-barcodes-file metadata.tsv
--m-reverse-barcodes-column barcode-reverse
--p-error-rate 0
--o-per-sample-sequences demultiplexed-seqs.qza
--o-untrimmed-sequences untrimmed.qza
--verbose

qiime cutadapt trim-paired
--i-demultiplexed-sequences demultiplexed-seqs.qza
--p-front-f CTTGGTCATTTAGAGGAAGTAA
--p-front-r TCCTCCGCTTATTGATATGC
--p-error-rate 0.1
--p-minimum-length 275
--o-trimmed-sequences trimmed-seqs.qza
--verbose

system · March 24, 2022, 7:58pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.