Q2-cutadapt add "--discard-untrimmed" option

michoug · February 21, 2018, 6:11pm

Hi
When I use the q2-cutadapt plugin I realized that it removed the majority of the primers as expected, however it writes all the reads by default even if they didn't have the primers. I saw the standalone software had this option

michoug · February 21, 2018, 6:12pm

Hi
Is it possible to add the "--discard-untrimmed" command of the standalone cutadapt software to q2-cutadapt in order to remove reads that don't have the adapters?
Best
Greg

thermokarst · February 22, 2018, 2:29am

Hi @michoug! We have an open issue to implement this feature (--discard-untrimmed)--- stay tuned! We will update you here when that becomes available in a future release of QIIME 2. Thanks!

DaS · September 13, 2018, 4:45pm

I would like to express my intrest in this topic. Currently I am not using qiime2 cutadapt but the original version, just because of that feature.

I cant see a reason to not discard untrimmed reads. I end up discarding usually ~3% read pairs (occasional low quality samples up to 20%) but feel confident about the remaining sequences.

Why do you think primer-untrimmed sequences are worth keeping (aside from library preparation that provides immediately primer free reads)? What do you think reads without primer sequence could be?

Are you going to expose the "discard untrimmed" argument for cutadapt eventually?

thermokarst · September 13, 2018, 10:38pm

Hey there @DaS!

It is a technical reason --- QIIME 2 currently doesn't allow for "empty" SampleData[SequencesWithQuality] and SampleData[PairedEndSequencesWithQuality], which could happen in the case where no reads are discarded. This is entirely an issue within QIIME 2 --- there are a few ways to solve it, including:

error if no reads are discarded / discarded reads file is empty
update the existing format definitions to allow for emptiness
change the framework to support the notion of a "nullable" type

We currently have no ETA for this, but, contributions are always welcome!

SarahH · November 5, 2018, 1:32pm

Hi, I was just wondering if there had been any progress on the discard untrimmed reads option? This would be so useful to me as I'm using different amplicons with the same barcodes. At the moment I need to use other pipelines, and it would be great to stick with this one.

Thanks!

Nicholas_Bokulich · November 5, 2018, 1:38pm

No — the issue above is still open:

You can track that issue to stay updated.

As a workaround for the time being, you could use an alignment search tool external to QIIME 2 to filter out sequences that contain a specific primer, and them import to QIIME 2. I am not sure if VSEARCH can perform alignment on fastq data, but that would be a good place to start.