ITS + 16S sequences in the same fastq file, how to separate them before DADA2?

lisacarraro1982 · February 11, 2019, 5:06pm

Hi,
I have vegetal ITS + bacterial 16S sequences in the same fastq library (I gave same barcode to the 2 different amplicons).

My data are PairedEndSequencesWithQuality already demultiplexed from the Sequencing Service.

I would like to separate 16S sequences from ITS sequences before running DADA2 searching the primer sequences inside the reads. 16S Primer for is TCCTACGGGAGGCAGCAGT and 16S reverse is GGACTACCAGGGTATCTAATCCTGTT.

I create a minidaset with 3 of my fastq.gz samples named 100, 101, 102.

I imported with the script qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path raw --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-paired-end.qza

I have created a metadata file like this:

Sample Barcode

100 TCCTACGGGAGGCAGCAGT
100 GGACTACCAGGGTATCTAATCCTGTT
101 TCCTACGGGAGGCAGCAGT
101 GGACTACCAGGGTATCTAATCCTGTT
102 TCCTACGGGAGGCAGCAGT
102 GGACTACCAGGGTATCTAATCCTGTT

Can I use CUTADAPT ?

This scrips didn't work:

qiime cutadapt demux-paired --i-seqs demux-paired-end.qza --m-forward-barcodes-file metadata.txt --m-forward-barcodes-column Barcode --output-dir demultiplexed --o-untrimmed-sequences missed.qza

Thank you very much!!

Lisa

ebolyen · February 11, 2019, 5:57pm

Hi @lisacarraro1982,

I think I have another way to approach this.

DADA2 is actually perfectly fine with mixed amplicons, it just makes the downstream analysis more complicated, but I think there's a relatively straightforward way to handle this:

Run DADA2 on your mixed amplicons.
Use qiime feature-classifier extract-reads to pull out the ITS data for your resulting representative sequences (FeatureData[Sequence]), and then again to pull out the 16S data. Keep track of these new reads, they are your rep-seqs for anything else downstream.
Filter your feature-table using the newly created rep-seqs: qiime feature-table filter-features, the trick is you are going to treat your rep-seqs as metadata, so you can pass it as --m-metadata-file rep-seqs-16s.qza (or whatever you named it). If you do that once for each amplicon, you will have a pair of feature tables and representative sequences for each

lisacarraro1982 · February 12, 2019, 4:33pm

Thank you very much!

system · March 15, 2019, 10:33pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.