extract merged reads containing primer sites

I would like to extract those merged reads (from paired illumina shotgun sequencing reads) that contain the priming sequences for two 16S primers 314F and 805R.

I thought of using the command used to produce a reference subset for classifier creation but the data type of my reads poses problem.

Here is what I tried:

  • merge paired reads using bbmap to get longer sequences encompassing the two 16S primer sites (V3-4)

  • import the reads as merged reads

    qiime tools import
    –input-path reads/merged_manifest.csv
    –output-path merged_demux.qza
    –type SampleData[JoinedSequencesWithQuality]
    –input-format SingleEndFastqManifestPhred33

  • feed the qza to extraction

    qiime feature-classifier extract-reads
    –i-sequences merged_demux.qza
    –p-f-primer “CCTACGGGNGGCWGCAG”
    –p-r-primer “GACTACHVGGGTATCTAATCC”
    –p-n-jobs 24
    –p-read-orientation ‘forward’
    –o-reads 314f-805r-merged_demux-seq.qza

I get:

There was a problem with the command:
(1/1) Invalid value for ‘–i-sequences’: Expected an artifact of at least
type FeatureData[Sequence]. An artifact of type
SampleData[JoinedSequencesWithQuality] was provided.

Any idea how I could achieve my goal using qiime?
Should I import my reads using another format? (can I even have qualities here?)

Hi @splaisan,

The feature-classifier expects reads without a quality score. I’m not sure if it would work, but you might try trimming paired end reads before joining using cutadapt. As a disclaimer, I’m not actually sure if this will work, but its worth a try on at least a subset.

Best,
Justine

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.