I would like to extract those merged reads (from paired illumina shotgun sequencing reads) that contain the priming sequences for two 16S primers 314F and 805R.
I thought of using the command used to produce a reference subset for classifier creation but the data type of my reads poses problem.
Here is what I tried:
-
merge paired reads using bbmap to get longer sequences encompassing the two 16S primer sites (V3-4)
-
import the reads as merged reads
qiime tools import
--input-path reads/merged_manifest.csv
--output-path merged_demux.qza
--type SampleData[JoinedSequencesWithQuality]
--input-format SingleEndFastqManifestPhred33 -
feed the qza to extraction
qiime feature-classifier extract-reads
--i-sequences merged_demux.qza
--p-f-primer "CCTACGGGNGGCWGCAG"
--p-r-primer "GACTACHVGGGTATCTAATCC"
--p-n-jobs 24
--p-read-orientation 'forward'
--o-reads 314f-805r-merged_demux-seq.qza
I get:
There was a problem with the command:
(1/1) Invalid value for '--i-sequences': Expected an artifact of at least
type FeatureData[Sequence]. An artifact of type
SampleData[JoinedSequencesWithQuality] was provided.
Any idea how I could achieve my goal using qiime?
Should I import my reads using another format? (can I even have qualities here?)