Find Sequence of Feature Identified

Hi there,

I am using QIIME to identify sequences found in eDNA. Occasionally I get features that are identified as species that are highly unlikely to be found in the eDNA sample (e.g. a Japanese saltwater fish found in an Alabama freshwater pond). I'd like to check the raw sequence data of these identified features using BLAST to see if they might be complete matches with other, more plausible species.

Is there an easy way to do that? I notice that the FeatureTable and Taxonomy Table outputs only include feature IDs or OTU IDs rather than the raw sequence.

Here's an example of my pipeline:

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path manifest-file.txt --input-format SingleEndFastqManifestPhred33V2 --output-path demux-co1-r1.qza

qiime dada2 denoise-single \
 --i-demultiplexed-seqs demux-co1-r1.qza \ 
--p-trim-left 20 \ 
--p-trunc-len 115 \
 --o-representative-sequences rep-seqs-r1-115.qza \ 
--o-table table-r2-115.qza \ 
--o-denoising-stats denoise-stats-r2-115.qza

qiime rescript evaluate-fit-classifier --i-sequences coi-mussels-filtered-seqs.qza --i-taxonomy coi-mussels-taxonomy-unfiltered.qza --o-classifier coi-mussels-classifier.qza --o-evaluation coi-mussels-classifier-evaluation.qzv --o-observed-taxonomy coi-mussels-classifier-predicted-taxonomy.qza --verbose

qiime feature-classifier classify-sklearn \ 
--i-classifier ../coi-mussels-classifier.qza \ 
--i-reads rep-seqs-r1-115.qza \ 
--o-classification coi-r1-mussels-classified-taxonomy.qza

qiime metadata tabulate \
 --m-input-file coi-r1-mussels-classified-taxonomy.qza \ 
--o-visualization coi-r1-mussels-classified-taxonomy.qzv

Thanks for your help.


Hi @alexkrohn,
There isn't a way to check the raw sequence, as in before your denoise-single step, but you can check the sequence post DADA2. This should get you the information you need to explore further.

In your example code, you created a file called rep-seqs-r1-115.qza. If you call qiime feature-table tabuluate-seqs on that file, you'll generate a .qzv that links the feature identifiers to the sequences. (The sequences in that visualization are links that will BLAST the sequence for you on the NCBI BLAST server - a bit of convenience added for this exact situation.) In that .qzv, you can search for the sequence identifier that you're interested in using your browser's search functionality (usually by hitting command-F or control-F).

You can see this illustrated in the Moving Pictures tutorial here.

Good luck!

1 Like