‘reference_sequences’ is not a subtype of FeatureData[Sequence]

Dear Greg,

Thank you for this tutorial.
I have performed this tutorial with my own data (demultiplexed, quality-controlled sequence data). So I followed option 1 and went on to perform closed-reference clustering. However i proceeded with a different reference database, i.e. Naive Bayes classifier trained on Greengenes 13_8 99% OTUs full-length sequences

I used the following commands:
qiime vsearch cluster-features-closed-reference /
--i-table table.qza
--i-sequences rep-seqs.qza
--i-reference-sequences gg-13-8-99-nb-classifier.qza
--p-perc-identity 0.99
--o-clustered-table closedOTU/table-cr-99.qza
--o-unmatched-sequences closedOTU/unmatched.qza
--output-dir closedOTU

When I used these commands I get the following error:
Plugin error from vsearch:
Argument to parameter 'reference_sequences' is not a subtype of FeatureData[Sequence].
Debug info has been saved to /tmp/qiime2-q2cli-err-g8g61yiv.log qiime2-q2cli-err-g8g61yiv.txt (723 Bytes)

Do you have a suggestion for correcting my error?

Kind regards,
Guido

Hi @Guido_Lopes_dos_Sant! You’ll need to import your reference sequences (assuming they are unaligned sequences in FASTA format) using the FeatureData[Sequence] semantic type. See this section of the importing tutorial for details.

qiime vsearch cluster-features-closed-reference performs closed-reference OTU picking using the reference sequences, but doesn’t do any sort of taxonomic classification. To learn more about taxonomic classification in QIIME 2, see this tutorial.

Note: since you’re performing closed-reference OTU picking, each feature ID in the resulting feature table will correspond to a reference sequence ID. If your reference database already has a TSV file mapping each reference sequence ID to its taxonomic annotation, you don’t necessarily need to perform taxonomic classification since you already have an annotation for each reference sequence. You can simply import your reference taxonomy TSV file following this section of the feature classification tutorial I linked to above. If your TSV file doesn’t have a header line (i.e. indicating column names in the file), you can use --source-format HeaderlessTSVTaxonomyFormat (that’s the format used in the tutorial’s data set). If your TSV file has a header line, you can use --source-format TSVTaxonomyFormat in the import command. Either way, once you have a closed-reference feature table and the imported reference taxonomy, you can skip taxonomic classification and use those two .qza files in downstream analyses.

1 Like

Hi @jairideout Thank you for the tips! I’ll implement your suggestions.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.