Hello-
First, I am not sure the best place to classify this post so I apologize if this post is in the wrong spot! I have seen related posts on this subject and hoping if it can be solved may be of use to the QIIME2 community.
I am trying to format the RDP database to be a compatible format to be used with QIIME2, but I am getting a consistent error when running vsearch to classify taxonomy (qiime feature-classifier classify-consensus-vsearch). My qiime commands have been successfully executed to classify on silva and Greengenes ref databases so I don’t think it’s the vsearch script. I had tried to reformat the taxonomy and fasta files and have had this error a couple times but different identifiers were listed. This is the error:
“Plugin error from feature-classifier:
'Identifier 135 was reported in taxonomic search results, but was not present in the reference taxonomy.'
Debug info has been saved to /tmp/qiime2-q2cli-err-usbrditl.log”
Below is my workflow from the beginning:
Went to RDP resources website: RDPresources
I downloaded fasta file for unaligned 16S data that has the taxonomy included in the file (current_Bacteria_unaligned.fa.gz). Then after unzipping, I used grep to extract out the taxonomy headers and edited the file in python to match silva/Greengenes taxonomy from the QIIME2 resources.
Everything looks like it matches up with the taxonomy format for other db's in qiime2 resources and I was able to import the taxonomy as a .qza artifact without issue using below.
qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path RDP/rdp_qiime_taxonomy.txt \
--output-path rdp_16S_v16_ref-taxonomy.qza
Next I formatted the fasta file to be match the other ref databases and I also was able to import to a qiime artifact with the below script with no issues:
qiime tools import \
--type 'FeatureData[Sequence]' \
--input-path RDP/rep_set_99_rdp.fa \
--output-path rdp_16S_v16_otus.qza
DADA2 was then used for denoising and then I ran the below script to classify features:
qiime feature-classifier classify-consensus-vsearch
--i-query merged_v67_rep_seqs_pyro_trun150.qza
--i-reference-reads rdp_16S_v16_otus.qza
--i-reference-taxonomy rdp_16S_v16_ref-taxonomy.qza
--o-classification v67_rdp_vs_dada2trun150_rdp_complete.qza
The files are too large to attach here so figshare links are below. Please let me know if there's an issue opening.
taxonomy
ref_otus
I am still getting the hang of editing in unix, but my taxonomy file looked okay when I had it in python. Any assistance would be appreciated and if we can get it formatted correctly, would be happy to share this as a resource if RDP is a database of interest for QIIME2.
Thanks-
Katherine