Feature classifier using UNITE developer sequences

Hi there,
I am trying to extract reads and train a classifier using the developer sequences from UNITE. I can import the developer sequences using
qiime tools import
–type FeatureData[Sequence]
–input-path developer/sh_refs_qiime_ver7_97_01.12.2017_dev.fasta
–output-path developer/unite-ver7-97-seqs-01.12.2017.qza

However, when I run
qiime feature-classifier fit-classifier-naive-bayes
–i-reference-reads developer/unite-ver7-97-seqs-01.12.2017.qza
–i-reference-taxonomy unite-ver7-97-tax-01.12.2017-developer.qza
–o-classifier 97-classifier-deblur-developer.qza

I get the error:
Plugin error from feature-classifier:

Invalid character in sequence: b’t’.
Valid characters: [’-’, ‘G’, ‘R’, ‘A’, ‘H’, ‘.’, ‘D’, ‘B’, ‘N’, ‘M’, ‘K’, ‘V’, ‘C’, ‘T’, ‘Y’, ‘W’, ‘S’]
Note: Use lowercase if your sequence contains lowercase characters not in the sequence’s alphabet.

Is there some reformatting of the developer sequences that needs to be done to be compatible with qiime? Thanks in advance!

Hi @rboutin,

Yes, we have seen this with the developer seqs before:

Ideally we will add lowercase character support in the near future but for the time being lowercase characters should be quite rare (depending on the source dataset) and there is an easy workaround.

I hope that helps!

Hi Nicholas,
Thanks for the quick reply! Yes, I just tried it again after using some code I found on a previous forum topic to convert to lowercase and things seem to be working now. Thanks again!

1 Like

An off-topic reply has been merged into an existing topic: Help ! ITS taxonomy

Please keep replies on-topic in the future.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.