The version is qiime2-2018.6.
- When I run sequence A01 by itself, feature_classifier returns a g__Herbaspirillum classification.
- When I run the exact same sequence with sequence A02, feature_classifier returns p__OD classification.
- feature_classifier on A02 always returns p_OD, even when A02 is run by itself.
the commands are:
qiime tools import --type FeatureData[Sequence] --input-path A01A02.fasta --output-path A01A02.qza
qiime feature-classifier classify-sklearn --i-classifier /databases/qiime2/gg-13-8-99-nb-classifier.qza --i-reads A01A02.qza --o-classification gg.qza
qiime tools export gg.qza --output-dir gg
When I run Qiime1 with RDP classifier with Greengenes 13_8 99 db, I get g__Herbaspirillum for both A01 and A02.
The FASTA:
plate_999-A01 plate_184-A01 Contig - A01
TCGACGGCAGCATGGGAGCTTGCTCCTGATGGCGAGTGGCGAACGGGTGAGTAATATATC
GGAACGTGCCCTAGAGTGGGGGATAACTAGTCGAAAGACTAGCTAATACCGCATACGATC
TACGGATGAAAGTGGGGGATCTCAAGACCTCATGCTCCTGGAGCGGCCGATATCTGATTA
GCTAGTTGGTGGGGTAAAAGCCTACCAAGGCAACGATCAGTAGCTGGTCTGAGAGGACGA
CCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATT
TTGGACAATGGGGGCAACCCTGATCCAGCAATGCCGCGTGAGTGAAGAAGGCCTTCGGGT
TGTAAAGCTCTTTTGTCAGGGAAGAAACGGTTGTCTCTAATAATATTACTAATGACGGTA
CCTGAAGAATAAGCACCGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGGTGCAA
GCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTGTGTAAGTCAGATGTGA
AATCCCCGGGCTCAACCTGGGAATTGCATTTGAGACTGCACGGCTAGAGTGTGTCAGAGG
GGGGGTAGAATTCCACGTGTAGCAGTGAAATGCGTAGATATGTGGAGGAATACCGATGGC
GAAGGCAGCCCCCTGGGATAACACTGACGCTCATGCACGAAAGCGTGGGGAGCAAACAGG
ATTAGATACCCTGGTAGTCCACGCCCTAAACGATGTCTACTAGTTGTCGGGTCTTAATTG
CCTTGGTAACGCAGCTAACGCGTGAAGTAGACCGCCTGGGGAGTACGGTCGCAAGATTAA
AACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGATGATGTGGATTAATTCGATGC
AACGCGAAAAACCTTACCTACCCTTGACATGGATGGAATCCCGAAGAGATTTGGGAGTGC
TCGAAAGAGAACCATCACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGT
TGGGTTAAGTCCCGCAACGAGCGCAACCCTTGTCATTAGTTGCTACGAAAGGGCACTCTA
ATGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAGTCCTCATGGCCCT
TATGGGTAGGGCTTCACACGTCATACAATGGTACATACAGAGGGCCGCCAACCCGCGAGG
GGGAGCTAATCCCAGAAAGTGTATCGTAGTCCGGATTGCAGTCTGCAACTCGACTGCATG
AAGTTGGAATCGCTAGTAATCGCGGATCAGCATGTCGCGGTGAATACGTTCCCGGGTCTT
GTACACACCGCCCGTCACACCATGGGAGCGGGTTTACCAGAAGTG
plate_999-A02 plate_184-A02 Contig - A02
CTGATGGCGAGTGGCGAACGGGTGAGTAATATATCGGAACGTGCCCTAGTAGTGGGGGAT
AACTAGTCGAAAGACTAGCTAATACCGCATACGATCTACGGATGAAAGCGGGGGATCTCA
CGACCTCATGCTATTGGAGCGGCCGATATCTGATTAGCTAGTTGGTGGGGTAAAAGCCTA
CCAAGGCGACGATCAGTAGCTGGTCTGAGAGGACGACCAGCCACACTGGGACTGAGACAC
GGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATTTTGGACAATGGGGGCAACCCTGAT
CCAGCAATGCCGCGTGAGTGAAGAAGGCCTTCGGGTTGTAAAGCTCTTTTGTCAGGGAAG
AAACGGTAGTATCTAATACATATTGGTAATGACGGTACCTGAAGAATAAGCACCGGCTAA
CTACGTGCCAGCAGCCGCGGTAATACGTAGGGTGCAAGCGTTAATCGGAATTACTGGGCG
TAAAGCGTGCGCAGGCGGTTTTGTAAGTCAGATGTGAAATCCCCGGGCTCAACCTGGGAA
CTGCATTTGAGACTGCCCGGCTAGAGTGTGTCAGAGGGGGGTAGAATTCCACGTGTAGCA
GTGAAATGCGTAGATATGTGGAGGAATACCGATGGCGAAGGCAGCCCCCTGGGACAACAC
TGACGCTCATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGC
CCTAAACGATGTCAACTAGTTGTCGGGTCTTAATTGACTTGGTAACGCAGCTAACGCGTG
AAGTAGACCGCCTGGGGAGTACGGTCGCAAGATTAGAACTCAAAGGAATTGACGGGGACC
CGCACAAGCGGTGGATGATGTGGATTAATTCTTGCTGGCGAAAAACCTTACCTACCCTTG
ACATGGTGGAATCCCGAAGAGATAGTGAGTGCTCCCTTAGAACCGCACCAGGTGCTGCAT
GGTGTCGTCAGCTCGTGTCTGAATGTTGGGTTAATCCCGCAACGAGCGCAACCCTTG
Any input appreciated! I am running sets of 96 samples of full-length 16S rRNA sequences and getting tons of p_OD1 classifications that are suspect based on the above.Thanks!