Hi @Todd_Testerman,
Thanks for sharing your data. A few things:
- VSEARCH ERROR: This is an issue with your FASTA file having windows-style line endings. Other QIIME 2 plugins can handle these but VSEARCH blows up when it encounters them, causing this cryptic error, see here for some other examples: Dereplicated problem - #2 by Nicholas_Bokulich
- SILVA BAD CLASSIFICATIONS: This looks like it is probably a quirk of SILVA, maybe the specific classifier that you were using. We have seen similar issues with SILVA classifiers, with unexplained classifications to unknown archaea in particular. This is usually a problem with unusually short or long sequences being included in the reference sequences. Using
extract-readsusually fixes this issue (since it filters out unusually long/short seqs after in silico PCR) but this is not really an option for you... I'd recommend filtering out sequences that are shorter than expected for full-length 16S and training a fresh classifier.
So to fix your problems:
- vsearch: export your sequences, convert to unix-style line endings, then re-import before proceeding with vsearch classification
- SILVA: clean up the database and train your own classifier
- OR you could use the pre-trained Greengenes full-length 16S classifier. I tested this first as a troubleshooting step and the classifications look pretty good. Greengenes has its issues (mainly being 7 years old) but if you can look past those this would be the fastest way to proceed with an out-of-the-box solution.
Good luck!