Redoing analysis from QIIME 1, Archaea has mostly disappeared

SoilRotifer · October 14, 2022, 7:44pm

This is interesting. Perhaps we perform some sanity-checking here. Can you classify your reads against the full-length classifier, i.e. skip running the feature-classifier extract-reads step. If the results make more sense compared to your amplicon specific classifier, then this would tell me it is a PCR primer search and extraction issue.

Sometimes when using PCR primer based extraction of an amplicon region, you can lose some data. Likely because the reference sequences you need do not have the portion of sequence you need for a primer match for extraction. You can try our new extract-seq-segments function in RESCRIPt to improve the extraction of amplicon segments from reference sequences that may be missing, one or both of, your primer regions.

If neither work, perhaps try curating your own SILVA reference database as outlined here, to optimize for the retention and classification of archaeal sequences.

Let us know what you find!