Sure! I preprocessed the SILVA database with RESCRIPt. I pretty much followed this. More specifically, after getting SILVA data (NR99, version 138) with qiime rescript get-silva-data
:
- Remove low-quality seqs with
qiime rescript cull-seqs
- Dereplicate identical seqs with
qiime rescript dereplicate
- Extract specific regions from ref-seqs for each amplicon based off the primers sequences with
qiime feature-classifier extract-reads
After these steps I have a high-quality dereplicated SILVA database for each amplicon*. Running these preprocessed databases in parallel in qiime sidle prepare-extracted-region
gives me the --i-kmer-map
s referenced in the qiime sidle reconstruct-counts
command.
*Because I am running other analyses using these very same preprocessed databases, I am confident they are fine.
I hope this clarifies the situation. Thank you for your help!