Performance and running time of classify-consensus-blast+

steff1088 · December 2, 2017, 5:01am

Thank you very much for your support @Nicholas_Bokulich @wasade . The samples are 16S Illumina reads and I am running subsets to test the workflow first.

I used qiime feature-table filter-features to remove doubletons and singletons and my sequence numbers shrunk drastically. Now, before I would apply this command to all my samples I wanted to make sure my overall workflow seems right:

I imported already merged, pair-end, and quality-filtered data in .fna format.
Dereplication by qiime vsearch dereplicate-sequences. This resulted in a feature table and dereplicated sequences.
I would integrate qiime vsearch cluster-features-open-reference with q2-2017.11 which would generate a clustered table and clustered sequence file. Here a question: Could I theoretically use taxa barplot here already along with SILVA's otu.qza as taxonomy file?
To remove the singletons and doubletons, I applied qiime feature-table filter-features after which I downloaded the .csv file of the frequency per feature filtered table. Then, I reformatted this file (.tsv) and used it as input for the qiime feature-table filter-seqs command resulting in a filtered sequence file. Here I used for both commands the clustered table and sequence file from the open-reference clustering, is that right? I did this instead of running through deblur.
Using qiime feature-classifier classify-consensus-vsearch to classify against 16Sonly_consensus_taxonomy_7_levels.qza from SILVA. As reference read inputs I used the otu.qza file as in the cluster step. My input query was the dereplicated, OR-clustered, filtered sequence file, that decreased significantly in terms of sequence reads by now.

To move on, I would basically use the output taxonomy and the filtered feature table file to make taxa barplot, correct? I can run this pipeline and get some sequence/feature numbers to check, I just wanted to verify this workflow beforehand in order to avoid making a naive mistake somewhere.