Hi there,
I recently migrated over to qiime2 from mothur. I am processing the same ITS sequence data, that i did with mothur and i am using the same UNITE database for ITS classification. When i used the mothur pipeline i get good family and genus level resolution, however using qiime2 >90% of the sequences are not classified beyond “kingdom; fungi”. I am a bit puzzled and wonder if i have done something incorrect along the the way. I have posted by command log below. Wondering if anyone can help me see where i have gone wrong?
I am using qiime2-2019.10, installed through conda on ubuntu.
qiime tools import
–type ‘SampleData[PairedEndSequencesWithQuality]’
–input-path //path_to_file/samples.txt
–input-format PairedEndFastqManifestPhred33V2
–output-path /path_to_file_output/samples_demux-paired-end.qza
qiime demux summarize
–i-data samples_demux-paired-end.qza
–o-visualization samples_demux.qzv
qiime cutadapt trim-paired
–i-demultiplexed-sequences samples_demux-paired-end.qza
–p-cores 2
–p-front-f GAACGCAGCRAANNGYGA
–p-front-r TCCTCCGCTTATTGATATGC
–p-adapter-r TCRCNNTTYGCTGCGTTC
–p-adapter-f GCATATCAATAAGCGGAGGA
–o-trimmed-sequences samples_demux-paired-end_PT.qza
–verbose \
qiime demux summarize
–i-data samples_demux-paired-end_PT.qza
–o-visualization samples_demux-paired-end_PT.qzv
DENOISE WITH DADA2 - did not trim as ITS
qiime dada2 denoise-paired
–i-demultiplexed-seqs samples_demux-paired-end_PT.qza
–p-trunc-len-f 0
–p-trunc-len-r 0
–p-max-ee-f 2
–p-max-ee-r 2
–p-n-threads 2
–o-table samples_demux-paired-end_PT_table-dada2.qza
–o-representative-sequences samples_demux-paired-end_PT_rep-seqs-dada2.qza
–o-denoising-stats samples_demux-paired-end_PT_denoising-stats.qza
qiime metadata tabulate
–m-input-file samples_demux-paired-end_PT_denoising-stats.qza
–o-visualization samples_demux-paired-end_PT_denoising-stats.qzv
REMOVE SINGLETONS FROM TABLE
qiime feature-table filter-features
–i-table samples_demux-paired-end_PT_table-dada2.qza
–p-min-frequency 2
–o-filtered-table samples_demux-paired-end_PT_table-dada2_single.qza
REMOVE SINGLETONS FROM SEQUENCE FILE
qiime feature-table filter-seqs
–i-data samples_demux-paired-end_PT_rep-seqs-dada2.qza
–i-table samples_demux-paired-end_PT_table-dada2_single.qza
–o-filtered-data samples_demux-paired-end_PT_rep-seqs-dada2_single.qza
qiime feature-table summarize
–i-table samples_demux-paired-end_PT_table-dada2_single.qza
–o-visualization samples_demux-paired-end_PT_table-dada2_single.qzv
qiime feature-table tabulate-seqs
–i-data samples_demux-paired-end_PT_rep-seqs-dada2_single.qza
–o-visualization samples_demux-paired-end_PT_rep-seqs-dada2_single.qzv
make UNITE database following https://github.com/gregcaporaso/2017.06.23-q2-fungal-tutorial:
https://doi.org/10.15156/BIO/786334
Includes singletons set as RefS (in dynamic files).
sh_refs_qiime_ver8_dynamic_s_02.02.2019.fasta
sh_taxonomy_qiime_ver8_dynamic_s_02.02.2019.txt
*qiime tools import *
–type FeatureData[Sequence]
–input-path sh_refs_qiime_ver8_dynamic_s_02.02.2019.fasta
–output-path UNITE_seqs_v8_dynamic_02022019.qza
Imported sh_refs_qiime_ver8_dynamic_s_02.02.2019.fasta as DNASequencesDirectoryFormat to UNITE_seqs_v8_dynamic_02022019.qza
qiime tools import
–type FeatureData[Taxonomy]
–input-format HeaderlessTSVTaxonomyFormat
–input-path sh_taxonomy_qiime_ver8_dynamic_s_02.02.2019.txt
–output-path UNITE_tax_v8_dynamic_02022019.qza
Imported sh_taxonomy_qiime_ver8_dynamic_s_02.02.2019.txt as HeaderlessTSVTaxonomyFormat to UNITE_tax_v8_dynamic_02022019.qza
Train the classifier on this region
qiime feature-classifier fit-classifier-naive-bayes
–i-reference-reads UNITE_seqs_v8_dynamic_02022019.qza
–i-reference-taxonomy UNITE_tax_v8_dynamic_02022019.qza
–o-classifier classifier_UNITE_v8.qza
–verbose
UserWarning: The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.21.2. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable.)
warnings.warn(warning, UserWarning)
Saved TaxonomicClassifier to: classifier_UNITE_v8.qza
Classify reads by taxon using a fitted classifier
qiime feature-classifier classify-sklearn
–i-classifier /path_to_database/qimme2_databases/UNITE/UNITEv8_DynamicClassifier/dynamic/classifier_UNITE_v8.qza
–i-reads samples_demux-paired-end_PT_rep-seqs-dada2_single.qza
–o-classification samples_demux-paired-end_PT_rep-seqs-dada2_single_classification.qza
qiime metadata tabulate
–m-input-file samples_demux-paired-end_PT_rep-seqs-dada2_single_classification.qza
–o-visualization samples_demux-paired-end_PT_rep-seqs-dada2_single_classification_taxonomy.qzv
I also tried to classify the samples using
- qiime feature-classifier classify-consensus-vsearch as suggested on the forum.
- using the developer version of UNITE as suggested on the forum.
But the results were still unclassified.
Thanks for any help