Hi @hzh0005! The poor taxonomic resolution you’re seeing depends on many factors. Can you please provide the following info so that I can better assist you?
What QIIME 2 release are you using? You can find that out by running qiime info.
Is your sequencing data 16S, ITS, or something else?
Which variable region(s) did you amplify to produce the sequencing data? Which primers did you use?
Which reference database are you using to perform taxonomic classification? Did you use one of the pre-trained classifiers or did you train one yourself?
What are the exact commands you ran to produce taxonomy.qza?
Citing QIIME 2
If you use QIIME 2 in any published work, you should cite QIIME 2 and the plugins that you used. To display the citations for QIIME 2 and all installed plugins, run:
qiime info --citations
The database recommanded in “moving picture” was used.
gg-13-8-99-515-806-nb-classifier.qza
V3/V4 regions of 16S were sequenced, and the primer sequnces were listed in the attched file “map.txt”
Thanks for the info @hzh0005! The pre-trained classifier you’re using was trained on the V4 region (515F/806R), and your data are V3/V4. I recommend training a classifier that will fit your data set. Check out the q2-feature-classifier tutorial for instructions on how to train a classifier using your primer pairs, as well as trimming the reference reads to match the length of your sequences.
You might also try out the pre-trained full-length Greengenes classifier available on the data resources page, though taxonomic classification accuracy generally improves when training a classifier on only the region that was sequenced.
I used the database "silva-119-99-nb-classifier.qza " to run classfication again, but got errors as following:
Traceback (most recent call last):
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/commands.py”, line 224, in call
results = action(**arguments)
File “”, line 2, in classify_sklearn
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 222, in bound_callable
spec.view_type, recorder)
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/result.py”, line 257, in _view
result = transformation(self._archiver.data_dir)
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/core/transform.py”, line 59, in transformation
new_view = transformer(view)
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_feature_classifier/_taxonomic_classifier.py”, line 72, in _1
pipeline = joblib.load(os.path.join(dirname, ‘sklearn_pipeline.pkl’))
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 578, in load
obj = _unpickle(fobj, filename, mmap_mode)
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 508, in _unpickle
obj = unpickler.load()
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/pickle.py”, line 1043, in load
dispatchkey[0]
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 341, in load_build
self.stack.append(array_wrapper.read(self))
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 184, in read
array = self.read_array(unpickler)
File “/home/hzh0005/.conda/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 130, in read_array
array = unpickler.np.empty(count, dtype=self.dtype)
MemoryError
it looks like you are running out of memory (thats what that MemoryError above means). How much available memory (RAM) do you have? If you don't have access to a machine with more memory, you could experiment with the following parameters:
Are you getting low classification depth after using the appropriate classifier? If so, please post the commands that you are using to classify and train the classifier (if you are training your own), and please post the output of metadata tabulate