Diversity core metrics of Sanger sequencing data:

Masanobu_Hiraoka · May 20, 2020, 10:47pm

Hi,

I am a beginner of qiime2, and please let me know your advise.

I want to analyze my Sanger sequenced data by qiime 2 pipeline.

After I imported a combined fasta file which was made by the command, "add_qiime_labels.py" under qiime1 environment, I commanded "vsearch" for dereplicating/clustering with qiime2.
Using rep-seqs.qza and table.qza, I carried out generating .a tree for phylogenic analyses(aligned-rep-seqs.qza, masked-aligned-rep-seqs.qza, unrooted-tree.qza, rooted-tree.qza).
When I proceed to the next step of diversity core metrics process, I stacked.
Would you give me some advice?

[Alpha and beta diversity analysis]
(“Moving Pictures” tutorial — QIIME 2 2020.2.0 documentation)

command:

qiime diversity core-metrics-phylogenetic
--i-phylogeny rooted-tree.qza
--i-table table.qza
--p-sampling-depth 47
--m-metadata-file example_mapping4.txt
--output-dir core-metrics-results

error message:
...
Plugin error from diversity:

All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=903):

I referred similar topics about this error message(filtering?), but I can not identify the solution.

Thanks,

--
env;

virtual box 6.1.6, QIIME 2 Core - 2020.2 on win10pro.

Miniconda 4.6.14(to prepare for qiime1 with qiime2, degraded)

qiime1 works:
Qiime 1 Forum › post split fasta file import

mapping file was validated as no error.

Nicholas_Bokulich · May 28, 2020, 2:15am

Hi @Masanobu_Hiraoka,
I apologize for the delay in responding to your question.

It sounds like probably none of the feature IDs in your table match the tree. I am not sure why this would be, but you could inspect the feature IDs to get an idea of what is going wrong. This post shows how to inspect these IDs using python:

Give that a try and let me know what you see. Another step to troubleshoot is to examine the provenance in your tree and table artifacts to make sure that these are the right files, generated from the same source data (it is easy to mix up files with names like table.qza!)

Let me know what you find. Please share those files here if you are still having trouble.

Good luck!

Masanobu_Hiraoka · May 28, 2020, 5:21am

Thank you @Nicholas_Bokulich ,

Is this a command of python?
As far as I know, this command means to import three modules(qiime2, skbio, and biom) and assign the results of formula to pairs of variables(tree and table, tree_ids and table_ids) , right?

I input that commands arranged to the console of Ubuntu in virtualbox on win10, and this error message came back.

import qiime2 import-im6.q16: not authorized `qiime2' @ error/constitute.c/WriteImage/1037. (qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08 import skbio
import-im6.q16: not authorized skbio' @ error/constitute.c/WriteImage/1037. (qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08$ import biom import-im6.q16: not authorized biom' @ error/constitute.c/WriteImage/1037.
(qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08$ tree = qiime2.Artifact.load('rooted-tree.qza').view(skbio.TreeNode)
bash: syntax error near unexpected token (' (qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08$ table = qiime2.Artifact.load('table.qza').view(biom.Table) bash: syntax error near unexpected token ('
(qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08$ tree_ids = {t.name for t in tree.tips()}
bash: syntax error near unexpected token (' (qiime2-2020.2) qiime2@qiime2core2020-2:~/qiime2-wk08$ table_ids = set(table.ids(axis='observation')) bash: syntax error near unexpected token ('

Is this wrong way?

Thanks,

Nicholas_Bokulich · May 28, 2020, 5:23am

Hi @Masanobu_Hiraoka,

Correct. Type "python" into your terminal and hit enter to open the python interpreter. Enter those lines, and use control + d to exit and return to the bash terminal.

Masanobu_Hiraoka · May 28, 2020, 2:02pm

Hi @Nicholas_Bokulich,

Thank you for your kind instruction.
I checked tree_ids and table_ids as .csv files.

Each string of tree_ids has 40 letters consisted of at random alphanumeric letters + sample ID + _3-4 numbers.

Each string of table_ids has 40 letters too.

I searched those 40 letters in the same excel sheet in which table_ids and tree_ids were pasted in different rows.

I can find the same 40 letters both in the tree and table_ids rows but the order of strings was not parallel.

For example, the string "c900bd8ac2abff864c8237fb1c6451b05541607a" was located in colum A of table_ids row, while the string "3d0060441a525003579ecba52a641ee8137c5f80 wk07_3383" was found in colum AHP of tree_ids row.

How do you think? Is this the reason for the error?

Masanobu_Hiraoka · May 28, 2020, 2:02pm

Let me add comment to my previous post.

When I made rooted-tree.qza, I used the command "qiime phylogeny align-to-tree-mafft-fasttree ", accoring to Moving pictures tutorial.

Due to available plugin in docs, I can only find out the command "qiime alignment mafft".
https://docs.qiime2.org/2020.2/plugins/available/alignment/mafft/

Was my choice of the command from "Moving pictures tutorial" wrong?

Thanks,

Nicholas_Bokulich · May 28, 2020, 2:35pm

Hi @Masanobu_Hiraoka,

The order does not matter, but the number does. Maybe you filtered some sequences out before building the tree?

You are looking in the wrong plugin. The pipeline that you used is here:
https://docs.qiime2.org/2020.2/plugins/available/phylogeny/align-to-tree-mafft-fasttree/

It's also possible that that pipeline is filtering out features that are present in the table. You can use this method to remove those features from the table:
https://docs.qiime2.org/2020.2/plugins/available/fragment-insertion/filter-features/

Good luck!

Masanobu_Hiraoka · May 28, 2020, 7:41pm

Hi, @Nicholas_Bokulich,

I applied this command to table and rooted tree files made by "phylogeny align-to-tree-mafft-fasttree".

qiime fragment-insertion filter-features
--i-table table.qza
--i-tree rooted-tree.qza
--o-filtered-table filtered-table.qza
--o-removed-table removed-table.qza
--output-dir fragment-insertion-filter-results

The extension of argument(filtered-table.qza, removed-table.qza) in output options might be .qza?
And this error message came back:

Plugin error from fragment-insertion:

Not a single fragment of your table is part of your tree. The resulting table would be empty.

Debug info has been saved to /tmp/qiime2-q2cli-err-iid85fb7.log

The log file showed:

Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/decorator.py:decorator-gen-299>", line 2, in filter_features
File "/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
output_types, provenance)
File "/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py", line 390, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/q2_fragment_insertion/_insertion.py", line 238, in filter_features
raise ValueError(('Not a single fragment of your table is part of your'
ValueError: Not a single fragment of your table is part of your tree. The resulting table would be empty.

How can I fix it?

Thanks,

Nicholas_Bokulich · May 28, 2020, 7:46pm

could you share your tree and table here so that we can inspect?

Nicholas_Bokulich · May 30, 2020, 2:23am

Hi @Masanobu_Hiraoka,
Thanks for sharing the data and commands — this allowed me to track down the error.

It looks like the issue is not due to the tree-building pipeline, but rather a known bug related to the use of dereplicate-sequences without further clustering.

This issue — and possible workarounds — is also described here:

Using cluster-features-de-novo should get you moving, even if you cluster at 100% similarity.

This is an unrelated issue. I do not see anything in the QIIME 2 part of your pipeline that would cause this issue (since you are just dereplicating the input sequences, no filtering involved), so I think you are right to look to the qiime1 forum to solve this issue... if you rule out qiime1 as the issue, please open a new topic to get help for this issue. Thanks!

Good luck! I hope my workaround proves suitable to you.

Masanobu_Hiraoka · May 31, 2020, 2:58am

Thank you @Nicholas_Bokulich,

As you told, when I added de novo clustering(97% identity following my previous analysis), I could carry out both alpha and beta diversity analyses with renewed table and sequence files.

I started these analyses on April 26, and it takes one month to go throughout the qiime2 pipeline.
I appreciate qiime2 forum colleagues.

If the problem of sample read number remains, then let me post another topic and ask!

Thanks,

system · July 1, 2020, 8:58am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.