Dear qiime2 community!
I still have not figured out how to solve the problem of unmaching tree tips and sequence tables. Also discussed here
The user with the same problem did not post the solution.
The error I get is the same is in the post:
/usr/appli/freeware/miniconda/3.6/envs/qiime2-2018.2/lib/python3.5/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with input dtype int64 was
converted to bool by check_pairwise_arrays.
Traceback (most recent call last):
File “/usr/appli/freeware/miniconda/3.6/envs/qiime2-2018.2/lib/python3.5/site-packages/q2_diversity/_alpha/_method.py”, line 46, in alpha_phylogenetic
File “/usr/appli/freeware/miniconda/3.6/envs/qiime2-2018.2/lib/python3.5/site-packages/skbio/diversity/_driver.py”, line 170, in alpha_diversity
counts, otu_ids, tree, validate, single_sample=False)
File “/usr/appli/freeware/miniconda/3.6/envs/qiime2-2018.2/lib/python3.5/site-packages/skbio/diversity/alpha/_faith_pd.py”, line 136, in _setup_faith_pd
_validate_otu_ids_and_tree(counts, otu_ids, tree)
File “/usr/appli/freeware/miniconda/3.6/envs/qiime2-2018.2/lib/python3.5/site-packages/skbio/diversity/_util.py”, line 106, in _validate_otu_ids_and_tree
otu_idsmust be present as tip names in
otu_idsnot corresponding to tip names (n=23324): 4b0f96635e87ecb3e7903c0b4ab0bfb2abe2856a 5a299a483a1212f879ee358b2ec00495c256ef1f [omitting feature_ids]
So after this command the feature_ids have the same name:
qiime vsearch dereplicate-sequences
[I checked by exporting the files and manually checking some of the feature_ids.]
Hence, somewhere during these commands the feature_id’s get changed:
qiime alignment mafft
qiime alignment mask
qiime phylogeny fasttree
qiime phylogeny midpoint-root
qiime tools export
I played around with my data and used the tools
feature-table summarize (the third tab, ‘Feature Detail’) and
feature-table tabulate-seqs to visualize the names of my feature_ids. I then exported my tree and viewed it in MEGA. There I found why the tips do not match.
The tips of the tree are all changed the same way:
UU3micro-18S-12_S14_L001 is the name of one of the fasta files I used and _132201 is also added. It is not reads.
The problem is this that I do not know where the fasta-filename is added and why. I used mafft and fasttree outside of the pipeline, they never added names to the sequences. I think this is the only reason that I cannot get the qiime diversity core-metrics-phylogenetic command to run.
I also tried the command recommended in the moving pictures tutorial:
qiime phylogeny align-to-tree-mafft-fasttree
but I get this error (I should update my qiime version…):
Error: QIIME 2 plugin ‘phylogeny’ has no action ‘align-to-tree-mafft-fasttree’.
So thanks for your help so far! I am struggling to export mafft data so until I figure this out I cannot know where it is changed and why. There is also the possibility that the exporting changes the tip names and the error lies somewhere else.
I will also post the solution if I can find it and update my Qiime2 version (its from February)!