Errors when importing biomtable and tree files created by Ucluster into QIIME2

Both of my feature table and the tree has 923 features. There are no mismatched features when I looked into both of the feature table and the tree.
For the similar feature IDs, they represent different sequences. For example, Jul27.50v_1972 is one of the sequences from the sample July27th, 50m depth. Jul27.50v_1414 is another sequence from the sample July27th, 50m depth.

The last piece of my command looks like this:

qiime feature-table summarize \
  --i-table /path/to/feature_table.qza \
  --o-visualization /path/to/feature_table.qzv \
  --m-sample-metadata-file /path/to/metadata 
qiime diversity core-metrics-phylogenetic \
  --i-phylogeny  /path/to/rooted_tree.qza \
  --i-table /path/to/feature_table.qza \
  --p-sampling-depth 380 \
  --m-metadata-file /path/to/metadata \
  --output-dir /path/to/core-metrics-results

Sorry for the unclear paste.
Do you mean that QIIME2 is confused by the feature IDs?

:+1:

Hmm, well, that isn't what QIIME 2 is seeing --- the error you posted above indicates that the feature IDs are mismatched between the two Artifacts.

In QIIME 2, ID matching is performed by directly comparing the ID string, character for character --- the IDs must match 100% in order to be collated.

When we see mismatched ID issues here it is usually because of a bookkeeping issue when importing data.

If you can't find any problems with your file bookkeeping, please send me a DM with download links to 3 files necessary to re-run your qiime diversity core-metrics-phylogenetic command above. Thanks! :qiime2:

Hi Matthew,

I still can't find any problems with my file bookkeeping, could you please have a look on my files?
The threes files necessary to run my qiime diversity core-metrics-phylogenetic command is attached below.

Thank you!

feature-table.qza (30.9 KB)
rooted-tree.qza (17.9 KB)
anotop_rtpr_qiime2_matadata_period.txt (6.5 KB)

1 Like

The IDs for your features in the table have underscores:

 'Sep8.5v_9015',
 'Sep8.5v_9043',
 'Sep8.5v_9109',
 'Sep8.5v_9804',

While the tree has spaces:

 'Sep8.5v 9015'
 'Sep8.5v 9043'
 'Sep8.5v 9109'
 'Sep8.5v 9804'

That’s weird because I can see that the IDs have underscores in both of the unrooted tree file I imported into QIIME2 and the rooted tree file I exported from QIIME2.

Can you send the unrooted tree, too?

By the way, if your tree is already rooted, import it as such, you can skip midpoint rooting.

Here it is.
new_tree.tre.zip (11.7 KB)

My tree was not rooted.

Thanks @karren_owl --- I found the culprit. q2-phylogeny is using scikit-bio behind the scenes to do the midpoint rooting --- I was reading up on the Newick file format (used to represent the tree here), check this tidbit out:

In this format, underscores are treated as spaces --- in order to include a literal underscore, you must precede the underscore with a single-quote '. You can also surround the entire ID with single quotes.

You can learn more about the format here: Newick format (skbio.io.format.newick) — scikit-bio 0.5.5 documentation

1 Like

Hi Mathew,

I tried the qiime importing again using new biomtable and tree file with id quoted with single quotes. However, it runs to the same error again:
All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=515): ‘May12.50v_1549’ ‘Jul11.25v_4600’ ‘Jun22.0v_544
2’ ‘Aug5.50v_20401’ ‘May12.75v_1961’ ‘Jul11.50v_9708’ ‘Jul27.25v_4983’ ‘Jul27.50v_1972’

What does your updated tree look like? The quotes shouldn’t be showing up in skbio at all…

It looks like:
(((((('May12.50v_9315':0.03482,(('Jun22.75v_1205':0.07334,('Jul11.0v_7885':0.00053,('Jun22.0v_1022':0.04590,'Jul27.5v_6359':0.01931)0.442:0.00054)0.872:0.00053)0.982:0.03354,((('Jul11.50v_3904':0.07224,('Jul11.5v_1691':0.05089,'Aug5.75v_18546':0.03200)0.932:0.04150)0.885:0.03220,(((((('Jul27.50v_1972':0.00055,('Jun22.0v_2832':0.03786,'Jul11.50v_2380':0.04936)0.745:0.00379)0.999:0.08336,((((('May12.75v_1550':0.05003,'Jun22.75v_180.95otu_with_single_quotes_aligh_tree.tre.zip (11.7 KB)

Are you sure you have your files in order? I just successfully ran the core-metrics-phylo command from above using the feature table (from above) (UUID e43d8331-06d5-4444-8835-70bec0bacf56) and the tree you just posted — worked as expected. Make sure you have everything in order and give it a shot again!

I got it! I changed the IDs in the biomtable too and got that error. Apparently I only need to add quotes to the IDs in the tree file. Thanks!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.