`vegan` error in diversity adonis QIIME2 API

Hi, I am trying to run an adonis in QIIME2 API and it fails due to vegan error.

diversity.visualizers.adonis(distance_matrix, 
                             metadata=metdata,
                             formula="host_body_site * host_subject_id",
                             permutations=9999) 

Error log:

This is vegan 2.5-7
Error 'read.table(file = args[[1]], sep = "\t", header = TRUE, fill = TRUE, ':
  duplicated 'row.names' are not allowed

Though I don't have duplicates in rows or columns. Tried removing all trailing whitespaces and NA-values.

Hi!
Could you try to rename you columns to exclude all undescores?
AFAIK R read.table will mess up columns with underscores.

1 Like

Hi!
Thank you for the suggestion, I am not so good with R.
Unfortunately, I replaced the underscores with dots, but got the same error :frowning:

Hi!
Here are some more things to check:

  1. Check your metadata again if you have duplicated sample IDs or rows/columns
  2. Check if you have trailing rows
  3. Check if your tables are tab-separated (.tsv), not comma separated (.csv). Just renaming the extention will not convert .csv to .tsv, you can check it by opening file in text editor and see if values separated by tabs or commas.

Here is the link to stackoverflow topic about the same issue in R (maybe it will help to figure out what is happening there).

1 Like

From source code of q2-diversity i can see, that it's actually error with DistanceMatrix artifact, which is weird as it was produced in QIIME2 upstream.

Hi again, @crusher083
Did you manage to resolve this issue?
Turned out that some special characters in Sample Id's and column names may at the same time be processed by Qiime2 upstream Adonis without any errors and cause a failure in R packages.
So, if you are still getting an error, could you try to get rid of all special characters in the metadata and sample IDs?
If it will not help, you can share with us your files so we can take a closer look on it and help you to get through this error.

Can't figure it out, attaching my input.

_unweighted_unifrac_dis_matrix.qza (188.3 KB) adonis_metadata.tsv (91.6 KB)

Cheers

@crusher083 ,
I am pretty sure that your sample IDs are being interpreted as integers, most likely by R, and hence it is transforming your unique, float-like IDs into duplicated integer IDs.

>>> import pandas as pd, qiime2 as q2, skbio
>>> dm = q2.Artifact.load('_unweighted_unifrac_dis_matrix.qza').view(skbio.DistanceMatrix).to_data_frame()
>>> dm.index
Index(['13850.1', '13850.10', '13850.100', '13850.101', '13850.102',
       '13850.103', '13850.105', '13850.106', '13850.107', '13850.108',
       ...
       '13850.89', '13850.9', '13850.90', '13850.91', '13850.94', '13850.95',
       '13850.96', '13850.97', '13850.98', '13850.99'],
      dtype='object', length=135)

I recommend converting those sample IDs to strings (replace the decimal with an underscore?) and try again. This could be done in R to save time, or load as a pandas dataframe (as I have done above), fix the sample IDs, and then re-import as a QIIME 2 artifact.

Good luck!

4 Likes

Thank you, Nicholas!

It solved my problem indeed. Would have not figured it out on my own.

1 Like