Qiime2r - invalid “phyloseq” object: Component sample names do not match

otus
metadata
qiime2-r
taxonomy
(Adriana Paula Bernardo Cravo) #1

Hi, everyone,
My question is not 100% related to qiime2, but to qiime2r. I’m following the tutorial made by @jbisanz (Tutorial: Integrating QIIME2 and R for data visualization and analysis using qiime2R) which is helping me a lot (thanks for that)!!! However, I’m stuck in a problem that I cannot fix.
I uploaded my files:

metadata<-read_tsv (metadata_16s_run1.tsv")
cons_taxonomy<-read_qza (“classifier_Run1_tax.qza”)
cons_taxtable<-cons_taxonomy$data %>% as.tibble() %>% separate(Taxon, sep=";", c(“Kingdom”,“Phylum”,“Class”,“Order”,“Family”,“Genus”,“Species”))
table<- read_qza (“table.qza”)

constax=as.data.frame(cons_taxtable)%>%select(-Confidence) %>% column_to_rownames(“Feature.ID”)%>%as.matrix()
constax[,“Kingdom”]<-gsub(“D_0__”, “”, constax [,“Kingdom”])
constax[,“Phylum”]<-gsub(“D_1__”, “”, constax [,“Phylum”])
constax[,“Class”]<-gsub(“D_2__”, “”, constax [,“Class”])
constax[,“Order”]<-gsub(“D_3__”, “”, constax [,“Order”])
constax[,“Family”]<-gsub(“D_4__”, “”, constax [,“Family”])
constax[,“Genus”]<-gsub(“D_5__”, “”, constax [,“Genus”])
constax[,“Species”]<-gsub(“D_6__”, “”, constax [,“Species”])

and tried to create a phyloseq object merging them:

physeq<-phyloseq(
otu_table(table$data, taxa_are_rows = T),
tax_table(constax),
sample_data(metadata %>% as.data.frame() %>% column_to_rownames(“sample_name”))
)

but I keep on getting the error message:

Error in validObject(.Object) : invalid class “phyloseq” object:
Component sample names do not match.
Try sample_names()

and when I run sample_names(), I get:

sample_names(table)
NULL
sample_names(cons_taxonomy)
NULL
sample_names(metadata)
NULL

I don’t really know what to do, already searched online for help but couldn’t fit any of the solutions here. Also, when I visualize the files in R, they all look fine.
Can someone help me with this, please?

Thanks!
Adriana

0 Likes

(Jordan Bisanz) #2

Hi Adriana,

To troubleshoot this, you need to look at the objects you are trying to build the phyloseq object with. If you are using rstudio, you could use the command View(table$data) to make it open in a new tab. As an FYI, I think the sample_names function only works on a phyloseq object which is why NULL is returned.

Particularly, double check that the sample names in the metadata match what is in the table. Something like this would help diagnose the issue assuming you have the package gplots installed:

gplots::venn(list(metadata=rownames(metadata), featuretable=colnames(table)))

You could also print the list of samples as such if you have a mismatch where a sample in the table does not have metadata:

print(gplots::venn(list(metadata=rownames(metadata), featuretable=colnames(table)), show.plot = F))

Also for efficiency of coding, before splitting the taxonomy, you could remove all of the D_#__ in one swoop by doing:
... %>% mutate(Taxon=gsub("D_[0-9]__", "", Taxon) %>% separate(Taxon, sep=";"...

0 Likes

(Adriana Paula Bernardo Cravo) #3

Hi, Jordan, thanks for the quick feedback!
I did what you suggested (pictures attached). I ran the command gplots::venn(list(metadata=rownames(metadata), featuretable=colnames(table))) for both the constax_table and the feature table (the constax_table is the one with number 9 inside the featuretable circle). I imagined that this could be the problem, however I did the same test with other samples that I worked before, in another project, and I got the same output (zero in the intersection of the table and the metadata) and I could create the phyloseq object without any issues, so this doesn’t seem to be the problem (that’s the impression I had).


I also ran the command print(gplots::venn(list(metadata=rownames(metadata), featuretable=colnames(table)), show.plot = F)) and got the results:
num metadata featuretable
00 0 0 0
01 0 0 1
10 79 1 0
11 0 1 1
attr(,“intersections”)
attr(,“intersections”)$metadata
** [1] “1” “2” “3” “4” “5” “6” “7” “8” “9” “10” “11” “12” “13” “14” “15” “16” “17” “18” “19” “20” “21” “22” “23” “24” “25” “26”**
[27] “27” “28” “29” “30” “31” “32” “33” “34” “35” “36” “37” “38” “39” “40” “41” “42” “43” “44” “45” “46” “47” “48” “49” “50” “51” "52"
[53] “53” “54” “55” “56” “57” “58” “59” “60” “61” “62” “63” “64” “65” “66” “67” “68” “69” “70” “71” “72” “73” “74” “75” “76” “77” "78"
[79] "79"

attr(,“class”)
[1] "venn"
I re-checked the metadata file and the constax_table file but still couldn’t identify the error…

Thanks again for your feedback!!

0 Likes

(Jordan Bisanz) #4

Sorry, if I am understanding it, the issue you have has nothing to do with your taxonomy table, but a mismatch between your metadata and feature table (SVs or your favourite term here for denoised sequences). I have the sneaking suspicious that you may have used numbers to be your sample names, this is not generally advisable and it would help to append a character in front of the numbers. Let me know what happens when you follow up.

0 Likes

(Adriana Paula Bernardo Cravo) #5

Hi, Jordan,thanks again for getting back to me,
I changed the sample names for s1,s2 etc and tried again.
Now I got this:

physeq<-phyloseq(
   otu_table(table$data, taxa_are_rows = T), 
   tax_table(cons_taxtable), 
   sample_data(metadata %>% as.data.frame() %>% column_to_rownames("sample_name"))
 )

Error in validObject(.Object) : invalid class “phyloseq” object: 
 Component taxa/OTU names do not match.
 Taxa indices are critical to analysis.
 Try taxa_names()
In addition: Warning messages:
1: In .registerS3method(fin[i, 1], fin[i, 2], fin[i, 3], fin[i, 4],  :
  restarting interrupted promise evaluation
2: In get(method, envir = home) :
  restarting interrupted promise evaluation
3: In get(method, envir = home) : internal error -3 in R_decompress1
4: In .local(object) : Coercing from data.frame class to character matrix 
prior to building taxonomyTable. 
This could introduce artifacts. 
Check your taxonomyTable, or coerce to matrix manually.

Sorry to insist on this :expressionless: I wonder if I should classify and train my taxonomy again, to see if the error comes from the final file. Could this be an option, considering the new error messages?

0 Likes

(Devon O'rourke) #6

Not sure if this was done in the code above and I missed it, but whenever I get that error in Phyloseq what I’m usually failing to do is ensure that the row.names are the sample names of the OTU table.

The fact that your sample_names(table) function is throwing a NULL result might indicate that this is what’s going on.

4 Likes