Hello @Gabriel_Pires!
Great question. I have a couple of suggestions but I don’t know if it is the answer. The problem is that you table includes sequences that are not in your phylogeny.
It looks like you already made sure that you are not filtering out sequences but I would first make sure that you are not discarding too many samples with your sampling depth. Mike Robson does a great job explaining why here: The table does not appear to be completely represented by the phylogeny.
My next suggestion would be to make sure the names are the same in your rooted tree ID’s and your table ID’s. Matt gives a great explanation in this post with a similar problem. feature ids much be present as tip names in phylogeny
I hope that helps. Let me know if it doesn’t and we will brainstorm some more!
Chloe
Blockquote
It looks like you already made sure that you are not filtering out sequences but I would first make sure that you are not discarding too many samples with your sampling depth. Mike Robson does a great job explaining why here: The table does not appear to be completely represented by the phylogeny.
Indeed, I made sure to retain a maximum of samples, I chose a sampling depth of 24985 and Retained 124,925 (72.37%) features in 5 (100.00%) samples at the specifed sampling depth. So I think I don’t loose too many information, right ?
Blockquote
My next suggestion would be to make sure the names are the same in your rooted tree ID’s and your table ID’s. Matt gives a great explanation in this post with a similar problem. feature ids much be present as tip names in phylogeny
As for the ID’s, I checked the tree.nwk, the table.qza, the sample-metadata.tsv and all sample ids matched together, perhaps there is another file where I didn’t check yet ?
Hello @Gabriel_Pires,
I am sorry for the slow response on my part.
This seems perfectly reasonable and is more than likely not the cause of the issue!
There isn't another file to check, however, I am pretty sure that the issue lies in your feature IDs and not your sample IDs. So you might have checked the wrong column. I recommend following Matt's steps here again but checking the feature IDs.
If that does not work could you post your code and results(for Matt's steps) so that I have a better idea of what is happening. Also if you want to post your rooted tree and table that you are using I could also try to debug more effectively.
Hi @cherman2,
Thank you for your answer, I think I figured out what’s going on. I ran my qiime2 pipeline on a several .fna files. The one I looked was actually composed of 5 samples (ATB_1, _2, _3, _4 and _5). The problem is that for each sample, there exist multiple reads with an identifier as ATB_1_1502, ATB_1_1503, etc. In my table.qza, the samples only go from ATB_1 to _5 and in the rooted tree for each sample there exist multiple features, like one ATB_1_1502 and another one for ATB_1_1503. So I think, the rooted tree took all identifiers as unique and didn’t merge them as in the table.qza. I don’t know if it’s clear ?
Do you know a trick to force the rooted tree to merge all ATB_1 with each other and so on ?
Hello @Gabriel_Pires,
Awesome! I am glad we were able to find the source of the problem
The next step is making sure that the rep-seqs.qza (what you use to make the rooted tree) have the same feature IDs. I believe that you will need to filter the repseqs.qza file so that it matches your current table using this command:
Then recreate your rooted tree using the same command as before but with your new merged-rep-seq.qza. (You don't have to name it merged-rep-seqs.qza if your don't want to that is just my idea)
Then you should be able to run the core-metrics-phylogenetic function without this error!
I hope this fixes it.
Chloe