ID Mismatch when running core-metrics

actually, I imported data via the method from this post Mafft 'returned non-zero exit status 1' ERROR - #5 by Purrsia_Felidae

namely, I construct unrooted tree in qiime1 with code
"pick_de_novo_otus.py -i split_library_output/seqs.fna -o otus",

then I imported the unrooted-tree in qiime2
#import unrooted tree (phylogenetic tree)
qiime tools import \

--input-path unrooted-tree.tre \

--output-path unrooted-tree.qza \

--type 'Phylogeny[Unrooted]'

unrooted to rooted

qiime phylogeny midpoint-root \

--i-tree unrooted-tree.qza \

--o-rooted-tree rooted-tree.qza

but I when I tried to generate core metrics files, it reports like this

actually, when I work on qiime1, my computer run out of battery once. Is that the reason for this error? otherwise, I have no idea what is going on....and what to do next...do you have any idea about that? Thank you in advance!

sincerely,
nan

I think I’d better to fully introduce my condition.

My data are PGM data. detailed information is here Demultiplex joined reads sequences

then my work route is as follow:

  • download and setup qiime1 as tutorial

  • validate mapping file “sample-mapping.txt”

  • demux and quality control the “full-files” (not the pr-file because little sequence pass the filter)
    split_libraries.py -m sample-mapping.txt -f full.fasta -q full.qual -b 8 -o split_library_output”

  • construct unrooted tree with code “
    pick_de_novo_otus.py -i split_library_output/seqs.fna -o otus"

  • enter q2:

  • import data

#import sequences
qiime tools import \

–input-path seqs.fna \

–output-path seqs.qza \

–type 'SampleData[Sequences]’

#import FeatureTable

qiime vsearch dereplicate-sequences

–i-sequences seqs.qza \

–o-dereplicated-table table.qza \

–o-dereplicated-sequences rep-seqs.qza

#import unrooted tree (phylogenetic tree)

qiime tools import \

–input-path unrooted-tree.tre \

–output-path unrooted-tree.qza \

–type ‘Phylogeny[Unrooted]’

unrooted to rooted

qiime phylogeny midpoint-root \

–i-tree unrooted-tree.qza \

–o-rooted-tree rooted-tree.qza

#generate core metrics: sampling depth could vary. Here 50000 was chosen since the least counts are 6XXX, 3XXXX, 6XXXX, 7xxxx, and chinese version tutorial indicates that usually 30000/50000 are suitable depth.

#generate core metrics

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table table.qza --p-sampling-depth 50000 --m-metadata-file sample-metadata.txt --output-dir core-metrics-results

Did I do any thing wrong?

Hello @nan_wang!

There error you presented above in the screenshot (when running diversity core-metrics-phylogenetic indicates that the features in your phylogenetic tree do not match the features in your feature table. Please run feature-table summarize on you table, and feature-table tabulate-seqs on the FeatureData[Sequence] that was used to create your tree (this would be the file used to create unrooted-tree.tre, if you don’t have it, then take a look at the unrooted-tree.tre file directly. Your feature IDs need to be consistent across those two bits of data — this is how QIIME 2 associates feature frequencies to tips of the tree, through the feature ID.

Keep us posted!

Thank your for your reply. But viewing trees is a problem - I imported the rep_set.tre file directly from qiime1, and it cannot be viewed. So I imported the files that generated .tre file into qiime2, but their types are FeatureData[AlignedSequence], thus they cannot be view via feature-table tabulate-seqs …Is there any other way to view the unrooted tree?

Thank you very much.
Sincerely,
NAN

1 Like

Hi @nan_wang,

Try using iTOL to view your phylogenetic tree. As well, as I mentioned above, you can just open the text file that you imported into QIIME 2 directly and ready the IDs there, too.

thank you for your early reply.
I read the two files but I didn't find anything wrong... I attached the sequence data and the sequence-feature.txt to this post. Could you please help me to find why they didn't work together?
Thanks again.
feature-frequency-detail.txt (1014.8 KB)
seqs_rep_set_aligned_pfiltered.txt.zip (2.8 MB)

Hey hey @nan_wang! Thanks for sending those files, but those don’t match up with the error message you posted above. One of the artifacts implicated in the error above has hashes for feature IDs, and I don’t see those in either file. For example:

$ ag 409c6bf feature-frequency-detail.txt

$ ag 409c6bf seqs_rep_set_aligned_pfiltered.txt

No hits, but according to your error above, at least one of those files has a feature with that ID.

Oh sorry! I tried several times after the initial post, so something is different from the initial posts but I guess it’s the same problem.

Actually, after 9.29’s post, I tried several methods.

First, I thought maybe it is because the table and the tree were generated in different softwares that they didn’t match, so I imported the table generated by qiime1(the table I sent to you) and use it. it has no significant difference from the one generated in qiime2 except it has much less features than q2-generated table.

Then, I imported every file generated in QI to Q2 then continue the pipeline:
namely, 1) I imported aligned sequences in Q1 to Q2. then masked it, constructed unrooted tree, then got rooted tree.
2)I imported the filtered sequences(the seqs_rep_set_aligned_pfiltered file I sent to you ), which is equal to the masked sequences, to Q2, then construct the unrooted-tree…
and I also 3)imported the unrooted-tree file…

but all of the four solutions returned the same error “…not corresponding to tip names (n= numbers_of_features): (all feature IDs)”

I really have no idea why it happens…

Please send the artifacts and one of us will take a look. You can send to me in a DM if you want.

This is not quite correct --- it is only some of them (2307 out of 65,028).

My first question is --- why are these features present in your table, but absent from your tree? It looks like you prepared these data outside of QIIME 2, so you will need to double-check why this is the case.

So, assuming that it is okay to have missing features, how about we just drop those features from the feature table?

You can provide your rep seqs as metadata to feature-table filter-features, or, you can create a new feature metadata file with a list of the features to remove from your feature table. Once the table is filtered try core-metrics-phylogenetic again.

:qiime2: :t_rex:

1 Like

That works! Thank you soooooo much!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.