fragment insertion sepp error

Dear Forum:

I am running q2cli version: 2019.10.0 installed via conda

I am trying to build an ITS phylogenetic tree with

$ qiime fragment-insertion sepp --i-representative-sequences ./training-feature-classifiers_ITS/rep-seqs-dada2-ITS-without-recaps.qza --i-reference-database sh_refs_qiime_ver8_99_s_04.02.2020.qza --o-tree tree_ITS_without_recaps.qza --o-placements tree_placements_ITS_without_recaps.qza --p-threads 24

but I get the error message:

(1/1) Invalid value for “–i-reference-database”: Expected an artifact of at least type SeppReferenceDatabase. An artifact of type FeatureData[Sequence] was provided.

which is fair enough since I used qiime tools import --type FeatureData[Sequence]… to import the .fasta file. I’m sure the answer is right in obvious, but I can’t find it for two days for this question: how, then, do I turn the UNITE files into a SeppReferenceDatabase .qza?

Then I see on the ITS tutorial Fungal ITS analysis tutorial the admonition to use ITS only for species ID, not for phylogentic diversity methods. It suggests the q2-ghost-tree plugin as an alternative, but that tutorial also cautions against its use for trees, so which is it?

Thanks,
Steve

1 Like

Hi @skimble!

The short answer is: you don’t. fragment-insertion has only been validated with some very specific versions of GG and SILVA. I suggest pinging the developers at https://github.com/smirarab/sepp-refs to see about getting new references added. Keep us posted and let us know if you need a hand!

What tutorial are you referring to here (looks like you forgot to include the second link)?

Keep us posted!

Thanks, thermokarst. I was looking at the q2-ghost-tree plugin tutorial at Q2-ghost-tree Plugin: Community Tutorial for Creating Hybrid-Gene Phylogenetic Trees which notes that “The most popular application of this method is for fungal microbiome analysis using ITS sequences which provide great species identification, but make poor quality multiple sequence aligments (MSAs) and subsequently poor phylogenetic trees.” which I take to mean that the trees should only be used for SH but not for, say, UniFrac distances?

Thanks for sharing, @skimble! Perhaps pinging the original author with your question might lead to a more satisfying response? Feel free to ask on the linked tutorial above. Thanks!

:qiime2: