Greengenes2 taxonomy from features error

Hello,
I have installed q2-greengenes2 and tested it with V4 sequences obtained after cutadapt / dada2 steps.
These sequences can be classified by q2 feature-classifier classify-sklearn using the pretrained 2022.10.backbone.v4.nb.qza classifier.

In order to get used to this new plugin, I started q2 greengenes2 taxonomy-from-features or taxonomy-from-table commands, but get a plugin error from gereengenes2: No requested tips found

The debug info showed this:
Traceback (most recent call last):
File "/home/bt140047/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2cli/commands.py", line 352, in call
results = action(**arguments)
File "", line 2, in taxonomy_from_features
File "/home/bt140047/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/bt140047/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in callable_executor
output_views = self._callable(**view_args)
File "/home/bt140047/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2_gg2/_methods.py", line 771, in taxonomy_from_features
tree = _load_tree_and_cache(open(str(reference_taxonomy)), features)
File "/home/bt140047/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2_gg2/_methods.py", line 543, in _load_tree_and_cache
tree = tree.shear(names & features)
File "bp/_bp.pyx", line 758, in bp._bp.BP.shear
File "bp/_bp.pyx", line 800, in bp._bp.BP.shear
ValueError: No requested tips found

The greengenes2 tutorial on github does not contain the qiime greengenes2 filter-features step as on the tutorial in this forum. I tried also the filter step before, but got the same error.

Is this due to the md5 hash created from the representative sequences by dada2, which do not fit if the query and reference sequences are not exactly the same (length, sequence)?

Does anybody have an advice to me how to solve this issue?
I would appreciate any suggestion how to solve this issue.
Best regards,

2 Likes

Hi @arwqiime,

Sorry for a delay in reply. I'm monitoring this thread now so I should be able to reply much faster.

Could you provide the filename of the reference file used with the taxonomy-from-features command? And, is it correct that the FeatureData[Sequence] provided were the representative sequences from classify-sklearn relative to 2022.10.backbone.v4.nb.qza?

Best,
Daniel

Hi @wasade
I used two reference files, both with the same error message:

name:"2022.10.taxonomy.asv.nwk.qza"
uuid:"4632f4ee-3fff-4072-a552-6e2397490ab3"
type:"Phylogeny[Rooted]"
format:"NewickDirectoryFormat"

as well as (assuming that the classification can be done via the md5 hash, too):

name:"2022.10.taxonomy.md5.nwk.qza"
uuid:"b835694c-c927-4f07-8216-696336bc1e45"
type:"Phylogeny[Rooted]"
format:"NewickDirectoryFormat"

The command was (here shown for asv, corresponding for md5):

qiime greengenes2 taxonomy-from-features \
--i-reference-taxonomy 2022.10.taxonomy.asv.nwk.qza \
--i-reads od03_dada2-cutadapt_WS2021-V1/representative_sequences-filtered-30.qza \
--output-dir od04_dada2_gg2_rep-seqs

The FeatureData[Sequence] artifact was the output from dada2, filtered for a min fequency of 3 (but this should not matter) and was successfully classified with 2022.10.backbone.v4.nb.qza.

name:"2022.10.backbone.v4.nb.qza"
uuid:"32489596-075f-44ff-a0ad-0a5c43a80b2c"
type:"TaxonomicClassifier"
format:"TaxonomicClassiferTemporaryPickleDirFmt"

Best regards,

Thank you, @arwqiime. This is puzzling. Is there any chance you could share od03_dada2-cutadapt_WS2021-V1/representative_sequences-filtered-30.qza either on here or directly with me ([email protected])?

Best,
Daniel

Thank you, @arwqiime, for sharing the file.

On inspection, the representative sequences are all 272nt in length. The fragment placement portion of Greengenes2 is predominantly 90, 100, 150nt ASVs relative to EMP 16S 515F stemming from the default processing in Qiita. As a result, there won't be exact match for any of the 272nt fragments.

Looking back over the thread, I think the correct course of action here is to use classify-sklearn as you already did. That output will be FeatureData[Taxonomy] and is qualitatively comparable to the output of taxonomy-from-features (or -table) if the ASVs were represented in the phylogeny. Alternatively, as I think these are relative to 515F, it may be feasible to trim the representatives at denoising to 90, 100 or 150nt to obtain the phylogenetic taxonomy.

I've opened an issue to improve the error information that arises from taxonomy-from-features (and -table) which should help in this situation.

All the best,
Daniel

1 Like

Hi @wasade

That makes sense! It was not clear to me that the fragments are dominated by the shorter fragments resulting from the Qiita workflow.

Best regards,

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.