q2_composition da_plot label limit

Hello,
I am using differential abundance plots to look at the difference between features with taxonomic assignment. The differential abundance plots created have very long names for the taxonomy names, and are cut off from view, ie only showing a portion up to b__bacteria:p_Firmicutes;c__Bacilli... I found that modifying the bar chart label limit allows you to view the entire label. Could this be implemented? It is done by adding .configure_axis(labelLimit=1000) to the end of the alt.Chart(df).mark_bar().encode(...).

1 Like

Thanks for posting @bvan-tassel! Would you be able to post an example of the issue that you're running into, along with the command that generated it? And are you running QIIME 2 2023.5? I'm asking because I thought we had this handled by the label summarization here.

I believe that solved a different problem, where specifically now we want to view the entire taxonomic assignment in the y-labels. Also, the hover tool tip cuts off the full label as well, but I found this a more difficult problem to solve.

If I run the following commands -
qiime composition ancombc --i-table collapsed_table.qza --m-metadata-file merged_metadata_update.txt --p-formula Timepoint --p-reference-levels Timepoint::BASELINE --o-differentials differentials.qza

qiime composition da-barplot --i-data differentials.qza --o-visualization differentials
differentials.qzv (391.7 KB)

The y-labels will cut off the complete taxonomic classification. I modified the _diff_abundance_plots.py script manually -
bars = alt.Chart(df).mark_bar().encode(
x=alt.X('lfc', title="Log Fold Change (LFC)"),
y=shared_y,
tooltip=alt.Tooltip(["feature", effect_size_label,
significance_label, error_label,
"error-lower", "error-upper"]),
color=alt.Color('enriched', title="Relative to reference",
sort="descending")
).configure_axis(labelLimit=1000)
in order to get my desired result.
I don't think it is possible to fix the tool tip hover label limit based on some googling. But perhaps it might make sense to use the function that updates the taxonomic assignment the "most specific" taxonomic assignment value for the tool tip and to keep the full assignment for the y label. I could modify that function as well and send the code if we think that is a valid solution.

Hey @bvan-tassel,
Try running your command again with the --p-level-delimiter ';' option. That tells the visualizer that the feature identifiers contain hierarchical information that is delimited with a ;. That should improve the presentation of your results (both the y-labels and the tool-tip) so it looks like the attached file.

viz.qzv (280.9 KB)

This has the full label in the tooltip, and the "most specific label" as the y-label. Let me know if that addresses your needs, or if you still would like to see this addressed differently (in which case let's talk about your contribution idea).

1 Like

Oh, this does fix the label from cutting off. I think we'd still like the option for the full taxonomic annotation in the y-axis.

And to address your second question, I don't know that it would be worth it to have the hover tool tip label show the most specific taxonomic assignment.

Could the hover tooltip always adjust the names to replace the ";" character with spaces so that it is always visible without the requirement for the --p-level-delimiter ';' parameter? I see that in the 2023.5 notes this was addressed, is this because there are some taxonomic assignments that have a ";" in them? Is it unsafe to always substitute ";" with a space?

Basically, if a user does not provide the --p-delimiter option, I would want to to output the entire taxonomic name in the y-label and it would be nice to still replace the ';' delimiter with spaces to prevent the cutting off of the labels. Thanks @gregcaporaso!

@bvan-tassel, we recently had a lot of discussion about automatically handling the ;, and we decided against it. Ultimately, these are schema-less identifiers, not annotations, that are being presented. Since it's common that these identifiers are useful as annotations we added the support for splitting on semi-colons (rather than having to pass a separate FeatureData artifact) as a convenience for users. However, since the identifiers are schema-less, we can't assume that semicolons should be handled in any specific way, and we would want to support other delimiters that may show up. Hence the current implementation.

I agree that not cutting off the y-labels would improve the functionality in cases where users don't want to split (or it doesn't make sense to split) and have long feature ids. If you'd like to contribute functionality to support these we'd be happy to have it. Rather than hard-coding a maximum length, I suggest making this value a parameter that defaults to the current behavior (e.g., users could provide it as an int, with a default value of None). If you want to submit this fix, you can tag me to review the pull request. Please include an example .qzv that lets us review how the output looks with the PR. We're prepping for a release now, and our pull request submission deadline just passed. If you get this in in the next day or two, it's possible that we could get it merged in time for the 2023.7 release since it's a pretty simple one, but I can't promise that.

Let me know if you need any input.

1 Like