Colormap and Collapsed Data

Hi team!

I had an awesome interaction with Lisa Karstens, the developer of microshades recently. It's a really beautiful up to 30 (!!!!) shade colorblind friendly palette. The individual color palettes are also all kinds of pretty and I'm lowkey delighted.

Lisa and I agreed that we'd love to bring microshades into QIIME 2, but that it would work best if there was a way to collapse low abundance data before it goes into the barplot so the colormap doesn't repeat. (Have I given my barplot rant here :bar_chart:? I have a barplot rant). Is there a way that the collapse could be built into the current functionality? Is it better to try an write a new plugin and hope to avoid the automation pitfall?

Thanks,
Justine

8 Likes

Hi @jwdebelius !

These palettes look great!

Same with all the existing palettes — so I would be in favor of adding the palettes first, then attacking the collapsing issue second.

I'd recommend sticking with the existing action instead of writing a new plugin, to leverage the interactivity and other features. I don't think that starting fresh will necessarily save time here...

In theory yes this should be possible, and there are a few related open issues in q2-taxa (which is to say that if you want to tackle this our children's children will sing your praises)

An "easy" way to approach this (compared to the hard way of getting vega to do the collapsing interactively inside the visualization), would be to expose an option in the action to collapse low-abundance taxa before visualizing. The action already collapses the input feature table at each taxonomic level — such an option could further collapse these tables before saving as CSV and passing to the visualization.

An enhancement on that would be to save separate CSVs — collapsed and uncollapsed tables at each taxonomic level — then in the visualization you could add an option to choose which table is displayed (instead of dynamically collapsing the table inside the visualization, which would be really awesome since you could do things like interactively select the abundance threshold for collapsing, but probably a lot of work to implement).

Just exposing the option would be a low-tech way to accomplish this, and much less work than writing a new visualization. Curious to hear what you and others think!

4 Likes

Hi @Nicholas_Bokulich,

Thanks! When I spoke with Lisa, the collapsing/repeating issue was a large part of her hesitance to suggest we incorporate this into QIIME 2. I think it's important to impose collapsing (probably not interactive for now) and either link the collapsing to colormap (i.e. you can't show more colors than exist in your collapsed table) or cap the number of colors based on the limitations of colormaps.

Best,
Justine

As the colormap can be selected interactively in barplot, capping/collapsing would need to be dynamic and require a bit of work with vega, but could be done.

A lower-tech way consistent with what I proposed: the threshold parameter could be an int or float:

  • if int: collapse the table so that only the top N most abundant features are preserved, and the rest are summed as "Other".
  • if float: collapse the table so that all features with mean abundance >= threshold are preserved, the rest are summed as Other.

Then a user could just select based on the number of colors in the planned colormap (and this could even be listed in the documentation, e.g., that we recommend capping at N=12 or 20 or whatever is typical for most colormaps).

Low-tech but maybe that satisfies what you and Lisa are after :man_shrugging:

A ggplot2 wrapper as a plugin would also be an even easier but static alternative to integrate microshades...