Add FIGARO tool to optimize trimming/truncating parameters in DADA2

SEBASTIAN_VERA_SANDO · October 26, 2020, 8:58pm

Hello everyone.

A new user of QIIME 2 writes to you, and first of all I take the opportunity to tell you that I am in awe of this super-powerful tool for microbiota analysis.
As I am new, I am just exploring and trying to internalize the concepts and parameters associated with the DADA2 plugin. After looking through a DADA2 tutorial in the R environment I found that in the sequence trimming/truncating stage, the author suggests using a tool called FIGARO (Weinstein et al., 2019) that allows you to optimize these parameters in order to suggest a trimming site for both reads that minimize the expected error for both, as well as preserve the expected percentage of reads (Figure 1).

Figure 1 .Panel A: Fitting an exponential regression to the 83rd percentile for cumulative expected error
values across multiple samples from a single sequencing experiment on a MiSeq. The high
(>0.99) r2 value in both directions is representative of what was often observed with this model.
Panel B: A plot showing the percent read retention, trimming site scores, and forward and
reverse expected error allowances for a set of 16S rRNA gene sequences covering the V3 and
V4 regions generated on a MiSeq. The vertical dashed line represents the trimming site
recommended by FIGARO, providing minimal expected error allowances in both directions
while still preserving the expected percentage of reads (figure and description from Weinstein et al., 2019)

I suggest implementing this tool in :qiime2:, since it can contribute positively in two things: 1. increase objectivity when setting parameters and 2. save time when adjusting these parameters by doing it by trial-and-error.
Greetings to all.

Nicholas_Bokulich · October 28, 2020, 8:14am

Welcome to the forum and :qiime2: universe @SEBASTIAN_VERA_SANDO! And thank you for the kind words, enthusiasm, and feedback.

Figaro has been on our radar for some time... it has been mentioned a few times on this forum, and I agree could make a useful plugin. In the near term, the best way to see it wrapped in a QIIME 2 plugin would be to convince the developers of Figaro to create such a plugin! (as far as I know this idea has already been recommended to them, so it may be in the works) The next best way would be for others in the community to create such a plugin. Either way, there is ample support on this forum for new plugin developers.

Mehrbod_Estaki · October 28, 2020, 8:59am

Hi @SEBASTIAN_VERA_SANDO,

Welcome to the forum! Agreed that the FIGARO tool would be cool to wrap into QIIME 2, but as @Nicholas_Bokulich mentioned that is entirely up to the developer of that tool. I have had a few conversations with the main developer, Michael Weinstein, who has mentioned a QIIME 2 plugin is in his radar, however I'm not sure of the progress. I'll shoot him an email and update here if I hear anything new.
In the meantime you can use their stand-alone tool. Any feedback on it would be much appreciated both to the community and the developer.

SEBASTIAN_VERA_SANDO · October 28, 2020, 7:33pm

Thank so much for their answers, @Nicholas_Bokulich & @Mehrbod_Estaki

I am glad to hear that there are indications that FIGARO will join to :qiime2: at some point. For the moment, as @Mehrbod_Estaki point out, I will try to use it independently and will compare the denoising results from “subjectively” chosen sites.

michael-weinstein · November 3, 2020, 6:54pm

Apologies for the delay on this. I do have two major improvements in FIGARO in the works right now. One of them will be to make it able to take reads of slightly varying length. This is due to the popularity of library prep methods that cause staggering in the read starts. I am also working on understanding how to roll this into a QIIME2 wrapper, as you mention. I have put in some development towards both of those goals recently, but I have had a shift in my focus for the last couple months with both UCLA and Zymo Research to helping support the COVID-19 response (my two biggest projects have been helping the state of CA fix some of their reporting interface problems and monitoring of wastewater to detect and monitor the virus).
These improvements will be coming in the form of FIGARO 2.0, and as I get more of these COVID-19 response tasks resolved, I'll have more time for its development. And I do apologize for the delay in this development.

jolespin · August 19, 2024, 6:40pm

Just checking in to see if there is any progress on development here?

lizgehret · August 26, 2024, 6:25pm

Hi @jolespin,

To the best of my knowledge (based on the FIGARO Github repository) it doesn't seem like there has been any development since 2020. @michael-weinstein can you speak to any updates you may have made elsewhere, if there has been any additional development on your end?