Minimum read for denoising process

devonorourke · February 6, 2019, 4:26pm

Hopefully a quick question:
From what I can gather on posts @wasade has commented (ex. here) Deblur has parameters to define how many reads are required to be considered going into filtering (--p-min-size) and how many reads are required following filtering to be included in the resulting frequency table (--p-min-reads).
I can't find any such information on DADA2; I'm specifically interested in the paired-end approach if it makes a difference. I'm wondering (1) whether DADA2 considers reads of any abundance when building its error models, and (2) following the paired-end joining, if it removes any reads that pass its filtering but have less than N total merged read pairs. Perhaps @benjjneb can help.
Thanks very much!

benjjneb · February 6, 2019, 9:50pm

Yes.

No. But note there is an effective threshold that removes all singletons due to the division criteria in the core algorithm. But there is no additional threshold enforced later, that is left up to the user if they so desire.

devonorourke · February 6, 2019, 10:18pm

Thanks @benjjneb,

(1) I should have been more specific. Deblur has a parameter where it ignores reads below a user defined (or default) value; I was wondering if DADA2 has a specific minimum value, or if all read abundances are considered.

( 2) This threshold... it removes singletons? But it wouldn't remove doubletons then?

Cheers!

benjjneb · February 7, 2019, 2:53pm

The q2-dada2 plugin does not have such a parameter.

Yes, although doubletons will still probably be diminished somewhat in paired end data, as they must be detected in both the F and R reads independently.

devonorourke · February 7, 2019, 3:16pm

Thanks @benjjneb - very helpful. Appreciate the quick response!