Subsample - option to subsample samples in a feature-table by a metadata category

pending-development
feature-table

(Justin Shaffer) #1

Hello,

For making group comparisons, I have often found it useful to randomly subsample a feature-table such to have a similar number of samples in each group of interest (e.g., states of a metadata variable).

It would be useful to have an argument to ‘feature-table subsample’ that would subsample within states of a desired metadata variable, or intersections between states of multiple metadata variables, to a desired number of samples (x).

A useful default would be to subsample to the number of samples represented by the state (or intersection among states) with the fewest number of samples (y). If x exceeds y, another useful option would be to argue to include or exclude samples represented by states whose sample size is less than x.

Best wishes,

Justin


(Nicholas Bokulich) #2

Thanks @Lichen! You can use feature-table subsample to subsample, though you cannot stratify on a specific metadata variable. I have opened this feature request to add that eventually — contributions are always welcome to QIIME 2 and its plugins!