We are proud to (belatedly) announce the release of the q2-SCRuB plugin!
SCRuB is a tool designed to help researchers address the common issue of contamination in microbial studies.
Principal concept
SCRuB is a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. It models each sample of interest as a mixture of contamination and non-contamination (“biological”/“real”) sources, and each control sample as a noisy realization of a latent contamination source. It further uses the spatial location of a sample during processing (for example, location on a 96-well plate) to account for leakage of non-control samples into controls. A detailed description can be found here and here.Usage
This package provides an easy to use framework to apply SCRuB to your projects. All you need to get started are n samples x m taxa count matrices for both your samples and controls. In addition, locations of samples and controls during processing are optional but recommended. To begin, we recommend working through SCRuB's documentation pages . This documentation includes installation steps on the homepage, and examples of qiime commands with SCRuB. In addition, we provide the key plugin details below.Tutorial
Installation
conda activate qiime2-2023.5
conda install -c conda-forge r-devtools
Rscript -e 'devtools::install_github("shenhav-and-korem-labs/SCRuB"); torch::install_torch()'
pip install git+https://github.com/Shenhav-and-Korem-labs/q2-SCRuB.git
Example data
In this tutorial we use SCRuB to decontaminate a dataset using a Poore et al. This data can be downloaded with the following links:
First, we make a tutorial directory and download the data specified
above to the plasma-data
directory:
mkdir SCRuB-example
mkdir SCRuB-example/plasma-data
mkdir SCRuB-example/results
cd SCRuB-example/plasma-data
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/table.qza
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/metadata.tsv
cd ..
Decontaminating
To run SCRuB we only need a single command. In this
tutorial our control_idx_column parameter is is_control
,
our sample_type_column is sample_type
, and our
well_location_column is well_id
. Now we are ready to SCRuB
away the contamination.
qiime SCRuB SCRuB \
--i-table plasma-data/table.qza \
--m-metadata-file plasma-data/metadata.tsv \
--p-control-idx-column is_control \
--p-sample-type-column sample_type \
--p-well-location-column well_id \
--p-control-order "control blank library prep,control blank DNA extraction" \
--o-scrubbed results/scrubbed.qza
Outputs of the tutorial can be found here.
Extended tutorial
Extended version of this tutorial can be found in our documentation pages.Issue reporting
Please share any issues or feature requests in our GitHub repo's issues page.Citation
If you use this tool, please cite:Austin, G.I., Park, H., Meydan, Y. et al. Contamination source modeling with SCRuB improves cancer phenotype prediction from microbiome data. Nat Biotechnol (2023). https://doi.org/10.1038/s41587-023-01696-w