We are proud to (belatedly) announce the release of the q2-SCRuB plugin!
SCRuB is a tool designed to help researchers address the common issue of contamination in microbial studies.
Principal conceptSCRuB is a probabilistic in silico decontamination method that incorporates shared information across multiple samples and controls to precisely identify and remove contamination. It models each sample of interest as a mixture of contamination and non-contamination (“biological”/“real”) sources, and each control sample as a noisy realization of a latent contamination source. It further uses the spatial location of a sample during processing (for example, location on a 96-well plate) to account for leakage of non-control samples into controls. A detailed description can be found here and here.
UsageThis package provides an easy to use framework to apply SCRuB to your projects. All you need to get started are n samples x m taxa count matrices for both your samples and controls. In addition, locations of samples and controls during processing are optional but recommended. To begin, we recommend working through SCRuB's documentation pages . This documentation includes installation steps on the homepage, and examples of qiime commands with SCRuB. In addition, we provide the key plugin details below.
conda activate qiime2-2023.5 conda install -c conda-forge r-devtools Rscript -e 'devtools::install_github("shenhav-and-korem-labs/SCRuB"); torch::install_torch()' pip install git+https://github.com/Shenhav-and-Korem-labs/q2-SCRuB.git
In this tutorial we use SCRuB to decontaminate a dataset using a Poore et al. This data can be downloaded with the following links:
First, we make a tutorial directory and download the data specified
above to the
mkdir SCRuB-example mkdir SCRuB-example/plasma-data mkdir SCRuB-example/results cd SCRuB-example/plasma-data wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/table.qza wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/metadata.tsv cd ..
To run SCRuB we only need a single command. In this
tutorial our control_idx_column parameter is
our sample_type_column is
sample_type, and our
well_id. Now we are ready to SCRuB
away the contamination.
qiime SCRuB SCRuB \ --i-table plasma-data/table.qza \ --m-metadata-file plasma-data/metadata.tsv \ --p-control-idx-column is_control \ --p-sample-type-column sample_type \ --p-well-location-column well_id \ --p-control-order "control blank library prep,control blank DNA extraction" \ --o-scrubbed results/scrubbed.qza
Outputs of the tutorial can be found here.
Extended tutorialExtended version of this tutorial can be found in our documentation pages.
Issue reportingPlease share any issues or feature requests in our GitHub repo's issues page.
CitationIf you use this tool, please cite:
Austin, G.I., Park, H., Meydan, Y. et al. Contamination source modeling with SCRuB improves cancer phenotype prediction from microbiome data. Nat Biotechnol (2023). https://doi.org/10.1038/s41587-023-01696-w