Hello all,
With the release of Qiime2-shotgun and shotgun metagenomics tutorial. I'm just giving a quick writeup/tutorial on how to import precompiled kraken2 and bracken databases.
The first step is to download the appropriate database. My metagenomes are from environmental biofilms and have representatives from across the tree of life, however (and I have tried!) the full nucleotide database was too large for my computer. For this tutorial I'll use the 'PlusPFP with DB capped at 16 GB' Database as a compromise.
wget -c https://genome-idx.s3.amazonaws.com/kraken/k2_pluspfp_16gb_20231009.tar.gz
Once you've confirmed that has downloaded correctly time to extract!
tar -xf k2_pluspfp_16gb_20231009.tar.gz
So everyone can copy and paste at home even if they have used a different database I'll rename the folder and move into it to separate the different components needed for the kraken and bracken databases.
mv k2_pluspfp_16gb_20231009 k2
cd k2
touch k2_pluspfp_16gb_20231009.txt # set this to whatever your database is so there is some note somewhere of where the data came from!
For the kraken2 database we will need only the 3 *.k2d
files, and for the bracken database only the *.kmer_distrib
files.
mkdir krakendb
mv *.k2d krakendb/
mkdir brackendb
mv *.kmer_distrib brackendb/
Finally we can activate the shotgun qiime environment and import the data ready to be used in our analysis.
conda activate qiime2-shotgun-2023.9
qiime tools import \
--type 'Kraken2DB' \
--input-path krakendb \
--output-path kraken2.qza #I would recommend changing this to whatever your database is
qiime tools import \
--type 'BrackenDB' \
--input-path brackendb \
--output-path bracken.qza #I would recommend changing this to whatever your database is
Once completed you should be able to delete the krakendb
and bracken
folders and their contents as they are been zipped into the qza
files.
Hope this is helpful!
Jono
p.s. - I am just testing this for typos so give me 30 minutes checked!