Hi @Jibda! Here are the general steps you should take to create a closed-reference feature table in QIIME 2 for use with PICRUSt:
-
PICRUSt expects you to perform closed-reference OTU picking using the exact version of Greengenes that PICRUSt was trained against. There are ways to retrain PICRUSt using different reference databases, but that’s outside the scope of what I can help you with here; see the PICRUSt retraining docs for details.
Since PICRUSt was trained against Greengenes version
13_5
(instead of13_8
, which is used in many of the QIIME 2 tutorials), you’ll need to download that reference database from the PICRUSt docs. Once you have the reference database FASTA files, import the reference sequences into a QIIME 2 artifact with typeFeatureData[Sequence]
. Use this artifact when performing closed-reference OTU picking with q2-vsearch (see Step 2 below). When choosing which reference sequences to import, you’ll probably want to use the 90% or 97% identity Greengenes reference sequences. -
Follow the steps in the q2-vsearch OTU picking Community Tutorial to create a closed-reference feature table. The feature table’s feature IDs will correspond to Greengenes OTU IDs.
Note: the tutorial uses Greengenes version
13_8
85% identity reference sequences. Don’t use these reference sequences with your own closed-reference OTU picking analyses. Use the reference sequences you imported in Step 1 instead. Also make sure that you use the same percent identity corresponding to your reference sequences with the--p-perc-identity
option supplied toqiime vsearch cluster-features-closed-reference
. -
Once you have a closed-reference feature table, export the feature table to obtain a
.biom
file. -
Now that you have a
.biom
file containing PICRUSt-compatible Greengenes IDs, you can add the corresponding Greengenes taxonomic annotations usingbiom add-metadata
. There is no need to import the Greengenes taxonomic annotations into QIIME 2 to perform taxonomy assignment; you can use the taxonomic annotations that are distributed with the Greengenes reference database you downloaded in Step 1.
Let us know how it goes!