How to create a feature table with qiime2 for PICRUST with the taxonomic assignment?

Hi @Jibda! Here are the general steps you should take to create a closed-reference feature table in QIIME 2 for use with PICRUSt:

  1. PICRUSt expects you to perform closed-reference OTU picking using the exact version of Greengenes that PICRUSt was trained against. There are ways to retrain PICRUSt using different reference databases, but that’s outside the scope of what I can help you with here; see the PICRUSt retraining docs for details.

    Since PICRUSt was trained against Greengenes version 13_5 (instead of 13_8, which is used in many of the QIIME 2 tutorials), you’ll need to download that reference database from the PICRUSt docs. Once you have the reference database FASTA files, import the reference sequences into a QIIME 2 artifact with type FeatureData[Sequence]. Use this artifact when performing closed-reference OTU picking with q2-vsearch (see Step 2 below). When choosing which reference sequences to import, you’ll probably want to use the 90% or 97% identity Greengenes reference sequences.

  2. Follow the steps in the q2-vsearch OTU picking Community Tutorial to create a closed-reference feature table. The feature table’s feature IDs will correspond to Greengenes OTU IDs.

    Note: the tutorial uses Greengenes version 13_8 85% identity reference sequences. Don’t use these reference sequences with your own closed-reference OTU picking analyses. Use the reference sequences you imported in Step 1 instead. Also make sure that you use the same percent identity corresponding to your reference sequences with the --p-perc-identity option supplied to qiime vsearch cluster-features-closed-reference.

  3. Once you have a closed-reference feature table, export the feature table to obtain a .biom file.

  4. Now that you have a .biom file containing PICRUSt-compatible Greengenes IDs, you can add the corresponding Greengenes taxonomic annotations using biom add-metadata. There is no need to import the Greengenes taxonomic annotations into QIIME 2 to perform taxonomy assignment; you can use the taxonomic annotations that are distributed with the Greengenes reference database you downloaded in Step 1.

Let us know how it goes!

6 Likes