Doing taxonomy analysis and getting abundancies with manifests

ebolyen · September 11, 2017, 8:50pm

I'm a little uncertain what you mean, are you just trying to get taxonomic assignments (to be visualized in qiime taxa barplot or used just as a feature-table with qiime taxa collapse), or are you going for something else?

I think there's a misunderstanding here. There's multiple steps to get to your taxonomy and abundance information:

You start with your SampleData[SequencesWithQuality] which are basically your demultiplexed sequences (no quality control has happened, and we haven't really mapped out which sequences look interesting or counted them yet). This is usually what you want to import.

From there we perform a denoising step qiime dada2 denoise-single or qiime deblur denoise-16s which will manage error (correcting or discarding) any sequence that looks like a mistake from the sequencing instrument. That step will result in a number of "correct" sequences and will count them. This is basically our "feature selection" step. After we've denoised the original sequences and counted them we have what are called Amplicon Sequence Variants (ASVs) which we'll use as our features (these are analogous to 100% OTUs).

The "correct" sequences are the FeatureData[Sequence]. These no longer have sequencing abundance information, and instead are focused on just what the sequence is for each feature ID (it's basically a fasta file without any replicated sequences or IDs).

The counts are in the FeatureTable[Frequency] which has the number of times each feature ID is observed in each sample ID.

Using the FeatureData[Sequence] we can do taxonomic classification using the feature-classifier plugin, giving us a new FeatureData[Taxonomy] artifact. We can use that new artifact with the FeatureTable[Frequency] one because they still have the same feature IDs.

Let me know if that clarifies things, or if I've misunderstood what you are going for!