Help with QIIME2 quality scores and workflow

Hi @microbiotaphyto,

Welcome to microbiome analysis! These are good questions! If you haven't gone through already, you might find our tutorials or the online workshops useful.

It's not a tidy answer, but I would go through the forum, read posts,and see how people are making chocies. There are a lot of quality score plots here, and a lot of conversations around the plots. Honestly, this is a big piece of how I've been getting a feel for them.


But, let's talk about the real reason I'm here. (The other mods are rolling their eyes, and I've just poured myself a new cup of tea :tea:).

Some of this is terminology...

Metadata (sometimes also called sample information, patient data, survey data, etc) is non-omics about your microbiome sample.

The feature table tells you which things are in a specific sample, and how many they are. (We use "feature" because we might represent the data as an ASV, an OTU, a genus, a gene, etc).

Let's talk about the feature table first!

  1. How do you make it?
    Your feature table comes out of the denosing or OTU clustering pipeline. (Go back to the PD mouse tutorial, or one of the other tutorials for specific instructions.) Typically, this process is going to involve importing your data, doing to QC, and then denoising and/or clustering.

  2. How necessary is it are to the analyses?
    I'd argue a feature table is kind of the first point in a microbiome analysis. You've gotten to the point where you're mapping the members of the community to the samples. The feature table is what you use to build other analyses, like diversity, and it goes directly into differential abundance.
    If you're working at a sequencing company, this is one of the main bioinformatic products you will produce.
    If you're handing the work off to an analyst, this is the point the statistician will wander over and think you might have data.

  3. At what point in the workflow should you use it?
    The is kind of the mid-point in the workflow. Like, you get your data, you process it to a feature table, and then you get to actually do analysis. So, once you have it, it's the basis for pretty much everything you'll do down the line. (Well, the feature table, and accompanying representative sequences whcih get you to a tree and taxonomy.

And, okay... metadata - the non-microbiome sample information.

  1. How do you make it?
    This should have been planned during the experimental design phase. Usually, it's collected around when the sample was collected (although this semi depends on the study design.) What you need will depend on your reserach question. I study humans :mask:, so I want information like age, sex, diet, drug use, and medical history. Colleagues who work with mice :mouse2: often collect things like cage, genotype, diet, and/or treatment. Enviromental studies :evergreen_tree: might look at emperature, pH, rainfall... it will vary based on your question.
    Your metadata will probably also contain information about how the sample was processed: what extraction kit did you use? Where did it sit in the extraction plate? Who did the work?
    There are some community standard for how metadata should be formatted, as well as a new US government effort to make it more accessible. I'd highly recommend looking at the NMDC page as well as MixS.

  2. How necessary is it to the analysis?
    If you want to answer a biologically hypothesis, it's essential. Without metadata, microbiome analysis becomes augury a few steps removed from the bird :owl:, super expensive palmstry :raised_hand:, or very creative story telling :open_book:.
    If you're shipping the data off to someone else, they probably have the metadata already, and then you dont personally need it.

  3. At what point in the workflow should it be used?
    This comes in when you start to visualize and analyze the statistical aspects of the data. You need it to answer your biological question. So, once you have your feature table, you also need your metadata to do science!

Hopefully this helps; there are a lot more videos about these topics in the workshop I linked.

Best,
Justine

5 Likes