Relative abundances of unique sequences per sample

Hello!

I’m trying to summarize my data after I’ve ran it through the DADA2 pipeline in a way where I can get:

  • relative abundances of unique sequences per stool sample. Is there such a way I can do that?
    In other words, I know that rep-sequences.qzv gives me all the different feature ID, but is there a way I can know which stool sample has which exact feature ID?

Hope my question makes sense.

Thank you!
Rima

1 Like

Hi @rnasrah,
It sounds like what you need is feature-table heatmap, which will display a heatmap representation of your feature table, indicating the abundance of each feature in each sample.

The alternative would be to convert your feature table to a text file, e.g., to plot the abundance of a specific feature in R or another external program. You can use qiime tools export to export your feature table into biom format, then use biom convert --to-tsv to convert to a text file.

I hope that helps!

2 Likes

Thanks so much @Nicholas_Bokulich.

I tried doing the feature-table heatmap, but I'm not sure how to redo it so I can change the order of my different clinical groups (i.e. A being those patients who gain weight, B being those patients who lose weight). As you can see on the 'y-axis' as attached on the photo, I do not have it in order, is there a way, I can redo this? For now, the A group and the B group is all mixed up together.

Thanks so much!
Rima

Hi @rnasrah,

Clustering of axes is controlled by the --p-cluster parameter, which by default clusters both axes. If you add --p-cluster features to your command, it will only cluster the x-axis (features), and hence the samples should be shown in the same order as they are in the metadata file (I think. I am not certain but it’s worth a try).

Note that the mixing of A/B is indicating that these groups have similar feature compositions and hence cluster together.

I hope that helps!

3 Likes

Thanks so much Nicholas!
The p-cluster features made this work!

One more question, what do the lines on top of the heatmap mean/correspond to (where I put the purple arrow)?

Thanks once again,
Rima

1 Like

Hi @rnasrah,
That is a dendrogram (UPGMA clustering by default) showing the similarity of features according to their co-occurrence among samples.

This is toggled on/off with the --p-cluster parameter, and you can switch clustering methods with the --p-method parameter.

I hope that helps!

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.