Hi everyone,
I would like to show diversity results for only part of my samples (with common time of sampling) in a boxplot.
I have samples took in 1st, 3rd and 5th week, for several groups A,B,C, D and control.
I want to show the Faith's -PD only for results from 5th week.
I did it by filtering based on metadata: qiime feature-table filter-samples --i-table table_merged.qza --m-metadata-file metadata.txt --p-where '[week]="5"' --o-filtered-table table_merged_5week.qza
But it is before preforming core-metrics-phylogenetic and rarefaction. And I want to have also plots for whole dataset, but for whole dataset I had to perform separate core-metrics-phylogenetic analysis and rarefaction.
Is there another way of doing it, after core-metrics analysis?
And second: is there possibility to put the control at the first place in the boxplot? Because by default, in alphabet order, A will be first.
I can think of two ways to do this. You have already outlined both methods!
Filter first, then rerun:
The next step in this method is to remake the downstream graphs by rerunning functions like core-metrics-phylogenetic. Then only the samples you filtered will be included.
but for whole dataset I had to perform separate core-metrics-phylogenetic analysis and rarefaction.
In this method, we would do that for each cohort of filtered samples.
Take diversity values, then filter. "after core-metrics analysis"
Using R and the Tidyverse, I import the diversity values from within the .qza files into a dataframe, then graph it with ggplot2, rstatix, etc. These packages let you do stuff like "put the control at the first place."
But this requires using R, which is its own challenge! You can also use Python or even Excel/Google Sheets for custom graphs like this.