I am looking for separate plots, like Figure B and the +-SE. I combined all the samples in my figure, but I am not sure how to separate them based on their phylum and calculate +-SE. I attempted an R-code to calculate standard error but my relative abundance goes up to 3000%
As far as I understood, this CSV file contains relative abundance of features annotated at the phylum level. Each row is a sample and each column is a phylum. I understand you want a plot for each phylum where the X-axis contains your "source" variable (i.e., "Normal", "LDLrKO" and "LDLr HD 0").
Figure B (as well as your figure) are boxplots, that show the distribution of data. You don't need to calculate SE in order to generate them.
What I would do is:
Convert your source column to factor using as.factor(taxa_data$source)
Then, for each phylum you want to plot:
2.1. Keep only the column of that phylum and the source column. I see you use tidyverse packages, so some tidy code could be:
plot_data <- taxa_data %>%
select(matches("Firmicutes"), source) # IDK exactly the column names, feel free to edit
2.2. Plot it. This is a basic backbone:
plot_firmicutes <- ggplot(df, aes(x = source, y = .data[[1]])) + # Again, IDK the exact col name so I assume is the first
geom_boxplot() +
theme_minimal()
Once you generate all the plots you want, use patchwork: