facing challenges specifying color and shape commands in https://forum.qiime2.org/t/tutorial-integrating-qiime2-and-r-for-data-visualization-and-analysis-using-qiime2r/4121#plotting-pcoa-12

I have been facing challenges to execute the following command ggplot(aes(x=PC1, y=PC2, color=depth , shape=season, size=shannon_entropy)) + so that the functions color and shape retain something like this [color=body-site, shape=reported-antibiotic-usage]. I have my metadata file attached
sample-metadata.txt (2.8 KB).

here is the code I'm trying to run
library(tidyverse)
library(qiime2R)

metadata<-read_q2metadata("sample-metadata.txt")
uwunifrac<-read_qza("unweighted_unifrac_pcoa_results.qza")
shannon<-read_qza("shannon_vector.qza")$data %>% rownames_to_column("SampleID")

uwunifrac$data$Vectors %>%
select(SampleID, PC1, PC2) %>%
left_join(metadata) %>%
left_join(shannon) %>%
ggplot(aes(x=PC1, y=PC2, color=depth , shape=season , size=shannon_entropy)) +
geom_point(alpha=0.5) +
theme_q2r() +
scale_shape_manual(values=c(16,1), name="") +
scale_size_continuous(name="Shannon Diversity") +
scale_color_discrete(name="depth")
ggsave("PCoA.pdf", height=4, width=5, device="pdf")

the error
Error in FUN(X[[i]], ...) : object 'season' not found

Hey @tafara_bute,

Just passing by, but wanted to thank you for including your metadata, I took a peek at it because I didn't see a good reason for your ggplot calls to fail, and I noticed that you have double quotes around each line, which is going to make most TSV/CSV parsers treat each row of data as a single really long entry/variable/factor.

If you delete those double quotes I expect things should work. You might also use

str(metadata)

after loading via read_q2metadata so that you can visually verify that R has the right number of columns with the names you expect.

1 Like

thank you @ebolyen , I removed the double quotes and ran the str(metadata) command but no change, here is my console output

I believe ggplot is missing the data parameter, so it should be something like:
ggplot(data=metadata, aes(...))

I think your x=PC1 and y=PC2 will just work on their own, but if not, you'll need to cbind them into your metadata as well.

definitely ggplot is missing the parameter , i don't know why because its picking depth which is also in the same metadata file. How can i perform the cbind which you referred to ?

thank you_

So aes uses nonstandard evaluation, which means that even though color happens first, it isn't actually executed, instead everything is "captured" for later. ggplot probably evaluates the shape before deciding the color, hence season being found as missing instead of depth as you expect.

As to how you provide the data, adapt the sample I gave in my last reply. You need to provide a data argument at the start of your ggplot call.

Regarding cbind, I would google around a bit, as cbind, rbind, c() and the various applies are all important tools for the toolbelt, and how you use them really depends on your situation. I would recommend just stuffing different dataframes and vectors into those functions to see what happens and go from there.

thank you for the help. I tried renaming column names using

colnames

function and it did the magic, I guess the problem was with the metadata column names.