Hey guys,
I'm newbie in R, and I am facing difficulties to plot a graphic of PCA on stat_ellipse. I don't know if this topic really fits in here but I really need some sort of help, so...
I spent the entire day and as far as I got was import my artifacts from qiime2 to R with qiime2R:
Importing ASVs abundance file
ASVs <- read_qza("table.qza")
Importing metadata
metadata2 <- readr::read_tsv('sample-metadata.tsv')
Importing tree
tree <- read_qza("rooted-tree.qza")
Importing taxonomy
taxonomy <- read_qza("taxonomy.qza")
tax_table <- do.call(rbind, strsplit(as.character(taxonomy$data$Taxon), "; "))
colnames(tax_table) <- c("Kingdom","Phylum","Class","Order","Family","Genus","Species")
rownames(tax_table) <- taxonomy$data$Feature.ID
Creating phyloseq object
physeq <- phyloseq(
otu_table(ASVs$data, taxa_are_rows = T),
phy_tree(tree$data),
tax_table(tax_table),
sample_data(metadata)
)
And then I saw this code on this post here ggplot: coloring pcoa plot. - #3 by pyghost.
But instead of antibiotic usage and body site I inputted SampleID because in my current dataset I am just analyzing patients with anorexia nervosa and control (I just wanted to plot these two), all from gut:
uwunifrac<-read_qza("unweighted_unifrac_pcoa_results.qza")$data
shannon<-read_qza("shannon_vector.qza")$data %>% rownames_to_column("SampleID")
df_join <- uwunifrac$Vectors %>%
dplyr::select(SampleID, PC1, PC2) %>%
dplyr::left_join(metadata, by="SampleID") %>%
dplyr::left_join(shannon, by="SampleID")
df_join %>%
ggplot(aes(x=PC1, y=PC2,
color= SampleID
,
shape = SampleID
,
size= shannon_entropy)) +
geom_point(alpha=0.5) +
xlab(paste("PC1 (", round(100uwunifrac$ProportionExplained[1],2),"% variance explained)")) + #changed
ylab(paste("PC2 (", round(100uwunifrac$ProportionExplained[2],2),"% variance explained)")) + #changed
theme_bw() +
scale_shape_manual(values = c(17, 1), name = "SampleName") +
scale_size_continuous(name="Shannon Diversity") + #keep as is
scale_color_discrete(name="SampleName") +
ggtitle("Unweighted UniFrac") +
stat_ellipse()
But as an output there was this error:
Too few points to calculate an ellipse
Error in palette()
:
! Insufficient values in manual scale. 29 needed but only 2 provided.
Run rlang::last_trace()
to see where the error occurred.
In total there are 15 samples of sick patients and 15 control patients, so I don't know why this is occurring.