So, essentially very similar, except I changed the axes. However, I am getting the following warnings and about half my data isn’t getting plotted,
1: Column `SampleID` joining factors with different levels, coercing to character vector
2: Removed 48 rows containing missing values
I checked to confirm that the order of my SampleIDs match up in both the metadata and the Unweighted UniFrac PCoA files. When I visualize them on QIIME2, they look fine to me. I was hoping you could help me figure this out. Thanks for your help!
Thanks for your response! I don't have any NA values in my data and that's the reason this is confusing. I checked what's happening with the combined file and this is how it looks. It seems like between the two files, although the #SampleID seems to be the same name, it isn't matching up. Is there something I am missing?
EDIT: Oh gosh! I figured it out. Some of the metadata sample names had an invisible "space" after the names, so R was picking them up as different samples. I had to manually go through and remove all the spaces. Gotta be careful while making that metadata file... eek! Thanks for your input through!
I have been using this tutorial to draw heatmaps for my data. It works well, thanks so much @jbisanz !
Now, I am trying to figure out few small things,
(1) Is there a way I can collapse the taxonomy data to the phylum level?
(2) Similarly, if there is a way to plot, abundances within specific phyla separately? (I tried to sort it in alphabetical order instead of mean abundance but couldn’t figure out the secondary sorting within that).
(3) I have 12 replicate data from each experimental condition. Would it be possible to collapse across these replicates and draw the heatmap for the ‘average’ relative abundance across the different phyla? (the bigger question also being if this is an advisable/correct thing to do?)
Also, can you please explain what the line summarize(Abundance=sum(Abundance))
I have been trying to do this on my own but am fairly new to R. So, I would really appreciate your help with this or if you could point be to other sources as I couldn’t find any. Thank you for your help!
Yes, that's right! Like the top 10 Firmicutes, and so on...
I will try these out. In the example above, I don't think phyloseq library was used to draw the heatmaps (I could be wrong) and also, the taxonomy.qza file is in combined form (see picture). I guess I will need to use something like parse_taxonomy(taxonomy, tax_sep, trim_extra)
to first separate the taxa and then use the code you recommended?
But if I did this, it would impact the Y-axis of the original heatmap I've already developed using the tutorial. I don't know if I am explaining it correctly... basically the way that the data needs to be arranged for phyloseq library is different from what's on this tutorial (or maybe I am wrong?). And so, I am trying to figure out if I can use one or the other to create these additional heatmaps. I would really appreciate your advice on this! Thanks!
That’s true, I really like the tutorial as it was a very intuitive way for me to understand how the heatmaps were drawn (going through it one step at a time). I will try to figure out how to tweak it for my requirements, but if someone else gets to it before I do, I would love to see it! (Enthusiastic learner )
I understand. I don’t actually have it as a phyloseq object but I will check it out and try to learn it. Thanks again!
All of my examples start with data in a phyloseq object, which I called ps. Once you do have your data in a phyloseq object, all of the plot_heatmap() plot_ord() plot_alpha() function that come with phyloseq are quite easy to use.
Thank you! I think I am getting the hang of it. I have just one more question… I see that the abundances in the charts are plotted as is. Is there any way that I can go in and change the abundances to relative abundance (on a percentage basis) and convert it into a log plot similar to what’s in the tutorial?
I am looking to do two things (circled in red),
(1) Report the data is %abundance (and on a log scale... like in the tutorial)
(2) Figure out how to change the label from being just phylum to Kingdom_Phylum. Someone in another forum recommended this
Yes, it does. I was thinking that perhaps some of the phyloseq functions are designed to recognize only specific elements, in this case, 7 taxa? I am guessing this because it also didn’t let me play around with the color scheme too much (I was hoping to use scale_fill_viridis_c() but I couldn’t disable the preexisting fill parameters). It’s possible I went about it wrongly though.