general discussion of qiime2r tutorial

Hello,
Your tutorial has been very helpful. We were able to make volcano plots and CLR bar plots for our differential taxa using your script. However, I was inputting the commands for the differential abundance volcano plot again, but this time trying to take out the leading higher taxonomy levels (K, P, O) to only label the lowest unambiguous taxonomic names (usually family and genus, but sometimes class) and it seems that our taxonomy file is not formatted specifically for your script adjustment mutate(Feature=paste(Phylum, Genus)). Similarly, when we try to parse our taxonomy file: it gives the error:

taxonomy<-parse_taxonomy(taxonomy)
Error in parse_taxonomy(taxonomy) :
Table does not match expected format (colnames(obj) are Feature.ID, Taxon, (Confidence OR Consensus))

#so maybe the datatype in the second column (Taxon) is incorrect, or maybe it doesn’t like the spaces in between the taxon names.

head(taxonomy)

A tibble: 6 x 2

Feature.ID Taxon

1 TACGAAGGGGGCTAGCGTTGTTCGGAATCACTGGGCGTAAAG… k_ Bacteria; p Proteobacteria; c__Alph…
2 TACGTAGGGGGCAAGCGTTGTCCGGAATCATTGGGCGTAAAG… k
Bacteria; p Actinobacteria; c__Ther…
3 TCCTGTTTGCTCCCCACGCTTTCGCGCCTCAGCGTCAGTAAT… k
Bacteria; p Proteobacteria; c__Delt…
4 TACGTAGGGGGCGAGCGTTGTCCGGAATGACTGGGCGTAAAG… k
Bacteria; p Firmicutes; c__Clostrid…
5 TACGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAAAG… k
Bacteria; p Proteobacteria; c__Alph…
6 TCCTGTTTGCTACCCACACTTTCGTGCCTGAGCGTCAGTTAC… k
Bacteria; p _Bacteroidetes; c__Ignav…

There isn’t a third column data (confidence/consensus) because to the original taxonomy file was exported out of R by a third party service and then I imported the taxonomy file into Qiime2 to create .qza format used here. Below is the script we used. Could this error be circumvented with some creative R script using the current format? If the third column is the problem, is there a way to disregard the empty third column or would be have to create a bogus third column to use this script as is? Or is the issue with the first or second column?

Thanks,
Christina

metadata<-readr::read_tsv(“Modified-Macaque-skinmicrobiome-metadata-LB-July2020-R.tsv”)
SVs<-read_qza(“asv_feature-table_2013_2015TS_75.qza”)$data
results<-read_qza(“ALDEx2_TS_75animals/differentials_75TS.qza”)$data
taxonomy<-read_qza(“taxonomy_for_qiime.qza”)$data
tree<-read_qza(“rooted-tree_17May2020.qza”)$data
#Find significantly different taxa (less than .01 p-value) and volcano plot
results %>%
left_join(taxonomy) %>%
mutate(Significant=if_else(we.eBH<0.01, TRUE, FALSE)) %>%
mutate(Taxon=as.character(Taxon)) %>%
mutate(Taxon=gsub("[[kpco]__*;]", “”, Taxon)) %>%
mutate(TaxonToPrint=if_else(we.eBH<0.01, Taxon, “”)) %>% #only provide a label to signifcant results
ggplot(aes(x=diff.btw, y=-log10(we.ep), color=Significant, label=TaxonToPrint)) +
geom_text_repel(size=3, nudge_y=0.1) +
geom_point(alpha=0.6, shape=16) +
theme_q2r() +
xlab(“log2(fold change); Differences Between”) +
ylab("-log10(P-value)") +
theme(legend.position=“none”) +
scale_color_manual(values=c(“black”,“red”))
ggsave(“volcano_75TS_0.01_shorter_taxa_d.pdf”, height=9, width=9, device=“pdf”)

Hi Christina,

Easiest fix would be to just add a dummy column with something like:

taxonomy<-read_qza(“taxonomy_for_qiime.qza”)$data %>% mutate(Confidence=1)

Thanks @jbisanz , a quick question: in taxa_barplot function the example command given is:

taxa_barplot(taxasums, metadata, "body-site")

is there an easy way to sort the order of the samples by another metadata column? For example if we have a variable of time and wanted to show the samples faceted by body-site and ordered by increasing time within the facets?

If you sort your table by time and then convert sample ID to be a factor with that sorted order it should print out as you are looking for!

Hi! right now i’m using this Package(qiime2R) and following one of your tutorials, but the function taxa_barplot it’s not available? does any other function that i can use for build up my taxonomy bar plot?
Thank you!!!

Hello @LiiBotero,

Welcome to the QIIME 2 forum! If you don’t mind a Python solution, you can take a look at the taxa_abundance_bar_plot method I wrote.

1 Like

Thank you so much for such a useful Tutorial. I was able to generate all the graphs, but I am more interested in having A graph base on Family level (group the family base on their names), not feature ID. Is there a way to do that?

Is it also possible to get a picture like this (please see figure 4 end of this page) (https://github.com/hollylutz/BatMP/blob/master/Figure4_Scatterplots.ipynb) with QiimeR?

Yes, just use the summarize_taxa() command first and pass the Family-summarized abundance table to the heatmap function.

1 Like

It should be relatively straight forward to reproduce the plot using ggplot2 with data imported using qiime2R.

1 Like

Hello,

I have gotten to the stage where I have a heat map!! But, it has no colour on it? I presume I have done something wrong in the code at some point?

No colour eh? Could you maybe post what you get and the code as you wrote it? I imagine it could be scaling issue or somehow NAs are getting introduced.


Hello, I have attached an image of the code.

Hmmm, i see that you are trying to facet on SampleID. It is possible that if you have many SampleIDs that they are just getting squished together and the way the figure has been rendered is just covering up the plot space. You could try removing the “SampleID” component to see if you get the desired plot.

Hello, I changed SampleID to something else this morning and got a great heat map. Thank you!

1 Like

Hi @jbisanz , I tried creating the PCoA plot from your tutorial and got this error:

Error in select(., SampleID, PC1, PC2) : 
  unused arguments (SampleID, PC1, PC2)

Are you familiar with what is causing this?

Thank you,
Zach

This could be an issue with a second package that is exporting a function called select. Try explicitly using dplyr::select() instead.

1 Like

@jbisanz Sir, I dont know I should ues this command in which step?

2 posts were merged into an existing topic: Tutorial: Integrating QIIME2 and R for data visualization and analysis using qiime2R