general discussion of qiime2r tutorial

Thank you. This is amazing for getting my qiime2 output into phyloseq.

After a lot of struggles with other suggestions this worked perfectly. I have gotten my data imported before but I never got the taxonomy and the tree to both work at the same time!!

Just a note to others, I could not use the first suggested method for creating a phyloseq object, "
phy<-qza_to_phyloseq(“table.qza”, “rooted-tree.qza”, “taxonomy.qza”,“sample-metadata.tsv”)
phy"

However, I had no trouble using the second suggested commands for “if you want to have more control” after the data import steps.

physeq<-phyloseq(
  otu_table(SVs$data, taxa_are_rows = T), 
  phy_tree(tree$data), 
  tax_table(as.data.frame(taxtable) %>% select(-Confidence) %>% column_to_rownames("Feature.ID") %>% as.matrix()), #moving the taxonomy to the way phyloseq wants it
  sample_data(metadata %>% as.data.frame() %>% column_to_rownames("#SampleID"))
  )

physeq
5 Likes

Thanks so much @jbisanz, this was super helpful. Not sure this is the right place to ask but, how should I cite this tutorial??

3 Likes

Hi, I was hoping someone could help me with an error I’m getting when trying to parse the taxonomy table.

taxonomy<-parse_taxonomy(taxonomy$data)
Error in if_else(x == “”, NA_character_, x) :
could not find function “if_else”

Thanks in advance!

Oops, if you load the dplyr or tidyverse libraries it should work. I will make this fix now.

Thank you so much! I figured it out just as you responded. Stay safe.

1 Like

Hello !!

How do I cite this package ??

Greetings !!

1 Like

Citing the QIIME2 manuscript is great and you could mention the version of qiime2R in the methods. Happy you found it helpful, Jordan

5 Likes

Hello,

I am getting the following error when running the taxonomic heatmap tutorial:

Error: by required, because the data sources have no common variables

Additionally, do you have a tutorial for producing taxonomic barplots through qiime2r?

1 Like

Thank you so much for the updated contribution to this tutorial!

2 Likes

I added a function that you could use directly for this purpose called taxa_barplot().

2 Likes

qiime2R (could not find “summarize_taxa” function)

I am trying to use the following code:

taxasums<-summarize_taxa(svtable, taxonomy); taxasums$Genus

However, I get the error saying ‘could not find “summarize_taxa” function’. My libraries are up to date. I am using R 3.6.3 and RStudio Version 1.3.959.

Please let me know how to get around this.
Thank you

2 Likes

Hi! thanks, very interesting!

I) Is it possible to collapse samples by a category of the mapfile using the new function taxa_barplot()? (i.e. control vs antibiotic treatment), then showing mean relative abundances instead per_sample abundances?
II) can we collapse "minor phyla" (say all phyla<1% total abundance), to avoid >10 colours in the same figure?
Otherwise, a good R code to get I) an II)? It is not so straightforward, apparently: Creating 100% stacked bar plots with less abundant taxa in a sub-category · Issue #901 · joey711/phyloseq · GitHub
I guess these two functionalities are very demanded. Thanks!!

2 Likes

In answer to 1- I am not a huge fan of collapsing the samples into a single bar as it masks the error and can lead to misleading conclusions. If you want to do this, you can find the average of each feature abundance within a group and then make a barplot from this aggregated sample. 2- I usually limit the plot to the 10 most abundant features (SVs/OTUs/Genera/Phyla etc) and group everything else into a remainder. The taxa_barplot function will do this by default. You could make a custom version of this function at line 39 to instead of pulling out the 10 most abundant, pulling out everything over 1% in average abundance.

1 Like

It should be there if you have the current version of qiime2R.

1 Like

I) I agree. II) Great, very useful. Thanks!!

Hi also wanted to add, how do we customize the feature side of the heatmap? I am having a hard time to figuring out the regex code, or am I missing some handy functions all along?

Hi Mr Bisanz,
Thank you for your great tutorial.
I am wondering if it is possible to diagnose a change step-by-step, or line-by-line, to our SVs data above. I can do it manually by declare SVs1 <- SVs instead of SVs above, and then delete pipe operator one at a time, but this is difficult. I realise that maybe there is a better way. Maybe debugging or something? Pardon me asking this basic R programming, because the context are already set I think it will be easier here.
I am using Rstudio.

I got no particular reason why choosing heatmap part for this example. It just happened that this part is what I am currently working on.

Thank you.

What I typically do in R studio is highlight down to the row I want to troubleshoot excluding the last pipe character and hit command enter to run. Ex:

SVs %>%
  as.data.frame() %>%
  rownames_to_column("Feature.ID") 

You could also just add head() or View() after the pipe on the line of interesting to send the result to the console or a new tab respectively.

If you want to customize the y axis, you slice and dice the text however you want. For example, you could paste together just the phylum and genus with something like mutate(Feature=paste(Phylum, Genus)). In my line above with the gsub command, I am just removing the extra k__Bacteria;p__Bacteroides… to get Bacteria;Bacteroides… to save on space when plotting.

Ah yes this is what I wanted all this time! Thank you! Also good tips, I didn't know that we can run selected code. I ended up assigning new SVs upto SVs15 lol. Thank you!

Also, my heatmap did not show similar y label as yours. I think it is because of this part of the code?

....
mutate(Feature=paste(Feature.ID, Taxon)) %>%
mutate(Feature=gsub("[kpcofgs]__", "", Feature)) %>% # trim out leading text from taxonomy string
ggplot(aes(x=SampleID, y=Feature, fill=NormAbundance)) +
....

The y from aes() asking for Feature column, but other than k__Bacteria;p__Bacteroides…etc line, there are also some random character of Feature.ID from mutate(paste()) function above it. That, or is it just me making mistake? What do you think, sir?
Also, what is that Feature.ID? They are from QIIME2 table.qza but, what algorithms do they use to generate the Feature.ID?

Thank you.

I believe the Feature.ID you would be seeing is the sequence’s hash which is a unique identifier for the SV (see here). You could just as easily replace them with arbitrary names like “SV_1” if you wanted. To do that, add an extra column to your taxonomy object with something like:

taxonomy<-taxonomy %>% mutate(SV_ID=paste0("SV_", 1:nrow(.)))

Then paste together SV_ID and Taxon to make Feature.

1 Like