Expanding upon the exportation tutorial

Sirtaj-Singh · November 21, 2017, 3:52am

Hello all!

I recently needed to export some QIIME 2 artifacts into R, and realized that the methods for doing so were not extremely intuitive as it required the use of some python code in conjunction with the outputs from qiime tools export. Expanding upon the exportation tutorial in QIIME 2 could be extremely useful for new users, and I would be happy to share the code I used to do so if that helps at all!

Sirtaj Singh

thermokarst · November 21, 2017, 1:59pm

Hi @Sirtaj-Singh!

Can you provide some more detail on why you needed this? Generally speaking, exporting basically just unzips the data that is tucked inside an Artifact. Once it is free it is in your court in terms of what kind of manipulation these data need, and how they will be loaded into other tools.

Thanks!

Sirtaj-Singh · November 21, 2017, 5:38pm

Yes of course!

After exporting the table.qza file, unrooted tree file, and the taxonomy file, there were a few more steps before I could import the biom file into R:

Convert the biom file into a JSON file.
Edit the metadata file so that the first column name is #SampleID, and then add it to the JSON .biom file.
Add the taxonomy file after making sure it has the correct column names.

These steps were extremely easy to do, yet it took some perusing to figure out that these were the necessary steps for getting the .biom file that I wanted.

Perhaps just adding a link to the biom-format python package will be enough to make this easier on the user, as I do agree that QIIME 2 should not be forced to provide code for every possible change a user may need to make to the exported files for their various needs.

Hope that makes sense!

thermokarst · November 22, 2017, 2:22pm

Ah, thanks for the details @Sirtaj-Singh! I think the key take-away here is this line:

I think there is a bit of a boundary between getting data out of QIIME 2, and into another piece of software (in your case, R). At this point, we probably won't include these kinds of details in the official docs. What would be awesome, is to see this kind of workflow outlined in a Community Tutorial! If this was something you wanted to share with others, that category on the forum would be the perfect place! Thanks @Sirtaj-Singh!!

Sirtaj-Singh · November 29, 2017, 6:43pm

That sounds like a great idea @thermokarst! I'll try to get something going on that this week

Micro_Biologist · December 2, 2017, 12:34pm

I already have (albeit not the best coded) script to get from qiime2 taxonomy tables, and OTU tables into something R can use.

I was unaware I could post this sort of thing on this forum as the tutorial says its for plugins, would you like me to share my code (just in .txt. format) to get you started and maybe you could tidy it up?

I would be intrigued to see if it works for other data sets as I have only been able to trial it on the 'moving pictures' dataset so far.

EDIT: I am by no means an R expert (or even advanced) i have just been scraping bits of code together to do what I want it to do.

thermokarst · December 4, 2017, 4:04pm

Hi @Micro_Biologist! A community tutorial would be great!

Hmm, not sure where you read that at, but here is the current text for Community Tutorials section:

The Community Tutorials category is for sharing QIIME 2 tutorials that are not part of the official QIIME 2 documentation. These may be contributed by members of the QIIME 2 user community, or by plugin developers who are drafting new tutorials that may ultimately be included in the QIIME 2 documentation.

Sorry for any confusion there, we would love to see user-submitted tutorials!

Micro_Biologist · December 5, 2017, 2:59pm

I'm sure its wholey on my end!

Here are my 2 usable scripts for transposing an OTU/ESV table

library(vegan)
library(tidyverse)

rm(Ftable, OtuOb, FT_colnames, Ftable2, Ftable3)
Ftable <- read_tsv('table.feature-table_biom.tsv', col_names = TRUE, skip = 1)
names(Ftable)[names(Ftable)=="#OTU ID"] <- "Feature.ID"
Ftable <- cbind(Ftable, "observation"=1:nrow(Ftable)) #adds an indexing column 'observation'
Ftable <- Ftable %>% select(observation, everything())
OtuOb <- Ftable
OtuOb <- as.tibble(OtuOb) 
write_tsv(OtuOb, "OtuToObservationReference.tsv")
Ftable <- Ftable[,-2]

and another for altering how taxonomy tables are formatted (eg it separates taxonomy into its taxonomic levels making working with R much easier if you wanted to display this information)

library(tidyverse)
library(vegan)

rm(taxonomy, taxonomyInt, taxonomySep) #to delete any data with those names
taxonomy <- read_tsv("taxonomy.tsv", col_names = TRUE)
taxonomyInt <- taxonomy %>% separate(Taxon, into = c("K", "P"), extra = "drop", sep = "p_")
taxonomyInt <- taxonomyInt %>% separate(P, into = c("P", "C"), extra = "drop", sep = "c_")
taxonomyInt <- taxonomyInt %>% separate(C, into = c("C", "O"), extra = "drop", sep = "o_")
taxonomyInt <- taxonomyInt %>% separate(O, into = c("O", "F"), extra = "drop", sep = "f_")
taxonomyInt <- taxonomyInt %>% separate(F, into = c("F", "G"), extra = "drop", sep = "g_")
taxonomySep <- taxonomyInt %>% separate(G, into = c("G", "S"), extra = "drop", sep = "s_")
taxonomySep <- as.data.frame(taxonomySep)
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("k_", "", taxonomySep)}))
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("_", "Z_", taxonomySep)}))
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub(";", "Z_", taxonomySep)}))
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("Z_Z_", NA, taxonomySep)}))
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("Z_", "", taxonomySep)}))
taxonomySep[taxonomySep == ""] <- NA
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("\\[", "", taxonomySep)}))
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub("]", "", taxonomySep)}))
taxonomySep[taxonomySep == "Unassigned"] <- NA
taxonomySep <- data.frame(lapply(taxonomySep, function(taxonomySep) {gsub(" ", "", taxonomySep)}))
taxonomySep #To check that its removed all the crap
write_tsv(taxonomySep, "taxonomySep.tsv")
rm(taxonomy, taxonomyInt)

Like I said I am NOT a competent R user, but hopefully this will be useful to @Sirtaj-Singh, alternatively I am happy to upload it as a more tutorial based format and people wanted to add to it or improve it where they can?