Qiime2 and R downstream analysis

AYK · December 9, 2022, 9:31pm

Hi,

I am having trouble executing alpha diversity using my data, following this very appreciated Tutorial.

I checked my data, here is the Venn diagram.

Note: the input file that I want to run contains 702 samples. (I am interested in the 702 samples).

Then, I added a column for the Faith phylogenetic diversity to the table.
Then, I ran the filter and plotting code below:

metadata %>%
filter(!is.na(FD)) %>%
ggplot .......... etc.

I go this error message below:

Error in filter():
! Problem while computing ..1 = !is.na(FD).
Input ..1 must be of size 778 or 1, not size 702.
Run rlang::last_error() to see where the error occurred.

I am not sure how I should tackle this problem. So, I tried the following:

I created a new metadata file (where I have 702 samples) that matches the FD2 file. Please see the Venn diagram here.

Then, I added a column for the Faith phylogenetic diversity to the table.
Then, I ran the filter and plotting code below:

metadata %>%
filter(!is.na(FD)) %>%
ggplot .......... etc.

I got a different error messages below:

Error in filter():
! Problem while computing ..1 = is.na(FD2).
Input ..1 must be a logical vector, not a logical[,3].

I'm having some difficulty resolving this issue and I would really appreciate your help and guidance.

Please read the following before posting!

Is this post a General Discussion topic? Have you reviewed the QIIME 2 Forum Glossary? Post to this category if you have a general question about microbiome science, bioinformatics, or other general questions, ideas, or topics to discuss. Please do not post questions here that have to do with technical support requests. Posts in this category are not guaranteed a response.

jbisanz · December 12, 2022, 3:24pm

I think the same problem as below with a workaround. Will try to address this in a future update.

AYK · December 14, 2022, 3:51pm

Thank you very much for getting back to me. I tried to workaround per this and it worked.

However, I encountered another issue running evenness. I ran the code below for evenness:

metadata %>%
filter(!is.na(evenness)) %>%
ggplot(aes(x=Age, y= evenness, color=Practice)) +
stat_summary(geom="errorbar", fun.data=mean_se, width=0) +
stat_summary(geom="line", fun.data=mean_se) +
stat_summary(geom="point", fun.data=mean_se) +
xlab("Age") +
ylab("evenness") +
theme_q2r() + # try other themes like theme_bw() or theme_classic()
scale_color_viridis_d(name="Practice") # use different color scale which is color blind friendly
ggsave("evenness_by_time.pdf", height=3, width=4, device="pdf") # save a PDF 3 inches by 4 inches

The error message I got is below:

Error in filter():
! Problem while computing ..1 = !is.na(evenness).
Input ..1 must be of size 778 or 1, not size 702.
Run rlang::last_error() to see where the error occurred.

I checked that 702 samples have been merged and Pielou's Evenness column has been added to the metadata.

I see from other posts that it seems the code worked for them with no issues when running the evenness, but I am having an issue. Can you please pinpoint a lead that is likely causing the error?

Thank you very much!

colinbrislawn · December 24, 2022, 3:24am

Try replacing this row

filter(!is.na(evenness)) %>%

with the drop_na() function.

That will drop all rows with NAs, but maybe that's ok
(If a sample's Age is also NA you probably want to drop it, and this function will do that)

But wait, why are some of your evenness values missing? Do you have more samples in your metadata than you do in your Qiime2 output files?

AYK · December 29, 2022, 9:00pm

Thank you for getting back to me, Colin. I have more samples in the metadata file than I do in Qiime2 outputs and that's because when I set up the sampling depth, I lost some samples.

I tried running the following:

metadata <-read_q2metadata("metadata.tsv")
evenness <- read_qza("evenness_vector.qza")
evenness <- evenness$data %>% rownames_to_column("SampleID")

metadata <-
metadata %>%
left_join(evenness)

metadata %>%
drop_na(evenness) %>%
ggplot(aes(x=Age-in-months, y=evenness, color=status)) +
stat_summary(geom="errorbar", fun.data=mean_se, width=0) +
stat_summary(geom="line", fun.data=mean_se) +
stat_summary(geom="point", fun.data=mean_se) +
xlab("Age") +
ylab("Evenness") +
theme_q2r() +
scale_color_viridis_d(name="status")
ggsave("evenness_by_Age.pdf", height=3, width=4, device="pdf")

I got this new error below:

Error in drop_na():
! Can't subset columns with evenness.
evenness must be numeric or character, not a <data.frame> object.

Then, I tried to fix that by converting the evenness column into a numeric or character type using the as.numeric() function. But I kept getting new errors.

I tried using the functions below, but each time I am getting a new error:

unlist() function
as.integer() function
na.omit() function

I would appreciate any help.

colinbrislawn · December 29, 2022, 10:00pm

This is the issue. While the evenness column should be a number, it's being merged in as a dataframe, resulting in a dataframe being nested within your dataframe.

Here's a temporary workaround:
before you left_joint(), pull out one column from your evenness dataframe.

evenness %>% names() # get column names
metadata <- metadata %>%
  left_join(evenness$OneColumnName) # enter in the column name here

AYK · December 30, 2022, 1:41am

Thank you very much, Colin! I appreciate your prompt response.

Is this a bug/flaw, which warrants the temporary workaround? Or was I doing something wrong?

I followed the workaround and I got the following,

evenness %>% names() # get column names
[1] "SampleID" "pielou_evenness" #this is the output of the evenness object

metadata <- metadata %>%
left_join(evenness$pielou_evenness) # enter in the column name here

I got this new error below:

Error in auto_copy():
! x and y must share the same src.
set copy = TRUE (may be slow).
Run rlang::last_error() to see where the error occurred.

colinbrislawn · December 30, 2022, 4:51am

I think this is a bug, so a workaround is fine until it's fixed upstream.

It looks like your new metadata merge went well! Now you can work on graphing.

metadata %>%
  drop_na(pielou_evenness) %>%
  ggplot(aes(x=Age-in-months, y=pielou_evenness, color=status)) +
  geom_point()

The function stat_summary() can get fancy , so let's try it with basic ggplot2 functions to make sure the Qiime2R stuff is working.

AYK · December 30, 2022, 7:50pm

This part below went through.
metadata %>%
drop_na(pielou_evenness) %>%
ggplot(aes(x=Age-in-months, y=pielou_evenness, color=status)) +
geom_point()

Although, I am getting the a graph, but it is not the graph I want.
The stat_summary() and scale_x_discrete() and the rest of the code is not working. I am getting new different errors when I add this part below

stat_summary(geom="errorbar", fun.data=mean_se, width=0) +
stat_summary(geom="line", fun.data=mean_se) +
stat_summary(geom="point", fun.data=mean_se) +
scale_x_discrete(limits = c("6 months", "12 months", "18 months", "24 months")) +
xlab("Age") +
ylab("pielou_evenness") +
theme_q2r() +
ggsave("evenness_by_Age.pdf", height=3, width=4, device="pdf")

colinbrislawn · December 30, 2022, 10:14pm

That's great to hear! I'm glad we figured it out!

I think we may be moving beyond the qiime2r question, and into questions about building graphs for publication. From the code of conduct:

This type of work is generally associated with authorship, and researchers who share responsibility for these aspects of their work often do so through formal collaborations with co-authors.

AYK · December 31, 2022, 4:12pm

Thank you very much, Colin.
I am grateful for your time, effort, and assistance!