Taxa barplot up to genus by grouping the rarest genus without using excel

Hello,
I'm a novice when it comes to using QIIME2 and I'm currently doing beta barcoding. I managed to get my barplot representing the bacterial genus composing the flora of my samples after filtering out the parameters that don't interest me (removing eukaryotes from my samples, etc.) and I wanted to use the .csv results to do a figure. However, the file is unreadable and puts each level of genus identification in a separate cell (e.g.: Cell 1: d_Bacteria, Cell 2: Bacteroidota...) Given the number of genera identified, this represents a colossal amount of work to group and sort by hand and can lead to careless errors.

So my question is this: Is there a way to retrieve a taxa barplot and modify it (group the rarest genera together in the barplot, change the color, etc.) without going through the .csv file but with R for example?

Sorry in advance if I'm not using the right jargon or if a post answering my problem has already been posted, I'm learning a bit on the job.

Thanks in advance.

Hi @Mout,

Don't worry, so myself and others can help you, a fuller explanation might help.

For example, have you gotten to the point where you have run something like the following (my example is 18s sequencing but the premise is similar):

qiime feature-classifier classify-sklearn \
--i-classifier /references/silva-138-99-seqs-18s-region-600bp-classifier.qza \
--i-reads 18s_rep-seqs.qza \
--o-classification 18s_taxonomy_results.qza

qiime taxa barplot \
--m-metadata-file my_metadata.tsv \
--i-table 18s_table.qza \
--i-taxonomy 18s_taxonomy_results.qza \
--o-visualization 18s_taxa_bar_plots.qzv

And then you've placed your version of 18s_taxa_bar_plots.qzv into qiime2 view and clicked on the drop down to select the taxonomic level you'd like (genus, so level 6), then you've downloaded the .csv file but you are having an issue importing it into excel? I think the issue could just the delimiter that excel is using by default.

I'm not sure if you are aware but you can sort by abundance and your metadata information using qiime2 view, those are other drop down columns. But really the only two ways to examine that output is interactively on qiime2 view or via the csv file in R or Python.

Are you used to coding in R or Python? have you tried to import your csv using either of these and what was the error message? Think about commands like groupby or unique or set to explore the data.

all the best,

Vic

2 Likes

Hello Victoria,

Thank you very much for your quick reply!

Yes, that's exactly what I did. In my case, it's for 16s.

I've also visualized my data using metadata, but I don't have a drop-down menu to sort by abundance...

I do my QIIME2 processing on the Linux command terminal using Python. But from a practical point of view, I'd like to manipulate my data and do my figures in R (which I'm also learning to master).

I've exported my results (level-6.csv) in R using the "read.csv" command, and my taxonomies seem to have grouped together nicely! However, is there any way to convert "full-taxonomies names" to "only-genus names"?
Also, the genera I'd like to group into an "other" category must be the least common of all my samples and not just one. Is there a way on R to determine which ones they are?

Once I've determined these, I'll be able to group them using the functions you've mentioned, I suppose!

Thanks again for your precious help and have a great day !

1 Like

Hi again @Mout,

from this:

However, is there any way to convert "full-taxonomies names" to "only-genus names"?
Also, the genera I'd like to group into an "other" category must be the least common of all my samples and not just one. Is there a way on R to determine which ones they are?

It appears that the root of your issue is your familiarity with R, rather than a Qiime2 issue, and this forum is mainly for Qiime2 issues. :qiime2:

So I'd probably recommend you invest some time in your R training as it will hold you in good stead for future projects and work. Bioinformatics and data analysis almost always require data wrangling and editing! Have a google around, how to deal with dataframes in R, edit headers, how to split strings, as these are the types of basic functions you'll need.

Unfortunately, this is where I have to hold my hands up :raised_hands: and admit that I only speak Parseltongue :snake: (I do nearly all of my work in Python).

I know this will seem like an unhelpful answer, but it's honestly going to be so much better for you to try to spend time learning R and putting the ground work in.

good luck and all the best, :+1:

Vic

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.