taxanomy barchart filter

mohsen_ej · February 9, 2021, 11:19am

Dear all,
I see in the tutorial that you filtered the samples in Parkinson's mouse to create a taxonomy bar chart. but in moving pictures you didn't do this. I need to know what is the difference? What does this depend on?
Thank you

jwdebelius · February 9, 2021, 3:12pm

Hi @mohsen_ej,

If you go back in each tutorial, is there some point where the number of samples changes?

Best,
Justine

mohsen_ej · February 9, 2021, 4:28pm

sorry I didn't understand what do you mean. sorry if my question is simple but I got confused and didn't find out what was the reason.
Thank you very much

jwdebelius · February 9, 2021, 4:32pm

Hi @mohsen_ej,

This is one of those cases where the best thing I can do is help you find the answer yourself. If you go into the tutorial, are there any steps before you generate your taxonomic barchart where you exclude sequences? Think about diversity analyses, and how you got the number of samples in that step.

Best,
Justine

mohsen_ej · February 9, 2021, 4:40pm

Thank you for your help.
in both cases, we chose a sample depth. but we didn't consider the sample depth in both taxonomy bar chart. sorry If I am missing something simple

mohsen_ej · February 9, 2021, 4:58pm

maybe because we exclude three of our samples in moving pictures we don't want to limit any other samples in further steps. but we didn't lose many samples in Parkinson's tutorial so we can filter our samples.
I'm not sure.

jwdebelius · February 9, 2021, 11:44pm

Hi @mohsen_ej,

You should double check the filtering step in the PD mice tutorial and the filtering step.

You could also check the rarefied tables to see if you lose samples in rarefaction.

Best,
Justine

mohsen_ej · February 10, 2021, 7:16pm

in alpha rarefaction, we lost about 50% of our samples but in alpha diversity, we chose another sample depth. we chose alpha rarefaction depth for the taxonomy bar chart. but the question is that we did the same steps in moving pictures. isn't it?

mohsen_ej · February 11, 2021, 6:48pm

I understand that in the PD tutorial we exclude some samples in alpha and beta diversity and maybe we want to make our taxonomy more accurate by investigating the same samples. but in moving pictures we didn't filter taxonomy while we excluded some samples too. sorry I got confused.

jwdebelius · February 15, 2021, 3:46pm

Hi @mohsen_ej,

Thanks for being patient, I spent some time looking. Im not sure why samples aren't filtered in Moving Pictures, I think it's because it's based on an old versions where samples were not filtered out. (Corresponding to the minimum depth). We should probably update some part of the tutorial to address this.

Best,
Justine

mohsen_ej · February 15, 2021, 6:10pm

so, we conclude normally we should filter our data before creating a taxonomy bar chart right? can we say the value for filtering always is equal to the core-metrics value?

jwdebelius · February 15, 2021, 6:23pm

Hi @mohsen_ej,

Yes, I would filter my data so that the two datasets matched. You want to describe and analyze the same set of samples.

Best,
Justine

sbslee · February 15, 2021, 11:53pm

Hi @jwdebelius and @mohsen_ej,

Thanks for this interesting topic! I have been drawing many taxa bar plots for a while, but haven't really considered filtering out samples with low depth (i.e. rarefaction depth). I guess this was never a big issue for me so far because my datasets typically have more than 5,000 reads per sample, so even though I do perform rarefaction for, say, beta diversity, I didn't bother with taxa bar charts.

However, after reading through this post got me thinking: If we are willing to throw out samples below a rarefaction depth, should we might as well just use a rarefied table for creating a taxa bar chart (i.e. all samples have the same depth)? If I understood the PD tutorial correctly, in the taxa bar chart step, we are only filtering out samples with low depth, but the remaining samples could have vastly different total read depth.

Interested to hear your (or anyone else's) thoughts on this!

jwdebelius · February 16, 2021, 1:32am

Hi @sbslee,

My view on stacked bar charts is that they're only useful as long as they're readable and you're interested in abundant things. Research (not mine!) suggests that people can distinguish between 8 and 12 colors. So, as soon as my color map has to start recycling colors, I tend to stop displaying additional categories. Depending on your taxonomic level and system, I would assume these these clades would be your most abundant ones and they should be more resistant to rarefaction.
In real life, I find barchart are useful for big environmental differences/big differences across systems. You can use them to diagnose if something is majorly weird in your community if you know what you're genreally expecting. (I had a colaborator who sent me skin samples instead of fecal . We figured it out from stacked barcharts.) Otherwise, I find they're awesome when you're comparing body sites, humans vs adults, vaginal communities, and environmental samples. They're less useful to me in seeing other effects in the data which often affect lower abundance organisms... but a lot of my collaborators feel like they haven't had a proper microbiome analysis unless they've seem some kind of taxonomic composition plot.

The place the sample filtered, unrarefied table becomes important is differential abundance. And, since you'll need it later, it makes sense to generate it early and use it for bar charts (if you're going to do them ).

Best,
Justine

mohsen_ej · February 16, 2021, 7:46am

Thank you very much for your useful information.

mohsen_ej · February 16, 2021, 8:05am

sorry, one more question. so, if we want to find determine effect of lower abundance taxa's or for example markers, which method is the best?
Thank you

system · March 19, 2021, 2:05pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.