# How to calculate a simple Jaccard similarity coefficient for two populations?

I am new at this, so my apologies if I have overlooked what should be a simple answer and is already in the Forum… I have looked but cannot find whether there’s code to run just a simple Jaccard similarity coefficient between two populations:

S = a / (a + b + c)

where S = Jaccard similarity coefficient,
a = number of species in Sample A and Sample B (joint occurrences)
b = number of species in Sample B but not in Sample A
c = number of species in Sample A but not in Sample B

I understand and know of the Jaccard’s plot possible in Emperor plots, but I want something far more basic when I’m comparing only two populations (or species, or samples, or whatever).

Also, just a simple calculation of Shannon’s H for each sample/population? Is that possible?

Thank you for your time!

Best regards, Brian

Totally possible! Check out `qiime diversity beta` for jaccard distance and `qiime diversity alpha` for shannon.

Best,
Justine

1 Like

Okay, so I tried the following:

qiime diversity alpha --i-table bacteria-table.qza --p-metric shannon --o-alpha-diversity shannon_table

where bacteria-table.qza is my filtered table (removed eukaryotes, archaeans, and bacteria identified only to the Domain level), this should be analogous to the table.qza from the Moving Pictures tutorial.

I did generate shannon_table.qza, but I am unsure of the commands necessary to visualize the results! I know this should be easy, but I just cannot seem to get the correct syntax…

Also, is there a way to integrate my metadata file so that I can see the two groups?

Any help with the visualization is appreciated! Thank you!

Best regards, Brian

Hey there @l.brian.patrick!

Have you seen the `alpha-group-significance` visualizer?

https://docs.qiime2.org/2019.10/plugins/available/diversity/alpha-group-significance/

The command I shared above requires metadata, so you should be all set!

2 Likes

That worked brilliantly! Thank you so very much!

1 Like

Okay, one last question (I think)-- is there a way to integrate sampling depth into these analyses? As I currently have the arguments structured, they are using a table.qza (specifically bacteria-table.qza, as specified in my earlier post) that hasn’t accounted for sampling depth. Thus, it appears that all of my samples are included in the analyses.

For Shannon’s I ran the following:

qiime diversity alpha --i-table bacteria-table.qza --p-metric shannon --o-alpha-diversity shannon_table

qiime diversity alpha-group-significance --i-alpha-diversity shannon_table.qza --m-metadata-file mdat.tsv --o-visualization shannon_table.qzv

For Jaccard’s similarity I ran the following:

qiime diversity beta --i-table bacteria-table.qza --p-metric jaccard --o-distance-matrix jaccard_table

qiime diversity beta-group-significance --i-distance-matrix jaccard_table.qza --m-metadata-file mdat.tsv --m-metadata-column State --o-visualization jaccard_table.qzv

where State is the categorical variable chosen from the metadata file.

I do run the following command to get some core metrics:

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table bacteria-table.qza --p-sampling-depth #### --m-metadata-file mdat.tsv --output-dir core-metrics-results

where #### is determined using rarefaction.

Is there an output to the above core-metrics that I could use that would allow me to only use samples at or above the specified sampling depth?

I did find this article in the Forum, so I do have the basics of including the phylogeny into the Shannon’s and Jaccard, if that is the way to go:

Thank you so very much for your help and I appreciate your time helping the noob figure out some of these specifics!

Best regards, Brian

Wait, when I ran the qiime diversity core-metrics-phylogenetics and created the output directory I just noticed that there’s a shannon_vector.qza and a jaccard_distance_matrix.qza within that directory. Are these based off of the sampling depth I set in that same command??? If so, then I have answered my own question???

Pretty sure my face should be pretty red with embarrassment at the moment…

Best regards, Brian

2 Likes

Yep! Those are based off the sampling depth you set there.

Best,
Justine

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.