How to calculate a simple Jaccard similarity coefficient for two populations?

l.brian.patrick · February 19, 2020, 11:38pm

Okay, one last question (I think)-- is there a way to integrate sampling depth into these analyses? As I currently have the arguments structured, they are using a table.qza (specifically bacteria-table.qza, as specified in my earlier post) that hasn't accounted for sampling depth. Thus, it appears that all of my samples are included in the analyses.

For Shannon's I ran the following:

qiime diversity alpha --i-table bacteria-table.qza --p-metric shannon --o-alpha-diversity shannon_table

qiime diversity alpha-group-significance --i-alpha-diversity shannon_table.qza --m-metadata-file mdat.tsv --o-visualization shannon_table.qzv

For Jaccard's similarity I ran the following:

qiime diversity beta --i-table bacteria-table.qza --p-metric jaccard --o-distance-matrix jaccard_table

qiime diversity beta-group-significance --i-distance-matrix jaccard_table.qza --m-metadata-file mdat.tsv --m-metadata-column State --o-visualization jaccard_table.qzv

where State is the categorical variable chosen from the metadata file.

I do run the following command to get some core metrics:

qiime diversity core-metrics-phylogenetic --i-phylogeny rooted-tree.qza --i-table bacteria-table.qza --p-sampling-depth #### --m-metadata-file mdat.tsv --output-dir core-metrics-results

where #### is determined using rarefaction.

Is there an output to the above core-metrics that I could use that would allow me to only use samples at or above the specified sampling depth?

I did find this article in the Forum, so I do have the basics of including the phylogeny into the Shannon's and Jaccard, if that is the way to go:

Thank you so very much for your help and I appreciate your time helping the noob figure out some of these specifics!

Best regards, Brian