Beta Diversity Trees

Sara_Jeanne08 · March 3, 2018, 8:30pm

Hi,

I am hoping to find out how to obtain the beta-diversity tree (.tre) files from several beta diversity tests (i.e. jackknifed_beta_diversity, binary distance jaccard, bray curtis, hamming). In QIIME 1 - .tre file outputs were included in the output directory when running several these non-phylogenetic beta-divesity tests (beta_diversity_metrics – List of available metrics — Homepage). Could you help obtain these types of tree files with my QIIME 2 data?

EDIT: I believe the scripts from QIIME 1 that build and compared the trees from the beta diversity distance matrixes is upgma_cluster.py and tree_compare.py

Thank you,

Sara

gregcaporaso · March 5, 2018, 3:31pm

Hi @Sara_Jeanne08, I believe the tree you're looking for is generated as part of the qiime diversity beta-rarefaction command. See the Clustering tab in the resulting visualization. Does that get you what you need?

Sara_Jeanne08 · March 6, 2018, 2:37pm

@gregcaporaso Thank you for you help with this. The resulting .tre files with the beta-rarefaction command is exactly what I am looking for.

I do need to find away to collapse my feature/ biom table file prior to running this again, as I have 3 replicates for each sample that I need to combine by either sum or mean (not sure what is best) and then run this command so I have one node for each sample. QIIME 1 had a collapse table script - collapse_samples.py. What is the QIIME 2 equivalent of this command?

Thank you very much!

thermokarst · March 6, 2018, 10:01pm

Hi @Sara_Jeanne08! You can use the feature-table group method to collapse samples based on sample metadata (you can also collapse along the other axis of the table, if that was interesting to you). This method will let you specify a metadata column to group by, so if you have multiple replicates for each sample, you could do something like this with your metadata

sample-id   base-sample   replicate
s1-a        s1            a
s1-b        s1            b
s1-c        s1            c
s2-a        s2            a
s2-b        s2            b
s2-c        s2            c
...

then, when running the feature-table group method, you can specify base-sample as the column to group by.

One other thing to consider, and I don't know what is best for you and your study, but you will need to specify a grouping mode (--p-mode [median-ceiling|sum|mean-ceiling]), which means you will choose to sum the counts, or compute the median or the mean of the counts (the ceiling business is for rounding up to the nearest whole number - you can't have a fraction of an observation, right?). I suspect you will want to play with that option to get a sense of how it impacts downstream portions of your analysis.

Keep us posted!

Sara_Jeanne08 · March 8, 2018, 11:33pm

Hi @thermokarst,

Thank you for your time and help with this. I really appreciate it. I have been trying to collapse everything down by Base_Sample but I am getting an error saying one for the parameters (-m--metadata-column) does not exist:

(qiime2-2017.12) bash-3.2$ qiime feature-table group --i-table 20180306_Cricket_Venom_Analysis_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned.qza --p-axis Base_Sample --m-metadata-file 20180303_Spider_Microbiome_Cricket_Ptep_Venom_Analysis_Mapping_FIle.txt --m-metadata-column Base_Sample --p-mode median-ceiling --o-grouped-table 20180306_Cricket_Venom_Analysis_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned_GROUPED_MEDIAN --verbose
Error: no such option: --m-metadata-column

I am uncertain on how to proceed.

Thank you,
Sara

Sara_Jeanne08 · March 8, 2018, 11:39pm

@thermokarst - I think I found the problem:

It looks like there is a typo in the doc page for the feature-table group command:

--m-metadata-column MetadataColumn[Categorical]
Column from metadata file or artifact
viewable as metadata. A column defining the
groups. Each unique value will become a new
ID for the table on the given axis.
[required]

It should be the following from the options when viewed in the terminal screen:

--m-metadata-category TEXT Category from metadata file or artifact
viewable as metadata. A category defining
the groups. Each unique value will become a
new ID for the table on the given axis.
[required]

Also, I am confused by the --p-axis option, I do not know what value to place here... I only want to collapse by the base sample (group replicates)

Thank you!

Sara

thermokarst · March 8, 2018, 11:50pm

Hi @Sara_Jeanne08!

You are running QIIME 2 2017.12, but you are looking at the docs for QIIME 2 2018.2! A lot has changed in between those releases! You can change which version of a doc page you are looking at by using the dropdown on the left side of the page, although I would recommend upgrading to 2018.2, the latest and greatest.

Feature tables are two-dimensional: samples on one axis and features on the other. If you want to "collapse by the base sample," that means you want to perform the collapsing procedure along the sample axis, preserving the features as-is. Make sense?

Hope that helps!

Sara_Jeanne08 · March 8, 2018, 11:59pm

@thermokarst
Thank you so much! I didn't realize I was using the wrong doc page. The issue is I have done more then half of all my analyses for my paper and dissertation with the previous version - so I am hesitant to upgrade.

I appreciate you explaining what to put in for the --p-axis option. I really appreciate it - I have it running now!

Sara

thermokarst · March 9, 2018, 12:04am

Personally (and of course, I am biased), I wouldn't hesitate. QIIME 2 artifacts are backwards compatible, and we are constantly improving and fixing bugs. You shouldn't need to start from scratch when upgrading between versions - plus, our install instructions are tailored to keep your old installations available to you. I think @gregcaporaso has just about every release of QIIME 2 on his work laptop right now! Also, the cool thing about automatic decentralized provenance is that it tracks what versions were used for each individual step, so you will still have all of that information (check out the provenance tab next time you load a viz into view.qiime2.org).

Keep on QIIMEin'!

Sara_Jeanne08 · March 9, 2018, 4:41pm

@thermokarst - Thank you so much for the details on the versions and backwards compatibility. I will keep that in mind.

So - I tried running my newly grouped table in the beta diversity rarefaction script but I keep getting errors. I am using the coverage value from the sample with the lowest amount of summed sequences. Here is the lastest:

(qiime2-2017.12) bash-3.2$ qiime diversity beta-rarefaction --i-table 20180309_WHOLE_SPIDER2_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned_GROUPED_SUM.qza --p-metric weighted_unifrac --p-clustering-method upgma --m-metadata-file 20180304_Spider_Microbiome_ALLWHOLE2_Mapping_FIle_noSgM.txt --p-sampling-depth 43263 --i-phylogeny ../20180225_PHII_All_VSearch_OpenRef_99per_Clustered_Seqs_10OTUsmin_NoSgM9_NoUnAssigned_MidPoint_Rooted_TREE.qza --output-dir 20180309_Beta_Rarefaction_Weighted_UNIFRAC --verbose

Traceback (most recent call last):
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/commands.py", line 224, in call
results = action(**arguments)
File "", line 2, in beta_rarefaction
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py", line 228, in bound_callable
output_types, provenance)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py", line 424, in callable_executor
ret_val = self._callable(output_dir=temp_dir, **view_args)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_diversity/_beta/_beta_rarefaction.py", line 61, in beta_rarefaction
emperor = _jackknifed_emperor(primary, support, metadata)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_diversity/_beta/_beta_rarefaction.py", line 166, in _jackknifed_emperor
return Emperor(primary_pcoa, df, jackknifed=jackknifed_pcoa, remote='.')
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/emperor/core.py", line 204, in init
self.mf = self.mf.loc[ordination.samples.index]
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py", line 1373, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py", line 1616, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py", line 1115, in _getitem_iterable
self._has_valid_type(key, axis)
File "/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py", line 1472, in _has_valid_type
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['L_hes_whole', 'L_geo_whole', 'L_mac_whole', 'P_tep_whole',\n 'S_grossa_whole'],\n dtype='object')] are in the [index]"

Plugin error from diversity:

"None of [Index(['L_hes_whole', 'L_geo_whole', 'L_mac_whole', 'P_tep_whole',\n 'S_grossa_whole'],\n dtype='object')] are in the [index]"

See above for debug info.

EDIT: I also tried lowering my sample depth to the same as the one i used with my replicates before grouping ... I also tried my collapsed sum table with the qiime diversity core-metrics-phylogenetic script and the same thing happened...

Thank you for your time and help with this,

Sara

thermokarst · March 9, 2018, 9:47pm

Did you remember to create a new sample metadata file? Since you collapsed your samples, you have all new sample IDs - make sense? By collapsing, you have effectively altered the root aspect of your study, so you will need to create a new sample metadata file to go along with it.

sample-id         foo     bar
L_hes_whole       a       1
L_geo_whole       a       2
L_mac_whole       a       4
P_tep_whole       b       16
S_grossa_whole    b       32

Sara_Jeanne08 · March 10, 2018, 2:59am

Thank you! I did not realize I needed a new mapping file - as I had the IDs in the Base_Sample Column. It is working now!

Curious do I need to remake my tree to match up with these IDs when I run phylogenetic-based diversity tests.

Thanks,

Sara