Beta Diversity Trees

Hi @Sara_Jeanne08! You can use the feature-table group method to collapse samples based on sample metadata (you can also collapse along the other axis of the table, if that was interesting to you). This method will let you specify a metadata column to group by, so if you have multiple replicates for each sample, you could do something like this with your metadata

sample-id   base-sample   replicate
s1-a        s1            a
s1-b        s1            b
s1-c        s1            c
s2-a        s2            a
s2-b        s2            b
s2-c        s2            c
...

then, when running the feature-table group method, you can specify base-sample as the column to group by.

One other thing to consider, and I don’t know what is best for you and your study, but you will need to specify a grouping mode (--p-mode [median-ceiling|sum|mean-ceiling]), which means you will choose to sum the counts, or compute the median or the mean of the counts (the ceiling business is for rounding up to the nearest whole number - you can’t have a fraction of an observation, right?). I suspect you will want to play with that option to get a sense of how it impacts downstream portions of your analysis.

Keep us posted! :t_rex:

Hi @thermokarst,

Thank you for your time and help with this. I really appreciate it. I have been trying to collapse everything down by Base_Sample but I am getting an error saying one for the parameters (-m–metadata-column) does not exist:

(qiime2-2017.12) bash-3.2$ qiime feature-table group --i-table 20180306_Cricket_Venom_Analysis_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned.qza --p-axis Base_Sample --m-metadata-file 20180303_Spider_Microbiome_Cricket_Ptep_Venom_Analysis_Mapping_FIle.txt --m-metadata-column Base_Sample --p-mode median-ceiling --o-grouped-table 20180306_Cricket_Venom_Analysis_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned_GROUPED_MEDIAN --verbose
Error: no such option: --m-metadata-column

I am uncertain on how to proceed.

Thank you,
Sara

@thermokarst - I think I found the problem:

It looks like there is a typo in the doc page for the feature-table group command:

–m-metadata-column MetadataColumn[Categorical]
Column from metadata file or artifact
viewable as metadata. A column defining the
groups. Each unique value will become a new
ID for the table on the given axis.
[required]

It should be the following from the options when viewed in the terminal screen:

–m-metadata-category TEXT Category from metadata file or artifact
viewable as metadata. A category defining
the groups. Each unique value will become a
new ID for the table on the given axis.
[required]

Also, I am confused by the --p-axis option, I do not know what value to place here… I only want to collapse by the base sample (group replicates)

Thank you!

Sara

Hi @Sara_Jeanne08!

You are running QIIME 2 2017.12, but you are looking at the docs for QIIME 2 2018.2! A lot has changed in between those releases! You can change which version of a doc page you are looking at by using the dropdown on the left side of the page, although I would recommend upgrading to 2018.2, the latest and greatest.

Feature tables are two-dimensional: samples on one axis and features on the other. If you want to “collapse by the base sample,” that means you want to perform the collapsing procedure along the sample axis, preserving the features as-is. Make sense?

Hope that helps! :t_rex:

@thermokarst
Thank you so much! I didn’t realize I was using the wrong doc page. The issue is I have done more then half of all my analyses for my paper and dissertation with the previous version - so I am hesitant to upgrade.

I appreciate you explaining what to put in for the --p-axis option. I really appreciate it - I have it running now!

Sara

1 Like

Personally (and of course, I am biased), I wouldn’t hesitate. QIIME 2 artifacts are backwards compatible, and we are constantly improving and fixing bugs. You shouldn’t need to start from scratch when upgrading between versions - plus, our install instructions are tailored to keep your old installations available to you. I think @gregcaporaso has just about every release of QIIME 2 on his work laptop right now! Also, the cool thing about automatic decentralized provenance is that it tracks what versions were used for each individual step, so you will still have all of that information (check out the provenance tab next time you load a viz into view.qiime2.org). :nerd_face:

Keep on QIIMEin’! :balloon:

1 Like

@thermokarst - Thank you so much for the details on the versions and backwards compatibility. I will keep that in mind.

So - I tried running my newly grouped table in the beta diversity rarefaction script but I keep getting errors. I am using the coverage value from the sample with the lowest amount of summed sequences. Here is the lastest:

(qiime2-2017.12) bash-3.2$ qiime diversity beta-rarefaction --i-table 20180309_WHOLE_SPIDER2_ONLY_PHII_All_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned_GROUPED_SUM.qza --p-metric weighted_unifrac --p-clustering-method upgma --m-metadata-file 20180304_Spider_Microbiome_ALLWHOLE2_Mapping_FIle_noSgM.txt --p-sampling-depth 43263 --i-phylogeny ../20180225_PHII_All_VSearch_OpenRef_99per_Clustered_Seqs_10OTUsmin_NoSgM9_NoUnAssigned_MidPoint_Rooted_TREE.qza --output-dir 20180309_Beta_Rarefaction_Weighted_UNIFRAC --verbose

Traceback (most recent call last):
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/commands.py”, line 224, in call
results = action(**arguments)
File “”, line 2, in beta_rarefaction
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 228, in bound_callable
output_types, provenance)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 424, in callable_executor
ret_val = self._callable(output_dir=temp_dir, **view_args)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_diversity/_beta/_beta_rarefaction.py”, line 61, in beta_rarefaction
emperor = _jackknifed_emperor(primary, support, metadata)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_diversity/_beta/_beta_rarefaction.py”, line 166, in _jackknifed_emperor
return Emperor(primary_pcoa, df, jackknifed=jackknifed_pcoa, remote=’.’)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/emperor/core.py”, line 204, in init
self.mf = self.mf.loc[ordination.samples.index]
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py”, line 1373, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py”, line 1616, in _getitem_axis
return self._getitem_iterable(key, axis=axis)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py”, line 1115, in _getitem_iterable
self._has_valid_type(key, axis)
File “/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/pandas/core/indexing.py”, line 1472, in _has_valid_type
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: “None of [Index([‘L_hes_whole’, ‘L_geo_whole’, ‘L_mac_whole’, ‘P_tep_whole’,\n ‘S_grossa_whole’],\n dtype=‘object’)] are in the [index]”

Plugin error from diversity:

“None of [Index([‘L_hes_whole’, ‘L_geo_whole’, ‘L_mac_whole’, ‘P_tep_whole’,\n ‘S_grossa_whole’],\n dtype=‘object’)] are in the [index]”

See above for debug info.

EDIT: I also tried lowering my sample depth to the same as the one i used with my replicates before grouping … I also tried my collapsed sum table with the qiime diversity core-metrics-phylogenetic script and the same thing happened…

Thank you for your time and help with this,

Sara

Did you remember to create a new sample metadata file? Since you collapsed your samples, you have all new sample IDs - make sense? By collapsing, you have effectively altered the root aspect of your study, so you will need to create a new sample metadata file to go along with it.

sample-id         foo     bar
L_hes_whole       a       1
L_geo_whole       a       2
L_mac_whole       a       4
P_tep_whole       b       16
S_grossa_whole    b       32

Thank you! I did not realize I needed a new mapping file - as I had the IDs in the Base_Sample Column. It is working now!

Curious do I need to remake my tree to match up with these IDs when I run phylogenetic-based diversity tests.

Thanks,

Sara

Hi @Sara_Jeanne08.

I would recommend reviewing the Metadata in QIIME 2 tutorial for more information on how metadata works in QIIME 2.

The tree tips are comprised of Feature IDs, not Sample IDs (going back to the idea of a feature table, this is the other axis). The phylogenetic metrics are designed to work with trees that are a superset of IDs - for example, you could use the full greengenes tree with your feature table. You should be fine to just use your existing tree. Thanks! :t_rex:

Hi @thermokarst,

Thank you again for all your help. My analyses are going well but I had a few warnings pop up that I wanted to ask you about:
Command:
(qiime2-2017.12) bash-3.2$ qiime diversity beta-rarefaction --i-table 20180403_PhII_SILK_ONLY_VSearch_OpenRef_99per_Clustered_Table_10OTUsmin_NoSgM9_NoUnAssigned_GROUPED_MEAN.qza --i-phylogeny ../20180225_PHII_All_VSearch_OpenRef_99per_Clustered_Seqs_10OTUsmin_NoSgM9_NoUnAssigned_MidPoint_Rooted_TREE.qza --p-metric jaccard --p-clustering-method upgma --m-metadata-file 20180304_Spider_Microbiome_SILK_ONLY_Mapping_FIle_COLLAPSED.txt --p-sampling-depth 28629 --output-dir 20180404_Beta_Div_Jaccard --verbose

 /Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/sklearn/utils/validation.py:475: DataConversionWarning: Data with input dtype int64 was converted to bool by check_pairwise_arrays.
  warnings.warn(msg, DataConversionWarning)
/Users/Sara_Jeanne/miniconda2/envs/qiime2-2017.12/lib/python3.5/site-packages/emperor/core.py:204: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

I still am getting the output files but I am wondering what this warning actually means… I have run this test many times on different base sample types but this is the first I am seeing this warning.

Thank you,

Sara

Hi @Sara_Jeanne08, that is a deprecation warning, intended to warn users of the pandas tool of a now-deprecated behavior of that tool. In this instance, the warning is coming from code in the emperor tool (emperor is using pandas internally). The TLDR is that this is out of your hands, and shouldn’t be a cause for any concern. I will cc @yoshiki, lead developer of Emperor to see if he wants to add anything else to this. Thanks!