Procrustes Metadata Problem

Hello,

I am attempting to run a procrustes analysis on a data set where different extraction methods were attempted on multiple samples (and comparing how the DNA extraction method affects the outcome of the profiling). I am running into some trouble whilst trying to generate a procrustes emperor plot. I have been able to run the standard qiime diversity procrustes-analysis and qiime diversity mantel on my data thus far with no errors. I am now running the below command:

qiime emperor procrustes-plot --i-reference-pcoa core-metrics-SB-1000/weighted_unifrac_pcoa_results.qza --i-other-pcoa core-metrics-PS-1000/weighted_unifrac_pcoa_results.qza --m-metadata-file 20190429_SBvsPS_xtracts_for_SBvsPS_relabelled_1000_min.txt --o-visualization procrustes-emperor.qzv

I am getting the following error:

There was an issue with loading the metadata file:

Metadata IDs must be unique.

It then lists every metadata ID present in my file. I am slightly confused by this as the point of the procrustes plot is to compare variants of the same sample (with the same SampleID) and this was the only way that some of the original procrustes commands would work (having identical sampleIDs between the two). Why am I getting an error for duplicate IDs when this seems to have been required for other commands? I referenced this other forum topic as well Metadata with Procrustes. I’m using Qiime2-2019.1. I’d be happy to share any files but hoping I’m missing something obvious…

Any help would be appreciated!

TNT

Hey there @Todd_Testerman!

In QIIME 2, all individual “samples” must all have unique ID values. It sounds like what you are describing is basically multiple replicates of the same biological sample. My suggestion is to update your metadata to indicate the replicate ID as the “primary” ID, and the biological sample ID as a new column.

Example

Original MD

sample-id extraction-method
sample-a foo
sample-a bar
sample-a baz
sample-b foo
sample-b bar
sample-b baz

Revised MD

sample-id biological-sample extraction-method
s1 sample-a foo
s2 sample-a bar
s3 sample-a baz
s4 sample-b foo
s5 sample-b bar
s6 sample-b baz
1 Like

BTW, please check out this tutorial if you haven’t already: https://docs.qiime2.org/2019.1/tutorials/metadata/

1 Like

Hi Matthew,

Thanks for the quick reply. I sent a private message with some additional info BTW.

I am still running into some problems even with this adjustment you suggested. I now have all unique sampleIDs in my metadata file and they are tied together by the second column, denoting the biological sample. See command and error below:

    qiime emperor procrustes-plot --i-reference-pcoa core-metrics-SB-1000/weighted_unifrac_pcoa_results.qza --i-other-pcoa core-metrics-PS-1000/weighted_unifrac_pcoa_results.qza --m-metadata-file 20190429_SBvsPS_xtracts_for_SBvsPS_relabelled_1000_min.txt --o-visualization procrustes-emperor.qzv
Plugin error from emperor:

  The ordination at index (0) does not represent the exact same samples.

This seems to be the opposite error from what I had originally mentioned. The core metrics were regenerated after I renamed my sampleIDs to be unique (by the way). I am unsure whether sample IDs should match exactly (as this error states) or all be unique (as the original error in my post noted).

Thanks for the support!

Todd

Okay, this is where it gets a little weird with the procrustes plot — the Sample IDs need to be the same in the two tables used to generate the pcoa results. The link you provided above has the most “up to date” protocol for relabeling the IDs. In particular, check the last two steps in that link, feature-table group.

Okay so I generated two sets of core metrics from my two separate tables with the sampleIDs in those tables matching between the two. Should the metadata file supplied in qiime emperor procrustes-plot have those same sampleIDs or is this where they are switched back? So as I understand it right now:

Step 1: Create two feature tables, each with the “biological sample” in the sampleID column.

Step 2: Generate core metrics from these tables, ensuring that the same sampling depth is used.

Step 3: Create a new metadata file with all samples from both tables included. In this file, sampleIDs should now be switched so that there are no duplicates in the sampleID column (or an error will occur during the next step).

Step 4: Run the procrustes-plot command supplying pcoa results from each table’s core metrics as well as the metadata file produced in step 3.

When trying this, it detects that all the samples are missing from the metadata file (as the sampleIDs used to generate the core metrics are now in a new column). However, if I run the command with the sampleIDs kept consistent with what was used to generate the core metrics results, it tells me there are duplicate sampleIDs (which is true). I’m sure there is something simple I am missing so apologies for how long-winded this is but appreciate the assistance!

Todd

Got it to work! So I think my confusion was on the final metadata file. I figured every sample still needed to be represented but I guess each pair should only be represented once? Again, sorry if that was intuitive but I figured we still needed a metadata entry for each sample represented. Okay, so I just have three final general knowledge questions.

  1. With this metadata requirement of one entry per pair, will this mean that metadata labeling of the emperor plot becomes impossible? As I now only have one entry per sample, I can only input metadata for one sample type.

  2. In regards to the output generated from qiime diversity procrustes-analysis, is the output from this step (two transformed matrices.qza) supposed to be used for other analyses in Qiime2? Or are they meant to be exported and analyzed using a different package?

  3. Looking at the mantel test results, would someone be able to briefly say what the interpretation is? I figure once I hear it once I should be good going forward.

Thanks again for the assistance!

Todd

Hey @Todd_Testerman, sorry for the slow turnaround here.

Not "impossible," but I guess it depends on how you are able to redefine your sample metadata. Anyway, I think the big thing here is being able to compare two similar, but different groups.

I'm not sure - @yoshiki? @Nicholas_Bokulich? @jwdebelius?

This might be a better question to ask in the "General Discussion" section of the forum.

:qiime2:

1 Like

These two matrices are currently only intended for the Procrustes plot functionality. Though there's nothing that prevents you from using this elsewhere, if you think that makes sense.

2 Likes

Thanks for the support @thermokarst and @yoshiki!

I’ll post the mantel results in general discussion for help on that!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.