Diversity core metrics: metadata issue

Cybele_C · September 27, 2018, 1:12pm

Hello! I'm working on diversity analyses, and got the following error messages, the second after altering my metadata file (from 2 to 3, below)

Metadata2.txt (288 Bytes)
Metadata3.txt (312 Bytes)

Error messages:

qiime diversity core-metrics-phylogenetic **

--i-phylogeny rooted-tree.qza **

--i-table table3.qza **

--p-sampling-depth 1200 **

--m-metadata-file Metadata2.txt **

--output-dir core-metrics-results

Plugin error from diversity:

None of the sample identifiers match between the metadata and the coordinates. Verify that you are using metadata and coordinates corresponding to the same dataset.

qiime diversity core-metrics-phylogenetic **

--i-phylogeny rooted-tree.qza **

--i-table table3.qza **

--p-sampling-depth 1200 **

--m-metadata-file Metadata3.txt **

--output-dir core-metrics-results

Plugin error from diversity:

All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=282):

(Long list of characters)

I was not able to run this (in 2018.8) without the metadata file, which might not be needed - the files are attached - according to this thread. diversity analysis - #8 by kia2094
However, without that line, I get
Error: Missing option: --m-metadata-file

Thanks for your help!

Nicholas_Bokulich · September 27, 2018, 2:42pm

Hi @Cybele_C,
Here is the key error message:

You changed the sample IDs in your mapping file — that means that they no longer correspond to any of the sample IDs used in any of your data!

Two solutions:

Just use the original metadata file with the original IDs.
Use qiime feature-table group to relabel your sample IDs in your feature table. See "option 3" in this post: Metadata Index Error - #6 by thermokarst

Good luck!

Cybele_C · October 8, 2018, 1:32am

Hello!

Well, my question is really about the necessity of the metadata file. The mapping file was NOT used before this in the table.qzv generation.

I didn't use it in early analyses because I wanted to look at the overall quality, and I couldn't get it to work unless I took out the file at this stage:
(qiime2-2018.6) Cybeles-MacBook-Pro:qiime2-skate3 cybelecollins$ qiime feature-table summarize \

> --i-table table.qza \

> --o-visualization table2.qzv \

> --m-sample-metadata-file Metadata S1 2.csv

Usage: qiime feature-table summarize [OPTIONS]

Error: Got unexpected extra arguments (S1 2.csv)

Without the --msample-metadata file line, I got this to work and proceeded, until the stage above. I did not have BarcodeSequence and LinkerPrimer options.

The new form does correspond to names on my manifest file, while the previous did not.

Obviously shortcuts are very dangerous when you're a beginner, but in the situation I described above, I wasn't able to skip the metadata file, getting Error: Missing option: --m-metadata-file

So I guess I can go back to the table stage and try that again...but why it's needed is still not so clear to me right now, if I want overall diversity per sample and the metadata is all categorical

Nicholas_Bokulich · October 8, 2018, 5:28pm

Mapping files are essential to the core-metrics pipelines, because these pipelines actually perform a series of commands that do use metadata, e.g., for statistical comparisons.

If you only want measures of alpha diversity (e.g., richness) for each sample, and do not want to produce PCoA plots or any other information, then you can run those commands individually instead of using the core-metrics pipeline. You can use qiime diversity alpha to calculate alpha diversity (and you should really use qiime feature-table rarefy to rarefy samples at an even sequence depth prior to doing this — that's one of the steps performed in the core-metrics pipeline).

That error appears to be because you have spaces in your metadata file name. The computer is interpreting these as separate arguments (which are also separated by spaces), and you need to either (a) don't use spaces in any file names! or (b) use a backslash character before each space to "escape" the space, so that the computer interprets this command correctly.

I hope that helps!

Cybele_C · October 13, 2018, 9:07pm

Metadata3b.txt (275 Bytes)

Plugin error from diversity:

All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=277):

Hello - I'm getting this error message, as above and in my original post.

The response above was that the name of my metadata file had spaces - I didn't see that, but some of the text itself did.

Used this code:
qiime diversity core-metrics-phylogenetic **

--i-phylogeny rooted-tree.qza **

--i-table table3.qza **

--p-sampling-depth 1200 **

--m-metadata-file Metadata3b.txt **

--output-dir core-metrics-results

Thank you!

Nicholas_Bokulich · October 15, 2018, 1:52pm

The issue is as I reported above:

You used Metadata2.txt in all previous steps, including core-metrics-phylogenetic... but that you swtiched to Metadata3.txt or Metadata3b.txt and get this (and similar) errors because you use different sample IDs in those metadata files. So all of your feature tables, sequence data, etc, are labeled with the sample IDs in Metadata2.txt and they do not match the new metadata files.

I also described the solution above:

Follow those solutions to fix this problem. Note that solution #2 will need to be applied to the feature table, so core-metrics-phylogenetic (and any other commands using that feature table) will need to be re-run.

The response about spaces in the metadata file name was to this error that you reported:

Which was due to spaces in the metadata file name here:

This is distinct from the original issue that you were reporting.

I hope that clarifies! Let us know if that fixes this issue!

Cybele_C · October 15, 2018, 2:41pm

Hi - I'm sorry, I think that I wasn't very clear before. I never used Metadata2.txt. NO metadata file was used. I'm repeating the SAME step with 3b, etc.

I did not use a metadata file at all until I tried this. I have no core-metrics-phylogenetic yet. I'm still on that step.
Metadata2 was renamed to try this step again.

I did show the text where I TRIED the metadata file in an earlier step - which didn't work because there were spaces - but I moved on without the file. I took that line out because it did not seem necessary and the rest worked without it. There was no argument at the time. This is why I asked why the metadata was required.

Again, the only place it was suggested and attempted before was the place I DIDN'T end up using it, without objection from the program. This time, it did not work, for reasons that you explained. Thanks. I also understand now why it didn't work the first time, but again, I took it out. This line of code, now using 3b, is the first metadata file to be both attempted and apparently crucial to the program.

That might be the actual issue. I can go back and try that again.
There is no previous mapping file. So that might be the problem.

The step where I did NOT running the mapping file, eliminating the line:

(qiime2-2018.6) Cybeles-MacBook-Pro:qiime2-skate3 cybelecollins$ qiime feature-table summarize \

> --i-table table.qza \

> --o-visualization table2.qzv \

> --m-sample-metadata-file Metadata S1.csv

Nicholas_Bokulich · October 15, 2018, 3:14pm

Aha, so I think that explains all of this and also my confusion. I missed that part and assumed that Metadata2.txt was the "original" metadata file based on this line:

So I misinterpreted your meaning.

Since you say no metadata file was ever used, I am assuming that you mean that your raw fastq sequences were already demultiplexed — so you imported as manifest format or as casava1.8 format, and the sample IDs will be determined by the file names on the import.

So you are correct to use qiime feature-table summarize to determine what the actual sample IDs are in your feature table, and it sounds like you were able to get that to run without a metadata file:

So use the sample IDs from that summary as sample IDs in your metadata file. Use that metadata file for qiime diversity core-metrics-phylogenetic and everything should work now that you have extracted the sample IDs from your feature table.

Let us know if that works or if you are still receiving an error!

Cybele_C · October 15, 2018, 5:40pm

Nicholas_Bokulich:

There is no previous mapping file. So that might be the problem.

Aha, so I think that explains all of this and also my confusion. I missed that part and assumed that Metadata2.txt was the “original” metadata file based on this line:

Cybele_C:

got the following error messages, the second after altering my metadata file (from 2 to 3, below)

So I misinterpreted your meaning.

Since you say no metadata file was ever used, I am assuming that you mean that your raw fastq sequences were already demultiplexed — so you imported as manifest format or as casava1.8 format, and the sample IDs will be determined by the file names on the import.

So you are correct to use qiime feature-table summarize to determine what the actual sample IDs are in your feature table, and it sounds like you were able to get that to run without a metadata file:

Cybele_C:

I moved on without the file. I took that line out because it did not seem necessary and the rest worked without it. There was no argument at the time.

So use the sample IDs from that summary as sample IDs in your metadata file. Use that metadata file for qiime diversity core-metrics-phylogenetic and everything should work now that you have extracted the sample IDs from your feature table.

Let us know if that works or if you are still receiving an error!

Yes, exactly! The samples were demultiplexed, and imported with a manifest file. I'm sorry for the confusion.

However, I have the same problem. The samples do have the same names as in the manifest file, and Metadata3b.txt (275 Bytes)
ManifestRun2.csv (1.2 KB)
12%20PM I'm still getting this error.

Plugin error from diversity:

All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=274):

The manifest file and metadata files are attached. The file address should be accurate. If there are any other errors, please let me know,. Thank you very much!

Nicholas_Bokulich · October 15, 2018, 6:38pm

could you please share the output of qiime feature-table summarize? Thanks!

Cybele_C · October 15, 2018, 7:49pm

Yes - the (most recent) output from the feature-table summarize is attached:
table3.qza (54.5 KB)
table3.qzv (372.3 KB)

Nicholas_Bokulich · October 15, 2018, 8:32pm

Can you pls share the phylogeny also?

(scratch my last few comments about sample IDs... feature IDs are the problem, obviously... I have been reading too fast )

Cybele_C · October 15, 2018, 8:54pm

Sure! here's the roots-tree file and others. I just might have noticed the problem, though. To generated the rooted-tree.qza I was trying to use, I used an earlier denoised set, rep-seqs2. So I'll work with this function again using rep-seqs3 and get back to you. Thanks so much for the good question!

: Adenoising-stats3.qza (7.5 KB)
masked-aligned-rep-seqs.qza (86.7 KB)
rep-seqs3.qza (106.5 KB)
aligned-rep-seqs.qza (88.6 KB)
ttachedrooted-tree.qza (56.5 KB)

Cybele_C · October 15, 2018, 9:11pm

So this worked! It was another small but major error with the file names. In a way, it was similar to the problem that you were suggesting. I found when you asked about the feature IDs.

Thank you so much!! I really appreciate the help.