Hello again! I’m back with yet another roadblock that I don’t seem to understand. I have imported my dereplicated Sanger sequences from a .fasta file:
>HCl_49 TGGTTNT... >HCl_50 TGGTTCT...
to a .qza, and I have a metadata file:
#SampleID Series Salt Concentration #q2:types categorical categorical categorical HCl_49 HCl NaCl High HCl_50 HCl NaCl High
& have used q2-vsearch, q2-ghost-tree, etc. But they do not seem to jive together when I run something like chimera filtering, diversity analyses, or anything involving a metadata file. For example, when I try to run:
qiime diversity alpha-group-significance \ --i-alpha-diversity Hines_observed_otus_vector.qza \ --m-metadata-file Hines_metadata.txt --o-visualization Hines_observed_otus_vector.qzv \ --verbose
I get the error:
Traceback (most recent call last): File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in __call__ results = action(**arguments) File "</Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-389>", line 2, in alpha_group_significance File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable output_types, provenance) File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 427, in _callable_executor_ ret_val = self._callable(output_dir=temp_dir, **view_args) File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_diversity/_alpha/_visualizer.py", line 38, in alpha_group_significance metadata = metadata.filter_ids(alpha_diversity.index) File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/metadata/metadata.py", line 727, in filter_ids ids_to_keep) File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/metadata/metadata.py", line 203, in _filter_ids_helper % (', '.join(repr(e) for e in sorted(missing_ids)))) ValueError: The following IDs are not present in the metadata: 'HCO3', 'HCl', 'LCO3', 'LCl', 'SW' Plugin error from diversity: The following IDs are not present in the metadata: 'HCO3', 'HCl', 'LCO3', 'LCl', 'SW'
I have tried to rename my sample ID’s in the original .fasta file & the metadata from ‘HCl_49’ to ‘HCl49’ or ‘HCl.49’, but in doing so I get the import error:
There was a problem importing Hines_SeqData_Final.fasta: Hines_SeqData_Final.fasta is not a(n) QIIME1DemuxFormat file
It seems to only accept is as a demux file when I use the “_” in the headers.
So, I’m honestly not sure how to proceed. I feel like I’m either missing something really simple/obvious, or my data need to be completely reformatted to fit into the q2 workflow.
Any help/insight is GREATLY appreciated!