Hello again! I’m back with yet another roadblock that I don’t seem to understand. I have imported my dereplicated Sanger sequences from a .fasta file:
>HCl_49
TGGTTNT...
>HCl_50
TGGTTCT...
to a .qza, and I have a metadata file:
#SampleID Series Salt Concentration
#q2:types categorical categorical categorical
HCl_49 HCl NaCl High
HCl_50 HCl NaCl High
& have used q2-vsearch, q2-ghost-tree, etc. But they do not seem to jive together when I run something like chimera filtering, diversity analyses, or anything involving a metadata file. For example, when I try to run:
qiime diversity alpha-group-significance \
--i-alpha-diversity Hines_observed_otus_vector.qza \
--m-metadata-file Hines_metadata.txt
--o-visualization Hines_observed_otus_vector.qzv \
--verbose
I get the error:
Traceback (most recent call last):
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.py", line 274, in __call__
results = action(**arguments)
File "</Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-389>", line 2, in alpha_group_significance
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
output_types, provenance)
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.py", line 427, in _callable_executor_
ret_val = self._callable(output_dir=temp_dir, **view_args)
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_diversity/_alpha/_visualizer.py", line 38, in alpha_group_significance
metadata = metadata.filter_ids(alpha_diversity.index)
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/metadata/metadata.py", line 727, in filter_ids
ids_to_keep)
File "/Users/haselkornlab/anaconda3/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/metadata/metadata.py", line 203, in _filter_ids_helper
% (', '.join(repr(e) for e in sorted(missing_ids))))
ValueError: The following IDs are not present in the metadata: 'HCO3', 'HCl', 'LCO3', 'LCl', 'SW'
Plugin error from diversity:
The following IDs are not present in the metadata: 'HCO3', 'HCl', 'LCO3', 'LCl', 'SW'
I have tried to rename my sample ID’s in the original .fasta file & the metadata from ‘HCl_49’ to ‘HCl49’ or ‘HCl.49’, but in doing so I get the import error:
There was a problem importing Hines_SeqData_Final.fasta:
Hines_SeqData_Final.fasta is not a(n) QIIME1DemuxFormat file
It seems to only accept is as a demux file when I use the “_” in the headers.
So, I’m honestly not sure how to proceed. I feel like I’m either missing something really simple/obvious, or my data need to be completely reformatted to fit into the q2 workflow.
Any help/insight is GREATLY appreciated!