Help Fatal error: Invalid (zero) abundance annotation in FASTA file header

mbnalbright · July 3, 2018, 12:14am

I am trying to reannotate an OTU file/rep sequences with greengenes annotations to use with PICRUST, using the vsearch cluster-features-closed-references. I keep getting the error

Fatal error: Invalid (zero) abundance annotation in FASTA file header
What might cause this error?

qiime vsearch cluster-features-closed-reference \

--i-table feature-table-1.qza
--i-sequences fasta_file_new2_NEW.qza
--i-reference-sequences 97_otus.qza
--p-perc-identity 0.97
--o-clustered-table tableTEST_3to8_BAC.qza
--o-clustered-sequences rep-seqs_TEST_3to8_BAC.qza
--o-unmatched-sequences unmatched_TEST.qza
July 2, 2018

Nicholas_Bokulich · July 3, 2018, 1:59pm

Hi @mbnalbright,
Could you tell us a little more about how you got to this stage?

Did you use qiime vsearch dereplicate-sequences before OTU picking? (if no, use that command first to dereplicate, then do OTU picking)
Could you possibly share your input files?
Could you please share the full error traceback? (either run your command with --verbose added to the command, or open the log filepath given along with the error you reported)

With a little more info, we can help troubleshoot!

mbnalbright · July 3, 2018, 2:12pm

fasta_file_new2.qza (133.8 KB)
feature-table_bact_OTUtable_runs3to8_NEW.qza (503.3 KB)

2.I tried to take an existing OTU table and fasta file representative sequences and upload them in qiime2 format. I have attached the files.

Below is the full error
Command: vsearch --usearch_global /tmp/tmpdlo96307 --id 0.3 --db /tmp/qiime2-archive-4rbmg1by/8f0d60d6-00c8-4a2c-ab1a-788746d54
125/data/dna-sequences.fasta --uc /tmp/tmp6ld5fb6x --strand plus --qmask none --notmatched /tmp/tmpltx9moo9 --threads 0

vsearch v2.7.0_linux_x86_64, 125.9GB RAM, 32 cores

Reading file /tmp/qiime2-archive-4rbmg1by/8f0d60d6-00c8-4a2c-ab1a-788746d54125/data/dna-sequences.fasta 100%
142290491 nt in 99322 seqs, min 1254, max 2353, avg 1433
Masking 100%
Counting k-mers 100%
Creating k-mer index 100%
Searching

Fatal error: Invalid (zero) abundance annotation in FASTA file header
Traceback (most recent call last):
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/site-packages/q2cli/commands.py", line 246, in call
results = action(**arguments)
File "", line 2, in cluster_features_closed_reference
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/site-packages/qiime2/sdk/action.py", line 228, in bound_callable
output_types, provenance)
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/site-packages/qiime2/sdk/action.py", line 363, in callable_executor
output_views = self._callable(**view_args)
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/site-packages/q2_vsearch/_cluster_features.py", line 256, in cluster_features_closed_reference
run_command(cmd)
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/site-packages/q2_vsearch/_cluster_features.py", line 33, in run_command
subprocess.run(cmd, check=True)
File "/home/malbright/anaconda3/envs/qiime2-2018.2/lib/python3.5/subprocess.py", line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['vsearch', '--usearch_global', '/tmp/tmpdlo96307', '--id', '0.3', '--db', '/tmp/qiime2-archive-4rbmg1by/8f0d60d6-00c8-4a2c-ab1a-788746d54125/data/dna-sequences.fasta', '--uc', '/tmp/tmp6ld5fb6x', '--strand', 'plus', '--qmask', 'none', '--notmatched', '/tmp/tmpltx9moo9', '--threads', '0']' returned non-zero exit status 1

Thanks!

Nicholas_Bokulich · July 3, 2018, 2:29pm

Thanks @mbnalbright!

Could you try using qiime vsearch dereplicate-sequences before OTU picking?

It looks like this error is coming from vsearch, not QIIME 2 — my understanding is that these abundance annotations are a part of the expected format for OTU picking in vsearch. From the manual:

input fasta file should present abundance annotations (i.e. a pattern [;]size=integer[;]
in the fasta header).

This abundance annotation will be written in during dereplication — so it looks like that will still need to be done here, even though you are re-clustering a pre-clustered table/sequences.

Let me know if that fixes it!

Nicholas_Bokulich · July 11, 2018, 8:00pm

An off-topic reply has been split into a new topic: Found blank or whitespace-only line before ‘+’ in FASTQ file

Please keep replies on-topic in the future.

system · August 12, 2018, 2:00am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.