Plugin error from feature-classifier: Found header without sequence data

Barbs · May 3, 2018, 12:54pm

Hello everyone,

I am working with the RP2 database for 18S sequences. I am trying to extract the reads from the database which include the primers I used for sequencing.
My command is:
qiime feature-classifier extract-reads
–i-sequences rdp_seq.qza
–p-f-primer ACTCCTACGGGAGGCAGCAG
–p-r-primer GGACTACHVGGGTWTCTAAT
–o-reads rdp_seq.qza

It runs for quite a while but I only get an error output:

Plugin error from feature-classifier:
Found header without sequence data.
Debug info has been saved to /var/folders/w6/kymftfkd7yv4l6bxnz_105y40000gn/T/qiime2-q2cli-err-r_lxne62.log

I would love to give you the debug log file but I cant find it. I tried to search for the whole path or just the file name. I do find the /var folder but it doesent lead to the indicated path.
Sorry for providing only those few infos.
Does this error message mean that there is an Accession number in the database which doesn’t have an assigned sequence?

Thanks a lot! Barbara

Nicholas_Bokulich · May 3, 2018, 2:26pm

Hi @Barbs,

Could you re-run the command with --verbose added to the end of the command? Then the error traceback will be printed to your terminal.

That's what it sounds like. If there are empty lines in the sequence file, you should be able to find them and their accession #s with this command (untested):

grep -B 1 '^$' seqs.fna

Let us know if that is in fact what's going on — if not, please share your sequences file and the full error traceback and we'll sort it out.

I hope that helps!

Barbs · May 4, 2018, 9:22am

Hi @Nicholas_Bokulich,

thanks for the super quick response. Here is the full error traceback which resulted from the verbose command:
plugin error feature classifier.txt (3.8 KB)

The grep command didnt give me any output. But maybe I am doing something wrong. I just changed the command to my .fasta file:
grep -B 1 '^$' PR2/pr2_version_4.10.0_mothur.fasta

I am using the original .tax and .fasta file from the PR2 database, in this case the mothur version:

So I am pretty sure there shouln't be any missing sequences.
Sorry again I am sure its some beginners mistake I am having here...

Thanks!

Barbs · May 4, 2018, 12:30pm

Just an amendment:
Your command:
grep -B 1 ‘^$’ PR2/pr2_version_4.10.0_mothurleer.fasta
Works perfectly fine when I insert an empty row. In that case it gives me the corresponding accession number. There just seems to be no empty row in the fasta file…

Nicholas_Bokulich · May 4, 2018, 12:56pm

Thanks for testing and confirming! I had assumed that this would be fasta format without linebreaks in the sequence, so that command is not adequate for this more complicated case.

That would be my assumption as well, but alas it is all too true:

$ grep -A 2 'CP000499.0.0_U' pr2_version_4.10.0_mothur.fasta
>CP000499.0.0_U
>JQ008851.1.1532_U
TCATTAAATCAGTTATCGTTTATTTGATCGTACCTTTACTACTTGGATAACCGTGGTAATTCTAGAGCTAATACATGCTT

The good news is that it looks like >CP000499.0.0_U is the only empty accession in that file, so you can just remove that line and all should be okay.

For reference, here's an awk command to find consecutive lines beginning with >:

awk '/^\>/{if(!p)x=$0;p++}!/^\>/{p=0}p==2{print x}p>1' pr2_version_4.10.0_mothur.fasta

You may want to raise an issue on the PR2 github site so they can get that fixed in a future release... let me know if you want me to notify them.

I hope that helps!

Barbs · May 4, 2018, 2:06pm

Hi @Nicholas_Bokulich,

so I deleted >CP000499.0.0_U from the fasta file and

tells me that there are no consecutive ">" in my dataset...and still i get the same error message. I am wondering if it has something to do with the linebreak format?!

Thanks anyway, I learned quite many helpful commands today!!!

Nicholas_Bokulich · May 4, 2018, 2:28pm

No that's not the problem — I just tested on a subset of these sequences and they could import/extract-seqs fine.

There must be another issue with the file. Did you delete the entire line containing >CP000499.0.0_U? Did you re-import to QIIME2?

Barbs · May 4, 2018, 8:14pm

Indeed i forgot to create a new .qza after editing the fasta file. Sorry for that I - got too excited! Everything is working now!
Thanks a lot for your great work!

system · June 5, 2018, 2:14am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.