The space in line header of UNITE database

Dear QIIME2 users
When I used the UNITE database (QIIME release for Fungi. Version 18.11.2018.) to analyze and after I modified by the code(See below*),I found a space in line 67868 header. Although there is only one space in line header of UNITE database(QIIME release for Fungi. Version 18.11.2018.) and we can modified by hand, it is an imperfect process.
Would some talent solve the problem?Thank you very much!

*Code used:

awk '/^>/ {print($0)}; /^[^>]/ {print(toupper($0))}' developer/sh_refs_qiime_ver8_99_s_02.02.2019_dev.fasta | sed -e '/^>/!s/\(.*\)/\U\1/;s/[[:blank:]]*$//' > sh_refs_qiime_ver8_99_s_02.02.2019_dev.fasta

Hi @ZhengQu,
I do not see any spaces in the headers of that file, but is it causing a problem with QIIME 2? Generally spaces in the header lines of fasta files are okay, the description after the space is just ignored. I and others have been using this UNITE release for fungal analysis for a year now without issue — so if you are receiving an error you may have unintentionally altered the source data, e.g., to introduce a space.

Sure, anything done by hand is imperfect — you can use sed or tr or awk to make simple programmatic edits to a text file like this, e.g., as done in the code you posted.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.