Importing lowercase alignment file

I had to use an online mafft program program because of memory issues. I am now trying to import the alignment file, but for some reason all the nucleotides were converted to lowercase. How do I import lowercase nucleotides? I could write a janky script to do it, but I am trying as much as possible to do this using qiime tools. My command was as follows:

qiime tools import
–input-path EXP_type1_rep-seqs-or-99/_out.18072802055553becwu7zMjlmSU9hBaKFHdlsflarge.pir
–output-path type1_rep-seqs-or-99_aligned-sequences.qza
–type ‘FeatureData[AlignedSequence]’

lowercase nucleotides can cause issues with some tools downstream, so I'd recommend just converting to uppercase. I don't think a bash one-liner would be too janky, provided it does not cause issues with your seq IDs.

Good luck!

1 Like

Unfortunately it does interfere with naming, but if there isn’t an easy solution I’ll write some kind of masticated perl script.

Try this command in bash:

awk '/^>/ {print($0)}; /^[^>]/ {print(toupper($0))}' file.fasta > file_upper.fasta

:qiime2: : :raised_hands:

Colin

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.