Dear Qiime2 White Wizards:
I thought I would follow up with my previous post: Mafft 'returned non-zero exit status 1' ERROR
First of all, THANK you for putting the --parttree option for the mafft alignment. I thought i would note that I am still running into MAFFT errors:
- I ran the following command, which produced the following error after 20 hours of run time.
qiime alignment mafft --i-sequences rep-seqs.qza --o-alignment mafft_aligned-rep-seqs.qza --p-parttree --p-n-threads 0
Plugin error from alignment:
Command ‘[‘mafft’, ‘–preservecase’, ‘–inputorder’, ‘–thread’, ‘-1’, ‘–parttree’, ‘/var/folders/z4/jfbc76zs7_l8q_4xb8rgtmf00000gn/T/qiime2-archive-m77v99nb/1193edc7-297d-4a84-975c-0fa9b22d246d/data/dna-sequences.fasta’]’ returned non-zero exit status 1
- Based on previous suggestions, I decided to go to MAFFT directly and ran the following on my sequences (~1.5 million sequences) (same .fna sequences)
mafft --parttree --thread 1 seqs.fna > mafft_out
Wall Time Used : 4-00:00:14
State : TIMEOUT (exit code 0)
CPU Efficiency : 0.00%
Memory Requested : 300.00 GB (300.00 GB/node)
Memory Used : 0.00 MB (estimated maximum)
I tried multiple versions of this command, altering my parameters. Each time I increased my --thread parameter, it would error out due to lack of memory (memory requirement would go up the roof!). Problem is, we don’t have > 300GB of memory available for our group on our cluster (costs are very high for this) and so I have to use our cluster’s burst mode, which puts a time limit to run jobs, hence the time out error.
It seems like MAFFT requires an enormous amount of memory to run, both via MAFFT and also via the Qiime2 option.
With Qiime 1 we could choose our aligment options (e.g. clustalw, mafft, muscle, pynast). Any plans on maybe having an option to use a different aligner besides MAFFT?
Many thanks for even reading this far!