Diversity plugin error

Continuing the discussion from Alignment error message:

Hello
I am trying to generate a tree for phylogenetic diversity analyses using the command: qiime alignment mafft. And I am receiving the same error as the user linked above. I am working on a 3TB drive, so I don’t think that memory is the issue, but I could be wrong. A friend suggested that perhaps my sequences are too different in length to align efficiently. Any thoughts? @coralgal did you find a solution?
Thanks
Laura

Hello
I am having a similar problem as above. A friend suggested that I make sure all of my sequences are the same length. I am working out how to do this, but do you think that this may be the issue?
Thanks
Laura

Hello Laura,

If you are using a small amount of memory for your VM, this could be the issue. Does your error mention a memory allocation issue?

I think you may get better answers if you open a new thread. This one is 17 days old and more people will see your question in a new thread!

Colin

Thanks, Colin will do. Also I’m working with a 3 TB drive so I don’t know if that is the problem?

Hi @LauraMason - Memory (RAM) is different from storage space — your 3 TB drive is storage space, but the alignment error you linked to has to do with available memory (RAM) on your computer. Can you please tell us a bit more about how you are running QIIME 2 (vm? native install?), and some details on the computer (mac? linux? institutional cluster?). With that said, when it comes to memory errors, normally there aren’t too many other options besides getting your hands on more memory. One option is to run this on a cluster, if you aren’t already. Another option is to check out an AWS instance, you could rent something big enough that way. Keep us posted! :t_rex:

PS - in the future, please send us the exact command you are trying to run, as well as the complete error message (add --verbose to your command, or attach the log file referenced at the end of the error). Right now we are working on the assumption that you are experiencing the exact same error you linked to above, but often times there are subtle differences in the command, error message, etc, that lead us down entirely different routes. Thanks!

Hi @thermokarst
Sorry for the double post earlier. (Am I posting this in the correct place this time?)

I am running QIIME 2 through a native install on a linux the 32gb RAM. Here is the code:
qiime alignment mafft --i-sequences uchime-dn-out/rep-seqs-nonchimeric.qza --o-alignment aligned-rep-seqs-nonchimeric.qza --verbose

Here is the output:
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: mafft --preservecase --inputorder --thread 1 /tmp/qiime2-archive-06k19wvb/93f71b67-0186-47f3-aa50-06d903649ea1/data/dna-sequences.fasta

inputfile = orig
171077 x 404 - 172 d
nthread = 1
stacksize: 8192 kb->33413 kb
generating a scoring matrix for nucleotide (dist=200) … done
Gap Penalty = -1.53, +0.00, +0.00

Making a distance matrix …
40301 / 171077 (thread 0)/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/bin/mafft: line 2440: 26812 Killed “$prefix/disttbfast” -q $npickup -E $cycledisttbfast -V “-”$gopdist -s $unalignlevel $legacygapopt $mergearg -W $tuplesize $termgapopt $outnum $addarg $add2ndhalfarg -C $numthreadstb $memopt $weightopt $treeinopt $treeoutopt $distoutopt $seqtype $model -f “-”$gop -Q $spfactor -h $aof $param_fft $algopt $treealg $scoreoutarg < infile > pre 2>> "$progressfile"
Traceback (most recent call last):
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2cli/commands.py”, line 224, in call
results = action(**arguments)
File “”, line 2, in mafft
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 228, in bound_callable
output_types, provenance)
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 363, in callable_executor
output_views = self._callable(**view_args)
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_alignment/_mafft.py”, line 61, in mafft
run_command(cmd, aligned_fp)
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/site-packages/q2_alignment/_mafft.py”, line 27, in run_command
subprocess.run(cmd, stdout=output_f, check=True)
File “/home/bobcatgenomics/miniconda3/envs/qiime2-2017.12/lib/python3.5/subprocess.py”, line 398, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command ‘[‘mafft’, ‘–preservecase’, ‘–inputorder’, ‘–thread’, ‘1’, ‘/tmp/qiime2-archive-06k19wvb/93f71b67-0186-47f3-aa50-06d903649ea1/data/dna-sequences.fasta’]’ returned non-zero exit status 1

Plugin error from alignment:

Command ‘[‘mafft’, ‘–preservecase’, ‘–inputorder’, ‘–thread’, ‘1’, ‘/tmp/qiime2-archive-06k19wvb/93f71b67-0186-47f3-aa50-06d903649ea1/data/dna-sequences.fasta’]’ returned non-zero exit status 1

See above for debug info.

Thanks!
Laura

2 Likes

:+1::tada:

Hmm, the error message looks like mafft is working with 32mb of RAM, not 32GB. Would you be able to share uchime-dn-out/rep-seqs-nonchimeric.qza, to see if we can tease out what is going on here? Feel free to send to me in a DM if you can't share publicly.

Since it sounds like this is an institutional linux cluster, do you need to tweak your job submission to request more memory? Just a thought. Keep us posted, and let us know about the data. Thanks! :t_rex:

1 Like

Thanks for sharing your data in a DM! I was able to recreate the error you reported, and from the looks of it, I think I was running out of memory, too. Will look into this a bit more, so stay tuned. :t_rex:

In the meantime, maybe you should look into filtering some of these data to reduce the footprint of this analysis. @gregcaporaso recommends considering keeping only OTUs that appear in more than one sample in his Analyzing paired end reads tutorial. As well, @Nicholas_Bokulich recommends quality filtering prior to clustering (--p-min-quality 20 or --p-min-quality 30 when running qiime quality-filter q-score), and then removing singletons with an abundance filter (and/or chimera filtering):

https://www.nature.com/articles/nmeth.2276

To be clear, the error you are observing is actually from MAFFT, the underlying alignment tool being used by q2-alignment, so you would see this same error if you ran the same data directly with MAFFT. Maybe you can work around this large amount of memory that MAFFT is trying to use by performing some of the cleanup steps I suggest above. Please let us know how it goes! :t_rex:

1 Like

I tried to run the unfiltered (except for chimeras) on my personal computer - a dell running the linux subsystem on 32 gb RAM. It seems like mafft is taking up too much memory here too, though, even with all 4 cores running it.

qiime alignment mafft --i-sequences rep-seqs-nonchimeric.qza --o-alignment rep-seqs-nonchimeric-aligned.qza --p-n-threads -1 --verbose

Plugin error from alignment:

Command ‘[‘mafft’, ‘–preservecase’, ‘–inputorder’, ‘–thread’, ‘-1’, ‘/tmp/qiime2-archive-19astjd9/93f71b67-0186-47f3-aa50-06d903649ea1/data/dna-sequences.fasta’]’ returned non-zero exit status 254

Hopefully this isn’t too much for one post - let me know if it is

Thank you

Laura

I am going to split these into some new topics, this is starting to lose focus. Thanks!

Thanks @LauraMason, can you attach the entire error log, either when running with --verbose (copy and paste), or upload the log that is referenced at the end of the error when running the command without the --verbose flag.

"Cores" refers to the CPU, which is different from memory (RAM). With that said, running on four cores usually means you are using add. The exit code of 254 is different than what you reported before, so something is failing in MAFFT in a new way. You might have better luck asking the MAFFT developers what this error code means (or at least consulting their documentation). Send that log our way too and we can start inspecting as well, but like I said, this is an error coming from MAFFT, and since we don't develop MAFFT, we aren't the best resource for figuring out why this is failing. Thanks! :t_rex:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.