Is there something unusual about how the TMPDIR is being set in 2023.2?

Hello,

I'm running into an issue related to how the TMPDIR is being set in QIIME 2 2023.2.

Here is the code I am running:

source activate qiime2-2023.2

set -x
set -e

export TMPDIR=/panfs/jpshaffer/tmp2

qiime diversity beta-phylogenetic \
  --i-table /projects/emp/emp-allergy/data/emp_16s_release2_qiita_with_release1_90bp_gg2_asv_rar5k.qza \
  --p-metric 'weighted_unifrac' \
  --p-threads 60 \
  --i-phylogeny /databases/gg/2022.10/2022.10.phylogeny.asv.nwk.qza \
  --o-distance-matrix /projects/emp/emp-allergy/data/emp_16s_release2_qiita_with_release1_90bp_gg2_asv_rar5k_dist_wunifrac.qza \
  --verbose

Here is the error message:

+ set -e
+ export TMPDIR=/panfs/jpshaffer/tmp2
+ TMPDIR=/panfs/jpshaffer/tmp2
+ qiime diversity beta-phylogenetic --i-table /projects/emp/emp-allergy/data/emp_16s_release2_qiita_with_release1_90bp_gg2_asv_rar5k.qza --p-metric weighted_unifrac --p-threads 60 --i-phylogeny /databases/gg/2022.10/2022.10.phylogeny.asv.nwk.qza --o-distance-matrix /projects/emp/emp-allergy/data/emp_16s_release2_qiita_with_release1_90bp_gg2_asv_rar5k_dist_wunifrac.qza --verbose
Traceback (most recent call last):
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2cli/commands.py", line 352, in __call__
    results = action(**arguments)
  File "<decorator-gen-111>", line 2, in beta_phylogenetic
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 475, in _callable_executor_
    outputs = self._callable(scope.ctx, **view_args)
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2_diversity/_beta/_pipeline.py", line 31, in beta_phylogenetic
    dm, = action(table, phylogeny, threads=threads,
  File "<decorator-gen-512>", line 2, in weighted_unifrac
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 405, in _callable_executor_
    prov = provenance.fork(name)
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/core/archive/provenance.py", line 442, in fork
    forked = super().fork()
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/core/archive/provenance.py", line 342, in fork
    forked._build_paths()
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/core/archive/provenance.py", line 142, in _build_paths
    self.path = qiime2.core.path.ProvenancePath()
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/core/path.py", line 146, in __new__
    path = tempfile.mkdtemp(prefix=prefix)
  File "/home/jpshaffer/software/miniconda3/envs/qiime2-2023.2/lib/python3.8/tempfile.py", line 358, in mkdtemp
    _os.mkdir(file, 0o700)
OSError: [Errno 28] No space left on device: '/tmp/qiime2-provenance-pme9mvn2'

Plugin error from diversity:

  [Errno 28] No space left on device: '/tmp/qiime2-provenance-pme9mvn2'

See above for debug info.
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command:

ssu -i /tmp/qiime2/jpshaffer/data/c4d2bb02-951c-4737-9f99-c05c2038d05e/data/feature-table.biom -t /tmp/qiime2/jpshaffer/data/1d6fd745-9191-448c-9066-6b754e53a272/data/tree.nwk -m weighted_unnormalized -o /tmp/q2-LSMatFormat-uxju8cte

It appears QIIME 2 is not suing the TMPDIR what was defined in the job script.

Thanks in advance for any insight.

1 Like

Hi @Lichen!

Good observation, what you did should have worked (and certainly has worked in the past).

Just to check how other programs treat it, could you run:

mktemp -u

in the same context as your command (post env-var setting)?

What's weird about this is we are just using Python's standard library ( tempfile.mkdtemp) to resolve the temp dir, so I really don't have a good suggestion yet.

Thank you!

Here is the output from running 'mktemp -u':

+ mktemp -u
/panfs/jpshaffer/tmp2/tmp.yxYnLJPmXa

Thanks again!

1 Like

Well darn, that looks perfect.

Are you running these commands through a queue-ing system, and if so, was the mktemp also run through the same queue-ing system?

Yes, the error is from running on a queue-ing system. However, the 'mktemp -u' test I just performed was in an interactive job on the same system. I have just kicked off a new job in the queue-ing system as the initial job that failed, including the 'mktemp -u' test, and will follow up once that fails or completes.

Thanks again!

2 Likes

Hey @Lichen,

Just wanted to check if you found the source of the issue?

Thanks!
-Evan

Thank you! It looks like our /tmp/ was cleaned up, and the job was able to complete based on that generating enough available space. Bummer! I was hoping to reproduce the problem. I will be sure to re-post if it comes up again, but let me know if you'd like to investigate further.

Hi Justin and Evan,
we also had trouble executing our standard SLURM qiime2-2023.2 scripts on our HPC system. Our solution was to empty the TMPDIR prior to any executions by including something like this in the bash script (before activating, running and deactivating the conda qiime2-2023.2 environment):

#!/bin/bash

#SBATCH --job-name=qiime2
#SBATCH --cpus-per-task=2
#SBATCH --output=log/qiime2-%j.out
#SBATCH --error=log/qiime2-%j.err

hostname
rm -r $TEMP/qiime2

Maybe it is of any applicability to you or other QIIME2 users...

1 Like

Hi @Mechah,

That's a good workaround, but it's definitely not our goal to need such a thing.
Would you be able to share any details about what specifically wasn't working?

Perhaps there's something we could be doing better here.