Trouble exporting environmental variable TMPDIR

QIIME2, Training a Classifier, Plugin error from rescript: [Errno 28] No space left on device, export TMPDIR

Hello all,

My name is Bryan and I'm a master's student studying fish diet using cytochrome oxidase I (COI) universal primers. I am following this Building a COI database from BOLD references tutorial on building a database and trained classifier for the target COI region (~180 bases). I have attempted to use evaluate-fit-classifier (rescript) and the q2-feature-classifier (QIIME2), both encounter the same error after many hours of running. evaluate-fit-classifier apparently requires more memory and storage because it includes a series of validation steps at the end. The issue appears to be that I run out of tmp directory storage, and the stated solution to this on other forums is to export to tmp directory to to a location with adequate storage. After doing so, the code still appears to attempting to write to the default location. I have successfully trained the classifier and used the classifier with q2-feature-classifier on a 0.2 subset of the dataset, and failed with a 0.25 subset. Below are the annotated commands and errors.

(this post was edited/updated on 04/26/2022 to reflect the exact code that was used when the problem occurred)
#attach to screen
-bash-4.1$ screen -r 12712.pts-3.khaleesi

#created bash subshell (potential issue)
-bash-4.1$ bash

#put conda' base (root) on PATH (potential issue)
bash-4.1$ conda activate

#activate conda environment
(base) bash-4.1$ conda activate qiime2-2022.2

#navigate to working directory
(qiime2-2022.2) bash-4.1$ cd /projects/pesticide_metagenomics/CLG_analysis_bv/QIIME2

#check current tmp dir storage
(qiime2-2022.2) bash-4.1$ df /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_centos-lv_root
51475068 40293260 8560368 83% /
#export tmp directory to location with more space
(qiime2-2022.2) bash-4.1$ export TMPDIR = '/projects/temp'

#validate
(qiime2-2022.2) bash-4.1$ echo -k $TMPDIR
’/projects/temp’

#check storage in new location, should be plenty
(qiime2-2022.2) bash-4.1$ df /projects/temp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_data-projects
13639837696 11687559604 1952278092 86% /projects
#subset by 0.25
qiime rescript subsample-fasta --i-sequences bold_anml_seqs.qza --p-subsample-size 0.25 --p-random-seed 1 --o-sample-sequences bold_anml_seqs_sub0.25.qza --verbose

#attempt training of classifier
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads bold_anml_seqs_sub0.25.qza --i-reference-taxonomy bold_anml_taxa.qza --o-classifier classifier_q2_0.25.qza

#detatch from screen
Ctrl+N+D
[detached]

#evaluate job status
-bash-4.1$ htop
image

#error after hours of running
Plugin error from rescript: [Errno 28] No space left on device Debug info has been saved to /tmp/qiime2-q2cli-err-_j6y5a0l.log

I'm concerned that the .log file wrote to the default tmp directory location, which makes me think exporting the variable did not do the trick. Here is a screen grab of the log file error.

I know this is a subject that has been discussed before, but the solutions that I could find in the forum do not appear to solve it. Any help is greatly appreciated!

Best,
-Bryan

quick question regarding:

Did you run the successful 0.2 subset using the alternate $TMPDIR, or the default?
If it was using default setup, maybe try running an even smaller subset (like 0.05 or tinier), export your temporary directory to ‘/projects/temp’, run the same commands you have been all along, and see if it writes the info to where you expect.

One thing to verify is you are running out of disk space because of file size/space, not because of file number limits. I doubt that is at issue here, but perhaps the power users in the forum can confirm you don’t create many small files during this process.

Hi Devon,

Thank you for the quick response!
I can confirm that I did run out of disk space when training the classifier with the full dataset.

Update:
I ran the 0.2 subset using the default tmp directory.
I opened a new screen, activated the conda environment, and began training the q2-feature-classifier without attempting to set the alternate. However, on other screens when I do attempt to set the alternate $TMPDIR, it still appears to write to the default tmp directory. So it would probably complete the 0.2 subset after attempting to set the alternate $TMPDIR, because it seems like export TMPDIR = 'projects/temp' does not actually change anything.

But, I may have found the solution now. I was able to complete the 0.25 subset and 0.5 subset yesterday, and the q2-feature-classifier is being trained on the full dataset currently, I am waiting for it to finish now.

I followed a tutorial for installing miniconda3 and running a conda environment on a linux system where they created a new bash subshell within the current shell (possible problem #1) before typing "activate conda" to put conda's base (root) on PATH (possible problem #2). Then activate the conda environment.

Failed Code:
#attach to screen
-bash-4.1$ screen -r 12712.pts-3.khaleesi
#created bash subshell
-bash-4.1$ bash
#put conda' base (root) on PATH
bash-4.1$ conda activate
#activate conda environment
(base) bash-4.1$ conda activate qiime2-2022.2
#now in activated conda environment
(qiime2-2022.2) bash-4.1$

Successful Code:
#attach to screen
-bash-4.1$ screen -r 12712.pts-3.khaleesi
#activate conda environment
-bash-4.1$ conda activate qiime2-2022.2
#now in activated conda environment
(qiime2-2022.2) bash-4.1$

1 Like