qiime tools import on cluster

Hi all :slight_smile: I installed qiime2 v2020.8 on my institutional server, through miniconda3.
I have PE sequences input files that have already been demultiplexed, and I created a manifest file to import them.

Unfortunately, when trying to run
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path qiime_manifest.tsv --output-path $(pwd)/demux.qza --input-format PairedEndFastqManifestPhred33V2

first it failed due to memory ([Errno 28] No space left on device). As read here on the forum, I thus exported my TMPDIR to be also in a specific folder within the working space (I am working on a scratch folder). However, now a new (and to me less understandable) error is rised:

qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path qiime_manifest.tsv --output-path $(pwd)/demux.qza --input-format PairedEndFastqManifestPhred33V2
Traceback (most recent call last):
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/q2cli/builtin/tools.py", line 158, in import_data
    view_type=input_format)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 241, in import_data
    validate_level='max')
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/sdk/result.py", line 273, in _from_view
    provenance_capture=provenance_capture)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/archiver.py", line 316, in from_data
    Format.write(rec, type, format, data_initializer, provenance_capture)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v5.py", line 21, in write
    provenance_capture)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v1.py", line 19, in write
    provenance_capture)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/archive/format/v0.py", line 62, in write
    data_initializer(data_dir)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/site-packages/qiime2/core/path.py", line 37, in _move_or_copy
    return _ConcretePath.rename(self, other)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/pathlib.py", line 1309, in rename
    self._accessor.rename(self, target)
  File "/scratch/mastrori/conda_env/qiime2-2020.8/lib/python3.6/pathlib.py", line 393, in wrapped
    return strfunc(str(pathobjA), str(pathobjB), *args)
FileExistsError: [Errno 17] File exists: '/scratch/mastrori/EM007_16S_Blanka/qiime_tmp/q2-SingleLanePerSamplePairedEndFastqDirFmt-4av8lpvk' -> '/scratch/mastrori/EM007_16S_Blanka/qiime_tmp/qiime2-archive-rodl8hyh/92061de0-7764-4e97-a7c5-069cbd504a91/data'

An unexpected error has occurred:

  [Errno 17] File exists: '/scratch/mastrori/EM007_16S_Blanka/qiime_tmp/q2-SingleLanePerSamplePairedEndFastqDirFmt-4av8lpvk' -> '/scratch/mastrori/EM007_16S_Blanka/qiime_tmp/qiime2-archive-rodl8hyh/92061de0-7764-4e97-a7c5-069cbd504a91/data'

See above for debug info.

I tried consulting my sysadmin and it looks to them like the two threads are picking up the same sample and want to write to the same temp folder. Obviously, the folder appointed as TMPDIR was freshly created, and I tried rm-oving and re-mk-ing it, but the error stayed the same.

Any suggestion of what I might be taking wrong here?

Best and thanks a lot for the support,
Nora

Thanks for the detailed info!

Can you please provide us with two things:

  1. the output of the command env
  2. your manifest file

It does appear that something is trying to overwrite an existing file, but there aren't multiple threads competing here. I am wondering if you have an issue where your tmpdir is polluting your import directory somehow (this is why I asked for item 1 above), or perhaps there is a problem with the manifest where multiple files are being imported to one sample id (which is why I asked for item 2 above; although the format validation should catch this situation).

A third alternative is that there is an issue with the scratch filesystem itself - we have observed issues in the past with Python (which the QIIME 2 framework is built on) running into problems with networked filesystems.

Dear Matthew,

thanks a lot for the super quick reply.
Please find attached the two requested files.

As a side note (hopefully helpful?), the command was working correctly with a subset of data (5 out of 110) when TMPDIR was set to my home dir, as suggested in another thread. Unfortunately, due to size limitation, if I export the TMPDIR to be either on /scratch itself or to other folder under my group folder, the command invariably fails as reported.

Thanks again for the help.
Best,
Nora

env_info.txt (10.3 KB) qiime_manifest.txt (41.4 KB)

Thanks @nora!

What command did you run to declare a new temporary directory?

Dear Matthew,

the command I run was
export TMPDIR=ā€™/g/scb/mzimmerm/mastrori/tmpā€™
which is the same TMPDIR you should find in the previously attached env_info.txt.
However, I also tried setting it to a folder within /scratch, i.e. where the data and the conda env is. Unfortunately still same error :frowning:

Best,
Nora

Thanks for clarifying! I noticed that the tmp dir set in your env output was

/g/scb/mzimmerm/mastrori/tmp

but I also noticed that that doesnā€™t match the tmp dir actually reported in your error message above:

/scratch/mastrori/EM007_16S_Blanka/qiime_tmp

Did you try setting a different tmp dir in an earlier iteration of this? I think that is what youā€™re saying above, but I just want to double-check.

Dear Matthew,

so, to clarify, I'm attaching you the error I get after running:

export TMPDIR='/scratch/mastrori/EM007_16S_Blanka/qiime_tmp'
mkdir /scratch/mastrori/EM007_16S_Blanka/qiime_tmp
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path qiime_manifest.txt --output-path $(pwd)/demux.qza --input-format PairedEndFastqManifestPhred33V2 2> /g/scb/mzimmerm/mastrori/qiime_err.txt

Hope this clarifies. Thanks again,
Noraqiime_err.txt (2.2 KB)

Perfect, thanks!

Can you run the following command and paste the results here?

mount

This will let us get some more detailed information about how the systemā€™s disks are configured.

Dear Matthew,

here you find the (looong) result for the mount command.

Thanks a lot for the time and dedication!!

Best,
Nora
mount_info.txt (24.6 KB)

Thanks for the info, @nora.

Looks like your HPC is using a tool called ā€œFUSEā€ for handling mounting your file system at /scratch/mastrori. I suspect that that might be the main cause for the issue - somehow the FUSE driver and Python (the language the QIIME 2 framework is implemented in) arenā€™t quite on the same page about what files are where. We have seen a few other reports of this floating around on the forum (cc @ebolyen). I donā€™t have a good workaround for you right now, besides asking your sysadmin if there is a non-FUSE device that you could work on.

Dead Marr,

thank you, I will try contact them again and see if we can find a workaround. Otherwise I will probably try docker (finger suuuuper crossed).

Best,
Nora

3 Likes

lol! This is going to be my new undercover alias...

Keep us posted!

:qiime2:

1 Like

Dear Matt (sorry for the previous typo),

unfortunately no luck on my side. I started again from scratch, installed miniconda3 and qiime2 v2020.8 and set my working folders to be out of any ā€˜FUSEā€™ (in our case, quobyte) space. Still unable to have the import function working, always with the same [Errno 17] File exists error. This is independent on whether or not I redirect my TMPDIR.

I was searching the forum a bit more, and it looks like a similar problem happened here, but has been closed without being solved.

I would gladly take in any other suggestion from your side. Right now I donā€™t have many other ideas but
trying again with a different - non native - installation of qiime.

However, I would really like to leverage the computational power available on my server.

Best,
Nora

P.s. from my reading around it looks like a problem which is not unique to the import process. Could potentially be related to how multithreading is handled during these commands? Unfortunately there is no way to manually deactivate multithreading on import, so I cannot check my hypotesis, but I hope it could be of any use.

Hello!

My guess its because the default tmpdir is also a FUSE filesystem.

Yeah, unfortunately we weren't able to reproduce the error on our end, which makes debugging pretty difficult.

Great suggestion! Unfortunately there is no multithreading enabled during import, for any of the available QIIME 2 imports. The issue has to do with reading and writing to the filesystem, not the CPU. Basically, the FUSE driver appears to report back inconsistent information to Python when operating - Python asks if a file has been written to the disk, FUSE tells it NO (when in reality it should've said YES), so then when Python attempts to write the file its like "whoa what, you just told me nobody was home.... guess i'll just leave..." (this is just speculation on my part).

I'm not sure what other options come to mind at the moment, but I'll let you know if I think of anything. Sorry!

Dear Matt,

according to what my sysadmin has been telling me, no, our scratch is not a FUSE filesystem. That is the actual reason for my confusion. Therefore there might be other filesystems not working properly under such circumstances.

Thanks for the explanation, it makes much more sense now. Luckily enough, for the specific experiment Iā€™m working on right now I managed to compute everything locally, so no problem. However, it will be really important for me if this problem will be solved in the future, because often times the datasets I will have to handle will benefit from cluster performances.

Best,
Nora

Hi @nora!

The logs you shared above indicate otherwise - FUSE is a type of filesystem, and it looks like your sysadmin is using a specific vendor's version of the software (quobyte).

Would you be available to run some debugging scripts for us on your cluster? I think I mentioned somewhere above in this thread that we haven't been able to reproduce this (I personally think there is something buggy about the quobyte software, we've seen it before), and we don't have access to quobyte's FUSE system. This would be a huge help for trying to diagnose and debug - just let me know.

:qiime2:

Dear Matt,

sorry to hear that. Unfortunately I had relied simply on the answers I got from my sysadmin.

If none of them require sudo rights, absolutely. I would be glad to try give a hand debugging it.

Best,
Nora

1 Like