Dada2 "No such file or directory" and "Error in open.connection" message

@ebolyen and @Bing so I ran it again with the complete dataset and it looks like it worked.
The only differences were:

  1. Before, the demux-paired-end.qza file location / working directory was different than the fastq files, this time I had all files in the same folder/same as working directory
  2. I tried to free up as much memory as possible

I don’t have a clear answer why it worked this time, but will stick to everything in the same directory including the fastq files for now, although I thought once the demux.qza file was ready I would not need the fastq files around…

Thanks

Thanks for the data @apzlo and @Bing! I'm going to run your commands on our local dev cluster today to see if I am able to reproduce.

This shouldn't matter in principle. The .qza/.qzv files are just .zip files. What QIIME 2 does is unzip them to a temporary directory, then it operates on that unzipped data without touching the .qza again.

Since your dataset is right at the boundary of available memory, this may be the reason it worked.
@thermokarst is going to run a couple tests on smaller AWS instances to see if we can replicate that way.

I'm kind of guessing I won't be able to reproduce the issues (time will tell), but at this point I'm suspicious of hardware failure and memory. @Bing was able to re-run with --verbose so whatever constraint it was, was reliable enough for that. My bet is on memory at the moment because it is hard to get hardware failure to happen twice the same way.

@Bing how much memory did you have when you ran your commands?

More generally, could both of you describe what kind of system you are running on?

  • What platform and distro (e.g. linux debian or OS X Yosemite)
  • If on linux, do you have an HPC queueing system (e.g. torque or slurm)
  • How much RAM is available?
  • Are there any filesystem limits or quotas, especially with respect to /tmp/?
  • How much free disk is available?

Thanks!

2 Likes

@ebolyen
Just in case I miss anything. I attached all the information of my own iMac information below. I actually stored all my fastq. gz and .qza or qzv files in an 2TB external drive LACIE. Please see the pics below. If that is the issue, please let me know what do I need to do to run the data.

  1. System information

  1. Displays
  2. Storage
  3. Memories

Thanks! Bing

2 Likes

Hey everyone, just giving some updates:

We’ve got the datasets and have been trying to reproduce the issues, but so far very little luck. I’m currently trying to use a bisection method with ulimit -v under the current assumption that memory is somehow playing a role here.

@Bing’s dataset is still running, so no real info on that yet.

This is going to take a while (and it didn’t help that I initially thought that -v took bytes instead of kilobytes). I’ll keep you posted as to any new info. @thermokarst was able to create an unhandled malloc warning on the atacama tutorial data with a smallish EC2 instance, but it finished anyways and the results looked sane.

An off-topic reply has been split into a new topic: Incorrect version of dada2 Installed

Please keep replies on-topic in the future.

More updates:

We can’t seem to reproduce these issues. We’ve limited memory in a lot of ways via ulimit -v (which also limits paging) and while we’ve seen several unique errors from R, nothing really matches the issue.

It is also possible there are transient issues with storage causing this, but we don’t have a way to reproduce that either.

I guess the important thing is there doesn’t seem to be anything wrong with the structure of the .qza input data at this time.

2 Likes

hi @ebolyen,
I rerun our data again with the dada2, same error happened again! Then I tried the method that @apzlo has used to move all the files in the same folder. Finally it works. I am just curious whether it is because I run the data analysis using an external USB rather than in local drive of my computer?

Thanks

Thanks for the update @Bing!

Hmm, that could very well be the problem. Where are you running the analysis? Are you inside the directory that is mounted from your USB drive?

What should be happening is QIIME 2 will read the .qza file and unzip it to your system's temporary directory. Out of curiosity what does this command do inside your environment?

python -c "import tempfile; print(tempfile.gettempdir())"

Does that happen to be your USB drive or is it some crazy path like:
/var/folder/w5/t9wf37ss1cg6s6lhtt8s33gw0000gn/T (which is what I would expect)?

I suppose another situation could be that when QIIME 2 tries to unzip from your USB, some kind of disk-failure/fault occurs that Python's zip library isn't catching.

Thanks again for the update!

@ebolyen

Hmm, that could very well be the problem. Where are you running the analysis? Are you inside the directory that is mounted from your USB drive?

I created a folder in my USB drive which is only used for the data analysis. I put all the .fastq in a folder based on the tutorial, and then all the analysis is running in that folder. Does that mean I need to create directory use mkdir?

Out of curiosity what does this command do inside your environment?
    python -c "import tempfile; print(tempfile.gettempdir())"

Do I need to run this command in qiime2 environment or terminal window?

Does that happen to be your USB drive or is it some crazy path like:
/var/folder/w5/t9wf37ss1cg6s6lhtt8s33gw0000gn/T (which is what I would expect)?

This is exactly what it showed in the errors. If that is the case, what are your suggestions to avoid this?

Thanks,
Bing

Nope that should work fine, mkdir doesn't do anything special.

Good point. That answers the question then, you don't need to run anything. QIIME 2 is working off of your system drive which probably means that the issue happens from some kind of semi-corrupted unzip. (Unless your system drive is the unstable one and we're just getting lucky).

So I think we can probably point this at hardware failure and something in either the framework or Python 3's standard library that isn't handling the failure with an exception and instead keeps running resulting in "missing files". We're having another kind of error right now related to the zip files, so maybe there is something non-standard happening.

Since you don't have this issue when working on your system's hard-drive I would stick to that. (You may also want to look into backing up your current USB drive, it may be failing).

P.S. Totally unrelated, but on this forum you can just highlight text, while you have the reply editor open, you can quote things (it back-links and everything)!

@Bing, is it possible for you to try out this same analysis on a different USB drive? This would let us rule out hardware failure if it succeeds. If you have time/resources to do this we’d really appreciate it! We’d like to track down the source of the issue since it has affected multiple users.

Also, what is the output from running these commands (with your USB drive plugged in)?

df -h
mount

Thanks!

I will try this in another USB drive soon and will tell the results! Also I will try the command after I am back to school!

3 Likes

An off-topic reply has been split into a new topic: DADA2 Duplicate Sample IDs

Please keep replies on-topic in the future.

Here is the results:
`Last login: Sun May 7 23:45:33 on console
Jinbings-iMac:~ jinbingbai$ df -h
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk0s2 931Gi 287Gi 644Gi 31% 2138066 4292829213 0% /
devfs 182Ki 182Ki 0Bi 100% 630 0 100% /dev
map -hosts 0Bi 0Bi 0Bi 100% 0 0 100% /net
map auto_home 0Bi 0Bi 0Bi 100% 0 0 100% /home
/dev/disk1s1 1.5Ti 59Gi 1.4Ti 4% 786 4294966493 0% /Volumes/LaCie
/dev/disk1s2 373Gi 145Mi 372Gi 1% 0 0 100% /Volumes/LACIE SHARE

Jinbings-iMac:~ jinbingbai$ mount
/dev/disk0s2 on / (hfs, local, journaled)
devfs on /dev (devfs, local, nobrowse)
map -hosts on /net (autofs, nosuid, automounted, nobrowse)
map auto_home on /home (autofs, automounted, nobrowse)
/dev/disk1s1 on /Volumes/LaCie (hfs, local, nodev, nosuid, journaled, noowners)
/dev/disk1s2 on /Volumes/LACIE SHARE (msdos, local, nodev, nosuid, noowners)
Jinbings-iMac:~ jinbingbai$ `

Here is the results: Jinbings-iMac:~ jinbingbai$ source activate qiime2-2017.4
(qiime2-2017.4) Jinbings-iMac:~ jinbingbai$ python -c "import tempfile; print(tempfile.gettempdir())"
/var/folders/sd/vlrr9d916gg5qj3yzzf3srfw0000gn/T

So do I need to fix anything?

Thanks

Thanks for that info @Bing! Which volumes were you using (/Volumes/LaCie or /Volumes/LACIE SHARE) in your analyses that were failing?

@jairideout
I use /Volumes/LaCie

Thanks

Thanks @Bing! Please let us know how things go when you have a chance to try a different USB drive.

(The underlying issue may be related to this bug).

Hello @jairideout

I tried another USB 500GB external drive, and also tried the internal drive of my iMac with 700GB available. Both methods failed with the same issue as reported before. The interesting thing is that the missing file for each time is different. I finally gave up trying it. What I did is that I reinstall my iMac to see how that will go!!

Is there anything I can do to handle this issue if it happens again and again?

Thanks,
Bing

2 Likes