Hi, I am running Qiime2 version 2019.4 as a Miniconda3 environment on an Ubuntu Desktop app version 18.04 on a Windows 10 system.
I have been trying to make a phylogenetic tree from a merged Feature[Sequence] table I made.
I had Ion Torrent 16S sequencing data from multiple runs with MrDNA. I first input .fna and .qual files into split_libraries.py on Qiime1 and individually imported the resulting seqs.fna files into Qiime2. I dereplicated with vsearch dereplicate-sequences to get individual rep-seqs.qza files. Next I took those and did qiime vsearch cluster-features-open-reference with the Silva 132 release 97% .qza to get rep-seqs-or-97.qza files. Then I used qiime feature-table merge-seqs to merge the rep-seqs-or-97.qza files. Out of this, I got a merged-rep-seqs.qza, and everything was fine up until this point. The merged-rep-seqs.qza is what I have been trying to use for generating the tree.
I enter this:
(qiime2-2019.4) [email protected]:~/qiime2_passaic-comparison$ qiime phylogeny align-to-tree-mafft-fasttree --i-sequences merged-rep-seqs.qza --o-alignment aligned-merged-rep-seqs.qza --o-masked-alignment masked-aligned-rep-seqs.qza --o-tree unrooted-tree.qza --o-rooted-tree rooted-tree.qza
and the app never returns anything. It just blinks for a couple hours like it is still working and then the app window goes black.
When I re-ran the same command in the wrong directory:
(qiime2-2019.4) [email protected]:~$ qiime phylogeny align-to-tree-mafft-fasttree --i-sequences merged-rep-seqs.qza --o-alignment aligned-merged-rep-seqs.qza --o-masked-alignment masked-aligned-rep-seqs.qza --o-tree unrooted-tree.qza --o-rooted-tree rooted-tree.qza
I got this:
(1/1) Invalid value for “–i-sequences”: ‘merged-rep-seqs.qza’ is not a QIIME
2 Artifact (.qza)
I can see that the .qza file I’m using should be the correct QIIME2 artifact:
qiime tools peek merged-rep-seqs.qza
Data format: DNASequencesDirectoryFormat
I’ve looked up the Ubuntu Desktop app and it says that it has 4 GB system memory. Is this not enough? I know for Qiime1, I allocated 4 GB memory and that was enough to go through an entire similar workflow. I’m wondering if this is a memory issue or something wrong with the file types, or if I missed something?
** Sorry, the 4 GB memory was just a recommended system requirement for the Ubuntu Desktop app. So I doubt that is the issue. I have 16 GB RAM.
Still, it also can be a memory issue. If you are able to run successfully the same with smaller data set, and your current is bigger, I think, you don’t have enough of RAM
I tried your suggestion and merged only two instead of the four rep-seqs-or-97.qza files I have, then tried to make the tree with that. I’m still having the same problem. (It’s been 30 minutes, but I don’t expect it will finish.) I have data from only six samples on four different runs. One run has three samples. If it wasn’t for that one run with multiple samples, I can make a manifest file and import that way. Dada2 doesn’t take the demultiplexed .fasta files I can make in Qiime1 for the multi-sample run, so I had to import everything as .fasta and use clustering. I thought I would be able to process 6 samples on my computer without a problem…?
Thank you for your detailed posts! Tree building can take some time, but let’s make 100% sure your computer is not not running out of RAM.
In Ubuntu, search for and open up ‘system monitor’. Let us know know the percent of memory and swap being used:
Hopefully, memory should be under 100% and swap should be 0%. If you see something different, we have found our problem!
I'm sorry, I guess it is not Ubuntu Desktop, it is just "Windows Ubuntu". It just looks like a terminal window. I don't know how to see CPU history, but I can see Memory and Swap. I included the Task manager from Windows while it is stuck running this command in the Ubuntu app. Is that helpful?
If I look higher on the Processes list in Task Manager, I see this disttbfast taking up the most memory.
Thanks for that troubleshooting Rachel!
Sounds like you are running Ubuntu directly through windows using WSL, instead of a VirtualBox or other VM. I’m not super familiar with Windows Subsystem for Linux (WSL), but this is a really great place to start!
Both those windows show 82% through 87% usage of RAM. You mentioned that your computer has 16 GB of RAM, but it looks like those GB are not getting seen by your PC.
This is a real mystery! Let’s see what the Qiime devs recommend!
Good! That means it is running! I think things are working fine, you just aren’t letting them run long enough to finish (this command can take a very long time). If you want to see live output, run the command with the
--verbose flag. You might also want to run with multiple threads, depending on the capacity of your computation environment.
It worked! It took 5 hours to run. I guess I just needed to be patient and adjust my sleep settings. I think I will be switching to a computing cluster soon.
Thank you everyone!
A post was split to a new topic: ANCOM error: msb/msw