I tryed " qiime phylogeny align-to-tree-mafft-fasttree --i-sequences rep-seqs-nonchimeric.qza --o-alignment alignded-rep-seqs.qza --o-masked-alignment masked-aligned-rep-seqs.qza --o-tree unrooted-tree.qza --o-rooted-tree rooted-tree.qza
"
and got
"Plugin error from phylogeny:
Command '['mafft', '--preservecase', '--inputorder', '--thread', '1', '/tmp/qiime2-archive-b3ktqfzt/243e8d70-800b-4fb1-8eaa-c32d62c3829a/data/dna-sequences.fasta']' returned non-zero exit status 1
"
after chimera de-novo filtering with vsearch.
what can i do now?
Could you post the full text of your error message or any log files you have? The full error text or log file should includes more clues about the 'non-zero exit status' that will help us solve this problem.
That might happen if you run out of memory or walltime (if on a cluster).
What kind of data re you working with? If a reference tree existed for your amplicon, you could always use q2-fragment-insertion for very large datasets (which is common if you aren't using a denoising algorithm like DADA2 or Deblur).
Hi.
Virtual machine (linux-64) on windows 2012 server with 32 Gb memory.
table-nonchimeric.qza - 49 Mb
rep-seqs-nonchimeric- 79 Mb
Is it too big to build distanse matrix?
No, that's an excellent machine. I don't think I have a good explanation for why this happened.
If you were to re-run the command, does it still fail? Has the the virtual machine been allocated enough of the hardware (I assume the answer is yes, since you're running a windows server install, but it doesn't hurt to check).
Yikes, alright. Would you be able to send me a DM with your rep-seqs-nonchimeric.qza? I'll see what I can do to reproduce and figure out what's going on (I won't be able to start that until next week).
You have around 1 million input sequences. The sequences look fine, but this is going to have a high memory demand.
since you are doing OTU clustering instead of denoising, I strongly recommend removing low-frequency sequences from your table and sequences before proceeding. This will dramatically reduce memory requirements. But that is not why I recommend this — it is best practice, since low-frequency OTUs are usually noisy, erroneous sequences, which can negatively impact results.
Do you want to give that a try and see if it also eliminates this memory issue?