I'm processing 625 MiSeq amplicon sequences (~450 bp), & encountered a memory shortage during dereplication using Q2 vsearch. The Linux unit being used has 30 GB memory with 15 GB in virtual memory. The unit is dedicated to this task, so there aren't many other processes occurring.
I've seen the 2 posts from January & February about memory limits & the inability to exclude low frequency sequences. Is there any way to estimate how much virtual memory needed for dereplicating with vsearch? I don't want to break up the data set as ir represents multiple years of sampling within a project, but perhaps that is the only choice.
Hey @colinbrislawn --- due to recent toolchain changes on bioconda, there are no longer osx builds --- this prompted us to set a hard version pin on vsearch: