Hi again @jwdebelius,
I am working with Jupyter Notebebook inside VSCode (sorry for not detailed in the previous message). I have been using the sidle-reconstruction
pipeline from qiime sidle v-0.1.0-beta and I would like to share some test that I have performed to avoid the previously mentioned problems.
1º Modify sidle-reconstruction
parameters: Based on your experience, my first attempt was to reduce the number of --p-n-workers
from >4 to 2 and to use a --p-block-size
equal to 1000. The process was killed after 57 min. A similar approach was used with 3 and 4 workers and both processes were killed too. Below is the code used for the approach with 2 workers.
qiime sidle sidle-reconstruction \
--i-kmer-map silva_128_V2_map.qza silva_128_V3_map.qza silva_128_V4_map.qza silva_128_V67_map.qza silva_128_V8_map.qza silva_128_V9_map.qza \
--i-regional-alignment V2_aligment_map.qza V3_aligment_map.qza V4_aligment_map.qza V67_aligment_map.qza V8_aligment_map.qza V9_aligment_map.qza \
--i-regional-table V2_f_table.qza V3_f_table.qza V4_f_table.qza V67_f_table.qza V8_f_table.qza dada2_pyro_V9_table.qza \
--i-reference-taxonomy silva_128_ssu_nr99_tax_derep.qza \
--p-region V2 V3 V4 V67 V8 V9 \
--p-min-counts 0 \
--p-database 'silva' \
--p-block-size 1000 \
--p-n-workers 2 \
--o-database-map ./reconstructed_results/database_recons.qza \
--o-database-summary ./reconstructed_results/database_recons_summ.qza \
--o-reconstructed-table ./reconstructed_results/feature_table_recons.qza \
--o-reconstructed-taxonomy ./reconstructed_results/taxonomy_recons.qza
2º Dask: I am totally new with Dask so I hope I have done the next steps properly. Next, there is an example of the general code I used to create the clusters.
from dask.distributed import Client, LocalCluster
cluster = LocalCluster(n_workers = 2, memory_limit = "7.5GB")
client = Client(cluster)
I used cluster.scheduler
to extract the IP address needed with --p-client-address
. In this steps, I used the first chunk of code changing --p-n-workers
and --p-block-size
by the --p-client-address
.
I have tried different number of workers (2, 3 and 4) and different memory limits according to this number of workers. With this memory limits I have tried to solve the problems reported in the log file on the previous message. Here, there is an example of the Scheduler created for 2 workers.
All attempts were killed after running during 20-25 min. The vast majority of errors reported were related with "unmanaged memory" and/or "memory not released back to the OS" or similar. In two of this processes, my laptop screen freezed and I had to restart it.
3º Trimming memory: The previous errors led me to this section of the Dask website, where they suggest some potential solutions for this problems. In summary, I have tried the options "Manually trim your memory" and "Automatically trim memory", but the same errors remained.
For now, I have not tried to trim the database by using 3 degenerates instead of 5. An alternative solution I though was to move all the artifacts needed to run sidle-reconstruction
(and tree reconstruction) to a more powerful machine. I think that when I have all reconstructed files, the next steps will not be require as much memory.
Best,
Andrés