Not able to run qiime dada2 denoise-paired on ec2 instance

gregcaporaso · April 4, 2023, 8:48pm

Hi @danielavarelat,
Glad to hear that you were able to get past the DADA2 step!

Memory issues when classifying with Silva is a known problem. Some relatively recent discussion of this, and tips, are consolidated in this post by @Nicholas_Bokulich.

@Mehrbod_Estaki also suggested that filtering low abundance features might help at this stage. You could do that with qiime feature-table filter-features --p-min-samples 2 ... (to include only features/ASVs that are present in at least two samples). You would do that filtering on your feature table, and then filter the features from your repseq.qza file using qiime feature-table filter-seqs --i-table .... This type of filter can reduce the feature count by as much as half sometimes, which can help a bit with memory.

Another alternative would be to use a different reference database for classification, such as Greengenes2 (classifiers available here) or GTDB (see here for details on how to train one of those).