How large is midori? My guess is "very large" because this method can take some time on very large databases ![]()
Absolutely! Longer primers and more degeneracy will increase runtime even more
You can use the --p-n-jobs parameter to run this in parallel, reducing the wait (though at 48 hr in I don't know if it's worth just waiting until completion or how much longer you have left to wait
)
Generally trimming is beneficial β there are a few publications (for 16S) showing improved classification accuracy, but this will also reduce runtime downstream so is a worthwhile step to perform if this is a trimmed database that you will use again and again (for classification, alignment, filtering, or whatever)
For COI specifically, @devonorourke tested this here:
https://www.biorxiv.org/content/10.1101/2020.10.05.326504v1
and describes steps to reproduce that database here: