Feature-classifier extract-reads

Nicholas_Bokulich · January 23, 2021, 8:52am

How large is midori? My guess is "very large" because this method can take some time on very large databases

Absolutely! Longer primers and more degeneracy will increase runtime even more

You can use the --p-n-jobs parameter to run this in parallel, reducing the wait (though at 48 hr in I don't know if it's worth just waiting until completion or how much longer you have left to wait )

Generally trimming is beneficial — there are a few publications (for 16S) showing improved classification accuracy, but this will also reduce runtime downstream so is a worthwhile step to perform if this is a trimmed database that you will use again and again (for classification, alignment, filtering, or whatever)

For COI specifically, @devonorourke tested this here:

and describes steps to reproduce that database here: