This means I will use “qiime feature-classifier extract-reads” instead of Steps 4-6 of this tutorial. I wonder if there is any difference between the two for trimming sequences outside the primer region: “feature-classifier extract-reads” and Steps 4-6.
Any concerns or suggestions?
How long it may take for "extrac-reads"? My MacBook has a 2.6GHz Intel Core i7 processor, and 16 GB 2133 MHz LPDDR3 memory.
Hi @eDNA ,
I think that @devonorourke used steps 4-6 because the primers were absent from some reference sequences so the multiple sequence alignment and positional trimming was needed as a workaround.
We now have a function for this in RESCRIPt called trim-alignment — the difference with extract-reads is that it will trim at a specific site (instead of only trimming reads that contain the primer and discarding the rest). So now this workflow could be more fully accomplished with QIIME 2.
I am not sure... but BOLD is very large, so this will take a long time to align the sequences, trim the sequences, and train the classifier... 16 GB RAM is most likely not enough. @devonorourke has shared his trimmed sequences and pre-trained classifiers as linked from the tutorial — I recommend using those to save yourself a month or more of trouble if you can!