curation recommendations for UNITE database based on sequence length

Hi @Joeee,

That is a great question! Also, just to clarify, as has been mentioned in that tutorial, feel free to change up the workflow outlined there. Also keep in mind this tip within that tutorial:

Feel free to modify any of the steps in this tutorial, or their order, to best suit your needs!

For example:

Note / Tip : Depending on your goals, it may also be reasonable to use the raw imported sequences, or output from either cull-seqs or filter-seqs-length[-by-taxon] as input into the above extract-reads command.

Remember that tutorial is just showing what you can do. I rarely run filter-seqs-length-by-taxon when I make my own SILVA classifier, unless I really only want near full-length sequences. That is, if you are making an amplicon-region specific classifier, you can skip that length filtering step as you may drop shorter reads that actually contain quality data across your amplicon region of interest.

Unless you are aware of why you need to trim based on taxonomy, I'd likely skip it... or if you simply want to trim everything it will be faster to use rescript filter-seqs-length. Also, as you are using UNITE, and if you still want to trim by taxonomy, you'll want to change the taxonomy labels to what you need.

Hopefully this helps. :slight_smile:

2 Likes