Aligned sequences and Rescript

Hello @jwdebelius!

In general, the degap-seqs method is in there to handle aligned sequences, so you are correct the workflow would be something like: import aligned seqs --> degap --> do whatever.

Unfortunately, that method only degaps aligned DNA sequences... although RESCRIPt defines an RNASequence type, we still need to add an AlignedRNASequence format

In the case of SILVA, you can just use the unaligned sequences — right? — in the case of SILVA you can also use the get-silva-data pipeline to reduce your blood pressure, which downloads and imports/formats the SILVA sequences and taxonomy.

See "getting SILVA data the easy way", by @SoilRotifer:

As far as I know, there are aligned and unaligned versions for the various SILVA releases. It would be neat to get more support for aligned sequences in there for other datasets and to support streamlined integration with plugins that use aligned reference sequences. Any interest in contributing a "good first issue" to RESCRIPt? :wink:

Not at all... maybe what you'd call a documentation hole. :hole:
Besides, I'm just glad RESCRIPt is getting the attention! :nerd_face:

1 Like