Subsampling randomly fasta file

Hello folks
, my name is Fernando and recently have moved out to QIIME2. I need to subsampling a .fasta file with thousands of sequences. In the first QIIME pipeline I found a script for that: subsample_fasta.py. I would like to know if there is any similar script in QIIME2.,can´t found it yet…

Thank you very much for your support
F

Hello Fernando,

I think this Qiime 2 plugin will subsample a single fasta file.
EDIT: That plugin expects quality score information, so it’s wont work with an imported fasta file.

Keep in mind, this is meant to be used on a fasta file you have already imported into Qiime. If you just want to subsample a fasta file, using
vsearch --fastx_subsample FILENAME --fastaout FILENAME --sample_pct 1
is probably much more simple.

Colin

Thanks @colinbrislawn, unfortunately that command won't work with this kind of data - it is designed to work with SampleData[PairedEndSequencesWithQuality | SequencesWithQuality], which are basically FASTQ files.

Typically a FASTA file represents FeatureData[Sequence] (usually, but not exclusively). You can use this command to filter this type of data:

https://docs.qiime2.org/2018.11/plugins/available/feature-table/filter-seqs/

Thanks! :qiime2: :t_rex:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.