What is the QIIME2 equivalent for split sequence file on sample ids?


I’m preparing sequences for submission to NCBI SRA and it requires sequences to be split into files by sample. I used to do this with http://qiime.org/scripts/split_sequence_file_on_sample_ids.html, but that is no longer working. Is there an equivalent command in QIIME2?


QIIME2 2020.6

Hi, @bo_stevens :wave:

Welcome to the forum!

This question was discussed a bit in this Topic recently:

There was a missing bit of advice in that thread, i.e., using a feature table to filter the sequences. Consulting the tutorials mentioned in the Topic I linked to above may be enough to get you where you need to go, but don’t hesitate to follow up if you run into any issues.

Thanks, not sure how I missed that!

1 Like

So, yes that post is the same topic but I don’t think the questions was answered. If I have 42 samples in a fastq file, I would like to split those into corresponding 42 files. The qiime2 tools doesn’t appear to do that. The output on your example is only one file.

Hi Bo,

Can you confirm what kind of data you’re working with? Do you have a multiplexed FASTQ file? I.e., do you need to submit sequences with or without quality?

In your original question, you referenced FASTA files. In your follow up, you referenced FASTQ files. In QIIME 2, FASTA files typically represent FeatureData[Sequence], while FASTQ represent SampleData[SequencesWithQuality] . Forgive me if this is familiar territory, but I just want to make sure we’re on the same page so I can be helpful.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.