Demultiplex the merged sequences with sample name in the header

Hi,
I have a merged fastq file, with sample name in the head, as below. The sample name in the header is the only identifier to distinguish the sequences. May I know how to demultiplex this in qiime2 and import it into qiime2? I want to use DADA2 to pick rep_seqs later. Thanks in advance.

@M01056:153:000000000-AFD83:1:1101:16123:1874–Sample1
TACGTAGGGGGCGAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCACGTAGGCGGTCCTTCAAGTCGGAAGTGAAATCTCAAGGCTCAACCTTGAAATTGCTTTCGATACTGGGGGACTTGAGGCAGGTAGGGGAGTGTGGAATTCCTGGTGTAGCGGTGAAATGCGCAGATATCAGGAGGAACACCAGTGGCGAAGGCGGCACTCTGGGCCTGTACTGACGCTTAGGTGCGAAAGCGTGGGGAGCAAACAGG
+
GGHHDGHGGGGGGGGGGEEGFHHGGGGGHHHHHHHHGEEFHHHHEGHGGHFGHGGGGGHHHHHHFHGGCGGH7HHHHHGGHHHHHHGGHHHHGEGHHHGGHHHEHGGFHGGHCC?DGGHGGGGGGGHHGFFGGGHFFHHHHFGHGGHHFHFCEEDFFHHFGGGGGFDFFHFHGFGFGHGGGD0FFGHHGGGDHHCGEEEF4HGEG1E?FGGFGFD@EGDGGGFF3FGGGGGHCGEEE?HGGGHGHHHFGGBGB
@M01056:153:000000000-AFD83:1:1101:13352:1907–Sample2
TACGTAGGGTGCAAGCATTATCCGGATTTATTGGGCGTAAAGCGTCCGTCGGCGTTTTATCAAGTTTTGACTTTAATACTGGAGCTTAACTCCAGCTACAGGTTGAAATACTGATAGAATTGAGTTTACTAGGGGGAGCTGGAATTCTCGGTGGAGGAGTGAAATCCGTAGATATCGAGAGGAACACCATTCGCGAAGGCGGGCTCCTGGAGTATAACTGACGCTCAGGGACGAAAGTTTGGGTAGCAAAAGGG
+
GGHHHHHGGHGGHHHHHHHHHHHGGGGGHHHHHHHHGGGGHHHGGGGGHH;GGGGGGGHHHGHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHHHHGHHHGHHHHHHHHHHHHHHGHHGHHHHHHHHHHHHHHGHHHHHHHHHHHGGGGGHGHHGHGGGHHHHHHGGFGHGGGGFGHHGGGGGGGHFHHGHHHHHHHHHHGGGGGGHHGHGFGHGHHHHHHHHHGHGHHHHHHHHHH

Hey @Lu_Yang,

Sorry for the delayed response and thanks for the sample reads!

Based on those samples, it looks like the sample ID is delimited with a -- in the header? We don’t have anything in QIIME 2 that recognizes that, or anything that can demultiplex based on matching something in a FASTQ header.

Where did you get your data from? Does it exist in a different format by any chance?

Thanks!

Hi, @ebolyen,
Happy Thanks Giving! Thanks for your reply!
This is the data merged from FLASH. I do not have the original samples. During these days, I have demultiplexed all these samples now. So now merged reads for each samples are in separate fastq.gz/ fastq files.
May I know is there any way to import these demultiplexed samples and later deal with DADA2 or Deblur? And is it appropriate to deal with QIIME2?
Again, thanks.

You too!

Perfect! You should be able to follow this section of the Importing Tutorial to get your data into a QIIME 2 artifact!

What sequencing technology did you use? We've mostly focused on Illumina amplicon sequencing at this point, but we're looking to add more types of analysis to QIIME 2 in the nearish future.

Let me know if that helps!

Hi, @ebolyen,
Thanks for your quick response.
My samples are 16S Illumina data. Based on your suggestion, may I know more details about which kind of formate can I use? Because my data are demultiplexed merged paired ended data. Regard it as “Casava 1.8 single-end demultiplexed fastq”?
And later all my data processing are regarded as single ended? Is that appropriate? I am afraid.
I have tried to import it as single ended. And all the processing are regarded as single ended in DADA2. But the result seems not right.
May I have a way out?

Thanks!

1 Like

Hey @Lu_Yang,

We actually have a bunch of features for joined data which will be in the next release. But basically, yes that will work soon, you'll just be specifying a slightly different semantic type (SampleData[JoinedSequencesWithQuality]).

DADA2 works best if it can denoise the forward and reverse reads independently before merging. It can kind of work with joined data, but it requires the overlapping quality scores to basically match the profile of the quality scores around it and that really depends on what read-joiner you used. (Then there's also the typical trimming and removal of non-biological sequence requirements.)

I would recommend waiting for the 2017.11 release where there should be a few other analysis options that you can use for this data.

1 Like

Hi, @ebolyen,

WOW! Cool! Looking forward to that!
Thanks for your info. I will wait for the new release.
Have a great holiday!

3 Likes

The QIIME 2 2017.11 release has expanded support for analyzing paired end reads! See the paired end reads community tutorial for more details. :tada:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.