Merging Fastq.gz files from WGS Sequencing

eburchard · October 19, 2018, 2:14pm

Hello all,

I have several fastq.gz files from multiple WGS sequencing runs that I would like to merge. Can I do this in the same way that you would merge any other read files? For example, my files look something like the following:

B1_run1_lane1_R1.fastq.gz
B1_run1_lane1_R2.fastq.gz
B1_run2_lane1_R1.fastq.gz

etc...

I would like to merge all the R1 and R2 files into single files so that each sample has only two files (forward and reverse).

Thanks in advance, this forum has been enormously helpful to me!

mouldinator · October 19, 2018, 3:37pm

hey!
If you import them using the manifest method (really easy) to a .qza then they become a single artifact you can use on qiime2 =D if you need it for something that requires fastq.gz format then just export it to one big ole fastq (sorry theres not a one step method i can think of off the top of my head

Nicholas_Bokulich · October 22, 2018, 6:16pm

So the issue is that you have replicates for each sample and you want these merged together?

As @mouldinator mentioned, importing fastq data into QIIME 2 using the manifest format of casava 1.8 format will cause these to be imported into a single artifact. Downstream analysis should be streamlined so you do not need to worry about juggling multiple files, and merging multiple runs will happen further downstream.

Otherwise, QIIME 2 is probably not the tool for the job, e.g., if you want to merge before inputting to a different program. In that case, you should probably just use some basic bash commands to gzunip, cat, and gzip based on filename and read direction.

Good luck!

system · November 23, 2018, 12:16am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.