Hello! I have downloaded a single fastq.gz file which contains both Fw and Rv and different samples reads of PE-runs. Is there a way to import these in qiime (without having R1 and R2 indifferent files?) or should I run any other software previously?
How are the reads differentiated? Do the FASTQ headers include information on which direction the read was sequenced in? Can you share the first 10-20 lines of the file? Thanks!
Thanks @botellaflotante! It looks like you have what is known as "interleaved" files. Unfortunately, we don't have support for this directly in QIIME 2 (it has become a bit more uncommon these days, at least from what I have seen here on this forum).
Your QIIME 2 environment already comes with cutadapt, so you could give it a try there, then, once demultiplexed and split up, you could follow a traditional QIIME 2 import.
But I just want to share what I've done in the past to quickly split apart interleaved fastq files, in case this serves as a useful backup, or perhaps some inspiration to learn how to use grep!
the first grep grabs headers indicating forward/reverse, and the following 3 lines
the second grep eliminates spacer lines (which grep inserts when the -A option is used)
Thank you! I think qiime cutadapt does not allow the interleaved option, but the grep did the job! Also I had it interleaved because these were many reads downloaded from SRA and this download generated a single interleaved file. I am assuming that there is no need to cut any adapter from these...
I actually found an error with this. The first grep I think it should be '@.*\.1 ' (with a space) Otherwise the output is weird and doesn’t work. Also '.1' alone will result in sequences like ‘.1.2’ or ‘.10.2’ being included in forward…
Thanks again
Sure, you may need to adjust to optimize for the patterns that actually occur in your sequences... grep is a powerful tool, but not an intelligent one!