I'm a newbie in this field and I'm trying to conver a bunch of these demultiplexed .fasq files to a single .qza file.
The publishers of the dataset state that they followed the Earth Microbiome Protocil, ,which I think means that they did pair-end sequencing:
Samples were sequenced according to the Earth Microbiome Project protocols (Gilbert et al., 2014). Briefly, DNA was extracted using a MoBio Power soil kit (Carlsbad, CA), and the V4 region of the 16S rRNA gene was amplified using barcoded primers
(Walters et al., 2015). Sequencing was performed using an Illumina MiSeq.
However, when I downloaded the data from ENA Browser page (Bulk Download Files button -> in submitted form), I couldn't find any trace of pair-end sequencing.
These are the files:
This is what's inside:
How can I distinguish between forward and backward reads and what am I missing?
Qiita - link with the metadata, but no raw sequences
ENA Browser - only raw sequences, downloaded in the Submitted file format
Welcome to the forum @aleksa.krsmanovicc!
I am reclassifying your post as “other bioinformatics tools” because your question seems to concern EBI and this specific dataset more than it does QIIME 2, though I see a QIIME 2 importing question that I will preemptively answer (if you run into trouble with importing please open a new, separate topic)
I recommend getting in touch with the authors of that dataset for more details regarding their protocol.
For the purposes of QIIME 2, you will need to import these data using a manifest format. Even though the authors used the EMP protocol, all data in EBI are deposited as per-sample fastq files. So you will import these data using manifest format and proceed directly to denoising (the data are already demultiplexed for you!)
Thank you for the fast response Nicholas, I’m going to treat them as a forward read.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.