Help With Importing .fastq data

I am working with sample data that I downloaded from a study from NCBI SRA. The data is all in .fastq files, with each file representing one of the patients in the study. I have no index/mapping file so I am almost certain that the data has been demultiplexed. However, I do not know how to tell if the data is single ended or paired. I am trying to create an OTU table from my data, and the Qiime2 I am using is on a Linux VM (Qiime2 Virtual Box). I have tried using the qiime tools import --path and --type etc. However, I keep running into errors. Do you know how I can import the data? I should also mention that for the study there are 54 files and about 30 gb of data in all.

Hi,

I’m not part of the QIIME2 team, but I have done a fair amount of importing demultiplexed data. Look for the fastq manifest formats in the import tutorial (https://docs.qiime2.org/2018.6/tutorials/importing/) for help. You’ll need to make a manifest file to import the data.

Have you tried checking the paper the NCBI SRA data was used for? The methods section will probably tell you if the data was created using PE or SE sequencing. You could also try looking at the headers in some of your fastq files, paired sequences from Illumina data will have headers that begin with the same information but end with either 1:N:0:1 for the forward read or 2:N:0:1 for the reverse read. For other sequencer formats, the reader headers should distinguish forward and reverse reads, but may use different formats.

Hope this helps!

3 Likes

I am trying to import paired fastq data into qiime but got this:
image
My mapping file looked for the most part like:sample-id,absolute-filepath,direction
SRR5057575,/Documents/SRR5057/SRR5057575.fastq,forward SRR5057576,/Documents/SRR5057/SRR5057576.fastq,forward
SRR5057577,/Documents/SRR5057/SRR5057577.fastq,forward SRR5057578,/Documents/SRR5057/SRR5057578.fastq,forward
SRR5057579,/Documents/SRR5057/SRR5057579.fastq,forward SRR5057580,/Documents/SRR5057/SRR5057580.fastq,forward
SRR5057581,/Documents/SRR5057/SRR5057581.fastq,forward SRR5057582,/Documents/SRR5057/SRR5057582.fastq,forward
SRR5057583,/Documents/SRR5057/SRR5057583.fastq,forward SRR5057584,/Documents/SRR5057/SRR5057584.fastq,forward
SRR5057585,/Documents/SRR5057/SRR5057585.fastq,forward SRR5057586,/Documents/SRR5057/SRR5057586.fastq,forward
SRR5057587,/Documents/SRR5057/SRR5057587.fastq,forward SRR5057588,/Documents/SRR5057/SRR5057588.fastq,forward
SRR5057589,/Documents/SRR5057/SRR5057589.fastq,forward SRR5057590,/Documents/SRR5057/SRR5057590.fastq,forward
SRR5057591,/Documents/SRR5057/SRR5057591.fastq,forward SRR5057592,/Documents/SRR5057/SRR5057592.fastq,forward
SRR5057593,/Documents/SRR5057/SRR5057593.fastq,forward SRR5057594,/Documents/SRR5057/SRR5057594.fastq,forward
SRR5057595,/Documents/SRR5057/SRR5057595.fastq,forward SRR5057596,/Documents/SRR5057/SRR5057596.fastq,forward
with then the same thing repeated but with reverse instead of forward

I’ve been really stuck on this, so any help is greatly appreciated!

Hi, I think for paired end you should have a single manifest file, with two lines per sample. The first line giving the path to the forward read, and the second giving the path to the reverse read. If you have paired end data with both the forward and reverse read in a single file, you may need to split the file into two before proceeding.

3 Likes

Thanks, the thing is, when I tried that I got the same error:
All paths in manifest must be absolute but found in relative path
Do you know what might be causing this?

Hi @Vik,

Could you attach your manifest file so we can have a look at the formatting of the file?

Thanks!

manifest_5057.txt (4.9 KB)
I also tried using $/ and ~/ in the destinations column. As mentioned before I am using a linux VM version of Qiime2 2018.6

Thanks,
Vik

Couple of things I see in your manifest file:

  1. you have the same file name for both the forward and reverse read. Are you certain this is paired end not single end data?
  2. The file path doesn’t look like a full file path, do you have the SRA5057 folder in your root directory? If not, you will need to edit you manifest file to check the full file path. You could try navigating into the SRA5057 folder and then typing pwd to get the full path.

Hope this helps!

4 Likes

Thanks so much for the advice, so far no errors!:grinning:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.