I did Miseq pair-ended sequencing. I used other softwares to merge and demultiplex the sequences. Now, I am trying to import sequence data (each sample has one fastq file) into QIIME2.
Error:
ValueError: InPath('/var/folders/zr/vqccwhr90rl1hdcykhvdly100000gp/T/q2-SingleLanePerSampleSingleEndFastqDirFmt-x_5h6kva/BS_0_2_2_4_L001_R1_001.fastq.gz') is not formatted as a FastqGzFormat file.
All my sequence files are fastq format. Does anybody know how to fix it? Thanks.
Hi @eDNA! The file that QIIME 2 is complaining about has the name BS_0_2_2_4_L001_R1_001.fastq.gz, which looks like a pretty different naming scheme compared to the one example sample you posted above (BS_0_1_1.fastq). We typically see filenames that are pretty consistently named â are you sure your manifest file has the right filenames in it? Can you please post your complete manifest file, and a listing of the files in the data directory (could be a screenshot, or ls or tree), that will help us with troubleshooting this problem. Thanks!
Yes, my manifest file has the right filenames in it.
The listing of files in the data directory
$ ls
BS_0_1_1.fastq BS_15_1_3.fastq BS_30_2_1.fastq
BS_0_1_2.fastq BS_15_2_1.fastq BS_30_2_2.fastq
BS_0_1_3.fastq BS_15_2_2.fastq BS_30_2_3.fastq
BS_0_2_1.fastq BS_15_2_3.fastq BS_500_1_1.fastq
BS_0_2_2.fastq BS_30_1_1.fastq BS_500_1_2.fastq
BS_0_2_3.fastq BS_30_1_2.fastq BS_500_1_3.fastq
BS_15_1_1.fastq BS_30_1_2_2.fastq BS_Blank1.fastq
BS_15_1_2.fastq BS_30_1_3.fastq BS_NC1.fastq
I follow the âFastq manifestâ formats tutorial. The âBS_0_2_2_4_L001_R1_001.fastq.gzâ looks like a name in the âCasava 1.8 single-end demultiplexed fastqâ tutorial. Did the command thought I was following the âCasava 1.8 single-end demultiplexed fastqâ tutorial? It mentioned âq2-SingleLanePerSampleSingleEndFastqDirFmtâ in the error.
I used one sample to test. I think the error was caused by sequence data (lower case in DNA sequence).
--------------- 1st test ---------------------
My command:
qiime tools import
âtype âSampleData[SequencesWithQuality]â
âinput-path ImportTest2
âoutput-path Test2_single-end-demux.qza
âsource-format SingleEndFastqManifestPhred33
Error: ValueError: InPath(â/var/folders/zr/vqccwhr90rl1hdcykhvdly100000gp/T/q2-SingleLanePerSampleSingleEndFastqDirFmt-87too7ky/VP_15_2_3_0_L001_R1_001.fastq.gzâ) is not formatted as a FastqGzFormat file.
@jairideout I found that you mentioned lower case issue in DNA sequence in other posters.
My demultiplexed data (upper case) is in one fastq file (e.g. the VP_P1_assigned.fastq I used for 2nd test). I split the file based on the âsampleâ attribute in each sequence to obtain a fastq file for each sample so that I can import them into QIIME2, but the sequences are in lower case. Does anyone have any idea how to import my data in QIIME2? I am very keen to use QIIME 2 to analyze my data. Thank you.
I did another test. I converted the sequences in VP_15_2_3.fastq to uppercase and it can be imported. I will have to convert each of my 280 fastq files before I use QIIME2.
My problem has been solved so far. Looking forward to getting results from QIIME2.
Great, thanks for following up @eDNA. We have an existing open issue on one of our bug trackers about the lowercase sequence issue. We will update this thread when a solution has been implemented, but at this time we have no ETA on when that will happen (I would assumed before for the end of 2018). Sounds like you have a workaround in place to get you moving forward. Thanks!