error importing 16s reads into qiime2 - not valid fastq?

Hello qiime team,

I have been trying to import my data into qiime but its giving me errors and I've tried different things.

My data can be found at:

https://github.com/mercadocapote/16s_diadema_ajmc2020/tree/master/rawsequences

My data is MiSeq Illumina dual index paired end sequences in the FASTQ format. I am trying to import them into qiime. There are four files that were generated by Illumina: I1, I2 (corresponding to the barcodes stripped from the sequences from demultiplexing) and R1, R2 (corresponding to the forward and reverse sequences). I have been trying to import the data and tried different manifest files, maybe there is a part that i am missing or overlooking. Thank you for your time and patience.


I was using the following script and other variations of this:
1.

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path rawsequences
--output-path paired-end-demux.qza \

Error output:

There was a problem importing rawsequences:
rawsequences/B2_S150_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:
Header on line 1 is not FASTQ, records may be misaligned

however to my understanding B2 is a fastq file


qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path rawsequences --output-path paired-end-demux.qza --input-format FastqGzFormat

Error output:

An unexpected error has occurred:
No transformation from <class 'q2_types.per_sample_sequences._format.FastqGzFormat'> to <class 'q2_types.per_sample_sequences._format.SingleLanePerSamplePairedEndFastqDirFmt'>
See above for debug info.


qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path manifest1.tsv --output-path paired-end-demux.qza --input-format PairedEndFastqManifestPhred33V2

There was a problem importing manifest1.tsv:
/tmp/q2-SingleLanePerSamplePairedEndFastqDirFmt-sk30d3vy/1C_8_L001_R1_001.fastq.gz is not a(n) FastqGzFormat file:
Header on line 5 is not FASTQ, records may be misaligned

to my understanding 1C_8_L001_R1_001.fastq.gz is in FastqGzFormat file

Hello @mercadocapote, there seems to be something a bit funky going on with your data. I cloned your github repo and ran that first import command and got the same error message as you. I opened rawsequences/B2_S150_L001_R1_001.fastq.gz and got this.



rawsequences/B2_S150_L001_R1_001.fastq.gz doesn't contain B2_S150_L001_R1_001.fastq like it should, it contains manifest.tsv instead! I compared the file contained within rawsequences/B2_S150_L001_R1_001.fastq.gz to rawsequences/manifest.tsv and they're completely identical.



Not really sure how this happened, but it appears to be the primary source of your troubles.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.