Issues importing old fastq files into Qiime2-2023.2

**Hi all, **

I am just getting into the latest Qiime2 to re-analyze some old data from a microbial ecology data set. I am using Qiime2-2023.2 on WSL for Windows in miniconda (installation went smoothly)

Unfortunately I am having issues importing the fastq files into Qiime2.

The files are saved as
1_SE-16S_V3-V4_CR5JT_AGGCAGAA-CTCTCTAT_L001_R1.fastq.gz
1_SE-16S_V3-V4_CR5JT_AGGCAGAA-CTCTCTAT_L001_R2.fastq.gz
2_SE-16S_V3-V4_CR5JT_TCCTGAGC-CTCTCTAT_L001_R1.fastq.gz
2_SE-16S_V3-V4_CR5JT_TCCTGAGC-CTCTCTAT_L001_R2.fastq.gz
With the first letter depicting sample number.

The content of the files looks as follows:
@M00547:316:000000000-CR5JT:1:1101:18590:1066 1:N:0:NGGCAGAA+NTCTCTAT
CCTACGGGGGGCAGCAGCAAGGAATATTGTGCAATGGGCGAAAGCCTGACACAGCGACGCCGCGTGGGNGATGAAGGCTTTCGGGTCGTAAACCCCTTTTCTCAGGGAGGAAAAATGACAGTACCTGAGNNNNNNNCCACGGCTAACTACGTGCNAGCAGCCGCGGTANANCGTAGGTGGCAAGCGTTATCCGGATTTANTGNGNNTAAAGAGGGCGCAGGCGNCTNTTCAAGTCGGATGTAAAATCTCCCGGCTTAACCGGGAGGGACCATTCGATACTGTTAAGCTAGAGTGCAACAG
+
CCCCCGGGGGEDACFFGGEEGGG8CEFFAFEGGGGGGGGGGGGGGGGFGGGGGGGGGGEGGGGGGGGG#:@FFGGGGGGGGGGG7FGGGG@FGGGGGGGGGGGGGGGGGGGGG>FGGGGGGGGGGGGGG#######88=FFGGGEGGGGGEFGG#/8CFGGGGGEGGG#2#2;ACGGGFFGGGGGGGGGFGGGGG5EGC#22#2##12;FDFDFGGEGGEGGG#21#2;CFEGGGGGGGGGGGGFFGC7CFEEEEGGGGGGGGEEEECEGGGGFGGG*7F7CFGDFGGFFGFB?47CF><
@M00547:316:000000000-CR5JT:1:1101:11856:1069 1:N:0:NGGCAGAA+NTCTCTAT
CCTATGGGAGGCAGCAGCAAGGAATATTGGGCAATGGGCGAAAGCCTGACCCAGCGACACCGCGTGGGNGAAGAAGGCCTTCGGGTTGTAAACCCCTTTTATCGGGGAAGAATTCTGACGGTACCCGATGNNNNNNCCTCGGCTAACTACGTGCNAGCAGCCGCGGTANTNCGTAGGAGGCGAGCGTTATCCGGATTTANTGNGCNTAAAGCGGGTGTAGGCGNCTNGTCAAGTCGGATGTGAAATCTCCCGGCTTAACTGGGAGGGTGCATTCGATACTGATGGGCTAGAGTGCAGCAG
+
...

I am not sure if the data has been demultiplexed or not.

I have been trying this command to import the files:
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path raw_data --input-format SingleEndFastqManifestPhred33V2 --output-path combined
But am getting the following error:
An unexpected error has occurred:
No transformation from <class 'q2_types.per_sample_sequences._format.SingleEndFastqManifestPhred33V2'> to <class 'q2_types.per_sample_sequences._format.SingleLanePerSamplePairedEndFastqDirFmt'>

Any help with importing this data would be very much appreciated :slight_smile:

P.s.: I have also tried "--input-format CasavaOneEightSingleLanePerSampleDirFmt" and "--input-format SingleLanePerSamplePairedEndFastqDirFmt" parameters for the import function

@ausme1,
It looks like you are trying to import your data both in a format designed to hold single ended sequences and as one designed for paired end sequences; before importing, you will need to either determine that you are working with paired end sequences or to process all of the data as single end data, as you cannot import both paired end and single ended sequences at the same time.

I would try dropping the --input-format parameter, as this can be inferred by QIIME 2 from --type, and it will use the correct format associated with the type you are importing as.

From a general perspective, I believe you will want to follow the instructions found in the Fastq Manifest Imports section of the importing tutorial. It appears that your barcodes might be found in the file name of each of your per sample files, which also makes me think your data has not been demultiplexed yet. The best way to determine this would be to reach out to the sequencing center or review any correspondence that you have with them.

1 Like

Thanks for the reply!

I think chances are high that they are already demultiplexed from what I remember and the file names just arbitrary.

I tried leaving out the --input-format command and received the following error:

There was a problem importing raw_data:

Missing one or more files for SingleLanePerSamplePairedEndFastqDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001.fastq.gz'

Same if I use --type 'SampleData[SequencesWithQuality]'

P.s.: I received the information that these are demultiplexed and paired end. From the fastq manifest I should be using --type 'SampleData[PairedEndSequencesWithQuality]'

Does anyone know what this error I am receiving means? Can't find anything about it online. Not sure if this should be in the technical support forum.

Hey Sebastian,

Thank you for your patience while the forums were busy.

Now that you have confirmed your reads are paired and you are using the fastq manifest format, can you post the newest command you tried? If you are willing, you could also post your fastq manifest file so we could check it's formatting.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.