Error when importing sequence files - Qiime2-2021.4

Hi,

I have received MiSeq PairedEnd sequences, and when I imported the files as follows:
qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path sequences
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-paired-end.qza

I got this error message:

There was a problem importing sequences:

Missing one or more files for CasavaOneEightSingleLanePerSampleDirFmt: '.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz'

My guess that the sent sequences are named in a different way from what is posted in Qiime2 tutorial.
e.g., my sample name "MI.M05812_0186.001.FLD0289.P1_R1.fastq.gz" (the sample_ID is P1)
whereas the tutorial says (e.g., L2S357_15_L001_R2_001.fastq.gz), it starts with the sample identifier.

I would appreciate your clarification on this issue.
Thanks!

Hi @Eman,

Please see:

-Mike

So, I have to create a manifest for the sequence files like this

sample-id forward-absolute-filepath reverse-absolute-filepath
sample-1 $PWD/some/filepath/sample0_R1.fastq.gz $PWD/some/filepath/sample1_R2.fastq.gz
sample-2 $PWD/some/filepath/sample2_R1.fastq.gz $PWD/some/filepath/sample2_R2.fastq.gz

But here I can see the sample_id at the beginning of the sample name, and my samples like this
sample-id forward-absolute-filepath reverse-absolute-filepath
P1 $PWD/some/filepath/MI.M05812_0186.001.FLD0289.P1_R1.fastq.gz $PWD/some/filepath/MI.M05812_0186.001.FLD0289.P1_R2.fastq.gz

I am confused!!

Ahh, so you want to import using a Manifest? This was not clear in your initial post. I though you were indeed trying to import using the CASAVA format.

In your case, the --input-format value should be PairedEndFastqManifestPhred64V2 or PairedEndFastqManifestPhred33V2. Thus, your final command should be something like:

qiime tools import
    --type 'SampleData[PairedEndSequencesWithQuality]' \
    --input-path manifest-file.txt \
    --input-format PairedEndFastqManifestPhred33V2 \
    --output-path demux-paired-end.qza

Hopefully, this will get you going :truck:.

Hi,
I changed my directory to where a folder named demo including selected sequences to import, plus the manifest file
I created the manifest.txt file for the selected demo sequences to apply when successful to the entire sequences. In the beginning, I used the sample-id as in my metadata file P1, P2,..., and then added Forward-absolute-filepath, Reverse-absolute-filepath, then I got this error:

There was a problem importing manifest.txt:

manifest.txt is not a(n) PairedEndFastqManifestPhred33V2 file:

'forward-absolute-filepath' is not a column in the metadata. Available columns: 'Forward-absolute-filepath', 'Reverse-absolute-filepath'

So, I changed the sample-id as below (including the multiplex key)

Sample-id Forward-absolute-filepath Reverse-absolute-filepath
FLD0289.P1 $PWD/demo/filepath/MI.M05812_0186.001.FLD0289.P1_R1.fastq.gz $PWD/demo/filepath/MI.M05812_0186.001.FLD0289.P1_R2.fastq.gz

And I got the same error. I am not sure how to create a manifest file (excel sheet then save as txt file), but what the exact data to include in each column. According to the provided info from the genomic facility:
MI.M05812_0186 is the run name,
FLD0289 is the multiplex key,
MI.M05812_0186.001.FLD0289.P1 is the read set id, the direction of sequence (R1 or R2).
the quality offset is 33

In my previous project I used Casava 1.8 paired-end demultiplexed fastq given the compatibility of the sequences with CASAVA format. But here, I do not know how to figure it out?

May you please explain that to me, and if possible, how can I convert these sequences into CASAVA format?

Thanks!

Please provide the command you ran.

It looks like you changed the column headers to begin with upper case. These should be as you originally had them:
sample-id forward-absolute-filepath reverse-absolute-filepath

Please see the Import documentation I linked earlier, and follow the manifest format description exactly. Have you tried downloading and running the examples provided there to make sure they work?

Did you try the other option I suggested, e.g. PairedEndFastqManifestPhred64V2? Try this after you try importing with PairedEndFastqManifestPhred33V2 using the corrected headers.

In the manifest file, the sample-id can be whatever you'd like to label the sample. It does not need to match the file names or anything else for that matter.

Yes, the problem because of the capitalized header. It works now. Thanks so much!

1 Like

Glad it worked! :tada:

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.