Hello,
currently I'm using a manifest file to import already demultiplex, paired .fastq.gz files, recently generated by Illumina MiSeq sequencing. The format of the manifest file looks like:
I import this data with the manifest file above using follow command:
qiime tools import \
--type SampleData[PairedEndSequencesWithQuality] \
--input-path manifest_file_PD1.tsv \
--input-format PairedEndFastqManifestPhred33V2 \
--output-path raw_reads_PD1.qza
summarize it by:
qiime demux summarize
--i-data raw_reads_PD1.qza
--o-visualization raw_reads_PD1.qzv
and next view it with:
qiime tools view raw_reads_PD1.qzv
This shows all my reads per sample and the quality plots etc. I like using a manifest file for reproducibility and overview and to easily edit the sample IDs when importing.
I now noticed that there is also another way to import this kind of data, using the example giving in the following page Importing data — QIIME 2 2022.2.0 documentation at "Casava 1.8 paired-end demultiplexed fastq".
So I can also important this data using following command:
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path PD_1_data \
--input-format CasavaOneEightSingleLanePerSampleDirFmt \
--output-path raw_reads_PD1_import_test.qza
and then again summarize and view it:
qiime demux summarize \
--i-data raw_reads_PD1_import_test.qza \
--o-visualization raw_reads_PD1_import_test.qzv
qiime tools view raw_reads_PD1_import_test.qzv
Next I view (qiime tools view) and compare both the .qzv files of the 2 import methods.
The "overview" tab the sections demultiplexed sequence counts summary, the histogram and Per-sample sequence counts, look exactly the same for both files (except for the sample_ID name in the per-ssample sequence counts). So, I have imported the same number of reads per sample. This is what I excepted, because I thought both methods above should do exactly the same?
However, when I checked the "Interactive quality plot" tab, I see there are some differences. The differences are very small, but they are there:
-
Quality plot using manifest file import:
-
Quality plot using CasaveOneEight import method:
How is this possible? What is the difference between these 2 methods for importing data? Based on the minor differences, I do not think the differences have any practical consequences, but I think it should be exactly the same.
Thanks!
PS: I know noticed that in the manifest command, I did not put SampleData[PairedEndSequencesWithQuality]
between apostrophes, but the importing worked so that cannot be the reason right?