Unusual difference between "Demultiplexed sequence length summary" and "Sequencing company: read length"

Hello,

Qiime2 version is: q2cli version 2022.11.1 using Miniconda

I have a question regarding Demultiplexed sequence length summary in the demux.qzv.
mort_demux.qzv (309.5 KB)

After I import my data into QIIME2 (ITS region sequences, already demultiplexed, using a Manifest file) and perform a summary the Demultiplexed sequence length summary looks like this:


With all reads having almost the same length of 251nts.

However when I look at the data received from the sequencing company, first of all the read length differs between the samples, but more concerning, the length of the reads on average is much longer(around 350bp). I attach a picture of 3 samples for a better understanding.



I've read that the summary is being performed on only a subset of the reads, however I feel like the difference is too big to be overlooked.

Could someone please advise me on how to view this situation?

Thank you in advance!

Best regards,
Riva

Hi @Riva ,

It looks like that summary from the company is on analyzed data, almost certainly with merged reads.

The Demultiplexed sequence summary is on the raw reads.

So you are comparing different things. The raw reads are all ~250 nt, but after merging they are different lengths, mean 346.8 nt.

Good luck!

1 Like

Hello Sir,

Oh I understand now! The merging of the reads occurs when processing with DADA2 for example, so the Demux.qzv only gives the length of the raw reads.
Now when looking at all the outputs of qiime dada2 denoise-paired I see that in the dada2_rep_set.qzv, the Sequence Length Statistics reflects the length of the merged reads.

I apologize for taking your time, but I am really grateful for your help! I've really been able to learn a lot here.

Best wishes,
Riva

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.