Question about quality plots


(Taeho Kim) #1

Hi,

I am working on a 16s rRNA data.
The raw data consists of roughly 30 samples and
each sample has its forward and reverse FASTQ files.

When I import the data, I used a manifest file
with “PairedEndFastqManifestPhred33” option,
so that I can make one “output.qza”.

Now, I could see the quality plots for the forward and reverse parts
to decide how much I can truncate in the DADA2 procedure.
At this point, I feel a bit confused because this pair of quality plots are about all the samples.

That is, I could get the pair of quality plots from each sample individually.
The quality plots for each sample might show a different tendency,
implying I have to use distinct truncations for each sample.

I am sorry if this is duplicated thread but
I was wondering whether my bundling-up approach is still okay.

Thank you for reading any input would be very appreciated.


(Nicholas Bokulich) #2

Yes, this is the normal approach in QIIME 2.

There would be no way to trim each sample differently in QIIME 2, so there would be no point in generating per-sample quality plots.

Besides, even if you did make per-sample plots you are still just taking an average across a population of sequences. The deviations in read quality should be random, and not differ too much between samples on the same run.

But most importantly, you want your sequences trimmed in the same way so that you can compare them downstream. If you trim paired-end reads to different lengths, some samples may merge better than others, leading to artificial differences in ASV abundance.


(Taeho Kim) #3

Thank you for kind answering, Nicholas_Bokulich.

I feel a bit embarrassed because this question revealed
my lack of understanding of procedure in its fundamental steps.
But, I am still accumulating my knowledge and this forum helps me a lot!


(Nicholas Bokulich) #4

No need to feel embarrassed — we are all learning! I am glad this forum is helpful to you!


(Taeho Kim) #5

Thank you, Nicholas_Bokulich!