Help interpreting interactive quality plots

Hi,

I used ion torrent sequencing technology to sequence my 16S rDNA sequences. I am trying to interpret the interactive quality plot that I obtained. The quality plot I obtained looks like this.
image

According to the primers I used, my PCR products should have a sequence length of about 256 bp. Is there any reason as to why the x-axis of my interactive quality plot shows that there are more than 1000 sequence bases? I am under the assumption that the plot shows the quality (Phred score) for each base of the sequence, starting from the first base.

I had trouble filtering sequences during the denoising step. Upon adjusting the parameters for the denoising step to obtain more sequences my sequences would not get assigned to its taxonomy to its species/ generic level. Can there be a link between this and my interactive quality plot?

Hi @Brigitta1,
This is definitely a head scratch-er :thinking:

This plot is saying that your sequences are longer than 1500bp .

You are correct.

This might be because your sequences are not 16s? I am not quite sure.

Where did you get these sequences? Could you reach out the sequencer (or whoever gave you those sequences) and ask if these sequences were what they were expecting?

Hi @cherman2

It is the V4 region of the 16S rDNA gene. I conducted the DNA extraction and amplification on the samples. I used the F515 and R806 primers for the amplification step.

Hi @Brigitta1,
Can you send me your demux.qzv, your dada2 command that you ran, and your rep-seqs.qzv so that I can look into this more?

Hi @cherman2

ï½¢demultiplexed_seqs: a05a13a5-b438-4ee4-9f52-3e955b4d0953ï½£
ï½¢trunc_len: 240ï½£
ï½¢trim_left: 30ï½£
ï½¢max_ee: 2.0ï½£
ï½¢trunc_q: 2ï½£
ï½¢pooling_method: 'independent'ï½£
ï½¢chimera_method: 'consensus'ï½£
ï½¢min_fold_parent_over_abundance: 1.0ï½£
ï½¢allow_one_off: Falseï½£
ï½¢n_threads:

I have also sent you the files that you wanted to take a look at.

There's one more thing, I used the "SingleEndFastqManifestPhred33V2" input format to import my sequences (Fastq files) as they were sequenced using ion torrent sequencing technology but I got the following error message when opening the demux file.
" Danger: Some of the forward PHRED quality values are out of range. This is likely because an incorrect PHRED offset was chosen on import of your raw data. You can learn how to choose your PHRED offset during import in the importing tutorial"

Could this have something to do with the sequences being unexpectedly long?

Hi @Brigitta1,
I blasted your sequences using the rep-seq that you sent me. What I saw was the your sequences do align with 16S, but they do not seems to align until 80 bps into your sequence. This indicates to me that there is an adapter at the front of your sequence that is not getting trimmed out. (This would explain your lack of taxonomic identification)

I am not sure what command you are running, but have you looked into dada2 denoise-ccs? This should allow you to trim off your adapters.

Additionally, my hypothesis for why your sequences are so long is that since this is a long read sequencing, that the sequencer sequences all of your v4 region, any adapters or barcodes and then continued reading even though there was no sequence left to sequence.

2 Likes

Hi @cherman2

I did try denoising even after using the plugin to trim adapters (Cutadapt) but it didn't work.

I used the denoise-pyro plugin as my sequences are obtained using ion torrent sequencer.

I tried this but it didn't seem to work either. I will share the output files with you.

Since these are metagenomic samples from different sites, the forward barcoded primer (front) is the same. However, the reverse barcoded primer (Adapter) differs from sample to sample.
Ex:
Forward barcoded primer 5'- CCT CTC TAT GGG CAG TCG GTG ATG TGC CAG CMG CCG CGG TAA -3'
Reverse barcoded primer for sample 1 5'- CCA TCT CAT CCC TGC GTG TCT CCG ACT CAG CTA AGG TAA CGA TGG ACT ACV SGG GTA TCT AAT -3'
Reverse barcoded primer for sample 2 5'- CCA TCT CAT CCC TGC GTG TCT CCG ACT CAG TAA GGA GAA CGA TGG ACT ACV SGG GTA TCT AAT -3'

If I trim these two sequences do you think the adapters well get removed?

Forward Primer (F515) 5' -GTG CCA GCM GCC GCG GTA A- 3' (19mer)
Reverse Primer (R806) 5' -GGA CTA CVS GGG TAT CTA AT- 3'(20mer)

Thank you In Advance!

Hi @Brigitta1,
Those adaptors should work.
Can you try running Cutadapt and show me what your output/error is?

You are right, I got these 2 methods flipped in my head.