I used ion torrent sequencing technology to sequence my 16S rDNA sequences. I am trying to interpret the interactive quality plot that I obtained. The quality plot I obtained looks like this.
According to the primers I used, my PCR products should have a sequence length of about 256 bp. Is there any reason as to why the x-axis of my interactive quality plot shows that there are more than 1000 sequence bases? I am under the assumption that the plot shows the quality (Phred score) for each base of the sequence, starting from the first base.
I had trouble filtering sequences during the denoising step. Upon adjusting the parameters for the denoising step to obtain more sequences my sequences would not get assigned to its taxonomy to its species/ generic level. Can there be a link between this and my interactive quality plot?
I have also sent you the files that you wanted to take a look at.
There's one more thing, I used the "SingleEndFastqManifestPhred33V2" input format to import my sequences (Fastq files) as they were sequenced using ion torrent sequencing technology but I got the following error message when opening the demux file.
" Danger: Some of the forward PHRED quality values are out of range. This is likely because an incorrect PHRED offset was chosen on import of your raw data. You can learn how to choose your PHRED offset during import in the importing tutorial"
Could this have something to do with the sequences being unexpectedly long?
I blasted your sequences using the rep-seq that you sent me. What I saw was the your sequences do align with 16S, but they do not seems to align until 80 bps into your sequence. This indicates to me that there is an adapter at the front of your sequence that is not getting trimmed out. (This would explain your lack of taxonomic identification)
I am not sure what command you are running, but have you looked into dada2 denoise-ccs? This should allow you to trim off your adapters.
Additionally, my hypothesis for why your sequences are so long is that since this is a long read sequencing, that the sequencer sequences all of your v4 region, any adapters or barcodes and then continued reading even though there was no sequence left to sequence.
I did try denoising even after using the plugin to trim adapters (Cutadapt) but it didn't work.
I used the denoise-pyro plugin as my sequences are obtained using ion torrent sequencer.
I tried this but it didn't seem to work either. I will share the output files with you.
Since these are metagenomic samples from different sites, the forward barcoded primer (front) is the same. However, the reverse barcoded primer (Adapter) differs from sample to sample.
Forward barcoded primer 5'- CCT CTC TAT GGG CAG TCG GTG ATG TGC CAG CMG CCG CGG TAA -3'
Reverse barcoded primer for sample 1 5'- CCA TCT CAT CCC TGC GTG TCT CCG ACT CAG CTA AGG TAA CGA TGG ACT ACV SGG GTA TCT AAT -3'
Reverse barcoded primer for sample 2 5'- CCA TCT CAT CCC TGC GTG TCT CCG ACT CAG TAA GGA GAA CGA TGG ACT ACV SGG GTA TCT AAT -3'
If I trim these two sequences do you think the adapters well get removed?