I don't understand why my visualization looks like this. Is it correct? I discovered that the quality scores of the data I download from NCBI are all equal to 30; maybe that's why the graph is a straight line T__T
Yes! If the sequences that you downloaded all have a quality score of 30 then there should not be any variation in the quality score.
You would see variation if the quality differed across sequences.
Looks like you are good to go!
What technology was used to sequence you data? It is pretty unusual for Illumina runs to be so consistent, and if you used something like a long read sequencing technology some of your initial processing steps will use slightly different tools.
Instrument is Illumina MiSeq. Metagenomic 16S rRNA. I chose a small data size to make it easy to do with qiime2. Yes, but this data is a bit strange compared to similar data I found on NCBI (in terms of quality score). It was weird to download, it only had one fastq file per sample ( 2 original files/each sample: not free egress), paired-end sequencing-> so I used awk to split it into two files forward and reverse, then I import it into qiime2. I'm not sure if their data has a quality guarantee.
In the next step, I should use ASV or OTU ?
Given these rather impossible quality scores, I would recommend OTU clustering. ASV techniques generally require some amount of error information to be present (if all reads have the same quality score for all bases, then there's zero useful quality information).
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.