The number of ASVs is much smaller than OTUs

hao_wen · March 15, 2025, 6:01am

Hello everyone,
When I was analyzing ASVs, I obtained approximately 5000 ASVs. But the number of OTUs given to me by the sequencing company is about 15000. Although I know these two are different principles, I want to know if this is a normal phenomenon. If not, I want to know where the problem lies.
My sample was collected from soil, with primers 338F-806R.
I first used fastp to perform quality control and filtering on the data.

fastp -i ./raw_data/Y2_0_5.338F_806R.R1.raw.fastq.gz -W 4 -M 20 -o ./clean_data/Y2_0_5.338F_806R.R1.clean.fq -I ./raw_data/Y2_0_5.338F_806R.R2.raw.fastq.gz -O ./clean_data/Y2_0_5.338F_806R.R2.clean.fq -h ./clean_data/Y2_0_5.reads.fastp.html

This is the quality control result of one of the samples.
[file:///C:/Users/wenhao/Desktop/Y2_0_5.reads.fastp.html](file:///C:/Users/wenhao/Desktop/Y2_0_5.reads.fastp.html)
Then remove the primer

qiime cutadapt trim-paired --i-demultiplexed-sequences ./dePrimer/paired-end-demux.qza --p-cores 2 --p-front-f ACTCCTACGGGAGGCAGCAG --p-front-r GGACTACHVGGGTWTCTAAT --o-trimmed-sequences ./dePrimer/paired-end-trimmed-seqs.qza

This is the statistical file before and after primer removal
paired-end-demux.qzv (312.1 KB)
paired-end-trimmed-seqs.qzv (318.5 KB)
Finally, perform denoising

qiime dada2 denoise-paired --i-demultiplexed-seqs ./dePrimer/paired-end-trimmed-seqs.qza --p-n-threads 2 --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 0 --p-trunc-len-r 0 --o-table ./dada2/feature_table.qza --o-representative-sequences ./dada2/rep-seqs.qza --o-denoising-stats ./dada2/stats.qza

Appreciated.

Nicholas_Bokulich · March 15, 2025, 8:05am

Hi @hao_wen ,
This is quite normal.

OTU clustering can significantly inflate diversity estimates when careful quality control is not done to remove noisy sequences:
https://www.nature.com/articles/nmeth.2276

Comparing denoisers vs. OTU clustering, typically denoisers yield a lower number of ASVs than OTUs (when additional QC steps are not applied), even around an order of magnitude different counts:

So your observation lines up with this pretty well. Not a problem, just a characteristic of the methods. And long story short is that if you want to use OTU clustering more careful QC needs to be applied, the "raw" OTUs will overinflate diversity estimates.

Good luck!

system · April 15, 2025, 2:06pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.