When "Number of features" is different according to "--p-min-fold-parent-over-abundance " in dada2

HANA · June 27, 2022, 1:51am

Hello. I'm Hana.

I am currently analyzing NGS data of v3 ~ v4 parts extracted from human feces.
After removing the primer, I have a few questions in the denosing step with dada2.

After chimera removal, the remaining "percentage of input non-chimeric" was less than 50%.
So, as a result of searching the forum, I changed the parameter of dada2 to --p-min-fold-parent-over-abundance 8 and analyzed it again.

As a result of comparing the two, "percentage of input non-chimeric" did not change significantly, but the number of "Number of features" has changed significantly.

I don't understanding this result.
Please let me know what I did wrong.

and Which result file should be used for downstream analysis?

Thanks for the reply.

Here are the commands I used.

qiime dada2 denoise-paired \
--i-demultiplexed-seqs paired-end-demux-trimmed.qza \
--p-trim-left-f 0 \
--p-trim-left-r 0 \
--p-trunc-len-f 260 \
--p-trunc-len-r 220 \
--p-n-threads 4 \
--o-representative-sequences rep-seqs.qza \
--o-table table.qza \
--o-denoising-stats stats.qza

qiime dada2 denoise-paired \
--i-demultiplexed-seqs paired-end-demux-trimmed.qza \
--p-trim-left-f 0 \
--p-trim-left-r 0 \
--p-trunc-len-f 260 \
--p-trunc-len-r 220 \
--p-n-threads 4 \
--p-min-fold-parent-over-abundance 8 \
--o-representative-sequences rep-seqs_t.qza \
--o-table table_t.qza \
--o-denoising-stats stats_t.qza

rep-seqs.qzv (860.9 KB)
stats.qzv (1.2 MB)
table.qzv (575.5 KB)

rep-seqs_t.qzv (3.2 MB)
stats_t.qzv (1.2 MB)
table_t.qzv (1.3 MB)

Keegan-Evans · July 6, 2022, 5:31pm

@HANA,

--p-min-fold-parent-over-abundance controls how many times more abundant a "parent" sequence must be compared to a potential chimeric sequence for it to be considered chimeric. It sounds like you have a large number of a small set of sequences that are considered chimeric when the abundance of the parent sequences is low and are not considered as chimeric when you require a larger, proportional, abundance of parent sequences. I think they may infact be chimeras, so I think I would keep the output from the denoising run with the defaults (that is a lower relative abundance of parent sequences), as you have plenty of data left and it excludes as many chimeras as possible.

HANA · July 8, 2022, 12:49am

Thank you for your kind reply.
I will proceed as you advised.

system · August 8, 2022, 6:49am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.