Low sequence counts after running Dada2

cgregg1227 · June 30, 2020, 8:42pm

Hi all,

I have had a low total count after running my samples through dada2 and am unsure if it is due to low quality of samples, or to issues with the parameters I am setting for dada2. Here is the command I used:

qiime dada2 denoise-paired
--i-demultiplexed-seqs /mnt/nfs/proj/bcp/Microbiome/16S/Data/BCPE/paired-end-demux.qza
--p-trunc-len-f 290
--p-trunc-len-r 270
--p-trim-left-f 0
--p-trim-left-r 0
--o-table ./table-dada2b.qza
--o-representative-sequences ./rep-seqsb.qza
--o-denoising-stats ./dada2_statsb.qza

I would greatly appreciate if someone would be able to give me some guidance on the parameters I should set? I've also attached the demux.qzv

Thank you,
Collinpaired-end.demux.qzv (300.3 KB)

ChrisKeefe · June 30, 2020, 10:29pm

Hi @cgregg1227!
Do you suspect a quality issue, or a parameterization problem? Why? What can you infer from the "interactive quality plot" tab in the paired-end.demux.qzv you shared? Or from your DADA2 denoising-stats?

There are a ton of great posts on this forum on this topic. If you don't feel confident answering these questions yet, do a quick search, and then come back to this topic with more specific questions and we can tackle em together.

All the best,
Chris

cgregg1227 · June 30, 2020, 10:45pm

Hi @ChrisKeefe,

I supsect a parameterization problem, because the samples I have looked pretty good based on their quality scores. Additionally, someone else has run these samples through another pipeline and their sampling depth was much higher than what I have been getting. I think my issue is that I am trimming my sequences incorrectly, and I have so far not been able to figure out how to trim them correctly.

Thank you,
Collin

ChrisKeefe · June 30, 2020, 11:39pm

Good insight, @cgregg1227. I agree - your quality scores look great to me. Do you understand why you're trimming your sequences in the first place? Can you infer why you're losing sequences from your DADA2 denoising stats?

cgregg1227 · July 1, 2020, 12:11am

@ChrisKeefe I'm trying to trim off the lower quality, so that they don't cause dada2 to drop the sample. I thought I was doing that correctly by truncating the forward read at 290 and the reverse read at 270, but when I did so I ended up losing a lot of my samples. Also, I'm not sure how to interpret the Dada2 denoising stats. I'll try to do some digging on that that may help me understand what is wrong in my parameters. Thank you for the help!

Best,
Collin

ChrisKeefe · July 1, 2020, 12:39am

Sounds like you've got the right idea!
You'll find good examples of how to interpret denoising stats here on the forum, but the cliff notes version is this: read from left to right, each column represents sequences left after some operation is performed. Finding the steps that drop a lot of sequences will let you diagnose why you're losing seqs.

cgregg1227 · July 1, 2020, 3:49am

@ChrisKeefe I haven't been able to find many forum pages explaining what each column represents in terms of the operation performed. Do you have any recommended pages to look at that link the step to the column within the denoising stats .qzv?

Thank you,
Collin

ChrisKeefe · July 1, 2020, 3:29pm

I don't have a great one-stop reference for you, @cgregg1227, but the DADA2 tutorial will do the job well enough. Search it for denoisedf and you'll find a table showing counts after each step in the process. Above that table, you'll find section headers that describe the filtering, merging, chimera removal steps in some detail.

In what step(s) do you see the most sequence attrition? If you had to guess, what does that mean about your data?

system · August 1, 2020, 9:29pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.