DADA2 truncation

Hello, I wanted some clarification in truncation for DADA2.

--p-trim-left 0
--p-trunc-len 0 \

If I use the above command, does this mean I am essentially using the whole sequence for analysis? Does 0 for both ends mean I won't be truncating anything? Thank you!

1 Like

Hi @ek_97, welcome back!

That's a great question! I'll go over what each of these parameters does below, which should provide some clarification on your question above.

--p-trim-left: Position at which read sequences (forward or reverse) should be trimmed due to low quality. This trims the 5’ end of the input sequences (i.e. the left side), which will be the bases that were sequenced in the first cycles.

--p-trunc-len: Position at which read sequences (forward or reverse) should be truncated due to decrease in quality. This truncates the 3’ end of the of the input sequences (i.e. the right side), which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded. After this parameter is applied there must still be at least a 12 nucleotide overlap between the forward and reverse reads. If 0 is provided, no truncation or length filtering will be performed.

If you set --p-trim-left to 0, no reads will be trimmed from the left side of your sequences. However, if you set --p-trunc-len to 0, you will truncate all reads from your sequences (since this is truncating everything greater than the position specified - which in this case would be 0).

If you'd like to use the full length of your sequences, I'd recommend just leaving out these parameters altogether - your sequences will not be trimmed/truncated unless specified by the above parameters.

Hope this helps!

Cheers,
Liz

2 Likes

From denoise-paired: Denoise and dereplicate paired-end sequences — QIIME 2 2021.4.0 documentation

If 0 is provided, no truncation or length filtering will be performed

Have any primers already been removed from the sequence? Usually those are what you want to trim off.

Hi, thank you for your respones! I have a few more questions I wanted to clarify with you.

--p-trunc-len : Position at which read sequences (forward or reverse) should be truncated due to decrease in quality. This truncates the 3’ end of the of the input sequences (i.e. the right side) , which will be the bases that were sequenced in the last cycles. Reads that are shorter than this value will be discarded.

Wouldn't the --p-trunc-len discard all the sequences greater than this value since this command would be truncating the right side? So if I set the value to 150, wouldn't that get rid of all the bases above the 150 position, and not below?

In addition, what happens if I set the value of --p-trunc-len to higher than the whole sequence length? For example, if my whole sequence is around 150 bases, but I don't know the exact length, so what if I set the value to 151? Would it not truncate anything if my base was shorter than 151?

Thank you again!

1 Like

This part:

Is saying that if you set the truncation length longer than a sequence, that sequence will be discarded.

Once the too short sequences are gone, the rest are truncated from the 3' end to the specified length.

Got it, thanks! So for example if my sequences are 150 bases long but I set the value for --p-trunc-len as 151, then nothing will be truncated and my sequence will remain as 150 bases?

Great follow-up questions @ek_97! Happy to address these below.

You're exactly correct, good catch! I'll make sure we get that documentation updated.

That's correct - your sequences will be left untouched if you specify a length in --p-trunc-len that doesn't have any reads greater than the given value.

For example, if my whole sequence is around 150 bases, but I don't know the exact length, so what if I set the value to 151?

Calling out this specific question, I'd recommend utilizing view.qiime2.org with the associated .qzv from your DADA2 analysis to determine what the length of your reads are. This information can be found under the interactive quality plot tab, which may also be useful for you when determining whether or not you need to trim/truncate your sequences due to a drop in quality.

I go into further detail on this tool, and some more specifics of the --p-trim-left and --p-trunc-len parameters in this forum post, which you may find useful as well.

Cheers,
Liz

1 Like

Hi Liz,

This is perfect. Thank you for the help!

1 Like

Just to clarify,
If you have reads that are exactly 151 bp long,
truncating at 152 (or longer) will discard all your reads and cause qiime to exit with an error.
truncating with 151 or 0 are equivalent and both perform no truncation.
If you have mixed length reads you need to set it to 0 to not truncate.

3 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.