Dada2 parameters

Gal · June 4, 2018, 12:25pm

Hello
I imported my data see attached paired-end-demux.qzv (286.8 KB)
and now I'm trying to choose the best parameters for dada2 denoising. I've used the parameters below and only got 49 features. these samples are low in diversity but the analysis I received from the provider shows much higher diversity than this. can you please help?

Thanks!!

qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.qza
--p-trunc-len-f 280
--p-trunc-len-r 220
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza
--o-denoising-stats stats[DADA2Stats]

Mehrbod_Estaki · June 4, 2018, 5:45pm

Hi @Gal,

Can you confirm that your primer/adapters have indeed been removed from your reads before running dada2? I ask because your visualization shows an oddly clean 5' and extends beyond the 300bp limit of typical Illumina runs.
If these have not been removed prior, dada2 will behave oddly similar to the way you described.
Also, what is the expected overalp of your primer set? If sufficient overlap deosn't exist then many reads that fail to merge will be discarded leading to false low diversity.

Gal · June 5, 2018, 12:51pm

Hi Mehrbod
Thanks for your reply. The primers were not removed, I tried removing them using trim-f and it didn't change much. I should have sufficient overlap, the primer set overlap is 135 bp

time qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux.qza
--p-trunc-len-f 280
--p-trunc-len-r 200
--p-trim-left-f 60
--p-trim-left-r 40
--o-representative-sequences rep-seqs-dada2.qza
--o-table table-dada2.qza
--o-denoising-stats stats[DADA2Stats]
table.qzv (306.1 KB)
paired-end-demux.qzv (288.7 KB)

Mehrbod_Estaki · June 5, 2018, 5:32pm

@Gal,

Thanks for the update. I agree that the diversity seems oddly low. A few additional thoughts/questions. You mentioned that the you expect low diversity in the samples but the facility has showed higher diversity. Could you elaborate on those points? First, what are the samples or what exactly is the target? 16S, 18S, ITS etc. What is the expected target length? What method does your facility use for their analyses? Is this from a 2x300 or 2x250 run?

Are you sure the trimming of 60 and 40 is enough to remove the full non-target sequences? I ask because the combination of primers, adapters, and barcodes in many design make up over 60nts which means even with your trimming parameters you might still have some left over. Something to double check.
If this is from a 2x250bp run then I'm guessing your overlap is insufficient also. But let's assume it is from 2x300bp, depending on your target and the natural variation in the target you might still have insufficient coverage. You are truncating 120bp from your reads, and have an overlap of 135 bp (so 15bp left) The overlap requirement for DADA2 is generally 20nt + natural variation in your target. So you might very well be missing the overlap coverage and not merging your reads thus discarding a lot of them. One easy way to confirm this is to run just your forward reads (which look great by the way) and compare the output to what you have with the paired end results. This circumvents the potential merging problem all together.
Give these a try and we'll go from there.

Gal · June 7, 2018, 2:05pm

Hi Mehrbod

Thanks heaps for your reply! I tried processing only the forward reads using Delbur, which increased the number of features by a lot!

I then continues processing based on the 'moving pictures' tutorial and all went well except that I could not assign taxonomy.. I know my classifier works ok, I tested it with the tutorial rep-seq however when I tested it with my sequences (attached) it only recognise up to the kingdom level.

could it be because the primers were not removed? and if so- how do I remove them?

Thanks againrep-seqs.qzv (624.7 KB)

time qiime deblur denoise-16S
--i-demultiplexed-seqs demux-filtered.qza
--p-trim-length 280
--o-representative-sequences rep-seqs-deblur.qza
--o-table table-deblur.qza
--p-sample-stats
--o-stats deblur-stats.qza

Mehrbod_Estaki · June 7, 2018, 8:57pm

Hi @Gal,

When I look at your rep-seqs.qzv it looks to me as your primers are still intact. See how the beginning of all your reads are one of only two different combinations, a quick google search showed them indeed as primers. You'll want to find out the exact sequences and length of your primers used, and in fact any other nonbiological sequences like barcodes/spacers/adapters etc that may still be in your reads and remove all those. You can use cutadapt tool to remove these. This would certainly explain why you are not getting any assignments passed Kingdom. Give that a try and we'll go from there.

Gal · June 8, 2018, 10:23pm

Thanks!!! it worked

system · July 10, 2018, 4:23am

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.