I performed DADA2 denoise on single-end cDNA sequenced data for V4 region of E.Coli. I noticed that after dada2 all the reads got filtered out in three samples; please see attached.
Ionexpress_81-83-84-57-59-60_single-end-demux-trimmed-2.qzv (304.6 KB)
Ionexpress_81-83-84-57-59-60_single-end-dada2-rep-seqs.qzv (204.7 KB)
Ionexpress_81-83-84-57-59-60_single-end-rep-seqs-stats-dada2.qzv (1.2 MB)
Ionexpress_81-83-84-57-59-60_single-end-rep-seqs-table-dada2.qzv (316.4 KB)
I would really appreciate your comments on what does this signify? Does this mean that none of the reads pass the filter and all reads were noisy.
Could you explain what you’ve done to the reads prior to importing them to qiime2? Also, what is the sequencing platform used here? Your have 400+ long reads which is not typical Illumina data (unless you’ve merged reads prior) and you seem to have run dada2 denoise-single which is more appropriate for Illumina data. On that note you seem to have selected a truncating parameter of 427 though most of your reads appear to not be that long at all so none will be included in the denoising step since most of them are all discarded right off the bat. I should also note that the V4 region is only around 250bp long to begin with so something else is happening here prior to importing that is maybe causing this problem.
An additional note which may or may not have been intentional but it looks as though you have used cutadapt to remove 2 different sets of primers separately, this may be intentional but just wanted to point it out in case it wasn’t.
Thanks @Mehrbod_Estaki, The platform is ion torrent. I used the raw data for importing, didn’t do anything prior to importing. After importing, I cut the primers from the 5’ end of the reads by doing this:
qiime cutadapt trim-single --i-demultiplexed-sequences Ionexpress_81-83-84-57-59-60_single-end-demux.qza --p-front ^GGACTACTGGGGTATCTAAT --p-times 2 --p-cores 10 --p-error-rate 0 --p-match-read-wildcards --p-match-adapter-wildcards --o-trimmed-sequences Ionexpress_81-83-84-57-59-60_single-end-demux-trimmed-1.qza
qiime cutadapt trim-single --i-demultiplexed-sequences Ionexpress_81-83-84-57-59-60_single-end-demux-trimmed-1.qza --p-front ^GTGCCAGCCGCCGCGGTAA --p-times 2 --p-cores 10 --p-error-rate 0 --p-match-read-wildcards --p-match-adapter-wildcards --o-trimmed-sequences Ionexpress_81-83-84-57-59-60_single-end-demux-trimmed-2.qza
I checked the primers and they are at the 5’ end or the beginning of the reads. Since the primers are at the 5’ prime end of the reads I used the above commands to cut them from the reads one by one; so the output of first cutadpat was used as input for the second.
Please let me know of your comments.
And, yes, I see from the quality score plot that the read length goes to 400 and above; however, most of the read length is about 275 so I have put another run with this length for dada2.
Great! Just wanted to make sure the double filtering was intentional. I’m learning a little about Ion Torrent data these days, sounds like these longer reads you see is somewhat expected and I like your way of dealing with them by just truncating them at a reasonable length. Let us know how this works out. Also, have a look here with regards to special consideration of Ion Torrent data with DADA2.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.