Dear all,
I am using R studio to analyze my samples. I exported the demux-paired-end.qza data to R and started analyzing my samples.
I used the following tutorial
https://benjjneb.github.io/dada2/tutorial.html
I ran following commands for my data as follows
Filter and Trimming
out <- filterAndTrim(fnFs, filtFs, fnRs, filtRs, truncLen=c(200, 200),
maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE,
compress=TRUE, multithread= FALSE) # On Windows set multithread=FALSE
Merging paired reads
mergers <- mergePairs(dadaFs, filtFs, dadaRs, filtRs, verbose = TRUE)
The output here is :
abundance forward reverse nmatch nmismatch nindel prefer accept
1 284 18 13 106 0 0 1 TRUE
2 282 21 30 106 0 0 1 TRUE
3 268 12 54 106 0 0 1 TRUE
4 253 16 12 106 0 0 1 TRUE
5 228 19 19 106 0 0 1 TRUE
6 222 17 14 106 0 0 1 TRUE
#construct sequence table
seqtab <- makeSequenceTable(mergers)
dim(seqtab)
Output : 13 9641
Inspect distribution of sequence lengths
table(nchar(getSequences(seqtab)))
Output :
292 293 294 295 296 297 301 305
1 3 1865 4119 305 3345 1 2
seqtab.nochim <- removeBimeraDenovo(seqtab, method=“consensus”, multithread=FALSE, verbose=TRUE)
dim(seqtab.nochim)
sum(seqtab.nochim)/sum(seqtab)
Output:
13 1431
0.5271426
getN <- function(x) sum(getUniques(x))
track <- cbind(out, sapply(dadaFs, getN), sapply(dadaRs, getN), sapply(mergers, getN), rowSums(seqtab.nochim))
colnames(track) <- c(“input”, “filtered”, “denoisedF”, “denoisedR”, “merged”, “nonchim”)
rownames(track) <- sample.names
head(track)
Output:
input filtered denoisedF denoisedR merged nonchim
Bac18-041119-F2-R22_S233_L001_R1_001.fastq.gz 25548 24682 24185 23374 21075 13273
Bac18-041119-F2-R23_S249_L001_R1_001.fastq.gz 28864 28044 27416 26477 22356 18298
Bac18-041119-F2-R24_S265_L001_R1_001.fastq.gz 3323 3167 3085 3024 2440 2300
Bac18-041119-F3-R13_S74_L001_R1_001.fastq.gz 20947 19977 19558 19489 18377 10665
Bac18-041119-F3-R14_S90_L001_R1_001.fastq.gz 29165 28268 27616 27432 25183 12378
Bac18-041119-F3-R16_S122_L001_R1_001.fastq.gz 32142 31383 31061 30701 29621 15933
I think most of the reads are lost in removing chimera around 50%. What steps should I take to avoid so much loss of my reads in chimera removal.
Can someone please help.
Thank you so much!