I am new to qiime2 and trying to work through some of my samples for practice. I have 27 samples from tree root nodules that I am trying to analyze for alpha and beta diversity as well as play around with some other analyses tools offered in qiime. I have the sequences uploaded into qiime2, but I cannot seem to get the right dada2 parameters to move past this denoising step. I have been reading many posts in the qiime2forum about demux.qzv results and denoising parameters but I feel like I'm still not understanding when it comes to my data.
The demux.qzv file is:
demux.qzv (309.2 KB)
You'll see that several of the samples have <1000 sequence counts, with the lowest being 48 (I don't know why this is so low since I gave the sequencing center double the DNA concentration requirement).
My understanding is that these sequences did have the adapters trimmed, which may explain why the quality plot in the demux file may look the way it does?
I have tried the following dada2 parameters:
qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trim-left-f 20 --p-trim-left-r 20 --p-trunc-len-f 250 --p-trunc-len-r 249 --o-table dada5table.qza --o-representative-sequences dada5rep-seqs.qza --o-denoising-stats dada5denoising-stats.qza
dada5table.qzv (452.8 KB)
qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trunc-len-f 251 --p-trunc-len-r 251 --o-table dada6table.qza --o-representative-sequences dada6rep-seqs.qza --o-denoising-stats dada6denoising-stats.qza
dada6table.qzv (431.7 KB)
qiime dada2 denoise-paired --i-demultiplexed-seqs demux.qza --p-trim-left-f 10 --p-trim-left-r 10 --p-trunc-len-f 170 --p-trunc-len-r 190 --o-table dada4table.qza --o-representative-sequences dada4rep-seqs.qza --o-denoising-stats dada4denoising-stats.qza
dada4table.qzv (402.7 KB)
As you can tell, the least truncation removes the most features (dada4table), yet no truncation or trimming still gives about 447 less features (dada6table) than trimming 20bp with little truncation (dada5table). I don't understand these descrepancies, nor if my data is even usable since an acceptable sampling depth seems unattainable with even the best dada2 results.
Any help or direction is greatly appreciated.