I got only few merged sequences. Would it be possible to adjust the command to get more sequences? The amplicon region is V4V5, and I ran 2x250bp on miseq platform.
According to trimmed_seqs_bacteria.qza, the quality of reverse reads is not so good. Should I try just forward reads with qiime dada2 denoise-single command? Would it have any influence on the results?
The quality of your reads are actually quite good. Those big dips at the tails of your reads are probably representing reads that didn't have the primer sequences in them when you used cutadapt to remove them. Thus those few hundred reads are longer and have low quality tails.
Try running again with forward truncate of 231 and reverse truncate of 218.
This is the maximum length of reads you can afford, if you get unsatisfactory merging from this then you are right to drop your reverse reads and use forwards only. That will certainly significantly increase your reads. However, I should mention that your current # are not too bad to begin with. You can probably just stick with whatever you get out of this new run.
Thanks for your suggestion! I will reset the parameters and run it again.
Actually, I have tried to run with forward truncate of 230 and reverse truncate of 217, but losing lots of reads after filtering. Here is part of the stats file.
Hi @hesongbing,
If you have already tried that combination then there is no point in trying mine.
At this point, like I said, you could just move forward with what you have since your lowest sample still has over 7k reads which is fairly good for most datatype.
If you do run only your forward reads (and set a safe truncating value like 200) you should see a significant increase in # of retained reads. The downside is that you lose some resolution as far as taxonomy goes but those differences are rarely ever detrimental to an experiment imo. The overall patterns of your results should stay the same. So you would be pretty safe with either approach.
I think your first run looks pretty good, yes you're losing 66% of your reads, you may be able to recover some more, but you likely are getting rid of bad quality sequences that you don't want in your samples anyway. Can iI ask you what kinds of samples are these? Are they samples with heavy eukaryotic contamination?
Hm, thank you for the clarification, did you run a positive control with your runs? You can check if there's something wrong with the run just in case (e.g., a failure w/ the positive control would suggests there was something wrong with the # of sequences you recovered). I am unfamiliar with the V4V5 region so I am not sure if this recovery si typical for your recovery. Ben
Oh, sorry I meant a positive sample where you know the exact composition (such as a culture). We run positive controls of a mock group of bacteria where we can confirm there was nothing that went wrong with our run (upstream of the QIIME2 analysis).
This is just in case there was a problem w/ our run (familiar of primers/polymerases), etc. It's ok if you did not, I would just run the rest of the pipeline w/ your DADA2 results to see what you get.
Great, I don't think you have much to worry about with the reads being filtered out by DADA2, I have similar loss in a "good" run of the 16S v3v4 region (we loss 50-60% of the sequences at the DADA2 denoise step) and the data at the end looked great (supporting our hypothesis). Ben