Qiime2 is a great tool. Before we put it into our production, we would like do some QC and benchmark on mock community. So we purchased zymo mix, (ZymoBIOMICS Microbial Community Standards | ZYMO RESEARCH), prepared DNA (used two different library prep kits) and whole cells (used two different DNA extraction methods and one library kit). Based on the user guide of Zymo, although all should have 12% of each bacterial component, the theoretical composition should be expected at Pseudomonas aeruginosa (4.2%), Escherichia coli (10.1%), Salmonella enterica (10.4%), Lactobacillus fermentum (18.4%), Enterococcus faecalis (9.9%), Staphylococcus aureus (15.5%), Listeria monocytogenes (14.1%), and Bacillus subtilis (17.4%). However, my qiime2 results on mock community are quite different from what I expected. Basically, I followed the “Moving Pictures” tutorial, here are key steps I did
- After I de-multiplex our paired-end (PE) reads, I have noticed the quality of data is not that great (see paired-end-demux.qza), therefore I treated our PE reads as single end, set --p-trunc-len 120 and did dada2 using:
qiime dada2 denoise-single --i-demultiplexed-seqs demux.qza --p-trim-left 0 --p-trunc-len 120 --o-representative-sequences single_rep-seqs-dada2.qza --o-table single_table-dada2.qza --verbose
There are more than 75% read left after dada2, but also apparently there are lots of amplicon sequences variants among my mock community (see attached single_table_dada2.qzv).
- For taxonomic analysis, I followed “moving pictures” using a pre-trained Naive Bayes classifier and the q2-feature-classifier plugin. This classifier was trained on the Greengenes 13_8 99% OTUs, where the sequences have been trimmed to only include 250 bases from V4 region of the 16S, which is the same region of our amplicon. As expected, it shows a much more complex community than my input mock community (see taxa-bar-plots.qzv).
Any one has insight on this? I am sure someone have already done this kind of exercise. Would you mind share with me parameters you used for dada2 denoise-single step? Thanks,Gary
taxa-bar-plots.qzv (1.1 MB)
single_table-dada2.qzv (774.5 KB)
paired-end-demux.qzv (282.8 KB)