Odd results with Mr DNA data


I am using Qiime 2 2021.2 in Virtualbox and attempting to taxonomically classify my 16S data using the Silva 132 database (i.e,, built and trained my own classifier). I followed the tutorials on qiime2 for the most part, although my data was sequenced by Mr DNA, and that lab provides what is known as the fastq processor to remove forward and reverse primers (515F/806R) as well as barcodes. The sequences have a paired-end format and are already demultiplexed (Casava 1.8). Below is the .qzv file I used to further denoise my data, along with the resulting representative sequences and table .qzv files that were generated from denoising, and finally the the taxonomy.qzv file that was generated after I tested the classifier. I have also included all the code (sans the file paths) I used for these procedures below:

qiime tools import
--type SampleData[PairedEndSequencesWithQuality]
--input-path reads
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-paired-end.qza

qiime demux summarize
--i-data demux-paired-end.qza
--o-visualization demux-QA.qzv

qiime dada2 denoise-paired
--i-demultiplexed-seqs demux-paired-end.qza
--p-trunc-len-f 297
--p-trunc-len-r 265
--o-table table.qza
--o-representative-sequences rep-seqs_081321.qza
--o-denoising-stats denoising-stats.qza

qiime feature-table summarize
--i-table table.qza
--o-visualization table.qzv \

qiime feature-table tabulate-seqs
--i-data rep-seqs_081321.qza
--o-visualization rep-seqs.qzv

qiime feature-classifier classify-sklearn
--i-classifier classifier2.qza
--i-reads rep-seqs_081321.qza
--o-classification 16SInitialTaxonomy2.qza

qiime metadata tabulate
--m-input-file 16SInitialTaxonomy2.qza
--o-visualization 16SInitialTaxonomy2.qzv

I can also include the code I used to build my classifier, I just didn't want to overdo it in this initial post.

As I said before, attached are the .qzv files from the initial import of my samples into q2 (demux-QA.qzv), the results from denoising the data (rep-seqs_081321.qzv, table.qzv), and the result from testing my classifier (16SInitialTaxonomy2.qzv). Any help with this would be hugely appreciated!!!


16SInitialTaxonomy2.qzv (1.2 MB) demux-QA.qzv (313.4 KB) rep-seqs_081321.qzv (197.0 KB) table.qzv (379.4 KB)

Hi @bkramer ,
It does not look like the odd results have anything to do with the taxonomy classifier — it is classifying a single sequence because this appears to be the number of input sequences (judging from the table and rep seqs files that you supplied).

You have one ASV with a total frequency of 1, even though the number of sequences is much larger post-demux... seems quite likely that something went wrong during denoising, perhaps paired-end reads failed to merge? You should check out your dada2 denoising stats summary for clues.

Good luck!

P.S., on a side note:

Since you are using 515f/806r primers, you could use the pre-trained SILVA 138 (or 132) classifiers from the QIIME 2 website to save some hassle.

Hi @Nicholas_Bokulich ,

Thanks for getting back with me! I think there were issues with the files I was putting through fastqprocessor, which led to downstream problems. So I started with the original fasta.gz files I received from Mr. DNA and the data came out much better...dare I say perfectly.

Thank you for your help!

1 Like

@bkramer ,
Thanks for replying and indicating the solution!

I have marked your post as the solution, and re-titled this post for better findability for others in the future.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.