Vsearch Concensus is giving lots of unassigned reads

victoriamesa · May 8, 2020, 5:19pm

Excuse me again, I continue having some problems.
Finally I chose only to work with forward sequences due to low quality scores in the reverse.
My barplot looks as follows, with a lot of unassigned sequences in some samples.

Here, my scripts:
qiime cutadapt demux-single
--i-seqs MULTIPLEXED_SINGLE/multiplexed-seqs.qza
--m-barcodes-file MULTIPLEXED_SINGLE/metadata.tsv
--m-barcodes-column Barcode
--p-error-rate 0
--o-per-sample-sequences MULTIPLEXED_SINGLE/demultiplexed-seqs.qza
--o-untrimmed-sequences MULTIPLEXED_SINGLE/untrimmed.qza
--verbose

It seems strange to me the low percentage that indicates with adapters?

qiime cutadapt trim-single
--i-demultiplexed-sequences MULTIPLEXED_SINGLE/demultiplexed-seqs.qza
--p-front GTGCCAGCMGCCGCGGTAA
--p-front GGACTACHVGGGTWTCTAAT
--p-error-rate 0
--o-trimmed-sequences MULTIPLEXED_SINGLE/trimmed-seqs3.qza
--verbose

qiime dada2 denoise-single
--i-demultiplexed-seqs Gut_microbiota/MicrobiotaPerro/MULTIPLEXED_SINGLE/trimmed-seqs3.qza
--p-trim-left 20
--p-trunc-len 220
--p-n-threads 2
--o-table MULTIPLEXED_SINGLE/table3FORW.qza
--o-representative-sequences MULTIPLEXED_SINGLE/rep-seqs3FORW.qza
--o-denoising-stats MULTIPLEXED_SINGLE/denoising-stats3FORW.qza
--verbose

denoising-stats3FORW.qza :

qiime feature-classifier classify-consensus-vsearch
--i-query MULTIPLEXED_SINGLE/rep-seqs3FORW.qza
--i-reference-reads ref-seqs97.qza
--i-reference-taxonomy ref-taxonomy- silva_132_97_16S.qza
--o-classification MULTIPLEXED_SINGLE/taxonomy-vsearch.qza

In taxonomy-vsearch I don't see many unassigned:

I appreciate your review and guidance, I don't know how to continue.
Thank you so much

Nicholas_Bokulich · May 11, 2020, 3:29pm

Hi @victoriamesa,
There has been a lot of good discussion on this forum about the sources and causes of unclassified sequences, I suggest starting here to diagnose:

In your case, I also notice that some of your samples have very few sequences... at least a few of these appear to be the samples with very few classified reads. So those are samples that should probably be dropped anyway, and the reads in those samples could just represent some background noise (e.g., reagent contamination, index hopping?).

Good luck!

system · June 11, 2020, 9:29pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.