Hi! I have run a few analyses and have not run into this issue previously. I thought that maybe because I was running 2024.5 but I no longer think that is the issue.
I am running into low features (47) from 122 samples with a total frequency of 149, but with previous runs they have been about 2,072 with a total frequency of 4,631,486. When I run through the end to classification, there are a lot of unassigned or even blanks for the samples. I tried both the pre-trained classifer provided as well as training my own to take out the primer reads. I have never run into this issue before so I appreciate any input.
Below is the code that I ran using terminal (conda) and qiime2-amplicon-2024.5
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path SFB_manifest.tsv --output-path reads.qza --input-format PairedEndFastqManifestPhred33V2
qiime demux summarize --i-data reads.qza --o-visualization reads-QA.qza
qiime cutadapt trim-paired
--i-demultiplexed-sequences reads.qza
--p-front-f CCTACGGGNGGCWGCAG
--p-front-r GACTACHVGGGTATCTAATCC
--p-match-adapter-wildcards
--p-match-read-wildcards
--p-discard-untrimmed
--o-trimmed-sequences paired-end-demux-trimmed.qza
qiime demux summarize --i-data paired-end-demux-trimmed.qza --o-visualization demux_reads-QA.qzv
Below was the visualization:
qiime dada2 denoise-paired
--i-demultiplexed-seqs paired-end-demux-trimmed.qza
--p-trunc-len-f 285
--p-trunc-len-r 200
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza
qiime feature-table summarize --i-table table.qza --o-visualization table.qzv --m-sample-metadata-file SFB_manifest.tsv
Table summary/ Frequency
From a different run/ data for comparison (run on 2024.2)
Number of reads seemed OK (except for 2-3). Not as high as previous runs but still seems adequate.
per-sample-fastq-counts_after taking out primers.txt (3.1 KB)
Classifier with reference reads taken out:
qiime feature-classifier extract-reads
--i-sequences silva-138-99-seqs.qza
--p-f-primer CCTACGGGNGGCWGCAG
--p-r-primer GACTACHVGGGTATCTAATCC
--o-reads ref-seqs.qza
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads ref-seqs.qza --i-reference-taxonomy silva-138-99-tax.qza --o-classifier classifier.qza
qiime feature-classifier classify-sklearn
--i-classifier classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy.qza
qiime metadata tabulate
--m-input-file taxonomy.qza
--o-visualization taxonomy.qzv
qiime taxa barplot
--i-table table.qza
--i-taxonomy taxonomy.qza
--m-metadata-file SFB_manifest.tsv
--o-visualization taxa-bar-plots.qzv
with the pre-trained classifier:
qiime feature-classifier classify-sklearn
--i-classifier silva-138-99-nb-classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy2.qza
qiime metadata tabulate
--m-input-file taxonomy2.qza
--o-visualization taxonomy2.qzv
qiime taxa barplot
--i-table table.qza
--i-taxonomy taxonomy2.qza
--m-metadata-file SFB_manifest.tsv
--o-visualization taxa-bar-plots2.qzv
I think that the latter issue with classification is stemming from the low features, but I have never run into this before. Happy to provide any other information!