Low Feature Count noticed after qiime feature-classifier classify-sklearn

divyaprince321 · July 21, 2023, 1:25pm

Hi all
I am getting an issue while running my dataset.
When I run the command qiime feature-classifier classify-sklearn
I first used Silva and green genes data base available on qiime webpage and got only few taxa
taxonomy.tsv (1.1 KB).
Then I went back and tried to train the classifier using the most recent available green genes and after successfully training it. I again ran the qiime feature-classifier classify-sklearn and again got the same number of taxonomic feature id ( 9)/
However when, I check the sequence depth that seems goods
per-sample-fastq-counts.tsv (192 Bytes)
Can any one guide where is the main issue.
thanks and regards

colinvwood · July 21, 2023, 6:40pm

Hello @divyaprince321,

If you fell comfortable doing so, would you mind sharing the corresponding reads (that were classified) and the feature table artifacts?

divyaprince321 · July 24, 2023, 12:58pm

Hi @ Colinvwood
Sure here I am sharing the required files
table.qza (12.6 KB)
rep-seqs.qza (11.3 KB)
Kindly please let me know if you need any other file.
thanks and regards

colinvwood · July 24, 2023, 5:09pm

Hello @divyaprince321,

You only have eight features in your feature table, so you only got 8 taxonomy assignments (not 9 by the way).

divyaprince321 · July 24, 2023, 5:35pm

Hi @ Colin
Anyways 8 or 9 why is it so
When the sequence count/depth seems good.
How to track this
Thanks and Regards

colinvwood · July 24, 2023, 5:37pm

Hello @divyaprince321,

As I said above, you have 8 taxonomy assignments because you only have 8 features to classify. Why would you expect more taxonomy assignments than you have features?

divyaprince321 · July 24, 2023, 6:02pm

Hi again
That is what I want to know
Why I got only 8 features.
Any reason and how to get more?
Is this issue with the sequencing or any other?
Thanks

colinvwood · July 24, 2023, 6:14pm

Hello @divyaprince321,

That could be a dada2 issue or it could be that you started out with a small amount of sequences to begin with.

divyaprince321 · July 24, 2023, 6:30pm

Thanks again
Now how to overcome this issue.
Thanks again

cherman2 · July 24, 2023, 6:37pm

Hi @divyaprince321,
Unfortunately, you have not given enough information about your data for us to help you with your question.
This question about low feature count has been asked multiple times on the forum, please look through the forum to see if any previous post could be helpful to you.

If you are not able to find anything, can you please provide additional information.
Can you explain your 1) importing steps 2) demuxing step 3) dada2 denoise step?
Could you also upload your dada2-stats.qzv?

divyaprince321 · July 25, 2023, 1:11pm

Hi @ cherman2
thanks for your support
For importing I have used
qiime tools import --type SampleData[PairedEndSequencesWithQuality] --input-path /home/qiime2/Desktop/metadata/manifest.tsv.txt --output-path /home/qiime2/Desktop/manifest.qza --input-format PairedEndFastqManifestPhred33
and successfully imported the datasets.
Then I ran the command
qiime demux summarize --i-data /home/qiime2/Desktop/manifest.qza --o-visualization /home/qiime2/Desktop/demux.qzv
demux.qzv (317.8 KB)
Followed by
qiime dada2 denoise-paired --verbose --i-demultiplexed-seqs /home/qiime2/Desktop/demux.qza.qzv --p-trim-left-f 0 --p-trim-left-r 0 --p-trunc-len-f 220 --p-trunc-len-r 220 --o-table /home/qiime2/Desktop/table.qza --o-representative-sequences /home/qiime2/Desktop/rep-seqs.qza --p-n-threads 12 --o-denoising-stats /home/qiime2/Desktop/dada-stats.qza
I left the
qiime cutadapt trim-paire step, because the sequences are received from the agency and thought they might have removed the primers
dada-stats.qza (10.9 KB)

Please find the attached files and let me know if you need any additional information
thanks and Regards

cherman2 · July 25, 2023, 3:46pm

Hi @divyaprince321,
I think I have found the reason that you have less features than your are expecting.

I would highly recommend looking at your dada2-stats after running dada2, it provides a really good summary of how many sequences passed filtering.

As you can see almost 0% of your features passed filtering. Please take a look at these posts to see if they can help.

There are a lot of forum posts about features not passing dada2.

You said you ran cutadapt on your data but according to your provenance you did not run cutadapt before dada2 and you can not run it after dada2 so I am a little confused on when you ran cutadapt. If there are adapters in your sequences that could be a part of the reason your sequences are not passing filtering.

Hope that helps!

divyaprince321 · July 25, 2023, 4:08pm

Hi again
Thanks for such a detailed overview.
I mentioned I didn't ran cutadapt, because the sequences were received from the agency and they might have removed the primers.
Thanks and regards

cherman2 · July 25, 2023, 4:16pm

@divyaprince321,

Okay, sorry that was unclear.

Please answer the following questions:

Did you read my recommended forum posts? Was there anything helpful there?

Do you understand what step in the filtering process is losing all your reads?

What are region are your working with?

What are your primer positions?

divyaprince321 · July 26, 2023, 6:12pm

Hi @cherman2
Hope you are doing well

@ Did you read my recommended forum posts? Was there anything helpful there?
Yes, trying to get it by working and understanding the things.
@ What are region are your working with?
I am working with V3- V4 regions.

@What are your primer positions?

16S Amplicon PCR Forward Primer:
5'TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCCTACGGGNGGCWGCAG3’
16S Amplicon PCR Reverse Primer:
5'GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGACTACHVGGGTATCTAATCC3’

Please suggest further