Database for ITS2

Hello all,
I tried using both Silva and UNITE database. I have soil samples, ITS2, V3-V4. The problem is I could not find any arbuscular mycorrhizal fungi (AMF) in taxa bar plot. We found plenty of AMF while running fatty acid methyl easter (FAME) in our lab. I would be grateful if you could give suggestions on which database is better for ITS2.
Thank you!

Hi @umanand

For ITS2, the UNITE database is preferable, as it is specifically designed for this (And ITS1). As this database is continually updated and curated by experts, it is the most accurate and robust db for your needs.

Can you share which primer set you are using for your ITS2 region? The set you are using may simply not be catching AMF-specific amplicons and thus give you an under-representation of that taxa. Also, if your AMF taxa is present in low numbers, I would check your sequencing depth to make sure it isn't too low for this purpose.

Can you also supply your taxa bar-plots?

Hi Mike,
I used only forward sequence for analysis. I have attached the primer information provided by the sequencing facility and taxa bar plot qzv file.
I used the following code for trimming purpose,
trim primers and adapters using cutadapt:
qiime cutadapt trim-single
--i-demultiplexed-sequences demux-single-end.qza
--p-front TCGATGAAGAACGCAGCG
--p-error-rate 0.1
--o-trimmed-sequences trimmed-seqs.qza
I would be happy you provide you with any other information needed.
Thank you!

ITS2_Primer
taxa-bar-plots.qzv (393.4 KB)

Hi @umanand

Could you provide the following just so I can have a clearer picture:

The output from 'qiime demux summarize', the parameters/command you used for DADA2, and the denoising stats from DADA2.

Did you use a pre-trained classifier or did you train and evaluate your own? If so, can you provide the details of this?

Many thanks!

Hi Mike,
Please find the demux.qzv.
demux-single-end.qzv (291.8 KB)

I used the following command for denoising
qiime dada2 denoise-single
--i-demultiplexed-seqs demux-single-end.qza
--p-trim-left 0
--p-trunc-len 285
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza

I used the following command for classifier

Import the UNITE sequences
qiime tools import
--type 'FeatureData[Sequence]'
--input-path sh_qiime_release_2024/sh_refs_qiime_ver10_97_s_04.04.2024.fasta
--output-path unite-seqs.qza

# Import the UNITE taxonomy

qiime tools import
--type 'FeatureData[Taxonomy]'
--input-format HeaderlessTSVTaxonomyFormat
--input-path sh_qiime_release_2024/sh_taxonomy_qiime_ver10_97_s_04.04.2024.txt
--output-path unite-taxonomy.qza

# Train the classifier

qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads unite-seqs.qza
--i-reference-taxonomy unite-taxonomy.qza
--o-classifier unite-classifier.qza

Classify your sequences

qiime feature-classifier classify-sklearn
--i-classifier unite-classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy.qza

Actually, I did not have any idea which classifier to use, so I used the latest one in UNITE database.

Thank you!

1 Like

Hi @umanand

Looking at your demux visualization file, your forward reads look very good. I think your parameter choice of --p-trunc-len 285 was justified. Can I ask why the reverse reads were not used in tandem with the forward reads?

The rest of your commands follow a logical flow. Would you be able to provide the 'denoising-stats.qza' file?

Can you confirm that you downloaded and used this UNITE Database - Includes singletons set as RefS (in dynamic files).

I just searched all taxonomy files in this download and I was able to find the Phylum Glomeromycota present in all of them.

Can you manually open the taxonomy text file and confirm that the Phylum Glomeromycota is not present?

If it is present, then I would look in to evaluating your classifier and the predicted taxonomy.

I hope this will bring us closer to finding the answer to this :smiley:

1 Like

Hi Mike,
The reverse read was not good as forward sequence, so I decided to use only forward sequence for it.
For UNITE Database, I used "sh_qiime_release_s_04.04.2024.tgz" and I have no idea where I went wrong.
I also tried using "sh_refs_qiime_ver10_dynamic_all_04.04.2024.fasta", I found "Glomeromycota" but the major phyla comprising around 15-20% is missing.
Please find the attached file for denoising.stats.qza
denoising-stats.qza (12.8 KB)
Thank you!

1 Like