where can I find a taxonomy tutorial

SAMRENDRA01 · April 18, 2020, 3:11pm

Thank you sir for resolve my problem.
i have sone more questions related to TAXONOMY assigning and Denoising process.

when i did denoising of PairedEnd Demultiplexed sequences by DADA2 it gives me 2194 feature. which is quite low. can i did denoising separately for forwards and reverses the sequence by DADA2 and after the denoising should i merge this separated sequences for getting high feature count?
i also facing the problem in the Taxonomic assignment of v3-v4 region. f have any tutorial for taxonomic aasignment please provide me.

colinbrislawn · April 18, 2020, 5:27pm

How many features were you expecting? Clustering has had issues with OTU inflation, but modern denoising methods minimize this problem. I think 2194 features might be small for OTUs, but might be reasonable for ASVs.

...yes, you could run dada2 denoise-single twice: once on your forward read and once on your reverse reads.

... yes, you could merge these features...
but if denoise-paired makes use bases in the area of overlap to estimate the error rate and correct for it. Using denoise-single twice removes this helpful step.

The default pipeline should work well! (Are you worried it's not working for your data?)

Of course! The official tutorials are listed here.
You could look for sections on "Taxonomic analysis" and "Taxonomic classification."

Let us know if you have any questions!

And welcome to the Qiime 2 forums! :qiime2:

Colin

SAMRENDRA01 · April 18, 2020, 7:26pm

ok thank you, sir, for your response i have some more queries you said 2194 ASVs is reasonable but when I perform a taxonomic assignment for these ASVs it count decrease as 700 hundred. do have any idea how I will overcome this problem?

colinbrislawn · April 18, 2020, 8:55pm

Hum... Interesting!

You should have the same number of features before and after taxonomy classification.

What are the commands you are using to classify features? Have you completed the Moving Pictures tutorial or the PD-Mice tutorial?

Colin

SAMRENDRA01 · April 19, 2020, 3:14pm

sir, i am working on v3 v4 16s region sequence and when i check the denoising sequence stats its showed me min seq. length 256 and maximum seq. length 433. that means all my ASVs (2194) is in under this sequence range(256-433)?
sir i also dont have the sequence primer (v3-v4) because vendor company not give me the details of primers, so in this case how can i do taxonomic assignment for v3 v4 region?

SAMRENDRA01 · April 19, 2020, 3:15pm

yes i did 3 times but same results came every time. i think i am doing mistake in taxonomic assignment part.

colinbrislawn · April 19, 2020, 7:51pm

I think so. If you would like to post the denoising visualization, I can take a look and confirm what the numbers mean.

If you don't have the primers, you can still use a pre-trained classifier. Try the full length silva classifers on this page.

Then follow the tutorials I mentioned above!

If you post 1) the full command you ran and 2) the full output, I can help you solve this problem!

Colin

SoilRotifer · April 19, 2020, 11:09pm

Hi @SAMRENDRA01,

Can you not ask the vendor for the primer sequences? I also ran into a case where the vendor would not share the primer sequences, because they were proprietary. However, they were willing to provide us with the length of the V3V4 primers used (the first 16 bp of the forward read and the first 24 bp of the reverse read). This enabled us to trim off those bases from the beginning (e.g. setting the trim options in DADA2). If the vendor is not willing to do provide even this information, you can simply trim off the first 30 bp from the beginning of each read and leave it at that.

There are only a few popular primer sets for these regions. I'd simply suggest looking the sequences directly and see if these primers are similar to the beginning of your sequences resemble any of them. Although, the primers used may be proprietary, but they should give you an idea of how long (or what other primer set they are based on... Another trick is to make a sequence logo of the first 30 bp of the of the read and see if that would be usable for trimming.

However, as @colinbrislawn suggested, please provide us with a set of commands you used up until this point. Taxonomy assignment should only return the classification of your reads, not removing them. If anything you'll simply obtain unassigned or unclassified as the taxonomy assignment. Also, using the full -length classifier as he suggested is a good start for trouble-shooting the problem.

-Best
-Mike

SAMRENDRA01 · April 20, 2020, 7:01am

Thanks, sir for your response, I am discussed my whole procedure.

I choose the manifest protocol for data import which is most suitable for my case because I have demultiplexed paired-end sequence. so I ran unzip

qiime tools import
--type 'SampleData[PairedEndSequencesWithQuality]'
--input-path manifest
--output-path paired-end-demux.qza
--input-format PairedEndFastqManifestPhred33V2
so i get demux_seqs.qzv i am attached the image

after this i was going for denoising protocol so i ran this command
qiime dada2 denoise-paired
--i-demultiplexed-seqs demux.qza
--p-trim-left-f 23
--p-trim-left-r 23
--p-trunc-len-f 240
--p-trunc-len-r 240
--o-table table.qza
--o-representative-sequences rep-seqs.qza
--o-denoising-stats denoising-stats.qza
so i get 3167 feature count (image attached)

after that i was going for taxonomic assignment, here i am using the pretrained classifier which is available in qiime 2 web (https://data.qiime2.org/2020.2/common/silva-132-99-nb-classifier.qza)
the command which i used for taxonomic classification is
qiime feature-classifier classify-sklearn
--i-classifier silva-132-99-nb-classifier.qza
--i-reads rep-seqs.qza
--o-classification taxonomy.qza

qiime metadata tabulate
--m-input-file taxonomy.qza
--o-visualization taxonomy.qzv
so I got the resulting image attached

and after for taxa barplot, i run
qiime taxa barplot
--i-table table.qza
--i-taxonomy taxonomy.qza
--m-metadata-file sample-metadata.tsv
--o-visualization taxa-bar-plots.qzv
so i got the result Screenshot 2020-04-20 at 12.18.45|689x350
now I have some questions

my complete process at every step is correct?
when i have got 3167 feature count but when i create taxonomic bar plot at the species level (level 7) i got only 970 results in the CSV format. why it's happening i mean why i am not getting 3167 counts in CSV ? please clear my doubt.
my sequence is from v3 v4 region and i am using full-length silva classifier which is might be v4 region i am not sure it just an assumption, does this deviate my result?

SoilRotifer · April 20, 2020, 2:42pm

Thanks for providing these details @SAMRENDRA01. It appears everything looks okay, though others may point out anything I have missed.

when i have got 3167 feature count but when i create taxonomic bar plot at the species level (level 7) i got only 970 results in the CSV format. why it’s happening i mean why i am not getting 3167 counts in CSV ?

I think there is a misunderstanding here. Many plots, like the taxonomy barplot, will collapse the separate ESVs / OTUs together that share the same taxonomy for the given rank-level you are observing. That is, you may have many sequence variants that map to the same taxonomy, which is almost always the case.

You can see this for yourself, look at the visualization you made, i.e. taxonomy.qzv, and search for a given label you are interested in, let's say Pseudomonas. You may see many ESVs that map to that taxonomy.

Similarly, when you view this information via the barplot, these separate ESVs will become merged under the taxonomy string that they share, for a given rank within the barplot.

Does this make sense?
-Mike

SAMRENDRA01 · April 20, 2020, 3:51pm

Thank you sir for explanation.

SAMRENDRA01 · April 20, 2020, 4:27pm

sir please solve my some more problems
1.how can i make classic otu table in qiime 2?
2. please provide me v3 v4 region silva classifier.
3. is this possible can i denoising forward read first then denoising the reverse read and after that marge the two read into one read? if yes how can i do this?

Francisco · April 20, 2020, 6:43pm

Hi there!

Open your barplot.qzv, select the desired taxonomy rank level and download the CSV

from here: SILVA 138 Classifiers
and i recommend this file (but its up to you!)
SILVA-138-SSURef-noSpeciesLabels-full-length-classifier.qza

Ask @SoilRotifer for the differences between version 1 and 2.

its possible to denois them separately, but its recommended to denois them together

SAMRENDRA01 · April 22, 2020, 4:47pm

hi @SoilRotifer when i run the v3v4 classifier which you provided it shows error.
The scikit-learn version (0.21.2) used to generate this artifact does not match the current version of scikit-learn installed (0.22.1). Please retrain your classifier for your current deployment to prevent data-corruption errors.
i don't have the primers information of my Illumina sequence and service provider also not giving response to my mail. so can you please provide my pre-trained v3v4 region classifier? also, i don't have the much powerful machine to train the classifier by myself. so if possible please provide me siliva classifier that will very help for me. thanks for helping.

Francisco · April 22, 2020, 8:02pm

Hey there!

Those classifiers work on Qiime2 version 2019.10 , and i gess you have version 2020.1 . You can create a conda enviroment with version 2019.10 (like installing qiime2 but 2019.10 without uninstalling v 2020.1

SAMRENDRA01 · April 23, 2020, 6:24am

thank you sir @Francisco Frans, but how can i create a conda environment with version 2019.10, without installing 2020.2 version ( currently i am using ).

thermokarst · April 24, 2020, 2:21pm

Please follow the instructions here:

https://docs.qiime2.org/2019.10/install/native/

SAMRENDRA01 · April 24, 2020, 2:39pm

thank you @thermokarst