Dear admins,
Many researchers are working on 16S gene. What factors are required to apply global (Vesearch) or local (Blast+) alignment? What criteria are there to pick up a right one?
Thanks
Qiimer
The papers (BLAST, VSEARCH) are a good place to start. The VSEARCH paper in particular spends some time comparing its (optimal) algorithm to BLAST's optimal-with-certain-parameters approach.
Thanks, sir.
Just a simple question I have. Many people use the Vesearch classifier for 16S analysis. Would it be problematic to use the Blast+ classifier for analysis 16S reads?
I'm not an expert on this, @TurboQiimer, but BLAST has been around for a long time, and I'd be surprised if there's a "problem" with using it in this context. As with most things, I suspect your decision is going to be one of "what's the best tool for the job?"
BLAST is a popular tool in NCBI website. There is no any doubt that it is a useful and effective tool in alignment, but based on survey in Qiime2 forum, I recognized Vesearch is widely used in classification. I read the two papers you suggested to me. The part I needed was mentioned in mathematics or algorithmic way, that I could not catch more. Yeah... you are right! I want simply to know which one is the best, Vesrach or Blast, although it sounds BLAST output represents that it works well in my case! I am just in dilemma, indeed!
Thanks your guidence in advance.
Qiimer
I'm not really sure how to answer this question, because it depends on how you define 'best'. If you were asking about protein alignment, I would say blast (because vsearch does not do protein alignment ).
In your comparison, are you using positive controls with a known taxonomic composition? How does the vsearch and blast+ classifications compare to your expected results?
Colin
If you look at this, you will see Blast generates more and mote taxa rather than Vesearch!!!
As a researcher, which do you prefer and offer? The results are very different!!!
As old saying"A picture is worth a thousand words".
By the way, in reality, which the result do I have in my samples? Belonged to Blast+ or Vesearch?
Qiimer
This is the big question. More is not always better.
Are these positive controls with a known composition?
Colin
What do you mean by that? You meant Vesearch-related result is accepted?
Could you please explain a little bit more? I am not sure I caught your question's concept!
Thank you very much you replied me as I know new year is around the corner!
Qiimer
Sure thing.
You have been asking about the taxonomy classification results of blast and vsearch, and have shown that they produce different results.
in reality, which the result do I have in my samples?
One way to answer this question is to use a mock community with a known mix of microbes. For example, in this paper, they simulate microbial samples so they know the exact composition of each sample. Then they analyse these samples in different ways, and just like you, they observed that different analysis methods produced slightly different results.
So what results is best? Well, because they knew the true composition of their mock samples, they can find the analysis method that most closely matches the true composition truth.
Here is an example of a mock sample being classified by two programs
Taxonomy | True % in mock sample | program1 | program2 |
---|---|---|---|
taxa1 | 50% | 49% | 35% |
taxa2 | 40% | 39% | 35% |
taxa3 | 10% | 12% | 30% |
I want simply to know which one is the best
Based on these example results, I would say program 1 is the best, as it most closely matches the expected composition of the mock sample.
Here is an example of a real sample being classified by the same two programs.
Taxonomy | Sample1 | program1 | program2 |
---|---|---|---|
taxa1 | ? | 55% | 25% |
taxa2 | ? | 27% | 27% |
taxa3 | ? | 18% | 48% |
I want simply to know which one is the best
We don't know what program is best because we do not know the true composition of Sample1.
So are we stuck? Based on these two examples, would you use program1 or program2?