BLAST consensus question

SoilRotifer · September 19, 2022, 5:31pm

Correct. At least up to the best LCA it can resolve to.

Where they inaccurate or unresolved? Generally speaking, this often results from a reference database not being as expansive as it could be, i.e. not having enough target reference sequences in the database. Additionally, the marker gene region may not provide enough resolution for your target organisms of interest. Or a combination of these two issues or something else.

I would classify your sequences against the MIDORI database w/o trimming out the amplicon region. Sometimes, in silico extraction of amplicon sequences via PCR primer matches may fail, due to the reasons I've outlined in the RESCRIPt tutorial I linked earlier. Potentially, leading to issues in taxonomy assignment. If you are better able to identify taxa with the full-length MIDORI database, then that would suggests that the primer-based extraction did not work as well.

If you obtain similar results, then it could mean that the marker gene itself may not provide the needed resolution. I've had this problem with various metazoan eDNA studies myself. In which case the next step is to try another marker gene like 12S rRNA, or something else.

I honestly do not know. I only mentioned it for purposes of providing another option. At least try a step that does not depend too heavily upon PCR primer searches to extract the amplicon region.

In fact, you can take the amplicon region you've extracted from the MIDORI database, and use that as a reference to extract even more amplicon regions from MIDORI, at least those that might have been missed with the initial PCR primer search and extraction. That is, this will ensure that you've extracted as much of your amplicon region of interest from MIDORI. I guess I am saying try using both MIDORI and RESCRIPt.

Besides, there are many approaches to constructing and curating a reference databases. The best thing to do is use the approach that best suites your needs.

Perhaps others on the forum have suggestions or approaches I'm not aware of?

-Cheers!
-Mike