Difficulty assigning taxonomy to cyanobacterial ASVs

Hello! I am trying to find the best method and database to assign taxonomy to my 1,107 ASVs. (For context, my samples were collected from seawater that was filtered and prepared for sequencing, bioinformatics, etc). Because of this, I was anticipating to see a rich diversity of Cyanobacteria. I first tried using this silva database (Silva 138.1 prokaryotic SSU taxonomic training data formatted for DADA2), but taxonomy classification could only go down to the genus level and a ton of my ASVs had "NA" values. I next tried Blastn (using my rep seqs.fasta file) with the "db nt" database with several different kinds of specific parameters. Specifically, to optimize seeing cyanobacterial taxonomy (if present), I tried using these entrez queries: ""txid1117[Organism:exp] AND 16S[Title] NOT uncultured[Title]", "txid1117[Organism:exp] NOT uncultured[Title]", and "txid1117[Organism:exp]". For each run of blastn with different entrez queries, I kept these parameters the same: evalue 1e-20, perc_identity 90, and max_target_seqs 10. Unfortunately, these three iterations of blastn only identified 1096 ASVs, 1104 ASVs, and 1368 ASVs (with 263 uncultured ASVs), respectively. Since these three iterations of blastn only assign taxonomy to about 10% of my ASVs, I'm unsure what else I should try to optimize my ASV representation. I talso ried placing my unassigned ASVs on a phylogenetic tree based off of reference sequences, but since so many of them are unassigned, this method hasn't been super successful. Lastly, I also tried pre train a classifier to the species level to be specific to the 16S V4 and V5 region I ran sequencing on with no luck. I have only just started learning bioinformatics, so any and all advice on methods to optimize ASV taxonomy assignment would be much appreciated. Thank you!

1 Like

I just wanted to offer some encouragement. It sounds like you are doing a pretty good job of exploring things as far as classification goes.

Also, I do wonder if actually the classification is fine, it's hard to tell, but having some ASVs end up without good taxonomic annotations is basically inevitable.

But presuming the classification is really not any good, you should become suspicious of your ASVs next. What happens if you BLAST with a lower percent identity? Does the top hit have some large fraction of the query which doesn't match? That would be a good clue towards the ASVs being the issue.

4 Likes

Hello! Thank you so much I appreciate your words of encouragement! I will try BLASTing with a lower pident.

1 Like