Matching ASVs with 16s Sanger amplicons.

So, simultaneously with the 16s metataxonomic sequencing (Illumina, Miseq), my team took an culturomic approach over the same pool of samples as the ones sequenced, resulting in a set of sanger amplicons belonging to strains isolated. This set of amplicons sequenced the 8F-907R region and the metataxonomic data sequenced the 385F-805R region, so hipothetically ASV range is contained in the amplicons from culturomics.

Sanger data was first basecalled using Tracy, then trimmed with Cutadapt, then I imported them to q2, dereplicate them using VSEARCH (as suggested in other posts) and assigned taxonomic with the NB classifier. After checking the taxonomy I saw full correspondence between isolated and ASVs taxonomy (yay!).

Now, I'm trying to retrieve the ASVs that might belong to the isolates or be closely related in order to determine how efficient was the culturomics (kinda when an article says "15% of bacteria found in common by metagenomics and culturomics").

At first, I tried a blast search, but the issue with this method is that I only obtain how much of a sequence is shared (in the form of a identity percentage) and that doesn't necessarily answer to other criteria such as a phylogenetic one and as amplicons have regions completely equal that also mess up the blast searches.
I tried clustering the 16s sanger amplicons but that led to nowhere, I tried SEPP but I just got errors and zero phylogenies (apparently it's not designed for Sanger sequences, I don't know what I was expecting tbh). And I haven't add the sanger amplicons to my reference database (SILVA) as the taxonomy of this amplicons isn't clear and I don't want to bug the classifier with fuzzy points. So now I'm trying to determine a straigthforward and painless method to perform this analysis. That's the reason of this post.

So, forum members, what kind of methods would you recommend to perform this kind of search?

Hi @WeedCentipede,

Given your question:

I think our recent work here may be of interest:

Carper, et al. (2021). “Cultivating the Bacterial Microbiota of Populus Roots.”

In one portion of this work, we compared the 16S rRNA gene sequences obtained from our culture collection to the 16S rRNA amplicon data of prior surveys. From this we obtain a high-level overview of how common these isolates might be present within the different plant compartments.


Hi @SoilRotifer

That paper is exactly what I was looking for, thanks!

Small question tho, in order to assign the taxonomy with vsearch using the 16s isolates database, is there any database curation process to take into account? maybe assign the taxonomy to the isolates through different methods and then join them? idk

Thank you so much for your answer,
Well, you could try out our really cool sequence curation plugin, RESCRIPt. You should be able to work through the tutorials linked there. It'll be a good starting point to see what you can do with sequence curation.

You can do just that! That is, you can assign taxonomy to

  1. your isolate database
  2. SILVA and/or GreenGenes (see the Data resources page)
  3. perhaps append your isolate sequences to the NCBI RefSeq db?
  4. any combination ...

Then you can pick the one that appears most reasonable to you, or you can run rescript merge-taxa to return a taxonomy from all of the different approaches listed above, given certain parameters... One example use case is outlined here.

Thank you so much! it's soo much clearer for me now.

