Epipyxis sp. and Neotessella volvocina detected in groundwater

Hi there,


I did 16S rRNA amplicon sequencing of some DNA and cDNA extracts from groundwater samples some time ago (targeting the V3–V4 region). I did all the processing and analysis in :qiime2:.

I used ‘dada2 denoise-paired’ for denoising and merging of the paired-end reads, and assigned taxonomic labels to the resulting ASVs via a classifier trained (using 'q2-feature-classifier') on the V3–V4 variable region of pre-formatted representative 16S rRNA sequences derived from the SILVA rRNA database (release 138) using RESCRIPt.

Here are the resulting taxa bar plots: taxa-bar-plots.qzv (1.7 MB)

My question

What am I to make of the fact that the following two taxa were detected in the groundwater from one of my wells (at relative abundances between 2.5% and 26.8%)?

  1. d__Bacteria;p__Cyanobacteria;c__Cyanobacteriia;o__Chloroplast;f__Chloroplast;g__Chloroplast;s__Epipyxis_sp.

  2. d__Bacteria;p__Cyanobacteria;c__Cyanobacteriia;o__Chloroplast;f__Chloroplast;g__Chloroplast;s__Neotessella_volvocina

Epipyxis appears to be some kind of mixotrophic freshwater algae, and has been detected in surface water in a previous study, which targeted the same region of the 16S rRNA gene as we did (V3–V4) and used the same database for assigning taxonomy (SILVA).

Neotessella volvocina appears to be some kind of freshwater algae too.

These two taxa were not detected at >0.1% relative abundance in any of the other groundwater samples in the study. They were, however, detected at low levels in one negative control (≤2.2% relative abundance), which is one reason to be cautious about interpretation.

My interpretation

Algae are known indicators of surface water intrusion in groundwater wells (see below quote from this paper):

Fungi and algae are generally not found in detectable concentrations in groundwater. However, both types of microbes can migrate into groundwater supplies from surface or near-surface sources, such as surface water bodies and sewer lines. The presence of algae cells in a groundwater supply is an indication that the water in the well or spring is probably originating from a nearby stream or lake; algae grow only in surface water, not groundwater.

That paper is from 1997, and refers (presumably) to the detection of algae via culture-based or microscopy methods. Nevertheless, I've been wondering if detection of the above taxa (Epipyxis and Neotessella volvocina) in 16S data from groundwater could be a potential indication that surface water from a nearby river is intruding through the subsurface towards the well from which the groundwater sample was taken? The river is within about 60 m of the well.

Is this in any way a reasonable possibility? I still consider myself an amateur when it comes to understanding bioinformatics and microbiome analyses, so I could be way off in my assumptions. One thing that makes me hesitate is that presumably 16S rRNA amplicon sequencing is not the method of choice for studying algae, so I worry about making deductions about algae based on 16S data alone. The other consideration is that these taxa were detected at low levels in one negative control, so perhaps it's not entirely possible to rule out cross-contamination of amplicon libraries as an explanation.

Curious to hear other people's ideas, and thanks in advance for any help!


1 Like


I'd like to point out an important point, as we highlight in our RESCRIPt tutorial, SILVA does not curate species labels, and we warn about this, under the drop menu "Species-labels: caveat emptor!"

I'd suggest that you fetch trustworthy sequences of these algal taxa from GenBank and place them into a single FASTA file. Then add some legitimate cyanobacteria too. Then add the sequences of the two potential algal taxa within that same file. From here I'd generate a sequence alignment and a phylogeny. What you should see is two main clades: 1) true cyanobacteria, and 2) chloroplasts. If your sequences land within the chloroplast clade then you can trust that your sequences are indeed chloroplasts and not mis-identified bacteria.

You can also submit your potential algal sequences to SILVA's ACT tool and see if those sequences do actually fall within an algal (chlroroplast) clade.

But I'd still be wary about making any over claims with out other data information.


Hi @SoilRotifer,

I did actually know that about the SILVA species labels being unreliable, but forgot to mention it in my post above. Thanks for the reminder! I should have written Neotessella rather than Neotessella volvocina.

You made some nice suggestions. Thank you very much for those! I'll report back if I make any headway.

Not sure what I would do without this forum, by the way. Always great ideas here! :+1:


Try a different database (curated and updated) like MIMt, and check whether the results are the same, since many of the SILVA assignments can be wrong due to missclassifications. MIMt has every single sequence taxonomicallu annotated at species level following the most updated ncbi taxdump file.
You can get the MIMt database at https://mimt.bu.biopolis.pt

1 Like