Samples with high contamination from chloroplast DNA in 16S analysis

Sparkle · October 21, 2019, 2:03pm

Hello! This is a conceptual question, but I'd like to hear your opinons here.
I'm running a 16S analysis on some old plant samples, performed through Roche454 pyrosequencing 5 years ago.
I used the latest version (gg_13_8) of the Greengenes database to build my taxonomy reference file and then performed the taxonomic analysis, resulting in the following barplot:

The dark blue part of the bars is labelled as 'k__Bacteria;p__Cyanobacteria;c__Chloroplast;o__Streptophyta;f__;g__;s__'

Chloroplast

Sadly, I think the problem is due to a bad primer choice (27 and 533).

I don't think there is any way to obtain something useful from these results, because the contamination affects more than half of the samples making comparisons impossible, but I'm asking just out of curiosity.

By reading the QIIME2 tutorial, I noticed the feature-classifier extract-reads method makes it possible to perform a training on the basis of the primers sequences, and it has been proven to improve results. I doubt that, but do you think this would make the results even slightly better?

Any other suggestions?
Thanks in advance.

Nicholas_Bokulich · October 21, 2019, 2:12pm

Hi @Sparkle

No, since your issue is chloroplast amplification extract-reads will not help — your reads are chloroplast any way you cut it

The only advice I can give is to use qiime taxa filter-table to drop all chloroplast reads and hope you have enough reads and samples left over to make a meaningful analysis.

Sparkle · October 21, 2019, 2:24pm

Thanks for your answer!

I had already tried that.

qiime taxa filter-table --i-table uchime-dn-out_cr_new/table-nonchimeric-wo-borderline.qza --i-taxonomy gg_13_8/gg_13_8_otus/taxonomy.qza --p-exclude mitochondria,chloroplast --o-filtered-table table-no-mitochondria-no-chloroplast_cr.qza

As I said, in the plot chloroplast DNA is represented by the dark blue part of the bars. Sadly, too many samples, 6 out of 12, are only constituted of this type of DNA.

I think the experiment has to be repeated, with a more specific primer choice and maybe with Illumina reads this time.