Hi everyone,
First of all , congratulations on the tool and its environment. It is amazing how it works and how everything gets streamlined with all the tutorials and the support in the forums.
I apologize in case there is a similar topic, but after a few days looking, I could not find anything similar (perhaps you will lead me in other directions also).
We work with axenic D. melanogaster and we inoculated them with 4 specific Acetobacter, Leuconostoc and Lactobacillus strains that were sanger sequenced. Flies were fed with different diets and we wanted to track how the bacterial composition changes over different generations.
Nevertheless it seems a contamination might have occurred on the way as some Gluconobacter are detected after a few generations. We also want to see if there is contamination of other Acetobacter as they are frequently found in D. melanogaster.
In our case, we are amplifiying V3-V4 regions, and as I understand, it is quite complicated to go down to the species level with such short fragment. Also, given my computational power limitations, I use SILVA or NCBI databases with blast consensus. A consensus that seems quite complicated to reach for Acetobacter, and as I understand, doing a top blast hit is not correct.
As a result, we can only see a lot of Acetobacter, but not down to the species level, so we cannot see whether there is contamination with other Acetobacter.
My supervisor suggested me to compare the 16s ilumina reads with the initial Sanger sequences, and to be honest I have no idea how to do it! (I am new with bioinfo)
I though that I could generate a database using the sanger sequences I have and use qiime. The incomplete classification of part of them will mean there is additional contamination with other strains not belonging to our 4 original strains. Is this feasible? If its, where do I start from to construct the seq and taxonomy files?
If this approach is incorrect, could you guide me to a more realistic strategy?
Thank you very much
Jaime