I’d like to pose a question to all of you about how to handle my sequencing data.
Background: I sequenced 400 soil samples. In the library prep, I used 4 primer sets, 1 primer set for bacteria, 1 for fungi, 1 for protists and 1 for metazoa. I first ran four different PCRs on each sample. The products of the amplification were pooled together by sample. Here, I did a second PCR to attach barcodes representative of each sample. After the barcodes were attached, I pooled together all the samples and sent them for sequencing ( MiSeq 2x300bp).
When I got the demultiplexed sequences back, I first divided them into organismal groups. So based on the primer used in the first PCR (locus specific), I separated the reads in bacterial, fungal , protists and metazoa datasets. At this point, I ran QIIME2 separately on the four different datasets.
For the taxonomy assignment, I trained Silva 138 in three different ways: 1 for bacteria, 1 for fungi and 1 for metazoa. For protists I trained PR2.
Once assigned the taxonomy, I could see that all ‘locus-specific’ eukaryotic primers allowed to identify also organisms from other groups (ex. among the fungal dataset I also got many protists and some metazoa, and the same happened in the metazoa dataset where I got many protists, and in the protists dataset where I got many fungi). Although this wasn’t a surprise, I was wondering what to do with those groups assigned in the ‘wrong’ database.
As a first thing, for each dataset I filtered out the ASVs that weren’t assigned to the group of interest (ex. from the fungal dataset I discarded all protists and metazoa ASVs, and I only kept the fungal ASVs)
But my question is: would it be a waste to just discard those ASVs? wouldn’t it be possible to merge, for example, the protists found in the fungal dataset, to the protists in the protist dataset?
To do this I would need a way to merge the tables after the taxonomy assignment.
But now, my second question would be: does it make sense to merge taxa that come from different taxomy assignment methods? Because for fungi and metazoa I used the same silva database but trained with different settings. For protists, I even used a different database for the taxonomy assignment. So I’m afraid that merging the data (if even possible) isn’t a good idea.
Thank you in advance for your support! I hope I was clear enough in explaining.