Taxonomic analysis issue

microme · September 23, 2019, 8:29pm

From my reps-seqs file generated after DADA2 , I can see that I have host DNA in addition to the bacterial DNA. The region that I have amplified is V4-V6 region.
rep-seqsPE.qzv (269.2 KB)

I still proceeded with the subsequent steps of taxonomic analysis using SILVA 132, where I had trained it acc. to my primer region. In the end, I generate a taxonomy table having almost all of them as either unassigned or as D_0__Bacteria. I did this because I assumed that I could filter out the unwanted sequences using the qiime taxa filter-seqs & qiime taxa filter-table.

When I use the command (below) to filter out the unwanted taxa both in my reps-seqs and taxanomy table :
Command 1
qiime taxa filter-seqs
--i-sequences /test/rep-seqsPE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned,
--o-filtered-sequences /test/rep-seqs-bac.qza

For command 1 I get: Plugin error from taxa:
All features were filtered, resulting in an empty collection of feature sequences.

Command 2
qiime taxa filter-table
--i-table /test/tablePE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned
--o-filtered-table /test/table-bac.qza
However, for the command 2 I generate a table that looks like:

Command for visualizing : qiime metadata tabulate
--m-input-file /data/p281301/test/table-bac.qza
--o-visualization /data/p281301/test/table-bac.qzv
The taxonomy table I generate, consists of unassigned and bacteria. But in the reps-seqs file, when I was directed to the ncbi database, I did spot few bacterial genera (and species):this does not reflect in the taxonomy table. For example, ncbi database shows E.coli (lets say) as a result, so correspondingly I should get the taxa until genus level and not just the domain d_0_bacteria. Where am I going wrong? From my reps-seqs file generated after DADA2 , I can see that I have host DNA in addition to the bacterial DNA. The region that I have amplified is V4-V6 region.
rep-seqsPE.qzv (269.2 KB)

I still proceeded with the subsequent steps of taxonomic analysis using SILVA 132, where I had trained it acc. to my primer region. In the end, I generate a taxonomy table having almost all of them as either unassigned or as D_0__Bacteria. I did this because I assumed that I could filter out the unwanted sequences using the qiime taxa filter-seqs & qiime taxa filter-table.

When I use the command (below) to filter out the unwanted taxa both in my reps-seqs and taxanomy table :
Command 1
qiime taxa filter-seqs
--i-sequences /test/rep-seqsPE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned,
--o-filtered-sequences /test/rep-seqs-bac.qza

For command 1 I get: Plugin error from taxa:
All features were filtered, resulting in an empty collection of feature sequences.

Command 2
qiime taxa filter-table
--i-table /test/tablePE.qza
--i-taxonomy /test/testsilva_97_v4-v6_taxonomy.qza
--p-exclude archaea,eukaryota,mitochondria,chloroplast,Unassigned
--o-filtered-table /test/table-bac.qza
However, for the command 2 I generate a table that looks like:

Command for visualizing : qiime metadata tabulate
--m-input-file /data/p281301/test/table-bac.qza
--o-visualization /data/p281301/test/table-bac.qzv
The taxonomy table I generate, consists of unassigned and bacteria. But in the reps-seqs file, when I was directed to the ncbi database, I did spot few bacterial genera (and species):this does not reflect in the taxonomy table. Where am I going wrong?
i have already looked at here, here, here, here. .

microme · September 23, 2019, 8:33pm

Also to let you know, I have looked this up here, here, here, here.

ben · September 23, 2019, 9:16pm

Your first link is expired. Would like to take a look at it. Ben

edit: when I had this problem, the dada2 portion did not have enough overlap and ended up without partial reads.

thermokarst · September 23, 2019, 9:35pm

Hey @ben, the first link is fine - here is a view link to it. You might've run into a situation where you downloaded and pre-cached the viz. Thanks!

PS - you can also just download the QZV and open directly in q2view.

microme · September 23, 2019, 10:19pm

Sorry, @ben was right, it was not working and then I edited the post by uploading the qzv file. But did not specify that I had correctwd, sorry for the confusion

microme · September 26, 2019, 3:00pm

I performed taxonomic analysis on the single end reads which is for the V4 region. Earlier when I did the taxonomic analysis (here) using paired end reads for the same samples, I found almost all of the features had corresponded to either d_0_bacteria or unassigned. I assumed it might be because I did not successfully merge them.

Hence on repeating the same steps with the single end reads, for few features I did get family and genus level classification (but still very low). The remaining are still unassigned (which is from the host) while for several features I still see d_0_bacteria: what does this mean? Can anything be done to improve this classification? The now improved taxonomy file for single end reads is below.
taxonomySE.qzv (1.3 MB)

Nicholas_Bokulich · September 26, 2019, 3:49pm

chances are you were using an inappropriate reference dataset — you can search the forum for other examples, but 99% of the time this happens when, e.g., someone has V1-3 reads and are using a V4 classifier. Or even V4-6 and using a V4 classifier. The classifier needs to be trained on reads ≥ the target region (so e.g., full-length 16S is fine for classification of V4).

One of two things:

the reads are too short
the reads are not really bacterial but are more host; spot check a few with NCBI BLAST (exclude uncultured/unclassified/unknown) to confirm.
same issue as #1; the classifier does not fully overlap your target region so you are getting bad classifications.
your reference database has poor coverage of the target taxa?

system · October 27, 2019, 9:49pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.