Unassigned reads kbacteria;;;;;; only in one sample type, murine samples

Nicholas_Bokulich · June 12, 2018, 7:13pm

This is probably NOT an annotation issue (unless if you mean that lack of animal DNA in the reference is to blame). Do you have any posts to reference? This post may hold the answer.

Particularly in low-biomass samples, a high proportion of unassigned reads will probably be host DNA and/or other non-target DNA/artifact. Better than cross-contamination, which would be much more difficult to eliminate

I'd recommend doing a preliminary check (e.g., NCBI blast a few unassigned reads) just to see what these reads might be, then filter out all unassigned reads without giving it another thought.

That's very stringent. I would lower that, personally. But that does not actually seem related to the issue you are having, since unassigned is only high in the low-biomass samples, suggesting that it may be some artifact/background noise/host DNA.

Closed reference OTU picking will remove these before you ever see them, because they do not match the reference database. Open reference would build novel OTUs, but the different filtering/chimera checking methods between QIIME 1 and QIIME 2 could be leading to this disparity.

Lower your % similarity threshold a bit, filter all unassigned features, and don't look back.

I hope that helps!

Unassigned reads k__bacteria;__;__;__;__;__; only in one sample type, murine samples

Unassigned reads kbacteria;;;;;; only in one sample type, murine samples