How to remove a certain number of reads found in negative


I am looking at parasite diversity in snails but I found some reads which came up as two parasites in my negative. One parasite has 112 reads in the negative and the other has 99 reads in the negative. Both of these parasites are also found in some but not all of the snail samples. Some samples have very high reads of these parasites and others have fairly low reads and some have none. I do not want to remove the whole sequences as I know that these parasites are meant to be in the snail so must be some cross contamination. Someone suggested that I just remove the number of reads found in the negative from each sample. Is there a way that I can do this through the filter table plugin in qiime2? Is this the best thing to do or would it be better to just leave it and mention that these two parasites were also found in the negative and should be interpreted with caution?


Bad idea! :grimacing: cross-contamination is not a simple additive process so you cannot just subtract a specific number of reads from each sample.

Yes, that would be best.

Or a better approach: you can use decontam (an R package developed by the same developer behind dada2 — this is not yet available in a QIIME 2 plugin but maybe later this year). decontam will identify reagent contaminants (to filter out) in negative controls vs. cross-contaminants (which are kept since there is not easy solution). Doing this (assuming these two ASVs are not filtered out) will allow you to say that these were detected in negative controls but appear to represent cross-contamination as you used a statistical approach (decontam) to identify background contaminants.

I hope that helps! :grin:


Thanks @Nicholas_Bokulich That is good to know!