Wrong values at the percentage of the "Retained" sequences while looking at "Interactive Sample Detail"


(Arnon Lieber) #1

Dear developers,

Just short bug report.
While viewing table.qzv (in this example it is the Fecal Transplantation exercise) under the “Interactive Sample Detail” tab, I was playing with the rarefaction depth. I noticed that the “retained sequences” percentage is lowest at both ends. Please find attached two snapshots with the scroll at too ends of the scale.

image

image

Let me know if you need further details here,
Good luck,
Arnon


(Nicholas Bokulich) #2

Thanks for reporting @arnon!

Perhaps I misunderstand you, but this looks consistent to me. The key point is that any samples with < X sequences are dropped, where X = the rarefaction depth.

You rarefy at 61 sequence per sample, and you get a total of 61 * 121 samples = 7381 sequences.

You rarefy at 8,374 sequences per sample and only 1 sample exceeds that depth, so you get a total of 8374 sequences.

So this makes sense: few sequences are retained at low and high rarefaction depths because 1. low rarefaction samples very few sequences and 2. high rarefaction samples a higher number of sequences but you are losing 99% of your samples in the process!


(Arnon Lieber) #3

Sorry Nick, you are totally right. I was looking at the “Sampling Depth” more as the minimum cut-off value of filtering by library size (e.g. we take all the samples with higher depth) so 61 for example should retain most of the sequences. My confusion…
All the best.
A