Hi @MMC_northS,
"reads-derep" indicates the number of unique reads within a sample following dereplication, and using the threshold provided by --p-min-size
. By default, this will omit singletons, so if you're data do not have many singletons, "reads-derep" would be quite similar to "reads-raw".
The number of reads following deblur itself ("reads-deblur") may be misleading as that is upstream of negative and positive filtering. The output table from q2-deblur
you will most likely be using should correspond to the results in "reads-hit-reference" which are the number of reads which passed the positive reference filter (which is applied after the negative).
The "artifact" and "reference" refer to the negative and positive filters respectively. The negative filtering database is generally composed of adapters, and the PhiX genome. The positive filter is generally composed of the target amplicon type.
"derep" indicates stats about the dereplication stage of the algorithm, and those two columns describe the number of unique reads observed after dereplication ("unique-reads-derep") and the number of total reads left after dereplication and filtering based off --p-min-size
("reads-derep").
When you have a second, can you provide the exact commands used? The column "filtered-by-min-length" does not appear in the deblur stats, is this in reference to q2-quality-filter
? If so, that plugin will truncate sequences using a sliding window over the PHRED scores when a minimum quality is observed, and after the truncation a minimum length filter is applied (the defaults are based off of Bokulich et al 2013.
Best,
Daniel