Deblur - discarded sequences

Hi,

Is there a way to access the actual sequences that are thrown out by deblur? (As in, by seq ID, or in a fasta or something). I am currently using deblur in the command line, I know Q2 has a stats option but I don’t think it outputs what i’m looking for.

I also know I can opt to save intermediate files, but I’m not sure whether any of those files tell me what I need. I’m assuming the .derep file merges all identical sequences, but as far as I can tell there is no way to know what sequences went into each of the “representative” sequences? And conversely which sequences were ultimately eliminated?

Thanks,

Maitreyi

Hi @maitreyi,

Sorry for the delayed response. I believe you’d need to examine the set of sequences in the input that are not represented by the set of sequences in the output. I’m not aware of a means to do this directly with Deblur, so it may need to be done custom. It may be the case that the cluster information can be inferred from the temporary output – that may be a better question to ask on the Deblur issue tracker (https://github.com/biocore/deblur/issues)

Best,
Daniel

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.