Please see the following post for updated and more complete instructions if the hacky method described below is of use to you:
Hi qiime2 community,
I’ve generated ASVs from deblur, and would like to cluster them at various thresholds to look at the data at different resolutions. This is easy to accomplish with the “qiime vsearch cluster-features-de-novo” command. It generates a new biom table which contains abundances for each cluster. The IDs found in this table are from the original ASV biom table (guessing this is a centroid that represents the cluster). But I couldn’t see any way to get the membership of these clusters (i.e. which of the original ASVs were merged into this cluster). Have I overlooked an option or is it not possible for this plugin yet? It looks like in VSEARCH the “–uc” option would be one way to preserve this information. I also considered just using VSEARCH outside of qiime2 but then it doesn’t look like it would be easy to retain the abundance information contained in the biom table generated by qiime2 (required input for VSEARCH is a fasta file - no option for biom input as far as I can tell).
Unfortunately, not at this time. We have not yet had a really compelling reason to add an OTU map output in QIIME 2, and we have not had too many people ask about this feature… but this could be added as a feature at some point (contributions are always welcome )
Yes — I think using vsearch directly would be the only way to get this information, unfortunately.
The --uc option is used internally by q2-vsearch… but the uc file just isn’t provided as an output. So it would probably be fairly straightforward (famous last words!) to fork q2-vsearch, modify the code to output a uc file, and call it a day.
Sorry we don’t have an out-of-box solution at this time!
Thanks very much for your prompt reply! I would certainly be interested in helping to develop this option, though I think I might need to take some time to learn how to fork github repositories and all that fancy developer business. In the meantime is there any quick hack I could do to save that tmp file that the --uc flag is being redirected to? I tried looking into the underlying scripts and modifying a few things (in an attempt to write a file with the --uc information) but it didn’t seem to alter the behaviour of qiime2. I am using a conda env and tried to alter “_cluster_sequences.py” script but qiime2 seemed to just ignore my changes, probably because I was editing the wrong thing.
Please see the following post for updated and more complete instructions if this hacky method is of use to you:
Hi Nicholas and anyone else who might want this information:
The simplest way I could think of to do this was to just change line 183 “_cluster_sequences.py” to:
" with tempfile.NamedTemporaryFile(delete=False) as out_uc:"
This prevents the temp “uc” file generated by VSEARCH from being wiped at the end of the “with” loop. Once q2-vsearch is done the clustering, you can just copy it to a non-temporary file and you’re off to the races. I tested it and the result looks good. Thanks again for your help!
If this is a feature that other folks need / want in the future, we could consider including a parameter in the vsearch plugin that preserves this .uc output file. If you are interested, this could be a great ‘first submission’ to Qiime 2!