vsearch de novo output --uc option

jcmcnch · October 9, 2020, 12:09am

Hi folks,

If you want to use the simple hacky workaround I alluded to before to preserve the UC file from qiime2, here's a way to do it (my previous instructions don't seem to work anymore and were probably not that clear anyway):

Follow the steps here to set up a qiime2 development conda environment.
Make sure you're in the environment - your shell should look something like this: (qiime2-dev) jesse@ubuntu-delltron:/testing $
Download my hacked* q2-vsearch env as follows: git clone git@github.com:jcmcnch/q2-vsearch.git, then enter the directory q2-vsearch.
Make and install the vsearch package: make dev && make test. You should get some warnings but no errors.
Now refresh the cache in your conda environment to make sure the updated q2-vsearch gets used when you do your clustering (NB: I only changed this for de novo clustering): qiime dev refresh-cache
Run de novo clustering in qiime2. Look in the text output for the location of the UC file and copy it somewhere to keep the info as illustrated in the image below
Optional - you can then parse the UC file using another script I've put together here to get membership info and a summary (i.e. which ASVs go into each centroid found in the qiime2 table output). The script has some basic documentation about how to run it which you can access by passing the --help flag.

*I only changed one line in one python file so the UC temp file doesn't get deleted. This workaround, although bush-league, will avoid you having to run VSEARCH outside of qiime2 which is pretty convenient in terms of data processing and will avoid the issues of comparability discussed above by Chantelle, Colin, and Devon.