qiime2 to qiime1 method

Hi
I want to use a database other than greengene provided by qiime2.

I want to use “pick_open_reference_otus.py” of qiime1.
What file should I use for qiime2?

Thank you.

HI @shinseung,

Qiime2 is agnostic to databases, so you can use whichever you prefer, though there are some compatibility requirements. For convenience, qiime2 compatible versions of greengenes and Silva are made available in the data resources page (and Unite for ITS) but you can choose a custom made databaes as well. Look around the forum on how to prepare custom made databases.

In Qiime2, we typically advise users to use the superior denoising methods such as DADA2 and Deblur over OTU picking methods (ex. pick_open_reference_otus.py)) , however if you still would rather use OTU picking, you can follow along the tutorial here. Note that you can also use denoising methods first, to utilize their superior quality control, chimera removal methods then do OTU picking after.

2 Likes

Thank you for your response.
I used user database using qiime vsearch cluster-features-open-reference. table-or-85.qza
I checked with tsv Above is the user database OTUID and below is the greengene OTUID. What is the problem?
And I don’t know the difference between classify-consensus-vsearch and vsearch cluster-features-open-reference.

Thank you ^^ ;;

1 Like

Hi @shinseung,

85% OTU is just used for an example in that tutorial, please carefully read the blue Note box in that link.

Note

Open-reference OTU clustering is generally performed at a higher percent identity, but 85% is used here so users of this tutorial don’t have to download a larger reference database.

I would recommend using a higher % database, say 97 or 99%.

Sorry, I’m not sure I understand what you are asking here. Could you please clarify? Do you see an error? If so, could you please provide us with the exact command you’re typing and the error message please.

Those are 2 very different plugins that perform different functions. Cluster-features-open-reference searches through your sequences and clusters them based on some % identity against a user-provided reference database, and performs de novo clustering on the the reads that do not hit that database. This is what you want to do if your end goal is to create an OTU table. Again, I would reiterate that we recommend doing DADA2/Deblur instead, or at least first before OTU clustering.
On the other hand, classify-consensus-vsearch is used to assign taxonomy to your sequences based on a user provided reference database. The only thing connecting these 2 plugins is that they both use vsearch under the hood for their searching, otherwise they perform very different functions.
You can read more about every plugin by either browsing through their documentations or simple add the --help tag in terminal after the plugin to see the docs there. Example qiime vsearch cluster-features-open-reference --help

Thank you for your response.
“id_taxonomy.txt” and “qiime_full.fasta” files are customized data. A database is provided and instructs the company to create an otu using “pick_open_reference_otus.py” and “parallel_assign_taxonomy_uclust.py”. I don’t know how to do it in qiime2. I created “id_taxonomy.txt” and “qiime_full.fasta” files as “taxonomy.qza” and “otu.qza”. Then cluster-features-open-reference?

Thank you

Hi @shinseung,

How did you make these files exactly? It’s not clear what these files are. Are these based on a reference taxonomy like greengenes/Silva? Or are these your own reads that you have acquired, done OTU clustering, and assigned taxonomy? Can you tell us exactly what type of data you have, which reference database you want to use, and what is your ultimate goal?

It’s hard to compare scripts from Qiime1 directly to Qiime2 as Qiime2 is not just an update on its previous version, it is a complete new system with very different structures. I would strongly recommend you go over some of the introductions/tutorials, starting with the overview tutorial and then follow one of the actual example tutorials, such as the Moving Pictures tutorial. If you feel like you are already familiar with Qiime1 and just want to skip basics, you can read the tutorial for advanced users.
Ultimately, what you are looking for are 2 separate steps. 1) You have fasta/fastq files which you want to get into an OTU table. I’ve provided you with the links to do that, the next step is using those files to create a taxonomy file, which is a separate process all together, see the Moving Pictures tutorial I linked for an example of how to get a taxonomy artifact.

The file I have is:

id_taxonomy.txt

qiime_full.fasta

i have a dada2_table.qza file

thankyou

HI @shinseung,

In one hand you said you had a custom database, but you also mentioned using greengenes? Can you clarify what exactly you are using? In the images you posted, the bottom one is of what appears to be your reference OTUs and the top is of a corresponding taxonomy file. You don’t need the taxonomy file to do open-reference clustering step, but you will need it in a later step if you wanted to assign taxonomies to your features (instead of the hashes that will be assigned by default). I’ve provided you links with instructions on how to do this 2nd step.

But first, what you need to do the open-reference clustering are:

  1. sequences: the rep-seqs.qza you obtain at the end of dada2.
  2. table: Your feature/otu table, you said you had this in dada2_table.qza
  3. reference-sequences: This is your custom reference database. It will be like the bottom image, you’ll have to however import that as a qiime2 artifact (can’t be in its original .fasta form).

If you have more issues with these please a) re-read my answers above, then b) copy&paste the exact commands you are writing, making sure you add the --verbose tag at the end of each command, c) copy&paste the exact error message you are receiving.

1 Like

Thank you for answer.
I created an otu table and it came out as follows:

Above is the custom database (up to line 1197)
Is the otuid listed below not an otuid in the custom database?
Can we delete from line 1198 down and use otu?

Thank you.

Hi @shinseung,
It looks like you used open-reference OTU clustering, which basically performs two main steps:

  1. closed-reference OTU clustering against a reference dataset. The OTU representative sequence is the reference sequence itself, so these are the custom database IDs in your table.
  2. de novo OTU clustering of any sequences that fail to align to the reference database. These are given the de novo OTU IDs that you see in your table above (at least that is where those IDs must be coming from if they are not IDs in your custom database).

No, that would defeat the purpose of open-reference OTU clustering. Though you could just do closed-reference OTU clustering if you are not interested in anything that does not align to your reference database.

1 Like