Use PR2 in Qiime

Hi, I’m trying to use PR2 to analyze my data from 18s amplicon sequencing.

I found a list of files on

I think I should be using pr2_mothur.fasta.gz and But what should I do to them so that I can use PR2 in Qiime?

Thank you.

Check out the format of other compatible databases, like SILVA or Greengenes. Get PR2 in that format and you are ready to run. :running_man:

there are a few other forum topics related to PR2 that might be useful — check those out just in case they help.

Good luck! Let us know if you have other questions.

Hi, I followed the tutorial “Training feature classifiers with q2-feature-classifier” and assigned taxonomy to my samples. However, the assignment does not look right to me since too many ASVs receive the same taxonomy and in many cases that the taxonomy is not correct as SILVA and BLAST search give me the same different result.

I’m thinking if that’s because of a problem with training the classifier.

The commands I used to train the classifier were:

qiime tools import --type ‘FeatureData[Sequence]’ --input-path pr2_version_4.10.0_mothur.fasta --output-path sequence.qza

qiime tools import --type ‘FeatureData[Sequence]’ --input-path pr2_version_4.10.0_mothur.fasta --output-path ref-sequence.qza

qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads ref-sequence.qza --i-reference-taxonomy ref-taxonomy.qza --o-classifier pr2-classifier.qza

I think they look all right to be but I’m wondering if there could be any issue I did not spot. Or do you think it could be because of something else?

Thank you so much.

why is this a problem? do you know the (approximate) composition of these samples? Many ASVs just means these sequences are different, not that the taxonomy should be different. You could, e.g., have many low-abundance ASVs that receive the same classification, but their relative abundance could still average out to an “expected” composition.

I’d recommend using taxa barplot to check this before drawing conclusions. Sharing plots like these could also make it clearer to us what the problem is.

SILVA and QIIME2 BLAST are very similar — both are LCA alignment-based classifiers, so are expected to behave similarly. So just because they agree does not necessarily mean that classify-sklearn is wrong. However, I would not expect them to give a very different classification from classify-sklearn — if so, that might indicate a training issue.

How different are the classifications? Is the issue that BLAST/SILVA give deeper classifications (e.g., to species level) and classify-sklearn gives shallower classifications (e.g., to family level)?

Your commands look correct. You could use extract-reads (as shown in the tutorial) to trim your reference sequences. That will probably increase accuracy a bit. Note that classify-sklearn has not been optimized for 18S reads, so fiddling with some parameters (e.g., the confidence parameter) may also help. It’s worth a try, though without an 18S mock community you really don’t know the “true” composition of a sample.

Good luck!

The first plot is based on SILVA assignment down to level 7 and the second is based on PR2 assignment down to level 9. Given that I have over 6000 features, I feel PR2 is giving me too few assignments.

Thanks for sharing the plots! This helps — and these results do look suspicious.

So let me get this straight:

  1. Classification with classify-consensus-blast against the SILVA 18S reference database gives somewhat reasonable results. (for some reason in your earlier message I thought you were also using the aligner on the SILVA website and getting similar results to BLAST — which would make sense — so I apologize for that misunderstanding)
  2. Classification with classify-sklearn against the PR2 database is getting mostly unassigned and poorly assigned features.

Classification with PR2 is giving so few taxonomy assignments because they are shallower: the classifier cannot confidently assign at deeper levels, so it is assigning at order or class level for the most part. This is resulting in most features being collapsed, even though they probably do belong to different species.

The problem here is almost certainly with the database. When we see poor classification like this (when results should be better, and your blast comparison is a good example), it is usually due to one of the following:

  1. The database is no good to begin with (I know nothing about PR2 but would assume you wouldn’t be using it if this were the case!)
  2. The database is trimmed to different regions than the query sequences. So PR2 may have only fragments of 18S, for example, that don’t fully overlap with your query sequences (which are also fragmentary). This is resulting in poor classification
  3. The query sequences contain non-biological sequence (e.g., barcodes or adapter sequences). Ironically, BLAST would be less sensitive to this, since it is just looking for the best alignments (even if those alignments are terrible!)

I would bet on the second possibility above, maybe the third — you should check to make sure these query/reference sequences actually overlap completely, and that the query sequences are “clean”.

Another good diagnostic test — switch up the classifiers/databases. What if you use classify-consensus-blast against PR2, and classify-sklearn against SILVA? This could help determine whether it’s the database or classifier that’s misbehaving (and either can be fixed, but this will help diagnose)

Hi, thanks for replying. I think I will first try to use classify-consensus-blast against PR2. However, I actually used classify-sklearn for both SILVA and PR2. Does that make a difference?

Also, I’m not sure if this helps but a colleague of mine runs the same data in PhyloSeq with PR2 and I think he does get a reasonable assignment. I think this means that the database is good but there is something wrong either with the files (sequence and taxonomy) for QIIME2 (I downloaded them on PR2’s website and I think PhyloSeq uses different files) or with the classifier?

Oh I see — that definitely indicates a database problem.

That does help — narrows this down to a formatting issue. (i.e., PR2 is not a bad database, just the data you are presenting to QIIME 2 is probably bad)

This strongly weighs in favor of this:

Or that would be the most likely problem with formatting (otherwise qiime2 should detect the issue).

A bunch of other uses have posted about using PR2 database — you may want to check out and possibly post to those topics to see if those users can provide any advice on formatting the PR2 database for use in :qiime2:

I hope that helps!

Hi, I just got my result from classify-consensus-blast against PR2.

The taxa plot looks a lot more reasonable.

What is the difference between classify-sklearn and classify-consensus-blast? Is one better than the other?

Also, the assignment I got is still different from BLAST results. I checked a few sequences and the assignments disagree even at the first few levels.


This is not all that surprising, as I noted above:

Those alignments are very likely not terrible (since you have pretty good taxonomy assignments), there is probably just extra noise in there (e.g., the database does not fully overlap your query sequences) that is less upsetting to BLAST local alignment.

classify-consensus-blast is blast local alignment followed by LCA taxonomy classification. classify-sklearn is a naive Bayes classifier based on kmer frequency, similar to RDP classifier. These have not been tested on 18S, but on 16S and fungal ITS classify-sklearn does a little better. For more details and benchmark results, see here.

Something is making PR2 behave badly and it is probably the way the database is pre-processed/trimmed prior to training the classifier.

You mean BLAST + PR2 vs. classify-sklearn + PR2?

What about BLAST + PR2 vs. classify-sklearn + SILVA or BLAST + SILVA?

You could use qiime quality-control evaluate-taxonomy to compare how close these assignments are. (This just gives you a quick assessment of the similarity between the two at each taxonomic level. Anything with F > 0.9 is pretty similar)

If I were in your shoes, I would probably just use the SILVA database and use either classifier (whichever one gives more pleasing results, because I already know both are pretty good on other marker genes) — or use PR2 with BLAST (it looks like you get good classification but the fact that this looks like a database formatting issue makes me slightly uneasy). The reason: it looks like BLAST + PR2 is not enormously different from classify-sklearn + SILVA (though this is just a passing glance) and I would rather not spend the time debugging a database formatting issue if SILVA gives me similar results.

The other problem: when you have multiple databases and multiple classifiers you wind up with the “two watches” problem (4 in your case!). Do you really know which is most accurate? I linked to an 18S mock community above that you could use to figure out which seems to be best — but that’s a single old data set and might not really generalize to your data.

Hence back to my “if I were in your shoes” suggestion: lacking benchmarking data or any other evidence to suggest one “watch” is better than the other, I would throw away the broken one and just stick with the one that I like the most. :watch:

1 Like

Hi, I tried qiime quality-control evaluate-taxonomy

This is for BLAST+PR2 vs sklearn+SILVA.

This is for BLAST+PR2 vs sklearn+PR2.

I'm not sure how to use the files. Is there a guide to them?

I think one explanation for what I got from BLAST+PR2 vs sklearn+SILVA is that PR2 and SILVA organize taxonomic levels differently and each level means a different thing between the two.

Yep, sorry! that didn’t occur to me when I made that recommendation… so this will be useful, e.g., to compare classifications made with the same database but not between databases.

Oh well, I suppose we can “eyeball it” :eyes:

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.