vsearch closed reference OTU picking error: No matches were identified to reference_sequences

there are something mysterious happened in my qiime2:
'''
qiime vsearch cluster-features-closed-reference
--i-table table.qza
--i-sequences rep-seqs.qza
--i-reference-sequences 85_otus.qza
--p-perc-identity 0.85
--o-clustered-table table-cr-85.qza
--o-clustered-sequences rep-seqs-cr-85.qza
--o-unmatched-sequences unmatched-cr-85.qza
'''
when i run code above with 85_otus.qza from official (https://data.qiime2.org/2019.10/tutorials/otu-clustering/85_otus.qza), everything is okay, while when i run the same code with the same qza imported from 85_otus.fastq, there is a error occurred. and this is the tips:

Plugin error from vsearch:

No matches were identified to reference_sequences. This can happen if sequences are not homologous to reference_sequences, or if sequences are not in the same orientation as reference_sequences (i.e., if sequences are reverse complemented with respect to reference sequences). Sequence orientation can be adjusted with the strand parameter.

Debug info has been saved to /tmp/qiime2-q2cli-err-hdo1wth7.log


and 85_otus.qza is the file downloaded from official web.
86_otus.qza is the file i import use the code below.
the picture is the differece when i view it in command line, the official 85_otus.qza is green! and other qza i import such as 85,97,99 are all white and all invalid.
'''
qiime tools import
--type 'FeatureData[Sequence]'
--input-path 85_otus.fasta
--output-path 85_otus.qza
'''

85_otus.qza (1.7 MB) 86_otus.qza (1.7 MB)

Hi @Richard,
A few things to check.
Is this 16S data or some other gene? If 16S, the below comments may be relevant. If not 16S you need to use a different reference sequence file.

If you go back to that closed-reference OTU clustering tutorial, there is a blue box that advises of not using the 85_otus.qza, that is used for tutorial only as it is small. Instead, you want to use a reference sequence file clustered at a higher identity like 97 or 99%. This may actually solve your issue. If not,
Are your sequences all in the same orientation? If you think they may not be for some reason, you may want to include the --p-strand and set it to both in vsearch.

The difference in colors you see in your texts are not validating or invalidating files, those just represent different properties of those files, depending on your color setting. For example, the blues are directories, the greens are executable (I think) with different permission rights. I wouldn’t base any troubleshooting of your issue based on those colors.

Let’s start with those and go from there.

1 Like

amazing!@Mehrbod_Estaki, my trouble is totally solved following your tips, and now i get the right result. Thanks very much!

1 Like

Hi @Richard,
That is great to hear!
For the sake of completeness and others who may be looking at this forum in the future, can you clarify which tips you followed exactly that solved your problem?

2 Likes

there are a combination of solution for my problem.

The necessary step is add the parameter --p-strand. and the second parameter you need to notice is --p-perc-identity, you should try the best matched number, in my study, 95 is good, and 99 is too strict, maybe in your study, a lower number can get a better result. Thanks a for your help and notice @Mehrbod_Estaki

2 Likes