rescript dereplicate plugin error

I am having difficulties when attempting to run rescript dereplicate. I am getting an error that I cannot seem to find solution for.

I am running QIIME2 version 2024.5 and this version is on an external server that I believe installed it through conda.

I am using the silva-138_2_ssu_nr99 sequence and taxonomy files.

I have been running into some problems trying to dereplicate. When I run the below command…

qiime rescript dereplicate --i-sequences silva-138_2-ssu-nr99-seqs_cleaned.qza --i-taxa corrected_silva-138_2-ssu-nr99-tax.qza --p-mode 'super' --o-dereplicated-sequences silva-138_2-ssu-nr99_clean-derep-super.qza --o-dereplicated-taxa corrected_silva-138_2-ssu-nr99-tax-derep-super.qza

I get the following error

/home/share/anaconda/envs/qiime2-amplicon-2024.5/lib/python3.9/site-packages/qiime2/core/cache.py:468: UserWarning: Your temporary cache was found to be in an inconsistent state. It has been recreated.

warnings.warn(

Plugin error from rescript:

'CP013075.1696123.1697664'

Debug info has been saved to /tmp/qiime2-q2cli-err-abalx_cf.log

I have searched my corrected_silva-138_2-ssu-nr99-tax file and there is no feature ID that is CP013075.1696123.1697664

I have attached the file here for your reference.

qiime2-q2cli-err-abalx_cf.txt (4.7 KB)

Prior to receiving this error I ran the following

qiime rescript cull-seqs --i-sequences silva-138_2-ssu-nr99-seqs.qza --o-clean-sequences silva-138_2-ssu-nr99-seqs_cleaned.qza

qiime rescript dereplicate --i-sequences silva-138_2-ssu-nr99-seqs_cleaned.qza --i-taxa corrected_silva-138_2-ssu-nr99-tax.qza --p-mode 'super' --o-dereplicated-sequences silva-138_2-ssu-nr99_clean-derep-super.qza --o-dereplicated-taxa corrected_silva-138_2-ssu-nr99-tax-derep-super.qz a

When I ran the above script I ran into problems with duplicate feature IDs, so I did the following

qiime tools export --input-path silva-138_2-ssu-nr99-tax.qza --output-path silva-138_2-ssu-nr99-tax

mv taxonomy.tsv silva-138_2-ssu-nr99-tax. tsv

awk '!seen[$1]++' silva-138_2-ssu-nr99-tax.tsv > corrected_silva-138_2-ssu-nr99-ta x.ts v

qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path corrected_silva-138_2-ssu-nr99-tax.tsv --output-path corrected_silva-138_2-ssu-nr99-tax.qza

Any guidance would be appreciated.

Hi @pbrannock,

From where did you download the silva-138_2_ssu_nr99 files? Assuming these are QZA files, what version of QIIME 2 was used to make them? You can view that in the provenance with QIIME 2 View.

Without looking through the files, I'd guess that this error is related to any prior editing / parsing of the taxonomy file. It is okay for the taxonomy file IDs to be a super set of the sequence file. It appears this error is the other way around, where the sequence file IDs are a superset of the taxonomy file.

To be safe, I'd recommend using qiime rescript edit-taxonomy ... to make any taxonomy adjustments. This will minimize unexpected errors.

Also, I'd advise against using --p-mode 'super', with SILVA taxonomy strings. As this can result in hybrid non-sensical taxonomies for some groups. I'd leave this parameter set as --p-mode 'unique'.

Finally, that version of QIIME 2 is almost 2 years old. I'd inquire about installing the most recent version (2026.1) on that system.

Hi @SoilRotifer,

I got the silva files from the server that I use. I assume they obtained them from the SILVA site and used the same verion of QIIME (qiime2 2024-5) to make the qza files. I tried to do the following command

qiime rescript get-silva-data \
--p-version '138.2' \
--p-target 'SSURef_NR99' \
--o-silva-sequences silva-138.2-ssu-nr99-rna-seqs.qza \
--o-silva-taxonomy silva-138.2-ssu-nr99-tax.qza

but due to the qiime version I have access to it does not have the 138.2. I know there is a work around, but I do not have permissions to do that.

I do not think your qiime rescript edit-taxonomy will work for me. I have a problem at first with the silva-138_2-ssu-nr99-tax.qza having duplicate feature IDs. Therefore when I run the following script (regardless of the –p-mode) I get an error stating a bunch of IDs that are duplicated. This command I could not find how to remove duplicate feature IDs. I only could see how to edit the taxonomic string.

qiime rescript dereplicate --i-sequences silva-138_2-ssu-nr99-seqs_cleaned.qza --i-taxa silva-138_2-ssu-nr99-tax.qza --p-mode 'super' --o-dereplicated-sequences silva-138_2-ssu-nr99_clean-derep-super.qza --o-dereplicated-taxa silva-138_2-ssu-nr99-tax-derep-super.qza

After searching I found in order to fix this you do the below commands to remove the duplicated taxanomic values

qiime tools export --input-path silva-138_2-ssu-nr99-tax.qza --output-path silva-138_2-ssu-nr99-tax

mv taxonomy.tsv silva-138_2-ssu-nr99-tax.tsv

awk '!seen[$1]++' silva-138_2-ssu-nr99-tax.tsv > corrected_silva-138_2-ssu-nr99-tax.tsv

qiime tools import --type 'FeatureData[Taxonomy]' --input-format HeaderlessTSVTaxonomyFormat --input-path corrected_silva-138_2-ssu-nr99-tax.tsv --output-path corrected_silva-138_2-ssu-nr99-tax.qza

I have requested for the latest version of QIIME2 to be installed on the server. I am waiting to hear.

I'd suggest not trusting these files. See below.

There should be no duplicates when running through the RESCRIPt pipeline. I suspect something is very wrong. Again, you can look through the provenance for those files via QIIME 2 View. You should be able to see what they did to make the QZA files. However, it should not be possible to make a QZA file with duplicate IDs present. Duplicate IDs are checked for during the QZA file generation process. At this stage an error should be raised.

Solution:
There is a work around to make a 138.2 database with your version of QIIME 2. All you need to do is take a more manual approach as outlined in the original tutorial, by clicking the drop-arrow "The gritty details", just under "Hard mode". Just wget the listed files and run the import commands. (the rescript get-silva-data command simply automates these parts for you). Then you are ready to proceed with the rest of the curation pipeline. Skip and choose whichever steps you think appropriate.

FYI, SILVA just recently started to generate some premade QIIME 2 files. Check those out and see if they work.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.