Deblur with --p-no-hashed-feature-ids not working

Hi Qiime2 support team,

I have been struggling with trying to merge data from 3 different sequencing runs because of differences in the feature IDs. I have a slightly older dataset that was deblurred using qiime2 version 2018.2, in which the featureIDs were the md5 hashes by default. The other two datasets I am trying to merge with currently have the sequences as the feature IDs, which also was the default when they were deblurred using qiime2 2019.1

In order to merge these datasets, I have tried a couple of things.

  1. Writing a python script to reassign the feature ID sequences in the newer datasets to be md5 hashes. This was done using the hashlib python package. I was able to verify the md5 assignments by comparing to the older dataset, and it seemed to work within the rep-seqs files. However, when I later went to merge these datasets, I realized by looking within the “feature detail” tab of each of my feature tables that the two datasets still had sequences as feature IDs, which posed problems for taxonomic classification. I then tried to export these feature tables as biom files, convert to tsv, and wrote a python script to convert these feature IDs to md5 hashes, but then when trying to reconvert back to a biom table I kept getting duplication errors.

  2. So, since that was getting super complicated, I decided it was worth re-deblurring the older dataset so that the feature IDs were sequences instead of md5 hashes. Here is a sample command of what I ran:

qiime deblur denoise-16S
–i-demultiplexed-seqs PMI3_spring_NIJ-1_demux-filtered.qza
–p-trim-length -1
–o-representative-sequences PMI3_spring_NIJ-1_rep-seqs_noHash.qza
–o-table PMI3_spring_NIJ-1_table_noHash.qza
–o-stats PMI3_spring_NIJ-1_deblur-stats_noHash.qza
–p-sample-stats
–p-no-hashed-feature-ids

I got a couple of errors. The --p-sample-stats command was “not found”, which I am a little confused about. Also, when I converted my output qza files to qzv’s to look at, most of the feature IDs were still md5 hashes, while some of them were sequences. I then considered that maybe the order of things in the above command was not right, and then tried to run this one:

qiime deblur denoise-16S
–i-demultiplexed-seqs PMI3_spring_NIJ-1_demux-filtered.qza
–p-trim-length -1
–p-no-hashed-feature-ids
–o-representative-sequences PMI3_spring_NIJ-1_rep-seqs_noHash2.qza
–o-table PMI3_spring_NIJ-1_table_noHash2.qza
–p-sample-stats
–o-stats PMI3_spring_NIJ-1_deblur-stats_noHash2.qza \

And got the errors:

Error: Missing option: --o-table
Error: Missing option: --o-representative-sequences
Error: Missing option: --o-stats

Does anyone have insight as to how I can get this to work? I feel like I am almost there, but that maybe there is something I don’t understand about where to put optional parameters within my command. It takes about 6 hours for this to run, so any insight about how to do this right the next time would be much appreciated. Thank you all so much!

Heather

The code formatting on your post is a little off (next time put in a backtick fence!), so the following suggestion might be attributed to formatting rather than a real issue, but bear with me. Anyway, that error looks like it is due to missing backslashes at the end of the line (again, this could actually be a red herring due to code formatting). I would double check for the missing backslashes, if not, please re-copy and paste the commands into a backtick fence. Thanks!

We haven't changed the default for that plugin (to the best of my knowledge). The default is to emit hashed feature IDs, check out the docs:

  --p-hashed-feature-ids / --p-no-hashed-feature-ids
                                  If true, hash the feature IDs.  [default:
                                  True]

By default the hashing is enabled. You are specifically opting out of it by specifying --p-no-hashed-feature-ids

See my comments above about missing backslashes --- just a hunch.

1 Like

@thermokarst Oh my goodness, thank you so much. Here I am thinking about this problem in the most complicated way, and it’s just something as simple as backslashes. I’ll rerun it and post if something else is off.

Thank you!!

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.