Plugin error from deblur: Deblur cannot operate on sample IDs that contain underscores.

I attempted to run deblur on my joined-read data (“Result vsjoin.qza appears to be valid at level=max.”)
I was given the next error message:
Városi Vadvizek Vizsgálata

Plugin error from deblur:

*** Deblur cannot operate on sample IDs that contain underscores. The following ID(s) contain one or more underscores: DB-05-S81_L001, DB-06-S25_L001, DB-07-S2_L001, DB-08-S34_L001, DB-09-S23_L001, DBA-05-S78_L001, DBA-06-S11_L001, DBA-07-S38_L001, DBA-08-S24_L001, DBA-09-…***

I am now stuck., pls give me a hint!

First I believed that it is like that dada2 does not accept my sample with 46 character long read names that come from the MySeq machine so 1st I must always rename all reads to less than 40 characters. So, I almost started renaming my sample files and updating the metadata file, to start up the whole pipeline again, when it came to my mind that I already used deblur a month ago on underscore-containing filenames. So this Error Message may not even be true. I have even found the deblur log from that time, which says:

more deblur.log
INFO(140639269128000)2019-10-11 16:18:50,532:*************************
INFO(140639269128000)2019-10-11 16:18:50,532:deblurring started
WARNING(140639269128000)2019-10-11 16:18:50,532:deblur version 1.1.0 workflow started on /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-dpwmebh7
WARNING(140639269128000)2019-10-11 16:18:50,532:parameters: {‘logger’: <Logger main (INFO)>, ‘is_worker_thread’: None, ‘jobs_to_start’: 1, ‘log_file’: ‘/mnt/sda1/home/kp/old.duna/deblur.log’, ‘log_level’
: 2, ‘keep_tmp_files’: None, ‘threads_per_sample’: 1, ‘min_size’: 2, ‘min_reads’: 10, ‘left_trim_length’: 0, ‘trim_length’: 249, ‘indel_max’: 3, ‘indel_prob’: 0.01, ‘error_dist’: [1, 0.06, 0.02, 0.02, 0.01,
0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005], ‘mean_error’: 0.005, ‘overwrite’: True, ‘neg_ref_db_fp’: (), ‘neg_ref_fp’: (), ‘pos_ref_db_fp’: (), ‘pos_ref_fp’: (), ‘output_dir’: ‘/tmp/tmpk1t0rc4u’, ‘seq
s_fp’: ‘/tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-dpwmebh7’}
INFO(140639269128000)2019-10-11 16:18:50,532:error_dist is : [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005]
INFO(140639269128000)2019-10-11 16:18:50,532:deblur main program started
INFO(140639269128000)2019-10-11 16:18:50,532:processing directory /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-dpwmebh7
INFO(140639269128000)2019-10-11 16:18:50,534:building negative db sortmerna index files
INFO(140639269128000)2019-10-11 16:18:50,534:build_index_sortmerna files [’/home/kp/.conda/envs/qiime2-2019.7/lib/python3.6/site-packages/deblur/support_files/artifacts.fa’] to dir /tmp/tmpk1t0rc4u/deblur_wo
INFO(140639269128000)2019-10-11 16:18:50,584:building positive db sortmerna index files
INFO(140639269128000)2019-10-11 16:18:50,584:build_index_sortmerna files [’/home/kp/.conda/envs/qiime2-2019.7/lib/python3.6/site-packages/deblur/support_files/88_otus.fasta’] to dir /tmp/tmpk1t0rc4u/deblur_w
INFO(140639269128000)2019-10-11 16:19:27,202:processing per sample fasta files
INFO(140639269128000)2019-10-11 16:19:27,203:--------------------------------------------------------
INFO(140639269128000)2019-10-11 16:19:27,203:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-dpwmebh7/RR15_S86_L001_R1_001.fastq.gz
INFO(140639269128000)2019-10-11 16:19:51,510:dereplicate seqs file /tmp/tmpk1t0rc4u/deblur_working_dir/RR15_S86_L001_R1_001.fastq.gz.trim
INFO(140639269128000)2019-10-11 16:19:51,682:remove_artifacts_seqs file /tmp/tmpk1t0rc4u/deblur_working_dir/RR15_S86_L001_R1_001.fastq.gz.trim.derep
INFO(140639269128000)2019-10-11 16:19:52,548:total sequences 11084, passing sequences 11084, failing sequences 0
INFO(140639269128000)2019-10-11 16:19:52,549:multiple_sequence_alignment seqs file /tmp/tmpk1t0rc4u/deblur_working_dir/RR15_S86_L001_R1_001.fastq.gz.trim.derep.no_artifacts
INFO(140639269128000)2019-10-11 16:22:01,894:deblurring 11084 sequences
INFO(140639269128000)2019-10-11 16:23:58,183:6116 unique sequences left following deblurring
INFO(140639269128000)2019-10-11 16:23:58,220:remove_chimeras_denovo_from_seqs seqs file /tmp/tmpk1t0rc4u/deblur_working_dir/RR15_S86_L001_R1_001.fastq.gz.trim.derep.no_artifacts.msa.deblurto working dir /tmp
INFO(140639269128000)2019-10-11 16:24:03,286:finished processing file
INFO(140639269128000)2019-10-11 16:24:03,298:-------------------------


Hi @Peter,

The problem is right in the title:

So, you have a few options to fix it. You can (1) re-name your samples again to remove the underscores (my suggstion is to just re-import with a manifest) or… you can actually run deblur outside of QIIME 2 and it will work with the underscores but its a bit more painful to get it back in. So, my suggestion is to use a manifest to import with periods or dashes in the name. Or to drop the lane designation unless there’s something really special there. I might actually even double check against your original metadata file, because much of your ID looks like things that get added on the sequencer and there’s no point in keeping the full designation there if you’re not working with that full id.


Thanks, Justine.
That clarified the case. I just did not understand how I could use the deblur inside qiime2 the other day if I can not do that now.
But, for now on, I just let this enigma alone, and I will reimport the data and start up again.

Again, thanks a lot, have a merry Xmas