INFO(139936696375104)2021-09-30 15:10:17,219:************************* INFO(139936696375104)2021-09-30 15:10:17,219:deblurring started WARNING(139936696375104)2021-09-30 15:10:17,219:deblur version 1.1.0 workflow started on /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl WARNING(139936696375104)2021-09-30 15:10:17,219:parameters: {'seqs_fp': '/tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl', 'output_dir': '/tmp/tmpkn_3riuh', 'pos_ref_fp': (), 'pos_ref_db_fp': (), 'neg_ref_fp': (), 'neg_ref_db_fp': (), 'overwrite': True, 'mean_error': 0.005, 'error_dist': [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005], 'indel_prob': 0.01, 'indel_max': 3, 'trim_length': 150, 'left_trim_length': 0, 'min_reads': 10, 'min_size': 2, 'threads_per_sample': 1, 'keep_tmp_files': True, 'log_level': 2, 'log_file': '/home/iasst/transcriptome/cutadapt/deblur.log', 'jobs_to_start': 1, 'is_worker_thread': False, 'logger': } INFO(139936696375104)2021-09-30 15:10:17,219:error_dist is : [1, 0.06, 0.02, 0.02, 0.01, 0.005, 0.005, 0.005, 0.001, 0.001, 0.001, 0.0005] INFO(139936696375104)2021-09-30 15:10:17,219:deblur main program started INFO(139936696375104)2021-09-30 15:10:17,219:processing directory /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl INFO(139936696375104)2021-09-30 15:10:17,220:building negative db sortmerna index files INFO(139936696375104)2021-09-30 15:10:17,220:build_index_sortmerna files ['/home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa'] to dir /tmp/tmpkn_3riuh/deblur_working_dir INFO(139936696375104)2021-09-30 15:10:17,296:building positive db sortmerna index files INFO(139936696375104)2021-09-30 15:10:17,296:build_index_sortmerna files ['/home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/88_otus.fasta'] to dir /tmp/tmpkn_3riuh/deblur_working_dir INFO(139936696375104)2021-09-30 15:11:04,718:processing per sample fasta files INFO(139936696375104)2021-09-30 15:11:04,718:-------------------------------------------------------- INFO(139936696375104)2021-09-30 15:11:04,718:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1069_6_L001_R1_001.fastq.gz INFO(139936696375104)2021-09-30 17:57:51,949:dereplicate seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim INFO(139936696375104)2021-09-30 18:00:04,806:remove_artifacts_seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-09-30 18:00:04,950:sortmerna error on file /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-09-30 18:00:04,950:stdout : [process:1372] === Options processing starts ... === Found value: sortmerna Found flag: --reads Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim.derep of previous flag: --reads Found flag: --ref Found value: /home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa,/tmp/tmpkn_3riuh/deblur_working_dir/artifacts of previous flag: --ref Found flag: --aligned Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim.derep.sortmerna of previous flag: --aligned Found flag: --blast Found value: 3 of previous flag: --blast Found flag: --best Found value: 1 of previous flag: --best Found flag: --print_all_reads Previous flag: --print_all_reads is Boolean. Setting to True Found flag: -v Previous flag: -v is Boolean. Setting to True Found flag: -e Found value: 100 of previous flag: -e [process:1456] Processing option: aligned with value: /tmp/tmpkn_3riuh/deblur_working_dir/R1069_6_L001_R1_001.fastq.gz.trim.derep.sortmerna [process:1456] Processing option: best with value: 1 Usage: sortmerna -ref FILE [-ref FILE] -reads FWD_READS [-reads REV_READS] [OPTIONS]: ------------------------------------------------------------------------------------------------------------- | option type-format description default | ------------------------------------------------------------------------------------------------------------- [REQUIRED] --ref PATH Required Reference file (FASTA) absolute or relative path. Use mutliple times, once per a reference file --reads PATH Required Raw reads file (FASTA/FASTQ/FASTA.GZ/FASTQ.GZ). Use twice for files with paired reads. The file extensions are Not important. The program automatically recognizes the file format as flat/compressed, fasta/fastq [COMMON] --workdir PATH Optional Workspace directory USRDIR/sortmerna/run/ Default structure: WORKDIR/ idx/ (References index) kvdb/ (Key-value storage for alignments) out/ (processing output) readb/ (pre-processed reads/index) --kvdb PATH Optional Directory for Key-value database WORKDIR/kvdb KVDB is used for storing the alignment results. --idx-dir PATH Optional Directory for storing Reference index. WORKDIR/idx --readb PATH Optional Storage for pre-processed reads WORKDIR/readb/ Directory storing the split reads, or the random access index of compressed reads --fastx BOOL Optional Output aligned reads into FASTA/FASTQ file --sam BOOL Optional Output SAM alignment for aligned reads. --SQ BOOL Optional Add SQ tags to the SAM file --blast STR Optional output alignments in various Blast-like formats Sample values: '0' - pairwise '1' - tabular (Blast - m 8 format) '1 cigar' - tabular + column for CIGAR '1 cigar qcov' - tabular + columns for CIGAR and query coverage '1 cigar qcov qstrand' - tabular + columns for CIGAR, query coverage, and strand --aligned STR/BOOL Optional Aligned reads file prefix [dir/][pfx] WORKDIR/out/aligned Directory and file prefix for aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'aligned' is used Examples: '-aligned $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-aligned dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-aligned dir_1/' -> $PWD/aligned.fasta '-aligned apfx' -> $PWD/apfx.fasta '-aligned (no argument)' -> WORKDIR/out/aligned.fasta --other STR/BOOL Optional Non-aligned reads file prefix [dir/][pfx] WORKDIR/out/other Directory and file prefix for non-aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Must be used with 'fastx'. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'other' is used Examples: '-other $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-other dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-other dir_1/' -> $PWD/dir_1/other.fasta '-other apfx' -> $PWD/apfx.fasta '-other (no argument)' -> aligned_out/other.fasta i.e. the same output directory as used for aligned output --num_alignments INT Optional Positive integer (INT >=0). If used with '-no-best' reports first INT alignments per read reaching E-value threshold, which allows to lower the CPU time and memory use. Otherwise outputs INT best alignments. If INT = 0, all alignments are output --no-best BOOL Optional Disable best alignments search 1 By default the exchaustive alignments search is performed by searching '-min_lis N' candidate alignments If N == 0: All candidate alignments are searched If N > 0: N best alignments are searched. Naturally the larger is the N, the longer is the search time. Explanation: A read can potentially be aligned (reaching E-value threshold) to multiple reference sequences. The 'best' alignment is the highest scoring alignment out of All alignments of a Read. To find the Best alignment - an exhaustive search over All references has to be performed. 'best 1' and 'best 0' (all the bests) are Equally intensive processes requiring the exhaustive search, although the size of reports will differ. --min_lis INT Optional Search all alignments having the first INT 2 longest LIS LIS stands for Longest Increasing Subsequence, it is computed using seeds' positions to expand hits into longer matches prior to Smith - Waterman alignment. --print_all_reads BOOL Optional Output null alignment strings for non-aligned reads False to SAM and/or BLAST tabular files --paired BOOL Optional Flags paired reads False If a single reads file is provided, use this option to indicate the file contains interleaved paired reads when neither 'paired_in' | 'paired_out' | 'out2' | 'sout' are specified. --paired_in BOOL Optional Flags the paired-end reads as Aligned, False when either of them is Aligned. With this option both reads are output into Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_out'. --paired_out BOOL Optional Flags the paired-end reads as Non-aligned, False when either of them is non-aligned. With this option both reads are output into Non-Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_in'. --out2 BOOL Optional Output paired reads into separate files. False Must be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads When used with 'sout', four (4) output files for aligned reads will be generated: 'aligned-paired-fwd, aligned-paired-rev, aligned-singleton-fwd, aligned-singleton-rev'. If 'other' option is also used, eight (8) output files will be generated. --sout BOOL Optional Separate paired and singleton aligned reads. False To be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads Cannot be used with 'paired_in' | 'paired_out' --zip-out STR/BOOL Optional Controls the output compression Yes/True By default the report files are produced in the same format as the input i.e. if the reads files are compressed (gz), the output is also compressed. The default behaviour can be overriden by using '-zip-out'. The possible values: Y(es), N(o), T(rue), F(alse). No value means 'True'. The values are Not case sensitive i.e. 'Yes, YES, yEs, Y, y' are all OK Examples: '-reads freads.gz -zip-out n' : generate flat output when the input is compressed '-reads freads.flat -zip-out' : compress the output when the input files are flat --match INT Optional SW score (positive integer) for a match. 2 --mismatch INT Optional SW penalty (negative integer) for a mismatch. -3 --gap_open INT Optional SW penalty (positive integer) for introducing a gap. 5 --gap_ext INT Optional SW penalty (positive integer) for extending a gap. 2 -e DOUBLE Optional E-value threshold. 1 Defines the 'statistical significance' of a local alignment. Exponentially correllates with the Minimal Alignment score. Higher E-values (100, 1000, ...) cause More reads to Pass the alignment threshold -F BOOL Optional Search only the forward strand. False -N BOOL Optional SW penalty for ambiguous letters (N's) scored as --mismatch -R BOOL Optional Search only the reverse-complementary strand. False [OTU_PICKING] --id INT Optional %%id similarity threshold (the alignment 0.97 must still pass the E-value threshold). --coverage INT Optional %%query coverage threshold (the alignment must 0.97 still pass the E-value threshold) --de_novo_otu BOOL Optional Output FASTA file with 'de novo' reads False Read is 'de novo' if its alignment score passes E-value threshold, but both the identity '-id', and the '-coverage' are below their corresponding thresholds i.e. ID < %%id and COV < %%cov --otu_map BOOL Optional Output OTU map (input to QIIME's make_otu_table.py). False Cannot be used with 'no-best because the grouping is done around the best alignment' [ADVANCED] --passes INT,INT,INT Optional Three intervals at which to place the seed on L,L/2,3 the read (L is the seed length) --edges INT Optional Number (or percent if INT followed by %% sign) of 4 nucleotides to add to each edge of the read prior to SW local alignment --num_seeds BOOL Optional Number of seeds matched before searching 2 for candidate LIS --full_search INT Optional Search for all 0-error and 1-error seed False matches in the index rather than stopping after finding a 0-error match (<1%% gain in sensitivity with up four-fold decrease in speed) --pid BOOL Optional Add pid to output file names. False -a INT Optional DEPRECATED in favour of '-threads'. Number of numCores processing threads to use. Automatically redirects to '-threads' --threads INT Optional Number of Processing threads to use 2 [INDEXING] --index INT Optional Build reference database index 2 By default when this option is not used, the program checks the reference index and builds it if not already existing. This can be changed by using '-index' as follows: '-index 0' - skip indexing. If the index does not exist, the program will terminate and warn to build the index prior performing the alignment '-index 1' - only perform the indexing and terminate '-index 2' - the default behaviour, the same as when not using this option at all -L DOUBLE Optional Indexing: seed length. 18 -m DOUBLE Optional Indexing: the amount of memory (in Mbytes) for 3072 building the index. -v BOOL Optional Produce verbose output when building the index True --interval INT Optional Indexing: Positive integer: index every Nth L-mer in 1 the reference database e.g. '-interval 2'. --max_pos INT Optional Indexing: maximum (integer) number of positions to 1000 store for each unique L-mer. If 0 - all positions are stored. [HELP] -h BOOL Optional Print help information --version BOOL Optional Print SortMeRNA version number [DEVELOPER] --dbg_put_db BOOL Optional --cmd BOOL Optional Launch an interactive session (command prompt) False --task INT Optional Processing Task 4 Possible values: 0 - align. Only perform alignment 1 - post-processing (log writing) 2 - generate reports 3 - align and post-process 4 - all --dbg-level INT Optional Debug level 0 Controls verbosity of the execution trace. Default value of 0 corresponds to the least verbose output. The highest value currently is 2. ERROR(139936696375104)2021-09-30 18:00:04,950:stderr : [opt_default:939] ERROR: Option: 'best' is not recognized WARNING(139936696375104)2021-09-30 18:00:04,960:No sequences left after artifact removal in file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1069_6_L001_R1_001.fastq.gz WARNING(139936696375104)2021-09-30 18:00:04,960:deblurring failed for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1069_6_L001_R1_001.fastq.gz INFO(139936696375104)2021-09-30 18:00:04,960:-------------------------------------------------------- INFO(139936696375104)2021-09-30 18:00:04,960:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1073_8_L001_R1_001.fastq.gz INFO(139936696375104)2021-09-30 23:42:51,515:dereplicate seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim INFO(139936696375104)2021-09-30 23:47:39,437:remove_artifacts_seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-09-30 23:47:39,446:sortmerna error on file /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-09-30 23:47:39,446:stdout : [process:1372] === Options processing starts ... === Found value: sortmerna Found flag: --reads Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim.derep of previous flag: --reads Found flag: --ref Found value: /home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa,/tmp/tmpkn_3riuh/deblur_working_dir/artifacts of previous flag: --ref Found flag: --aligned Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim.derep.sortmerna of previous flag: --aligned Found flag: --blast Found value: 3 of previous flag: --blast Found flag: --best Found value: 1 of previous flag: --best Found flag: --print_all_reads Previous flag: --print_all_reads is Boolean. Setting to True Found flag: -v Previous flag: -v is Boolean. Setting to True Found flag: -e Found value: 100 of previous flag: -e [process:1456] Processing option: aligned with value: /tmp/tmpkn_3riuh/deblur_working_dir/R1073_8_L001_R1_001.fastq.gz.trim.derep.sortmerna [process:1456] Processing option: best with value: 1 Usage: sortmerna -ref FILE [-ref FILE] -reads FWD_READS [-reads REV_READS] [OPTIONS]: ------------------------------------------------------------------------------------------------------------- | option type-format description default | ------------------------------------------------------------------------------------------------------------- [REQUIRED] --ref PATH Required Reference file (FASTA) absolute or relative path. Use mutliple times, once per a reference file --reads PATH Required Raw reads file (FASTA/FASTQ/FASTA.GZ/FASTQ.GZ). Use twice for files with paired reads. The file extensions are Not important. The program automatically recognizes the file format as flat/compressed, fasta/fastq [COMMON] --workdir PATH Optional Workspace directory USRDIR/sortmerna/run/ Default structure: WORKDIR/ idx/ (References index) kvdb/ (Key-value storage for alignments) out/ (processing output) readb/ (pre-processed reads/index) --kvdb PATH Optional Directory for Key-value database WORKDIR/kvdb KVDB is used for storing the alignment results. --idx-dir PATH Optional Directory for storing Reference index. WORKDIR/idx --readb PATH Optional Storage for pre-processed reads WORKDIR/readb/ Directory storing the split reads, or the random access index of compressed reads --fastx BOOL Optional Output aligned reads into FASTA/FASTQ file --sam BOOL Optional Output SAM alignment for aligned reads. --SQ BOOL Optional Add SQ tags to the SAM file --blast STR Optional output alignments in various Blast-like formats Sample values: '0' - pairwise '1' - tabular (Blast - m 8 format) '1 cigar' - tabular + column for CIGAR '1 cigar qcov' - tabular + columns for CIGAR and query coverage '1 cigar qcov qstrand' - tabular + columns for CIGAR, query coverage, and strand --aligned STR/BOOL Optional Aligned reads file prefix [dir/][pfx] WORKDIR/out/aligned Directory and file prefix for aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'aligned' is used Examples: '-aligned $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-aligned dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-aligned dir_1/' -> $PWD/aligned.fasta '-aligned apfx' -> $PWD/apfx.fasta '-aligned (no argument)' -> WORKDIR/out/aligned.fasta --other STR/BOOL Optional Non-aligned reads file prefix [dir/][pfx] WORKDIR/out/other Directory and file prefix for non-aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Must be used with 'fastx'. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'other' is used Examples: '-other $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-other dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-other dir_1/' -> $PWD/dir_1/other.fasta '-other apfx' -> $PWD/apfx.fasta '-other (no argument)' -> aligned_out/other.fasta i.e. the same output directory as used for aligned output --num_alignments INT Optional Positive integer (INT >=0). If used with '-no-best' reports first INT alignments per read reaching E-value threshold, which allows to lower the CPU time and memory use. Otherwise outputs INT best alignments. If INT = 0, all alignments are output --no-best BOOL Optional Disable best alignments search 1 By default the exchaustive alignments search is performed by searching '-min_lis N' candidate alignments If N == 0: All candidate alignments are searched If N > 0: N best alignments are searched. Naturally the larger is the N, the longer is the search time. Explanation: A read can potentially be aligned (reaching E-value threshold) to multiple reference sequences. The 'best' alignment is the highest scoring alignment out of All alignments of a Read. To find the Best alignment - an exhaustive search over All references has to be performed. 'best 1' and 'best 0' (all the bests) are Equally intensive processes requiring the exhaustive search, although the size of reports will differ. --min_lis INT Optional Search all alignments having the first INT 2 longest LIS LIS stands for Longest Increasing Subsequence, it is computed using seeds' positions to expand hits into longer matches prior to Smith - Waterman alignment. --print_all_reads BOOL Optional Output null alignment strings for non-aligned reads False to SAM and/or BLAST tabular files --paired BOOL Optional Flags paired reads False If a single reads file is provided, use this option to indicate the file contains interleaved paired reads when neither 'paired_in' | 'paired_out' | 'out2' | 'sout' are specified. --paired_in BOOL Optional Flags the paired-end reads as Aligned, False when either of them is Aligned. With this option both reads are output into Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_out'. --paired_out BOOL Optional Flags the paired-end reads as Non-aligned, False when either of them is non-aligned. With this option both reads are output into Non-Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_in'. --out2 BOOL Optional Output paired reads into separate files. False Must be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads When used with 'sout', four (4) output files for aligned reads will be generated: 'aligned-paired-fwd, aligned-paired-rev, aligned-singleton-fwd, aligned-singleton-rev'. If 'other' option is also used, eight (8) output files will be generated. --sout BOOL Optional Separate paired and singleton aligned reads. False To be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads Cannot be used with 'paired_in' | 'paired_out' --zip-out STR/BOOL Optional Controls the output compression Yes/True By default the report files are produced in the same format as the input i.e. if the reads files are compressed (gz), the output is also compressed. The default behaviour can be overriden by using '-zip-out'. The possible values: Y(es), N(o), T(rue), F(alse). No value means 'True'. The values are Not case sensitive i.e. 'Yes, YES, yEs, Y, y' are all OK Examples: '-reads freads.gz -zip-out n' : generate flat output when the input is compressed '-reads freads.flat -zip-out' : compress the output when the input files are flat --match INT Optional SW score (positive integer) for a match. 2 --mismatch INT Optional SW penalty (negative integer) for a mismatch. -3 --gap_open INT Optional SW penalty (positive integer) for introducing a gap. 5 --gap_ext INT Optional SW penalty (positive integer) for extending a gap. 2 -e DOUBLE Optional E-value threshold. 1 Defines the 'statistical significance' of a local alignment. Exponentially correllates with the Minimal Alignment score. Higher E-values (100, 1000, ...) cause More reads to Pass the alignment threshold -F BOOL Optional Search only the forward strand. False -N BOOL Optional SW penalty for ambiguous letters (N's) scored as --mismatch -R BOOL Optional Search only the reverse-complementary strand. False [OTU_PICKING] --id INT Optional %%id similarity threshold (the alignment 0.97 must still pass the E-value threshold). --coverage INT Optional %%query coverage threshold (the alignment must 0.97 still pass the E-value threshold) --de_novo_otu BOOL Optional Output FASTA file with 'de novo' reads False Read is 'de novo' if its alignment score passes E-value threshold, but both the identity '-id', and the '-coverage' are below their corresponding thresholds i.e. ID < %%id and COV < %%cov --otu_map BOOL Optional Output OTU map (input to QIIME's make_otu_table.py). False Cannot be used with 'no-best because the grouping is done around the best alignment' [ADVANCED] --passes INT,INT,INT Optional Three intervals at which to place the seed on L,L/2,3 the read (L is the seed length) --edges INT Optional Number (or percent if INT followed by %% sign) of 4 nucleotides to add to each edge of the read prior to SW local alignment --num_seeds BOOL Optional Number of seeds matched before searching 2 for candidate LIS --full_search INT Optional Search for all 0-error and 1-error seed False matches in the index rather than stopping after finding a 0-error match (<1%% gain in sensitivity with up four-fold decrease in speed) --pid BOOL Optional Add pid to output file names. False -a INT Optional DEPRECATED in favour of '-threads'. Number of numCores processing threads to use. Automatically redirects to '-threads' --threads INT Optional Number of Processing threads to use 2 [INDEXING] --index INT Optional Build reference database index 2 By default when this option is not used, the program checks the reference index and builds it if not already existing. This can be changed by using '-index' as follows: '-index 0' - skip indexing. If the index does not exist, the program will terminate and warn to build the index prior performing the alignment '-index 1' - only perform the indexing and terminate '-index 2' - the default behaviour, the same as when not using this option at all -L DOUBLE Optional Indexing: seed length. 18 -m DOUBLE Optional Indexing: the amount of memory (in Mbytes) for 3072 building the index. -v BOOL Optional Produce verbose output when building the index True --interval INT Optional Indexing: Positive integer: index every Nth L-mer in 1 the reference database e.g. '-interval 2'. --max_pos INT Optional Indexing: maximum (integer) number of positions to 1000 store for each unique L-mer. If 0 - all positions are stored. [HELP] -h BOOL Optional Print help information --version BOOL Optional Print SortMeRNA version number [DEVELOPER] --dbg_put_db BOOL Optional --cmd BOOL Optional Launch an interactive session (command prompt) False --task INT Optional Processing Task 4 Possible values: 0 - align. Only perform alignment 1 - post-processing (log writing) 2 - generate reports 3 - align and post-process 4 - all --dbg-level INT Optional Debug level 0 Controls verbosity of the execution trace. Default value of 0 corresponds to the least verbose output. The highest value currently is 2. ERROR(139936696375104)2021-09-30 23:47:39,446:stderr : [opt_default:939] ERROR: Option: 'best' is not recognized WARNING(139936696375104)2021-09-30 23:47:39,446:No sequences left after artifact removal in file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1073_8_L001_R1_001.fastq.gz WARNING(139936696375104)2021-09-30 23:47:39,446:deblurring failed for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1073_8_L001_R1_001.fastq.gz INFO(139936696375104)2021-09-30 23:47:39,446:-------------------------------------------------------- INFO(139936696375104)2021-09-30 23:47:39,446:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1071_7_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 02:58:48,976:dereplicate seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim INFO(139936696375104)2021-10-01 03:01:16,486:remove_artifacts_seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 03:01:16,495:sortmerna error on file /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 03:01:16,495:stdout : [process:1372] === Options processing starts ... === Found value: sortmerna Found flag: --reads Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim.derep of previous flag: --reads Found flag: --ref Found value: /home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa,/tmp/tmpkn_3riuh/deblur_working_dir/artifacts of previous flag: --ref Found flag: --aligned Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim.derep.sortmerna of previous flag: --aligned Found flag: --blast Found value: 3 of previous flag: --blast Found flag: --best Found value: 1 of previous flag: --best Found flag: --print_all_reads Previous flag: --print_all_reads is Boolean. Setting to True Found flag: -v Previous flag: -v is Boolean. Setting to True Found flag: -e Found value: 100 of previous flag: -e [process:1456] Processing option: aligned with value: /tmp/tmpkn_3riuh/deblur_working_dir/R1071_7_L001_R1_001.fastq.gz.trim.derep.sortmerna [process:1456] Processing option: best with value: 1 Usage: sortmerna -ref FILE [-ref FILE] -reads FWD_READS [-reads REV_READS] [OPTIONS]: ------------------------------------------------------------------------------------------------------------- | option type-format description default | ------------------------------------------------------------------------------------------------------------- [REQUIRED] --ref PATH Required Reference file (FASTA) absolute or relative path. Use mutliple times, once per a reference file --reads PATH Required Raw reads file (FASTA/FASTQ/FASTA.GZ/FASTQ.GZ). Use twice for files with paired reads. The file extensions are Not important. The program automatically recognizes the file format as flat/compressed, fasta/fastq [COMMON] --workdir PATH Optional Workspace directory USRDIR/sortmerna/run/ Default structure: WORKDIR/ idx/ (References index) kvdb/ (Key-value storage for alignments) out/ (processing output) readb/ (pre-processed reads/index) --kvdb PATH Optional Directory for Key-value database WORKDIR/kvdb KVDB is used for storing the alignment results. --idx-dir PATH Optional Directory for storing Reference index. WORKDIR/idx --readb PATH Optional Storage for pre-processed reads WORKDIR/readb/ Directory storing the split reads, or the random access index of compressed reads --fastx BOOL Optional Output aligned reads into FASTA/FASTQ file --sam BOOL Optional Output SAM alignment for aligned reads. --SQ BOOL Optional Add SQ tags to the SAM file --blast STR Optional output alignments in various Blast-like formats Sample values: '0' - pairwise '1' - tabular (Blast - m 8 format) '1 cigar' - tabular + column for CIGAR '1 cigar qcov' - tabular + columns for CIGAR and query coverage '1 cigar qcov qstrand' - tabular + columns for CIGAR, query coverage, and strand --aligned STR/BOOL Optional Aligned reads file prefix [dir/][pfx] WORKDIR/out/aligned Directory and file prefix for aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'aligned' is used Examples: '-aligned $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-aligned dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-aligned dir_1/' -> $PWD/aligned.fasta '-aligned apfx' -> $PWD/apfx.fasta '-aligned (no argument)' -> WORKDIR/out/aligned.fasta --other STR/BOOL Optional Non-aligned reads file prefix [dir/][pfx] WORKDIR/out/other Directory and file prefix for non-aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Must be used with 'fastx'. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'other' is used Examples: '-other $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-other dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-other dir_1/' -> $PWD/dir_1/other.fasta '-other apfx' -> $PWD/apfx.fasta '-other (no argument)' -> aligned_out/other.fasta i.e. the same output directory as used for aligned output --num_alignments INT Optional Positive integer (INT >=0). If used with '-no-best' reports first INT alignments per read reaching E-value threshold, which allows to lower the CPU time and memory use. Otherwise outputs INT best alignments. If INT = 0, all alignments are output --no-best BOOL Optional Disable best alignments search 1 By default the exchaustive alignments search is performed by searching '-min_lis N' candidate alignments If N == 0: All candidate alignments are searched If N > 0: N best alignments are searched. Naturally the larger is the N, the longer is the search time. Explanation: A read can potentially be aligned (reaching E-value threshold) to multiple reference sequences. The 'best' alignment is the highest scoring alignment out of All alignments of a Read. To find the Best alignment - an exhaustive search over All references has to be performed. 'best 1' and 'best 0' (all the bests) are Equally intensive processes requiring the exhaustive search, although the size of reports will differ. --min_lis INT Optional Search all alignments having the first INT 2 longest LIS LIS stands for Longest Increasing Subsequence, it is computed using seeds' positions to expand hits into longer matches prior to Smith - Waterman alignment. --print_all_reads BOOL Optional Output null alignment strings for non-aligned reads False to SAM and/or BLAST tabular files --paired BOOL Optional Flags paired reads False If a single reads file is provided, use this option to indicate the file contains interleaved paired reads when neither 'paired_in' | 'paired_out' | 'out2' | 'sout' are specified. --paired_in BOOL Optional Flags the paired-end reads as Aligned, False when either of them is Aligned. With this option both reads are output into Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_out'. --paired_out BOOL Optional Flags the paired-end reads as Non-aligned, False when either of them is non-aligned. With this option both reads are output into Non-Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_in'. --out2 BOOL Optional Output paired reads into separate files. False Must be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads When used with 'sout', four (4) output files for aligned reads will be generated: 'aligned-paired-fwd, aligned-paired-rev, aligned-singleton-fwd, aligned-singleton-rev'. If 'other' option is also used, eight (8) output files will be generated. --sout BOOL Optional Separate paired and singleton aligned reads. False To be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads Cannot be used with 'paired_in' | 'paired_out' --zip-out STR/BOOL Optional Controls the output compression Yes/True By default the report files are produced in the same format as the input i.e. if the reads files are compressed (gz), the output is also compressed. The default behaviour can be overriden by using '-zip-out'. The possible values: Y(es), N(o), T(rue), F(alse). No value means 'True'. The values are Not case sensitive i.e. 'Yes, YES, yEs, Y, y' are all OK Examples: '-reads freads.gz -zip-out n' : generate flat output when the input is compressed '-reads freads.flat -zip-out' : compress the output when the input files are flat --match INT Optional SW score (positive integer) for a match. 2 --mismatch INT Optional SW penalty (negative integer) for a mismatch. -3 --gap_open INT Optional SW penalty (positive integer) for introducing a gap. 5 --gap_ext INT Optional SW penalty (positive integer) for extending a gap. 2 -e DOUBLE Optional E-value threshold. 1 Defines the 'statistical significance' of a local alignment. Exponentially correllates with the Minimal Alignment score. Higher E-values (100, 1000, ...) cause More reads to Pass the alignment threshold -F BOOL Optional Search only the forward strand. False -N BOOL Optional SW penalty for ambiguous letters (N's) scored as --mismatch -R BOOL Optional Search only the reverse-complementary strand. False [OTU_PICKING] --id INT Optional %%id similarity threshold (the alignment 0.97 must still pass the E-value threshold). --coverage INT Optional %%query coverage threshold (the alignment must 0.97 still pass the E-value threshold) --de_novo_otu BOOL Optional Output FASTA file with 'de novo' reads False Read is 'de novo' if its alignment score passes E-value threshold, but both the identity '-id', and the '-coverage' are below their corresponding thresholds i.e. ID < %%id and COV < %%cov --otu_map BOOL Optional Output OTU map (input to QIIME's make_otu_table.py). False Cannot be used with 'no-best because the grouping is done around the best alignment' [ADVANCED] --passes INT,INT,INT Optional Three intervals at which to place the seed on L,L/2,3 the read (L is the seed length) --edges INT Optional Number (or percent if INT followed by %% sign) of 4 nucleotides to add to each edge of the read prior to SW local alignment --num_seeds BOOL Optional Number of seeds matched before searching 2 for candidate LIS --full_search INT Optional Search for all 0-error and 1-error seed False matches in the index rather than stopping after finding a 0-error match (<1%% gain in sensitivity with up four-fold decrease in speed) --pid BOOL Optional Add pid to output file names. False -a INT Optional DEPRECATED in favour of '-threads'. Number of numCores processing threads to use. Automatically redirects to '-threads' --threads INT Optional Number of Processing threads to use 2 [INDEXING] --index INT Optional Build reference database index 2 By default when this option is not used, the program checks the reference index and builds it if not already existing. This can be changed by using '-index' as follows: '-index 0' - skip indexing. If the index does not exist, the program will terminate and warn to build the index prior performing the alignment '-index 1' - only perform the indexing and terminate '-index 2' - the default behaviour, the same as when not using this option at all -L DOUBLE Optional Indexing: seed length. 18 -m DOUBLE Optional Indexing: the amount of memory (in Mbytes) for 3072 building the index. -v BOOL Optional Produce verbose output when building the index True --interval INT Optional Indexing: Positive integer: index every Nth L-mer in 1 the reference database e.g. '-interval 2'. --max_pos INT Optional Indexing: maximum (integer) number of positions to 1000 store for each unique L-mer. If 0 - all positions are stored. [HELP] -h BOOL Optional Print help information --version BOOL Optional Print SortMeRNA version number [DEVELOPER] --dbg_put_db BOOL Optional --cmd BOOL Optional Launch an interactive session (command prompt) False --task INT Optional Processing Task 4 Possible values: 0 - align. Only perform alignment 1 - post-processing (log writing) 2 - generate reports 3 - align and post-process 4 - all --dbg-level INT Optional Debug level 0 Controls verbosity of the execution trace. Default value of 0 corresponds to the least verbose output. The highest value currently is 2. ERROR(139936696375104)2021-10-01 03:01:16,495:stderr : [opt_default:939] ERROR: Option: 'best' is not recognized WARNING(139936696375104)2021-10-01 03:01:16,495:No sequences left after artifact removal in file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1071_7_L001_R1_001.fastq.gz WARNING(139936696375104)2021-10-01 03:01:16,496:deblurring failed for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1071_7_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 03:01:16,496:-------------------------------------------------------- INFO(139936696375104)2021-10-01 03:01:16,496:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1038_4_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 06:11:33,870:dereplicate seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim INFO(139936696375104)2021-10-01 06:13:36,060:remove_artifacts_seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 06:13:36,069:sortmerna error on file /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 06:13:36,069:stdout : [process:1372] === Options processing starts ... === Found value: sortmerna Found flag: --reads Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim.derep of previous flag: --reads Found flag: --ref Found value: /home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa,/tmp/tmpkn_3riuh/deblur_working_dir/artifacts of previous flag: --ref Found flag: --aligned Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim.derep.sortmerna of previous flag: --aligned Found flag: --blast Found value: 3 of previous flag: --blast Found flag: --best Found value: 1 of previous flag: --best Found flag: --print_all_reads Previous flag: --print_all_reads is Boolean. Setting to True Found flag: -v Previous flag: -v is Boolean. Setting to True Found flag: -e Found value: 100 of previous flag: -e [process:1456] Processing option: aligned with value: /tmp/tmpkn_3riuh/deblur_working_dir/R1038_4_L001_R1_001.fastq.gz.trim.derep.sortmerna [process:1456] Processing option: best with value: 1 Usage: sortmerna -ref FILE [-ref FILE] -reads FWD_READS [-reads REV_READS] [OPTIONS]: ------------------------------------------------------------------------------------------------------------- | option type-format description default | ------------------------------------------------------------------------------------------------------------- [REQUIRED] --ref PATH Required Reference file (FASTA) absolute or relative path. Use mutliple times, once per a reference file --reads PATH Required Raw reads file (FASTA/FASTQ/FASTA.GZ/FASTQ.GZ). Use twice for files with paired reads. The file extensions are Not important. The program automatically recognizes the file format as flat/compressed, fasta/fastq [COMMON] --workdir PATH Optional Workspace directory USRDIR/sortmerna/run/ Default structure: WORKDIR/ idx/ (References index) kvdb/ (Key-value storage for alignments) out/ (processing output) readb/ (pre-processed reads/index) --kvdb PATH Optional Directory for Key-value database WORKDIR/kvdb KVDB is used for storing the alignment results. --idx-dir PATH Optional Directory for storing Reference index. WORKDIR/idx --readb PATH Optional Storage for pre-processed reads WORKDIR/readb/ Directory storing the split reads, or the random access index of compressed reads --fastx BOOL Optional Output aligned reads into FASTA/FASTQ file --sam BOOL Optional Output SAM alignment for aligned reads. --SQ BOOL Optional Add SQ tags to the SAM file --blast STR Optional output alignments in various Blast-like formats Sample values: '0' - pairwise '1' - tabular (Blast - m 8 format) '1 cigar' - tabular + column for CIGAR '1 cigar qcov' - tabular + columns for CIGAR and query coverage '1 cigar qcov qstrand' - tabular + columns for CIGAR, query coverage, and strand --aligned STR/BOOL Optional Aligned reads file prefix [dir/][pfx] WORKDIR/out/aligned Directory and file prefix for aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'aligned' is used Examples: '-aligned $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-aligned dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-aligned dir_1/' -> $PWD/aligned.fasta '-aligned apfx' -> $PWD/apfx.fasta '-aligned (no argument)' -> WORKDIR/out/aligned.fasta --other STR/BOOL Optional Non-aligned reads file prefix [dir/][pfx] WORKDIR/out/other Directory and file prefix for non-aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Must be used with 'fastx'. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'other' is used Examples: '-other $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-other dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-other dir_1/' -> $PWD/dir_1/other.fasta '-other apfx' -> $PWD/apfx.fasta '-other (no argument)' -> aligned_out/other.fasta i.e. the same output directory as used for aligned output --num_alignments INT Optional Positive integer (INT >=0). If used with '-no-best' reports first INT alignments per read reaching E-value threshold, which allows to lower the CPU time and memory use. Otherwise outputs INT best alignments. If INT = 0, all alignments are output --no-best BOOL Optional Disable best alignments search 1 By default the exchaustive alignments search is performed by searching '-min_lis N' candidate alignments If N == 0: All candidate alignments are searched If N > 0: N best alignments are searched. Naturally the larger is the N, the longer is the search time. Explanation: A read can potentially be aligned (reaching E-value threshold) to multiple reference sequences. The 'best' alignment is the highest scoring alignment out of All alignments of a Read. To find the Best alignment - an exhaustive search over All references has to be performed. 'best 1' and 'best 0' (all the bests) are Equally intensive processes requiring the exhaustive search, although the size of reports will differ. --min_lis INT Optional Search all alignments having the first INT 2 longest LIS LIS stands for Longest Increasing Subsequence, it is computed using seeds' positions to expand hits into longer matches prior to Smith - Waterman alignment. --print_all_reads BOOL Optional Output null alignment strings for non-aligned reads False to SAM and/or BLAST tabular files --paired BOOL Optional Flags paired reads False If a single reads file is provided, use this option to indicate the file contains interleaved paired reads when neither 'paired_in' | 'paired_out' | 'out2' | 'sout' are specified. --paired_in BOOL Optional Flags the paired-end reads as Aligned, False when either of them is Aligned. With this option both reads are output into Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_out'. --paired_out BOOL Optional Flags the paired-end reads as Non-aligned, False when either of them is non-aligned. With this option both reads are output into Non-Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_in'. --out2 BOOL Optional Output paired reads into separate files. False Must be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads When used with 'sout', four (4) output files for aligned reads will be generated: 'aligned-paired-fwd, aligned-paired-rev, aligned-singleton-fwd, aligned-singleton-rev'. If 'other' option is also used, eight (8) output files will be generated. --sout BOOL Optional Separate paired and singleton aligned reads. False To be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads Cannot be used with 'paired_in' | 'paired_out' --zip-out STR/BOOL Optional Controls the output compression Yes/True By default the report files are produced in the same format as the input i.e. if the reads files are compressed (gz), the output is also compressed. The default behaviour can be overriden by using '-zip-out'. The possible values: Y(es), N(o), T(rue), F(alse). No value means 'True'. The values are Not case sensitive i.e. 'Yes, YES, yEs, Y, y' are all OK Examples: '-reads freads.gz -zip-out n' : generate flat output when the input is compressed '-reads freads.flat -zip-out' : compress the output when the input files are flat --match INT Optional SW score (positive integer) for a match. 2 --mismatch INT Optional SW penalty (negative integer) for a mismatch. -3 --gap_open INT Optional SW penalty (positive integer) for introducing a gap. 5 --gap_ext INT Optional SW penalty (positive integer) for extending a gap. 2 -e DOUBLE Optional E-value threshold. 1 Defines the 'statistical significance' of a local alignment. Exponentially correllates with the Minimal Alignment score. Higher E-values (100, 1000, ...) cause More reads to Pass the alignment threshold -F BOOL Optional Search only the forward strand. False -N BOOL Optional SW penalty for ambiguous letters (N's) scored as --mismatch -R BOOL Optional Search only the reverse-complementary strand. False [OTU_PICKING] --id INT Optional %%id similarity threshold (the alignment 0.97 must still pass the E-value threshold). --coverage INT Optional %%query coverage threshold (the alignment must 0.97 still pass the E-value threshold) --de_novo_otu BOOL Optional Output FASTA file with 'de novo' reads False Read is 'de novo' if its alignment score passes E-value threshold, but both the identity '-id', and the '-coverage' are below their corresponding thresholds i.e. ID < %%id and COV < %%cov --otu_map BOOL Optional Output OTU map (input to QIIME's make_otu_table.py). False Cannot be used with 'no-best because the grouping is done around the best alignment' [ADVANCED] --passes INT,INT,INT Optional Three intervals at which to place the seed on L,L/2,3 the read (L is the seed length) --edges INT Optional Number (or percent if INT followed by %% sign) of 4 nucleotides to add to each edge of the read prior to SW local alignment --num_seeds BOOL Optional Number of seeds matched before searching 2 for candidate LIS --full_search INT Optional Search for all 0-error and 1-error seed False matches in the index rather than stopping after finding a 0-error match (<1%% gain in sensitivity with up four-fold decrease in speed) --pid BOOL Optional Add pid to output file names. False -a INT Optional DEPRECATED in favour of '-threads'. Number of numCores processing threads to use. Automatically redirects to '-threads' --threads INT Optional Number of Processing threads to use 2 [INDEXING] --index INT Optional Build reference database index 2 By default when this option is not used, the program checks the reference index and builds it if not already existing. This can be changed by using '-index' as follows: '-index 0' - skip indexing. If the index does not exist, the program will terminate and warn to build the index prior performing the alignment '-index 1' - only perform the indexing and terminate '-index 2' - the default behaviour, the same as when not using this option at all -L DOUBLE Optional Indexing: seed length. 18 -m DOUBLE Optional Indexing: the amount of memory (in Mbytes) for 3072 building the index. -v BOOL Optional Produce verbose output when building the index True --interval INT Optional Indexing: Positive integer: index every Nth L-mer in 1 the reference database e.g. '-interval 2'. --max_pos INT Optional Indexing: maximum (integer) number of positions to 1000 store for each unique L-mer. If 0 - all positions are stored. [HELP] -h BOOL Optional Print help information --version BOOL Optional Print SortMeRNA version number [DEVELOPER] --dbg_put_db BOOL Optional --cmd BOOL Optional Launch an interactive session (command prompt) False --task INT Optional Processing Task 4 Possible values: 0 - align. Only perform alignment 1 - post-processing (log writing) 2 - generate reports 3 - align and post-process 4 - all --dbg-level INT Optional Debug level 0 Controls verbosity of the execution trace. Default value of 0 corresponds to the least verbose output. The highest value currently is 2. ERROR(139936696375104)2021-10-01 06:13:36,069:stderr : [opt_default:939] ERROR: Option: 'best' is not recognized WARNING(139936696375104)2021-10-01 06:13:36,070:No sequences left after artifact removal in file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1038_4_L001_R1_001.fastq.gz WARNING(139936696375104)2021-10-01 06:13:36,070:deblurring failed for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1038_4_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 06:13:36,070:-------------------------------------------------------- INFO(139936696375104)2021-10-01 06:13:36,070:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1039_5_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 09:21:53,883:dereplicate seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim INFO(139936696375104)2021-10-01 09:24:18,053:remove_artifacts_seqs file /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 09:24:18,061:sortmerna error on file /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim.derep ERROR(139936696375104)2021-10-01 09:24:18,061:stdout : [process:1372] === Options processing starts ... === Found value: sortmerna Found flag: --reads Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim.derep of previous flag: --reads Found flag: --ref Found value: /home/iasst/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/deblur/support_files/artifacts.fa,/tmp/tmpkn_3riuh/deblur_working_dir/artifacts of previous flag: --ref Found flag: --aligned Found value: /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim.derep.sortmerna of previous flag: --aligned Found flag: --blast Found value: 3 of previous flag: --blast Found flag: --best Found value: 1 of previous flag: --best Found flag: --print_all_reads Previous flag: --print_all_reads is Boolean. Setting to True Found flag: -v Previous flag: -v is Boolean. Setting to True Found flag: -e Found value: 100 of previous flag: -e [process:1456] Processing option: aligned with value: /tmp/tmpkn_3riuh/deblur_working_dir/R1039_5_L001_R1_001.fastq.gz.trim.derep.sortmerna [process:1456] Processing option: best with value: 1 Usage: sortmerna -ref FILE [-ref FILE] -reads FWD_READS [-reads REV_READS] [OPTIONS]: ------------------------------------------------------------------------------------------------------------- | option type-format description default | ------------------------------------------------------------------------------------------------------------- [REQUIRED] --ref PATH Required Reference file (FASTA) absolute or relative path. Use mutliple times, once per a reference file --reads PATH Required Raw reads file (FASTA/FASTQ/FASTA.GZ/FASTQ.GZ). Use twice for files with paired reads. The file extensions are Not important. The program automatically recognizes the file format as flat/compressed, fasta/fastq [COMMON] --workdir PATH Optional Workspace directory USRDIR/sortmerna/run/ Default structure: WORKDIR/ idx/ (References index) kvdb/ (Key-value storage for alignments) out/ (processing output) readb/ (pre-processed reads/index) --kvdb PATH Optional Directory for Key-value database WORKDIR/kvdb KVDB is used for storing the alignment results. --idx-dir PATH Optional Directory for storing Reference index. WORKDIR/idx --readb PATH Optional Storage for pre-processed reads WORKDIR/readb/ Directory storing the split reads, or the random access index of compressed reads --fastx BOOL Optional Output aligned reads into FASTA/FASTQ file --sam BOOL Optional Output SAM alignment for aligned reads. --SQ BOOL Optional Add SQ tags to the SAM file --blast STR Optional output alignments in various Blast-like formats Sample values: '0' - pairwise '1' - tabular (Blast - m 8 format) '1 cigar' - tabular + column for CIGAR '1 cigar qcov' - tabular + columns for CIGAR and query coverage '1 cigar qcov qstrand' - tabular + columns for CIGAR, query coverage, and strand --aligned STR/BOOL Optional Aligned reads file prefix [dir/][pfx] WORKDIR/out/aligned Directory and file prefix for aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'aligned' is used Examples: '-aligned $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-aligned dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-aligned dir_1/' -> $PWD/aligned.fasta '-aligned apfx' -> $PWD/apfx.fasta '-aligned (no argument)' -> WORKDIR/out/aligned.fasta --other STR/BOOL Optional Non-aligned reads file prefix [dir/][pfx] WORKDIR/out/other Directory and file prefix for non-aligned output i.e. each output file goes into the specified directory with the given prefix. The appropriate extension: (fasta|fastq|blast|sam|etc) is automatically added. Must be used with 'fastx'. Both 'dir' and 'pfx' are optional. The 'dir' can be a relative or an absolute path. If 'dir' is not specified, the output is created in the WORKDIR/out/ If 'pfx' is not specified, the prefix 'other' is used Examples: '-other $MYDIR/dir_1/dir_2/1' -> $MYDIR/dir_1/dir_2/1.fasta '-other dir_1/apfx' -> $PWD/dir_1/apfx.fasta '-other dir_1/' -> $PWD/dir_1/other.fasta '-other apfx' -> $PWD/apfx.fasta '-other (no argument)' -> aligned_out/other.fasta i.e. the same output directory as used for aligned output --num_alignments INT Optional Positive integer (INT >=0). If used with '-no-best' reports first INT alignments per read reaching E-value threshold, which allows to lower the CPU time and memory use. Otherwise outputs INT best alignments. If INT = 0, all alignments are output --no-best BOOL Optional Disable best alignments search 1 By default the exchaustive alignments search is performed by searching '-min_lis N' candidate alignments If N == 0: All candidate alignments are searched If N > 0: N best alignments are searched. Naturally the larger is the N, the longer is the search time. Explanation: A read can potentially be aligned (reaching E-value threshold) to multiple reference sequences. The 'best' alignment is the highest scoring alignment out of All alignments of a Read. To find the Best alignment - an exhaustive search over All references has to be performed. 'best 1' and 'best 0' (all the bests) are Equally intensive processes requiring the exhaustive search, although the size of reports will differ. --min_lis INT Optional Search all alignments having the first INT 2 longest LIS LIS stands for Longest Increasing Subsequence, it is computed using seeds' positions to expand hits into longer matches prior to Smith - Waterman alignment. --print_all_reads BOOL Optional Output null alignment strings for non-aligned reads False to SAM and/or BLAST tabular files --paired BOOL Optional Flags paired reads False If a single reads file is provided, use this option to indicate the file contains interleaved paired reads when neither 'paired_in' | 'paired_out' | 'out2' | 'sout' are specified. --paired_in BOOL Optional Flags the paired-end reads as Aligned, False when either of them is Aligned. With this option both reads are output into Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_out'. --paired_out BOOL Optional Flags the paired-end reads as Non-aligned, False when either of them is non-aligned. With this option both reads are output into Non-Aligned FASTA/Q file Must be used with 'fastx'. Mutually exclusive with 'paired_in'. --out2 BOOL Optional Output paired reads into separate files. False Must be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads When used with 'sout', four (4) output files for aligned reads will be generated: 'aligned-paired-fwd, aligned-paired-rev, aligned-singleton-fwd, aligned-singleton-rev'. If 'other' option is also used, eight (8) output files will be generated. --sout BOOL Optional Separate paired and singleton aligned reads. False To be used with 'fastx'. If a single reads file is provided, this options implies interleaved paired reads Cannot be used with 'paired_in' | 'paired_out' --zip-out STR/BOOL Optional Controls the output compression Yes/True By default the report files are produced in the same format as the input i.e. if the reads files are compressed (gz), the output is also compressed. The default behaviour can be overriden by using '-zip-out'. The possible values: Y(es), N(o), T(rue), F(alse). No value means 'True'. The values are Not case sensitive i.e. 'Yes, YES, yEs, Y, y' are all OK Examples: '-reads freads.gz -zip-out n' : generate flat output when the input is compressed '-reads freads.flat -zip-out' : compress the output when the input files are flat --match INT Optional SW score (positive integer) for a match. 2 --mismatch INT Optional SW penalty (negative integer) for a mismatch. -3 --gap_open INT Optional SW penalty (positive integer) for introducing a gap. 5 --gap_ext INT Optional SW penalty (positive integer) for extending a gap. 2 -e DOUBLE Optional E-value threshold. 1 Defines the 'statistical significance' of a local alignment. Exponentially correllates with the Minimal Alignment score. Higher E-values (100, 1000, ...) cause More reads to Pass the alignment threshold -F BOOL Optional Search only the forward strand. False -N BOOL Optional SW penalty for ambiguous letters (N's) scored as --mismatch -R BOOL Optional Search only the reverse-complementary strand. False [OTU_PICKING] --id INT Optional %%id similarity threshold (the alignment 0.97 must still pass the E-value threshold). --coverage INT Optional %%query coverage threshold (the alignment must 0.97 still pass the E-value threshold) --de_novo_otu BOOL Optional Output FASTA file with 'de novo' reads False Read is 'de novo' if its alignment score passes E-value threshold, but both the identity '-id', and the '-coverage' are below their corresponding thresholds i.e. ID < %%id and COV < %%cov --otu_map BOOL Optional Output OTU map (input to QIIME's make_otu_table.py). False Cannot be used with 'no-best because the grouping is done around the best alignment' [ADVANCED] --passes INT,INT,INT Optional Three intervals at which to place the seed on L,L/2,3 the read (L is the seed length) --edges INT Optional Number (or percent if INT followed by %% sign) of 4 nucleotides to add to each edge of the read prior to SW local alignment --num_seeds BOOL Optional Number of seeds matched before searching 2 for candidate LIS --full_search INT Optional Search for all 0-error and 1-error seed False matches in the index rather than stopping after finding a 0-error match (<1%% gain in sensitivity with up four-fold decrease in speed) --pid BOOL Optional Add pid to output file names. False -a INT Optional DEPRECATED in favour of '-threads'. Number of numCores processing threads to use. Automatically redirects to '-threads' --threads INT Optional Number of Processing threads to use 2 [INDEXING] --index INT Optional Build reference database index 2 By default when this option is not used, the program checks the reference index and builds it if not already existing. This can be changed by using '-index' as follows: '-index 0' - skip indexing. If the index does not exist, the program will terminate and warn to build the index prior performing the alignment '-index 1' - only perform the indexing and terminate '-index 2' - the default behaviour, the same as when not using this option at all -L DOUBLE Optional Indexing: seed length. 18 -m DOUBLE Optional Indexing: the amount of memory (in Mbytes) for 3072 building the index. -v BOOL Optional Produce verbose output when building the index True --interval INT Optional Indexing: Positive integer: index every Nth L-mer in 1 the reference database e.g. '-interval 2'. --max_pos INT Optional Indexing: maximum (integer) number of positions to 1000 store for each unique L-mer. If 0 - all positions are stored. [HELP] -h BOOL Optional Print help information --version BOOL Optional Print SortMeRNA version number [DEVELOPER] --dbg_put_db BOOL Optional --cmd BOOL Optional Launch an interactive session (command prompt) False --task INT Optional Processing Task 4 Possible values: 0 - align. Only perform alignment 1 - post-processing (log writing) 2 - generate reports 3 - align and post-process 4 - all --dbg-level INT Optional Debug level 0 Controls verbosity of the execution trace. Default value of 0 corresponds to the least verbose output. The highest value currently is 2. ERROR(139936696375104)2021-10-01 09:24:18,061:stderr : [opt_default:939] ERROR: Option: 'best' is not recognized WARNING(139936696375104)2021-10-01 09:24:18,061:No sequences left after artifact removal in file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1039_5_L001_R1_001.fastq.gz WARNING(139936696375104)2021-10-01 09:24:18,061:deblurring failed for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1039_5_L001_R1_001.fastq.gz INFO(139936696375104)2021-10-01 09:24:18,062:-------------------------------------------------------- INFO(139936696375104)2021-10-01 09:24:18,062:launch_workflow for file /tmp/q2-SingleLanePerSampleSingleEndFastqDirFmt-xmy_e8dl/R1037_3_L001_R1_001.fastq.gz