Hi @jwdebelius,
first of all i rerun my workflow with a shorter final trim length in order to keep about 90% of the sequences which did not change the error.
These are the commands I used from import until the error in the regional alignment in V6 occurred. To determine trunc-length i viewed the demultiplexed-seqs artifact and after 180 and 200 the sequence quality drops. I have tried smaller values but the same error occured later on.
I am not quite sure if the cutadapt demux-paired command "--p-mixed-orientation" is necessary or not but again i ran the analysis with TRUE and FALSE and it still led to the error in the regional alignment.
I ran the exact same commands for the V45 region exept adapted trunc lenghts, final trim length and of course primer sequences leading to a viewable alignment_vis_V45 artifact.
- Import
qiime tools import \
--type MultiplexedPairedEndBarcodeInSequence \
--input-path $dirData/sequences/MB-Lib1/ \
--output-path $dirOut/multiplexed-seqs.qza
- Demultiplex by region (same parameters for all regions; due to the multiplexing regime the BC(9nt) plus forward Primer(19nt) act as Barcode so the whole "Barcode" Sequence is 28nt long)
qiime cutadapt demux-paired \
--i-seqs $dirOut/multiplexed-seqs.qza \
--m-forward-barcodes-file ${metadata_V6} \
--m-forward-barcodes-column BC+Primer-Sequenz \
--p-mixed-orientation TRUE \
--o-per-sample-sequences $dirOut/demultiplexed-seqs-V6-0.25.qza \
--o-untrimmed-sequences $dirOut/untrimmed-V6-0.25.qza \
--p-error-rate 0.25 \
--p-cores 23 \
--verbose
- Read Preparation
a) Denoise
qiime dada2 denoise-paired \
--i-demultiplexed-seqs $dirOut/demultiplexed-seqs-V6-0.25.qza \
--p-trunc-len-f 180 \
--p-trunc-len-r 200 \
--o-table ${table_V6_MR} \
--o-representative-sequences ${rep_seqs_V6_MR} \
--p-n-threads 23 \
--o-denoising-stats ${den_stats_V6_MR}
b) trim to uniform length
qiime sidle trim-dada2-posthoc \
--i-table ${table_V6_MR} \
--i-representative-sequences ${rep_seqs_V6_MR} \
--p-trim-length 220 \
--o-trimmed-table ${table_trimmed_V6_MR} \
--o-trimmed-representative-sequences ${rep_seqs_trimmed_V6_MR}
- Database preparation
a) filter database
qiime rescript cull-seqs \
--i-sequences ${ref_sequences_bacterial_in} \
--p-num-degenerates 5 \
--o-clean-sequences ${ref_sequences_bacterial_f}
qiime taxa filter-seqs \
--i-sequences ${ref_sequences_bacterial_f} \
--i-taxonomy ${ref_taxonomy_bacterial} \
--p-exclude mitochondria,chloroplast \
--p-mode contains \
--o-filtered-sequences ${ref_sequences_bacterial_taxf}
b) Prepare regional database for each region
qiime feature-classifier extract-reads \
--i-sequences ${ref_sequences_bacterial_taxf} \
--p-f-primer GCACAAGCRGHGGARCATG \
--p-r-primer CCGYCAATTYMTTTRAGTTT \
--o-reads ${ref_sequences_bacterial_taxf_V6}
qiime sidle prepare-extracted-region \
--i-sequences ${ref_sequences_bacterial_taxf_V6} \
--p-region "V6" \
--p-fwd-primer GCACAAGCRGHGGARCATG \
--p-r-primer CCGYCAATTYMTTTRAGTTT \
--p-trim-length 220 \
--o-collapsed-kmers ${db_V6_kmers} \
--o-kmer-map ${db_V6_kmers_map}
- Sequence reconstruction
a) regional alignment
qiime sidle align-regional-kmers \
--i-kmers ${db_V6_kmers} \
--i-rep-seq ${rep_seqs_trimmed_V6_MR} \
--p-region V6 \
--p-max-mismatch 5 \
--p-n-workers 23 \
--o-regional-alignment ${alignment_V6}
qiime metadata tabulate \
--m-input-file ${alignment_V6} \
--o-visualization ${alignment_vis_V6}
Attached the demux-seqs, the rep_seqs, the rep_seqs_trimmed and the db_kmers_map visualization and the empty alignment.qza for V6 region.
demultiplexed-seqs-V6.qzv (314.2 KB)
rep_seqs_V6.qzv (1.6 MB)
rep_seqs_trimmed220_V6.qzv (1.3 MB)
silva-V6-kmers_map.qzv (1.3 MB)
alignment_V6_220.qza (109.7 KB)