demux.qzv (292.6 KB)
trimmed_sequences.qzv (296.8 KB)
Hello,
I have paired end, multiplex sequences, with barcodes. Illumina adapters were ligated on after PCR, therefore some of the reads on the R1 file are reverse reads that contain no barcode and some of the reads in the R2 file are forward reads to do contain a barcode.
Based on this forum, I have concatenated R1 + R2 → “forward” reads and R2 + R1 = “reverse” reads as follows:
cat Mary-RS-16s-72314_S1_L001_R1_001.fastq Mary-RS-16s-72314_S1_L001_R2_001.fastq > forward.fastq
cat Mary-RS-16s-72314_S1_L001_R2_001.fastq Mary-RS-16s-72314_S1_L001_R1_001.fastq > reverse.fastq
I then imported and demultiplexed these data using the following commands:
qiime tools import --type MultiplexedPairedEndBarcodeInSequence --input-path ~/Riley/AXRS/Fastq --output-path ~/Riley/AXRS/Fastq/multiplexed-seqs.qza
qiime cutadapt demux-paired
--i-seqs ~/Riley/AXRS/Fastq/multiplexed-seqs.qza
--m-forward-barcodes-file ~/Riley/AXRS/RS_mappingfile_SCFAs_completed_study_only_w_responders.tsv
--m-forward-barcodes-column BarcodeSequence
--p-error-rate 0
--o-per-sample-sequences ~/Riley/AXRS/Demux-0/demux.qza
--output-dir ~/Riley/AXRS/output-dir-0
I tried running DADA2 denoising but, some of the sequences in the rep-seqs file generated from this step still seemed to contain barcodes and primer sequences. I therefore took the output from the demux-paired command and tried using the cutadapt trim-paired command to find and delete the adapters using:
qiime cutadapt trim-paired
--i-demultiplexed-sequences ~/Riley/AXRS/Demux-0/demux.qza
--p-front-f GTGTGCCAGCMGCCGCGGTAA
--p-error-rate 0
--output-dir ~/Riley/AXRS/Demux-0/CutAdapt
I looked at the raw output of this and saw that it was successful (couldn't find the adapters in the sequences).
qiime demux summarize
--i-data ~/Riley/AXRS/Demux-0/CutAdapt-0/trimmed_sequences.qza
--o-visualization ~/Riley/AXRS/Demux-0/CutAdapt-0/trimmed_sequences.qzv
I tried running dada2 denoising on these trimmed sequences as follows but it just gets stuck at “denoise remaining samples”
qiime dada2 denoise-paired
--i-demultiplexed-seqs ~/Riley/AXRS/Demux-0/CutAdapt-0/trimmed_sequences.qza
--p-trim-left-f 21
--p-trunc-len-f 205
--p-trim-left-r 21
--p-trunc-len-r 206
--o-representative-sequences ~/Riley/AXRS/DADA2-0/rep-seqs-dada2.qza
--o-table ~/Riley/AXRS/DADA2-0/table-dada2.qza
--output-dir ~/Riley/AXRS/DADA2-0
--verbose
R version 3.4.1 (2017-06-30)
Loading required package: Rcpp
DADA2 R package version: 1.6.0
-
Filtering ........................................................................
-
Learning Error Rates
2a) Forward Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 509215 reads in 113988 unique sequences.
Sample 2 - 111937 reads in 29605 unique sequences.
Sample 3 - 114140 reads in 31532 unique sequences.
Sample 4 - 423736 reads in 99168 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
selfConsist step 7
Convergence after 7 rounds.
2b) Reverse Reads
Initializing error rates to maximum possible estimate.
Sample 1 - 509215 reads in 99541 unique sequences.
Sample 2 - 111937 reads in 28305 unique sequences.
Sample 3 - 114140 reads in 28180 unique sequences.
Sample 4 - 423736 reads in 114631 unique sequences.
selfConsist step 2
selfConsist step 3
selfConsist step 4
selfConsist step 5
selfConsist step 6
Convergence after 6 rounds. -
Denoise remaining samples ......................
I have attached the qzv files from before and after trimming for reference. Any advice on what I'm doing wrong?
Thank you,
Riley