Demultiplexing and Trimming Adapters from Reads with q2-cutadapt


(Matthew Ryan Dillon) #1

:exclamation: :exclamation: :exclamation: NOTE :exclamation: :exclamation: :exclamation:

This tutorial is a work in progress, and is incomplete at the moment. It demonstrates at a high level some of the methods available in the q2-cutadapt plugin available in QIIME 2 2018.2. Please stay tuned here for additional updates as this tutorial is expanded upon in the coming weeks.

Multiplexed reads with the barcodes in the sequence reads can be demultiplexed in QIIME 2 using the q2-cutadapt plugin, which wraps the cutadapt tool. (Multiplexed sequences prepared with the EMP protocol, where barcode reads are in a separate file, as always can be demultiplexed with the q2-demux plugin.) The following tutorial utilizes a toy dataset to illustrate some of the methods in q2-cutadapt.

Download data used in this tutorial

forward.fastq.gz (770 Bytes)
metadata.tsv (53 Bytes)

The data here consists of single-end reads (6 reads total). There are two samples present in the data, with the following barcodes on the 5’ end:

Sample    Barcode

Import the multiplexed sequences

$ qiime tools import \
  --type MultiplexedSingleEndBarcodeInSequence \
  --input-path forward.fastq.gz \
  --output-path multiplexed-seqs.qza

Demultiplex the reads

$ qiime cutadapt demux-single \
  --i-seqs multiplexed-seqs.qza \
  --m-barcodes-file metadata.tsv \
  --m-barcodes-column Barcode \
  --p-error-rate 0 \
  --o-per-sample-sequences demultiplexed-seqs.qza \
  --o-untrimmed-sequences untrimmed.qza \

Trim adapters from demultiplexed reads

If there are sequencing adapters or PCR primers in the reads which you’d like to remove, you can do that next as follows.

$ qiime cutadapt trim-single \
  --i-demultiplexed-sequences demultiplexed-seqs.qza \
  --p-front GCTACGGGGGG \
  --p-error-rate 0 \
  --o-trimmed-sequences trimmed-seqs.qza \

Summarize demultiplexed and trimmed reads

$ qiime demux summarize \
  --i-data trimmed-seqs.qza \
  --o-visualization trimmed-seqs.qzv
$ qiime tools view trimmed-seqs.qzv

Regarding paired-end reads

  • The import format for paired-end reads with the barcodes still in the sequence is MultiplexedPairedEndBarcodeInSequence - this format expects two files in a directory (forward.fastq.gz and reverse.fastq.gz).
  • Demultiplexing currently only works if the barcodes are in the forward reads — we plan to support dual-indexing strategies in a future release of QIIME 2.
  • Demultiplexing is accomplished with the demux-paired command.
  • Filtering/trimming is accomplished with the trim-paired command.

Import multiplexed R1.fastq and R2.fastq with mixed forward and reverse reads + truncate reverse primer
Problems with fastq files paired end without barcode file
Create barcode file
Demultiplexing Help
QIIME 2 2017.12 release is now live!
Importing data, barcode read files missing
Separating two different amplicons from demultiplexed data
Demuliplex-sequences still contan primers, how can you run DADA2?
Split samples in different files
My libraries contain reads from _two_ variable regions - how can I proceed with the analysis?
Issue with Bray-Curtis PCOA
How to import multiplexed data?
Different taxa result from DADA2 and Deblur
Cutadapt with barcodes in reverse read
Replace barcode sequence with sample-Id
Replace barcode sequence with sample-Id
Demultiplexing Help
Analysis of fastq files
Miseq paired-end data with no barcodes
Importing ubam files into qiime
QIIME 2 processing comparatively to QIIME 1
New into QIIME2 and need help importing Data
Reads processing with different primers
Importing format with multiplexed fastq format, single read iwth barcode data in the reads, both forward and reverse directions
Analyzing variable length joined paired-end reads with Deblur
Primer-embedded dual-indexed paired-end data importing
How to join sequences from different libraries sequenced in miseq
Multiplexing fastq files paired end without barcode file
Slight typo in Atacama tutorial
How to use the mapping file (.txt) as a barcode.fastq.gz file in qiime import tool?
Import Ion .fastq file
Help Importing demultiplex paired end files with barcodes in head (only two files, files not separated by sample)
Problems with q2-cutadapt multiplexed paired end sequences without sequence ID lines or quality scores
Is there a way to use a FASTA/QUAL file for the moving pictures tutorial?
2 Fastq Files - Trying To Import Into Qiime2
Truncating reads with multiple sequencing runs and different primers
Workflow for illumina demultiplexed paired end data
Help with using cutadapt
Degenerate Primer
(Stephanie Hereira) #2

Hi, do you know If in the new version of qiime (2) there is a command that make me paired or do this using the barcode of the forward primer and reverse primer at the same time, in order to differentiate a sample?

(Matthew Ryan Dillon) #4

Hi @Steph_Hp!

I’m not quite sure what you’re asking for here, but q2-cutadapt supports demultiplexing paired-end reads, and trimming paired-end reads.

This sounds like dual-indexing – please see my note from above:

Hope that answers your questions, if not please let me know! Thanks! :t_rex:

(Nicholas Bokulich) #19

An off-topic reply has been split into a new topic: How to demux dual-indexed reads

Please keep replies on-topic in the future.

Difference of the types for importing data