I’m new to qiime2 and I have met a question when studying the moving pictures tutorial.
It’s about the data used by DADA2. The data used by DADA2 should meet one of the followed criteria: non-biological nucleotides have been removed,e.g. primers, adapters, linkers,etc.
But in “moving pictures” tutorial, the demultiplexing sequences with barcodes and primers were directly used by DADA2.
I don’t know why it can be used in this way.
Incorrect. Where do you get this impression? The moving pictures data are in EMP format, which means:
- The barcodes are not present in the sequences, they are present in a separate barcodes file that is discarded after demultiplexing.
- The PCR primers are also used as the sequencing primer and hence are not present in the sequence reads. I just double-checked: the primers are not present in the moving pics fastq sequences.
Perhaps the source of your confusion is that the moving pictures data are in true EMP format (which fits the criteria above). However, not all multiplexed data formats are true EMP (even if some can be demultiplexed with the
demux emp-* methods), so maybe you are comparing this to your own data which is not true EMP? E.g., some formats:
- Contain the barcodes in-line with the sequences (in which case you should use the
cutadapt demux-*methods to demultiplex).
- Contain the PCR primers (and sometimes adapters) inside the sequence reads. You can use dada2’s
trimparameters, or q2-cutadapt, to trim these prior to denoising.
thanks for your helping, could you please give me a link to the EMP format, so i can learn the relative knowledge
$ qiime demux emp-single --help Usage: qiime demux emp-single [OPTIONS] Demultiplex sequence data (i.e., map barcode reads to sample ids) for data generated with the Earth Microbiome Project (EMP) amplicon sequencing protocol. Details about this protocol can be found at http://www.earthmicrobiome.org/protocols-and-standards/
thanks for your helping!!!