Prepare Metadata for pair-end sequencing data


Can anyone give me an example to prepare metadata for pair-end sequencing data? The tutorial is for single-end sequencing data.


1 Like

Welcome to the forum, @Huiyun_Wu!

You can find examples of working with paired-end sequence data in a few different places.

Most importantly, keep in mind that you can get help directly from :qiime2: on the command line:

➜ qiime tools import --help
Usage: qiime tools import [OPTIONS]

  Import data to create a new QIIME 2 Artifact. See
  for usage examples and details on the file types and associated semantic
  types that can be imported.

  --type TEXT             The semantic type of the artifact that will be
                          created upon importing. Use --show-importable-types
                          to see what importable semantic types are available
                          in the current deployment.                [required]
  --input-path PATH       Path to file or directory that should be imported.
  --output-path ARTIFACT  Path where output artifact should be written.
  --input-format TEXT     The format of the data to be imported. If not
                          provided, data must be in the format expected by the
                          semantic type provided via --type.
  --show-importable-types Show the semantic types that can be supplied to
                          --type to import data into an artifact.
                          Show formats that can be supplied to --input-format
                          to import data into an artifact.
  --help                  Show this message and exit.
➜ qiime tools import --show-importable-types

Note the output of the second command above was truncated for demo purposes.

The “Atacama soil microbiome” tutorial is a good next step after working through Moving Pictures

There is also a section in the metadata tutorial: Importing data — QIIME 2 2020.2.0 documentation

Let us know if you have any more questions :grinning:


Thanks Andrew! I will try it out!

Something to keep in mind, @Huiyun_Wu is that there aren’t really any requirements in QIIME 2 regarding sample metadata (all of the old QIIME 1 requirements about linker-primer-sequence et al all gone):

Most metadata-needs will be driven by your specific study. Sample metadata is the “secret sauce” that makes your sequencing data interesting, and specific to your study.


Hi Andrew,

I was following the “Atacama soil” tutorial The barcode fastq.gz file is already prepared and easy to download. However, I only have the forward and reserve fastq.gz file available for my project. I have the sample ID and barcode sequence information. Is there a tutorial illustrating how to prepare the barcode fastq.gz file?

Also, when I open the sample-metada.tsv for this tutorial, I can only find one column “barcode-sequence”, where is the other pair of barcode? Is it listed in a different sample ID? In another word, is BAQ1370.1.2, BAQ1370.1.3, BAQ1370.3 the same sample but with differrent barcode?

Thanks a lot!

It sounds like you might have multiplexed-paired end reads with the barcodes still in the reads, which means there is no need to create a barcodes.fastq.gz. However, without more information, we can't be sure. If the barcodes are in fact in your reads, you can use this tutorial as a guide:

I recently answered a question that spells out how the mapping between barcodes and sequences work here:

Although the post is about single end sequences, the logic stands.


A post was split to a new topic: metadata column not recognized

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.