Poll: Tell us about your amplicon sequencing data!

We need your help! In an effort to best serve the QIIME 2 community, we are looking for feedback from users (and potential users) of QIIME 2 about what format their amplicon sequencing data is in when they are ready to import it into QIIME 2.

We put together some possible options:

When I work with single-end read data, I receive fastq data that is…

  • Already demultiplexed, one .fastq (or .fastq.gz) file per sample
  • Still multiplexed, EMP protocol format (i.e., there is one sequence read file and one barcode read file)
  • Still multiplexed, barcodes are in the sequence reads
  • Still multiplexed, dual barcodes are in the sequence reads
  • Still multiplexed, 3-file Illumina format (one sequence reads file, an index 1 reads file, and an index 2 reads file)
  • Still multiplexed, barcodes in sequence read header lines
  • Something else (feel free to post a description in a reply to this post)

0 voters

When I work with paired-end read data, I receive fastq data that is…

  • Already demultiplexed, one forward read and one reverse read .fastq (or .fastq.gz) file per sample
  • Still multiplexed, EMP protocol format (i.e., there are forward and reverse sequence read files and one barcode read file)
  • Still multiplexed, barcodes are in the sequence reads
  • Still multiplexed, dual barcodes are in the sequence reads
  • Still multiplexed, 4-file Illumina format (one forward reads file, one reverse reads file, an index 1 reads file, and an index 2 reads file)
  • Still multiplexed, barcodes in sequence read header lines
  • Something else (feel free to post a description in a reply to this post)

0 voters

I have the following types of artifacts in my sequences that need to be removed…

  • Barcodes
  • Primers or other parts of the sequencing construct
  • Heterogeneity spacers
  • Something else (feel free to post a description in a reply to this post)

0 voters

Thanks for taking the time to help us out, and happy QIIMEing! :sun_with_face:

4 Likes

Sometimes there are PhiX reads in the sequence data.

1 Like

Demultiplexed, interleaved paired-end reads.

1 Like

We (MBL) generate amplicon sequencing datasets that are often provided as demultiplexed raw fastqs (paired reads) to end users. They contain an inline barcode (either 4 or 9 nt at start of read 1) and 16s primer sequences in both read 1 and read 2. We may also provide qc’d merged amplicon reads with these sequences removed, but the reads are now in fasta format. It would be great if the latter were importable into qiime. If this is already the case, please let me know…

1 Like

Hi @Hilary_Morrison!

If the demultiplexed and QC'd sequences in the FASTA file have sequence IDs that follow the QIIME 1 demux format, you can import the FASTA file and dereplicate the sequences and/or perform OTU picking using q2-vsearch. Check out the q2-vsearch Community Tutorial for details and example data.