Poll: Tell us about your amplicon sequencing data!

q2d2 · February 13, 2018, 4:40pm

We need your help! In an effort to best serve the QIIME 2 community, we are looking for feedback from users (and potential users) of QIIME 2 about what format their amplicon sequencing data is in when they are ready to import it into QIIME 2.

We put together some possible options:

When I work with single-end read data, I receive fastq data that is...

Already demultiplexed, one .fastq (or .fastq.gz) file per sample
Still multiplexed, EMP protocol format (i.e., there is one sequence read file and one barcode read file)
Still multiplexed, barcodes are in the sequence reads
Still multiplexed, dual barcodes are in the sequence reads
Still multiplexed, 3-file Illumina format (one sequence reads file, an index 1 reads file, and an index 2 reads file)
Still multiplexed, barcodes in sequence read header lines
Something else (feel free to post a description in a reply to this post)

0 voters

When I work with paired-end read data, I receive fastq data that is...

Already demultiplexed, one forward read and one reverse read .fastq (or .fastq.gz) file per sample
Still multiplexed, EMP protocol format (i.e., there are forward and reverse sequence read files and one barcode read file)
Still multiplexed, barcodes are in the sequence reads
Still multiplexed, dual barcodes are in the sequence reads
Still multiplexed, 4-file Illumina format (one forward reads file, one reverse reads file, an index 1 reads file, and an index 2 reads file)
Still multiplexed, barcodes in sequence read header lines
Something else (feel free to post a description in a reply to this post)

0 voters

I have the following types of artifacts in my sequences that need to be removed...

Barcodes
Primers or other parts of the sequencing construct
Heterogeneity spacers
Something else (feel free to post a description in a reply to this post)

0 voters

Thanks for taking the time to help us out, and happy QIIMEing!

Nastassia_Patin · February 15, 2018, 1:09pm

Sometimes there are PhiX reads in the sequence data.

Adam_Rivers · February 15, 2018, 3:10pm

Demultiplexed, interleaved paired-end reads.

Hilary_Morrison · February 20, 2018, 3:52pm

We (MBL) generate amplicon sequencing datasets that are often provided as demultiplexed raw fastqs (paired reads) to end users. They contain an inline barcode (either 4 or 9 nt at start of read 1) and 16s primer sequences in both read 1 and read 2. We may also provide qc'd merged amplicon reads with these sequences removed, but the reads are now in fasta format. It would be great if the latter were importable into qiime. If this is already the case, please let me know...

jairideout · February 20, 2018, 5:04pm

Hi @Hilary_Morrison!

If the demultiplexed and QC'd sequences in the FASTA file have sequence IDs that follow the QIIME 1 demux format, you can import the FASTA file and dereplicate the sequences and/or perform OTU picking using q2-vsearch. Check out the q2-vsearch Community Tutorial for details and example data.