Importing Multiplexed Paired End Sequence With Adaptors --> Demultiplexed Sequences

jgcx · November 17, 2021, 7:47am

Thank you in advance for anyone that is willing to help. I have added an update below.

I am new to QIIME2 and a biologist with a minimal bioinformatics background. I got interested in using QIIME2 after a university website-based pipeline crashed, and now I am working to find an alternative to analyze 16S rDNA sequence information. However, I want to find a way to do the same functions done in the pipeline using QIIME2. I want to give some background barcoded amplicons that were generated via PCRs that contained an adaptor, 1-72 different barcoded forward primers (depends on the number of samples), a reverse primer (same used for all of the barcoded samples), and the usual ingredients (Taq and water). These barcoded amplicons were generated for each sample and used to create a pooled library which was then used to sequence the information for each sample. After the sequencing from our Ion GeneStudio S5 was complete, a raw data FASTQ sequence file was generated containing sequences for each sample. I am interested in the reads that are from 265bp to 285bp, nothing more and nothing less.

So far, I have obtained an Ubuntu Laptop and gone through the QIIME2 tutorials and other online help aids to do the following.

Successfully installed QIIME2, checked that it is updated, installed correctly
Convert my raw fastq file from our Ion GeneStudio S5 machine into a .gz file
Attempted to make a metadata file using excel that contains my sample ids and barcodes. So in this excel, there are two columns, one named sample-id and one named barcode-sequences. The sample-ids are BC_1 through BC_72, and the barcode sequences are next to each corresponding sample name.
Converted the text file containing the 4 forward primers into a .gz file via the P7Zip Application.

Now I have questions, and I am stuck.

Will QIIME2 read the 4 forward primers since it is just a text file that was converted into a .gz file?

Since I have the sequence in a text document of the adaptor, do I need to also convert this text file into a .gz file?

Since the same reverse sequence is used for each sample, do I need to make a text file of this sequence and convert it to a .gz file?

Once I complete the above. How do I use the commands below with these files if all or some of the above is needed?

qiime tools import
--type MultiplexedPairedEndBarcodeInSequence
--input-path muxed-pe-barcode-in-seq
--output-path multiplexed-seqs.qza

cutadapt demux

q2-demux and/or q2-cutadapt

dada2

denoise

I have not gotten to the library part of the tutorial yet since I am focused on the demultiplexing first. However, eventually, I would like to know how to get QIIME2 to use the SILVIA SSU database?

----------UPDATE-------

As an update I have been playing around with the following command lines. Please ignore the Enter Here/Press Enter Text. I am still unsure if I am using the wrong or right commands.

I would like to know how to command the following:

For reads to only take place from 265 bp to 285 bp.
How to input the forward primers that comes before the barcode primers.
How to input the reverse primer

qiime tools import
--type MultiplexedSingleEndBarcodeInSequence
--input-path /location of the document/name of computer/file location of the document in the computer/name of the sequencing file.fastq.gz
--output-path multiplexed-seqs.qza
Press Enter
qiime cutadapt demux-single
--i-seqs multiplexed-seqs.qza
--m-barcodes-file /location of the document/name of the metada file containing the barcode sequence information.tsv
--m-barcodes-column barcode-sequence
--p-error-rate 0
--o-per-sample-sequences demultiplexed-seqs.qza
--o-untrimmed-sequences untrimmed.qza
--verbose
Press Enter
qiime cutadapt trim-single
--i-demultiplexed-sequences demultiplexed-seqs.qza
--p-front CCATCTCATCCCTGCGTGTCTCCGACTCAG
--p-error-rate 0
--o-trimmed-sequences trimmed-seqs.qza
--verbose
Press Enter
qiime demux summarize
--i-data trimmed-seqs.qza \g
--o-visualization trimmed-seqs.qzv
Press Enter
qiime tools view trimmed-seqs.qz

thermokarst · November 22, 2021, 2:55pm

Hi @jgcx - can you point us to the one main/central question you have here? In general we try to keep this forum to one question per Discourse topic - thanks for your help!

jgcx · November 22, 2021, 9:19pm

Hello @thermokarst - For starters I am using Ion Torrent to obtain sequencing information. Where can I find a tutorial utilizing sequence information from Ion Torrent (Ion GeneStudio S5 Sequencer) to complete demultiplexing a multiplexed FASTQ file?

thermokarst · November 23, 2021, 2:44pm

Hi @jgcx - we don't generally develop platform-specific tutorials. These platforms tend to vary widely. Instead we focus on developing general-purpose tutorials that demonstrate how to import and work with a wide range of data, from many different sources. You can start by reviewing our documentation at docs.qiime2.org. In particular I'll recommend the Overview Tutorial that explains the general steps in a typical analysis. Lastly, there has been a bit of discussion regarding IonTorrent data here on the forum, feel free to search around for that.

:qiime2:

system · December 24, 2021, 8:44pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.