Thank you in advance for anyone that is willing to help. I have added an update below.
I am new to QIIME2 and a biologist with a minimal bioinformatics background. I got interested in using QIIME2 after a university website-based pipeline crashed, and now I am working to find an alternative to analyze 16S rDNA sequence information. However, I want to find a way to do the same functions done in the pipeline using QIIME2. I want to give some background barcoded amplicons that were generated via PCRs that contained an adaptor, 1-72 different barcoded forward primers (depends on the number of samples), a reverse primer (same used for all of the barcoded samples), and the usual ingredients (Taq and water). These barcoded amplicons were generated for each sample and used to create a pooled library which was then used to sequence the information for each sample. After the sequencing from our Ion GeneStudio S5 was complete, a raw data FASTQ sequence file was generated containing sequences for each sample. I am interested in the reads that are from 265bp to 285bp, nothing more and nothing less.
So far, I have obtained an Ubuntu Laptop and gone through the QIIME2 tutorials and other online help aids to do the following.
- Successfully installed QIIME2, checked that it is updated, installed correctly
- Convert my raw fastq file from our Ion GeneStudio S5 machine into a .gz file
- Attempted to make a metadata file using excel that contains my sample ids and barcodes. So in this excel, there are two columns, one named sample-id and one named barcode-sequences. The sample-ids are BC_1 through BC_72, and the barcode sequences are next to each corresponding sample name.
- Converted the text file containing the 4 forward primers into a .gz file via the P7Zip Application.
Now I have questions, and I am stuck.
Will QIIME2 read the 4 forward primers since it is just a text file that was converted into a .gz file?
Since I have the sequence in a text document of the adaptor, do I need to also convert this text file into a .gz file?
Since the same reverse sequence is used for each sample, do I need to make a text file of this sequence and convert it to a .gz file?
Once I complete the above. How do I use the commands below with these files if all or some of the above is needed?
qiime tools import
--type MultiplexedPairedEndBarcodeInSequence
--input-path muxed-pe-barcode-in-seq
--output-path multiplexed-seqs.qza
cutadapt demux
q2-demux and/or q2-cutadapt
denoise
I have not gotten to the library part of the tutorial yet since I am focused on the demultiplexing first. However, eventually, I would like to know how to get QIIME2 to use the SILVIA SSU database?
----------UPDATE-------
As an update I have been playing around with the following command lines. Please ignore the Enter Here/Press Enter Text. I am still unsure if I am using the wrong or right commands.
I would like to know how to command the following:
- For reads to only take place from 265 bp to 285 bp.
- How to input the forward primers that comes before the barcode primers.
- How to input the reverse primer
qiime tools import
--type MultiplexedSingleEndBarcodeInSequence
--input-path /location of the document/name of computer/file location of the document in the computer/name of the sequencing file.fastq.gz
--output-path multiplexed-seqs.qza
Press Enter
qiime cutadapt demux-single
--i-seqs multiplexed-seqs.qza
--m-barcodes-file /location of the document/name of the metada file containing the barcode sequence information.tsv
--m-barcodes-column barcode-sequence
--p-error-rate 0
--o-per-sample-sequences demultiplexed-seqs.qza
--o-untrimmed-sequences untrimmed.qza
--verbose
Press Enter
qiime cutadapt trim-single
--i-demultiplexed-sequences demultiplexed-seqs.qza
--p-front CCATCTCATCCCTGCGTGTCTCCGACTCAG
--p-error-rate 0
--o-trimmed-sequences trimmed-seqs.qza
--verbose
Press Enter
qiime demux summarize
--i-data trimmed-seqs.qza \g
--o-visualization trimmed-seqs.qzv
Press Enter
qiime tools view trimmed-seqs.qz