Illumina data import with sequence as fastq.gz, primer and barcodes as txt files

Hi Users,
I have qiime2-2019.4 installed through miniconda in Linux Mint 19.1.
I have sequences bacterial V3-V4 regions. The outsourcing company provided me forward and reverse sequences as fastq.gz files and primer and barcodes as text (provided at the end). I also asked them for barcodes as fastq.gz format, however, they are reluctant to provide me. I am novice in this and unable to move a step following the provided importing techniques (Moving Picture or FMT).

I require your kind support regarding:

  1. How to import
  2. Following workflow such as whether or how to remove barcodes and primer sequences etc.

Waiting for your kind response…

Less view of forward:

@700823F:460:HTJ2MBCX2:2:1101:4501:2076 1:N:0:GAGATTCC+GTACTGAC
ACGCCGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAGTGATGAAGGCTTTCGGGTCGTAAAACTCTGTTGTTAGGGAAGAACAAGTACAAGAGTAACTGCTTGTACCTTGACGGTACCTAACCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGTT
+
<<.<GAGGIIIIIIIIIGGIIIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIIIIIIIIIIIIIIGGGGGGGIGIIIGGIIGIGIGGIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIGGGGGIIGIIIIIIIIGIIIGIIGIIIGIIIGIIGIGGIIIIIIGIIIIIGIIIIGGGGIGIIIIIIIIIIIIGGGGGGGGIIIIIGGIGGIIGIGGGIIIIIIGIIIIIIIGGGGGGIIGG
@700823F:460:HTJ2MBCX2:2:1101:6407:2205 1:N:0:GAGATTCC+GTACTGAC
GACTACTGGGGTATCTAATCCTGTTTGATCCCCACGCTTTCGTGCCTCAGCGTCAATCATACTTTAGTAAGCTGCCTTCGCAATTGGTGTTCTGTGACATATCTATGCATTTCACCGCTACTTGTCACATTCCGCCTACCTCAAGTACATTCAAGCCTATCAGTATCAAAGGCACTGCGATAGTTAAGCTACCGTCTTTCACCCCTGACTTAATAGGCCGCCTACGCACCCTTTAAACCCAATAAATCCG
+
GGGGGGAGGGIIIIIIGGIIGIIIIIGGGIGIGGI<GGIIIGGGGIIIIGIGGGGGIIIGGGGIGIGGAGAGGGGGAGGGGGAGGIGGGGGAGGIGGGIIGGGIIIIGGIIIGGIGIIIGIIIGGGGGGIGIIGGIGGGGIGGAGGGIGGAGGAGGGGIGGGIIGGGGGGGGG.AGAGGGGIIIIIGGGGGIIIIIII.GGGGGGGAA.AGAGGGAGGGGGAAGGGGIIAGGGGGGGGI.7GGIIGGAAG

Less view of reverse:
@700823F:460:HTJ2MBCX2:2:1101:4501:2076 1:N:0:GAGATTCC+GTACTGAC
ACGCCGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAGTGATGAAGGCTTTCGGGTCGTAAAACTCTGTTGTTAGGGAAGAACAAGTACAAGAGTAACTGCTTGTACCTTGACGGTACCTAACCAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGTT
+
<<.<GAGGIIIIIIIIIGGIIIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIIIIIIIIIIIIIIGGGGGGGIGIIIGGIIGIGIGGIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIGIIIIIIIGGGGGIIGIIIIIIIIGIIIGIIGIIIGIIIGIIGIGGIIIIIIGIIIIIGIIIIGGGGIGIIIIIIIIIIIIGGGGGGGGIIIIIGGIGGIIGIGGGIIIIIIGIIIIIIIGGGGGGIIGG
@700823F:460:HTJ2MBCX2:2:1101:6407:2205 1:N:0:GAGATTCC+GTACTGAC
GACTACTGGGGTATCTAATCCTGTTTGATCCCCACGCTTTCGTGCCTCAGCGTCAATCATACTTTAGTAAGCTGCCTTCGCAATTGGTGTTCTGTGACATATCTATGCATTTCACCGCTACTTGTCACATTCCGCCTACCTCAAGTACATTCAAGCCTATCAGTATCAAAGGCACTGCGATAGTTAAGCTACCGTCTTTCACCCCTGACTTAATAGGCCGCCTACGCACCCTTTAAACCCAATAAATCCG
+
GGGGGGAGGGIIIIIIGGIIGIIIIIGGGIGIGGI<GGIIIGGGGIIIIGIGGGGGIIIGGGGIGIGGAGAGGGGGAGGGGGAGGIGGGGGAGGIGGGIIGGGIIIIGGIIIGGIGIIIGIIIGGGGGGIGIIGGIGGGGIGGAGGGIGGAGGAGGGGIGGGIIGGGGGGGGG.AGAGGGGIIIIIGGGGGIIIIIII.GGGGGGGAA.AGAGGGAGGGGGAAGGGGIIAGGGGGGGGI.7GGIIGGAAG

Sequences of forward, reverse primers and barcodes:
Library id Sample name Index 1 Index 2
LIB32530 YR2-CR2-P1 GAGATTCC GTACTGAC

V3 Forward : CCTACGGGNBGCASCAG
V4 Reverse : GACTACNVGGGTATCTAATCC

1 Like

Hi @ashisroybarman,
Welcome to the forum!
Thank you for providing the details of your problem, very helpful for troubleshooting.
Can you clarify one more thing for us, do you have a single fastq file for all of your forward and reverse files (multiplexed) or do you have a forward and reverse file for every sample (demultiplexed). From the example you sent it looks as though the barcodes have already been removed and placed in the first line (ex: GAGATTCC+GTACTGAC).

1 Like

Hi Mehrbod_Estaki,
Many thanks for your quick reply and sorry for being late in replying.
I have a forward and a reverse file for every sample, all together 14 files for 7 soil samples. As you have mentioned, I am new and not sure whether these were demultiplexed.
I want to perform analysis and comparison in two groups (4 samples in one group and 3 soil samples in other) from two completely different experiments.
Best regards,
Ashis

I have also tried importing with the tutorial “Importing data>Casava 1.8 paired-end demultiplexed fastq” (URL: https://docs.qiime2.org/2019.7/tutorials/importing/#sequence-data-with-sequence-quality-information-i-e-fastq) for my set of R1 and R2, it generated following error:

"There was a problem importing casava-18-paired-end-demultiplexed:

Missing one or more files for CasavaOneEightSingleLanePerSampleDirFmt: ‘.+_.+_L[0-9][0-9][0-9]_R[12]_001\.fastq\.gz’".

Hi @ashisroybarman.
Since you have a separate forward and reverse file for each sample, these reads have already been demultiplexed, and from the looks of your reads the facility has already removed primers and barcodes from your reads. They have left the barcodes on line 1 of your reads files for reference. How nice of them. That means you don’t need to worry about any of that stuff, so you can simply import your reads and follow along one of the many tutorials in qiime2. I recommend starting with the moving pictures tutorial. As for importing, you’re going to want to use the manifest-format import, the CasavaOne format only works if the name of your fastq files follows the CasavaOne convention. If you’re not sure, just use the manifest instead, then there’s no restrictions on naming. Hope this help, I know when starting things can seem tedious but it does get much much easier as you start understanding the basic concepts. Good luck!

A post was split to a new topic: Need help importing a manifest format

Hi Mehrbod_Estaki,
Thank you for your valuable suggestions.

1 Like

11 posts were split to a new topic: Strange dependencies in my QIIME 2 environment

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.