Hello,
I have a dataset of sequences from pollen collected from birds. We have amplified each sample three times. The samples come from different elevations and seasons which represent four expeditions. Each of the samples was amplified with different forward primers whereas the reverse primer was always the same. We have done this for two markers ITS-2 and ITS-1.Then the triplicates were combined into one sample. Subsequently, we produced partial libraries combining samples from both markers. Then, these partial libraries were tag with Illumina tags. I received from the sequencer each of the libraries splitted according to the tagged used. Now I want to analyze them however I am confused about how to make the metadata (for demultiplexing) file in order to split each one of the samples into the original samples.
Can someone help me out with this?
I understand I will have to run the pipeline two times, one for each of the markers (ITS-1 and ITS-2). However, what I am having problems with is how to map the sequences to the original samples. Basically the problem is that I have combined internal barcode indexing (barcode in the read sequence) together with Illumina indexing ( in I5 and I7). This means that fastq files transferred by sequencing facility for such libraries are partially demultiplexed i.e. they have been only demultiplexed according to Illumina indexing and must be further demultiplexed.
So now I have c. 200 folder (both for r1 and R2) which correspond to the Illumina tags used, however each of these folders contain sequences coming from 5 to 6 original samples.
Thanks