I have reads with Barcodes in sequence and the sequencing facility told me, all the adapters are removed, but the fastqc report shows me, that they are not. Since Illumina Universal adapters and polG
are detected in the forward and the reverse reads
I have demultiplexed my reads with cutadapt
qiime cutadapt demux-paired
And then tried to remove the remaining adapters with cutadapt too, after some playing around I was able to remove the Illumina adapters and polyG with this command but only in the forward reads!
qiime cutadapt trim-paired
I tried with the original, the reverse, the reverse compliment and the compliment version of the Illumina universal adapter sequence for p-anywhere-r. But nothing worked to remove the adapters from the reverse reads.
These are the multiqc plots of the demultiplexed reads before and after trimming.
Do you have any idea how I can remove these adapters from the reverse reads?
I am completely out of ideas any help is really appreciated!
I have reads with Barcodes in sequence and the sequencing facility told me, all the adapters are removed, but the fastqc report shows me, that they are not.
They said that the barcodes were removed or that the adapters were removed? Or are you using those two terms to refer to the barcodes? They removed the adapters (not barcodes) but did not demultiplex?
I'm not familiar with the figures you shared. They show that an adapter sequence that you told the tool to look for is in your reads? Do the illumina adapter sequences that you shared come from the library prep protocol or somewhere else?
thank you for the quick repsonse and sorry for being not clear on that.
I do distinguish between adapters and barcodes. Adapters being the pieces required and attached by the sequencing facility and barcodes being part of my primer to distinguish my samples.
Yes exactely, they told me they removed the adapters and I only have to demultiplex them. But when I checked the demultiplexed reads with fastqc I got a red warning for all of the reads that they do have adapter content.
The figures were only supposed to make clear, that the adapters really only remain in the reverse reads after trimming.
Originally I looked at the fastqc report and these told me that the adapter sequences found are Illumina universal adapters. So I googled the sequences of these because I coulnd´t get ahold of the sequncing facility. And eventually I found the sequence which is within the code above (--p-anywhere-f AGATCGGAAGAGCACACGTCTGAACTCCAGTCA).
The only problem left is, that I don´t know what sequence, code parameter or else to use to get rid of these adapters in the reverse reads.
Here are the pictures of the fastqc output for only one sample (R1 and R2) before and after trimming with cutadadapt:
R1 after trimming:
R1 before timming:
R2 before trimming:
R2 after trimming:
Is the following correct? Fastqc (this is the software you're using right?) has a built in set of adapter sequences that it searches for by default, and it found the "Illumina Universal Adapter" and "PolyG" sequences in your forward and reverse reads. It doesn't tell you what the actual sequence is, so you searched online for them and what you found is what you provided to cutadapt. Cutadapt successfully removed the adapter sequences in the forward reads but failed to remove the adapter sequences in the reverse reads.
If this is the case I think it's safe to say that you're simply searching for the wrong adapter sequence in the reverse reads (whatever you found online was incorrect).
I know you said that you couldn't get ahold of the sequencing center, but I would try to again because it really is their responsibility to provide this information. Thanks to their negligence your analysis is stumped before you could even get started.
Otherwise, maybe trial and error some more adapter sequences. From a quick google search for "fastqc illumina universal adapter" there seem to be different answers as to which sequence(s) are actually used.
Typically in our lab, we use tools such as BBDuk or Trimmomatic for adapter removal which can iterate through a list of adapter sequences (Not sure, if this can be done with cutadapt). Once this process is complete, we import the filtered reads into QIIME2.
I recommend giving BBDuk a try on the demultiplexed reads (export or unzip demultiplexed-seqs-V45-0.25.qza), as it includes a predefined list of adapter sequences.
bbduk.sh in1=raw_reads/R1.fastq.gz in2=raw_reads/R2.fastq.gz out1=bb_out/R1.fastq out2=bb_out/R2.fastq ref=bb_adapters.fa k=17 mink=7 ktrim=rl hdist=1 qtrim=r trimq=20 minlen=100 tpe tbo
Adjust the other parameters as required.
In the predefined list, I was able to locate only one of the adapter sequences you mentioned: "AGATCGGAAGAGCACACGTCTGAACTCCAGTCA,". The other one was not present. As mentioned above, this could be the reason adapters were not removed from the reverse reads.
Predefined List of Adapters - If it is helpful