I have some samples which were sequenced for the V4 region of 16s. However, even before putting them into QIIME2, I ran fastqc and realized that alerts pop up out of the adapter region, with most samples having a warning for "polyA". Here is the plot from the multiqc report:
Does anything like this has ever happened to you? is this somewhat expected?
thanks in advance
For trimming adapaters and primers from raw reads at the beginning of the pipeline, consider the cutadapt plugin:
DADA2 also let's you trim from the start and end of both R1 and R2, so you could may be able to handle all this within the denoising step.
@colinbrislawn thank you for your answer!
The issue is, according with the genomic facility we have used, these sequences went already through a step of adapter and primer removal using cutadapt, hence finding very odd these still showing up.
Ah. There are many ways to trim and a boatload of settings in cutadapt to match.
While we would hope that the sequencing facility would get all the adapters all the time, it looks like some unexpected stuff remains in your reads.
I don't think trimming poly-A tails is common for amplicons...
Would you expect poly-A tails in a 16S region for biological reasons?
Would you expect poly-A tails on Illumina for technical reasons?
Well, when running these sequences through blast, the majority get not hits or only partial, very random ones. So I wouldn't say these appear to be biological.
Regarding technical issues, I am not very knowledgeable on that topic so I really don't have an hypothesis...