I have the following issue. I am analysing 16S rRNA 250bp paired end mouse gut microbiome sequencing data (V3-V4 region) that have been obtained using Nextera tagmentation technology. Both forward and reverse read quality looks ok to me (see
. The problem I have is with the Nextera adapters. If I run fastqc on my samples I get the following picture for adapters’ content (two examples):
, where you can see that adapter content throughout my reads is high (varies from 20 to 50% depending on sample, on average probably around 30%). Naturally, after running cutadapt a good fraction of my reads become shorter than 250bp and thus after de-noising with dada2 I get a lot of losses if I try to use trunc-len=250 for both forward and reverse reads (
, where I trimmed the first 10bp of each read; I also increased –-p-max-ee to 5 to try and keep as many reads as possible). My first question is: does anybody have experience with Nextera adapters? Is this usual behavior? I have seen a previous post about this (nextera transposase contamination
) but the fraction of reads carrying adapters appeared to be much lower than what I find.
Independent on the Nextera issue I do think my data are still usable, however, I also believe that I need to find a way of keeping more of the reads after filtering and denoising. Given that I have very little wiggle room for cutting if I want my forward and reverse reads to merge I concluded that my best option would be to use only the forward reads and cut them to a length that allows me to keep for most samples >50% of them. This seems to work nicely in terms of sheer volume of reads that are retained (
), however, I wonder how short is too short for cutting my reads. I know there is no absolute answer to that but I would like to get a feel for what sounds ok and what is definitely too short. Is 200bp ok? Could I potentially go shorter? I guess it’s a balancing act between keeping as many reads as I can and keeping the reads as long as possible to get enough resolution for my taxonomic analysis.
I will welcome any comments, ideas and suggestions!