q2-cutadapt output

This is quite perplexing indeed. :thinking:

Usually when I see such consistent loss across all the samples, it usually implies that the primer sequences are slightly incorrect. Or something else is consistently wrong... :man_shrugging: What sequencing protocol was used to generate this data? Do you have a reference?

These primers are identical to the ones below, except for the T in place of K in the reverse primer. Which should not be an issue as K == G or T (Herlemann et al. 2011).

  • Bakt_341F: CCTACGGGNGGCWGCAG
  • Bakt_805R: GACTACHVGGGTATCTAATCC

I would rerun with the following changes. I know you've tried much of this already, I am only adding 2 new ones:

  • try using the --p-anywhere-f instead of --p-front-f
  • try using the --p-anywhere-r instead of --p-front-r
  • do not use the --no-indels
  • remove the ^ from the beginning.
  • keep --p-match-adapter-wildcards
  • keep --p-match-read-wildcards
  • keep --p-discard-untrimmed

Sometimes odd things happen with sequencing. I think at one point I simply had to trim some bases off either end of the primers... but you'd think this should be solved by simply increasing --p-error-rate as you've done. Otherwise, if you are not using staggered primers (Lundberg et al. 2013), then you can forgo cutadapt and just use the trimming options provided in dablur or DADA2. But it is nice to use cutadapt as a form of quality control.

Would it be possible to share the demuxed data? You can DM me if you do not want to share it publicly.

5 Likes