Non-16s amplicons

I have question about possible workflow for my data.

My paired fqs are shotgun sequencing data of some gene amplicon and I want to extract fasta sequences of all existing variants and their frequency. Amplicons was done using heminested pcr with 2 degenereted forward and 1 degenerated reversed primers, target length was ~422bp. I think about pipeline like this:
Import fq > merged reads (vsearch join-pairs) > remove reads different than ~422bp length and without primers(extract-reads??) > dereplicate/clustering (dada2?) > visualization (tabulate-seqs?)
Is this good idea, maybe qiime is not good tools to do this. I will be very greatfull for any advice and suggestion.

1 Like

Hi @abomba,

Welcome on the forum!
Im afraid I never processed shotgun fragments obtained from amplicons, and so I am probably raising more questions than proposing answer (and to be honest I am not familiar with the kit you mentioned either).
How long are your sequences? What I can not figure it out in my mind is what happens if you stick together the sequences form the shotgun data. I am not sure you are going to get back the full amplicon.
My gut feeling is that try methaphan2 plug in or shogun plug in within qiime2 may be more useful to give an overview of the data!

Iā€™m curious to see what is the general idea on this

Good luck


I misunderstood data owner, and made mistake in my previous post. Data is from amplicon sequencing, not shotgun. PCR products was ligated with barcodes and sequenced 2x300bp. When I merged reads most of it was exactly around 422bp.

1 Like


good for you! Much simpler this way!
Only one correction on your pipeline then:
If you merge the amplicon with vsearch join-pairs, do not use dada2 to identify ASV the sequences, it is designed to work with unmerged data!
If you want to pre-merge the sequences just replace dada2 with deblur!


In deblur denoise-other reference sequences are required ,which I havent. Of course I could download some seq form genbank and use them but I thought about some sort of primer-based filtering. Or maybe I should use dada2 without pre-merge stage.

Sorry I did not get your amplicons are not from 16S!
My suggestion would be to avoid the pre merging then.

1 Like