I am trying to build a tree via SEPP for further analyses after denoising sequences via DADA2, but the flowchart of sequence alignment and phylogeny building in the QIIME2 tutorial confused me a little bit. It seems like the input file for q2-fragment-insertion plugin is of type FeatureData[AlignedSequence]; however, it turns out that the plugin only accept FeatureData[Sequence] files.
The following are my scripts:
from qiime2.plugins.alignment.methods import mafft, mask
from qiime2.plugins.fragment_insertion.methods import sepp
‘’‘here dada2_sequences is of type FeatureData[Sequence] after DADA2 process pipeline’’’
mafft_alignment = mafft(dada2_sequences) # Multiple sequence alignment
masked_mafft_alignment = mask(mafft_alignment.alignment) # Masking sites
frag_insertion = sepp(representative_sequences = masked_mafft_alignment.masked_alignment, threads = 16) # fragment insertion
and I got a error message like this:
TypeError: Argument to parameter ‘representative_sequences’ is not a subtype of FeatureData[Sequence].
I also checked the github for q2-fragment-insertion. In his example, data of type FeatureData[Sequence] was used instead of FeatureData[AlignedSequence]. And this works perfectly for me, too. The scripts looks like this:
frag_insertion = sepp(representative_sequences = dada2_sequences, threads = 16) # fragment insertion
Obviously it is unnecessary to align and mask sequences before fragment insertion (since it raised a type error ). Does it make any difference with or without alignment before conducting fragment insertion? Any comment/clarification/explanation would be appreciated!! Thanks.