I'm trying to trim the primers (515F 5′-GTG CCA GCM GCC GCG GTA A-3′ and 806R 5′-GGA CTA CVS GGG TAT CTA AT-3′) of single end 454 reads (downloaded from NCBI) with Cutadapt (qiime2-2020.11). I've used the following:
What am I missing, why are the trimmed sequences still over 600 base pairs?
I'will be using DADA2 later (I can truncate to around 300 bp when the quality decreases then), I just want to make sure the sequences are primmer free.
Hi @Natali_Hernandez,
I am really rusty on the 454 side, so apology in advance if I go for red herrings ...
I am wondering, because you used linked style adapter, cutadapt is looking for the reverse primer sequences at the end of the read. Did you try to use "--p-front GTGCCAGCMGCCGCGGTAA"
and maybe in a second step "--p-anywhere ATTAGAWACCCBDGTAGTCC" ?
This is m first go with 454 sequences, so I am not sure if this is normal. I think I am looking at something similar to this one, as suggested there, I'll try RESCRIPt, I'm just unsure about --i-reference-sequences reference-sequences.qza , can this be any other sequences, i.e. not 454 but Illumina?
Is the last picture the result of trimming by using '--p-front GTGCCAGCMGCCGCGGTAA' and trimming again the result with "--p-anywhere ATTAGAWACCCBDGTAGTCC"?
In a 454 dataset, it is normal to have a huge range variation with a dropping of quality on tail. There are also possible sequences due to concatenamers or noise, so I would not be surprised if you will end up on filtering the sequences by applying a min and max length. On the post you link, the original question was because they started the analysis with with 2 fastq files associated to a 454 run, and they had to re-orientate one of them to proceed. How many fastq files do you have? However, it is a very good point to double check the orientation of your read is what you expect, and if needed, reorient them with rescript (you can use the sequence file you have as database for taxonomy identification for that, eg Silva or GreenGenes).
But I get this error:
(qiime2-2021.4) MacBook-Air-de-Natali:MarcellusCluff2014 natali$ qiime tools export --input-path single-end-marcellus2014.qza --output-path single-end-marcellusFD.qza --output-format 'FeatureData[Sequence]'
Traceback (most recent call last):
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/qiime2/sdk/util.py", line 90, in parse_format
format_record = pm.formats[format_str]
KeyError: 'FeatureData[Sequence]'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/miniconda3/envs/qiime2-2021.4/bin/qiime", line 11, in
sys.exit(qiime())
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/q2cli/builtin/tools.py", line 63, in export_data
source = result.view(qiime2.sdk.parse_format(output_format))
File "/opt/miniconda3/envs/qiime2-2021.4/lib/python3.8/site-packages/qiime2/sdk/util.py", line 92, in parse_format
raise TypeError("No format: %s" % format_str)
TypeError: No format: FeatureData[Sequence]
What should be the way to convert a type SampleData[SequencesWithQuality] artifact to
FeatureData[Sequence] artifact?
maybe you should do nothing ... at this stage!
What about denoise what you got and reorient the obtained ASVs before the taxonomy identification, I think the ASVs artifact should be ok for that plug in.