Hi, developer,
Appreciate your time.
I have multiple fasta files, each fasta is one sample. I want to pick OTU/ASV and check chimeras in QIIME2. May I know the procedure?
Sincerely.
Brandon
Hi, developer,
Appreciate your time.
I have multiple fasta files, each fasta is one sample. I want to pick OTU/ASV and check chimeras in QIIME2. May I know the procedure?
Sincerely.
Brandon
Hi, I had a similar issue and resolved it by importing my samples using a manifest. I believe you are looking for this:
https://docs.qiime2.org/2019.4/tutorials/importing/?highlight=manifest
If you donât have either EMP or Casava format, you need to import your data into QIIME 2 manually by first creating a âmanifest fileâ and then using the qiime tools import
command with different specifications than in the EMP or Casava import commands.
First, youâll create a text file called a âmanifest fileâ, which maps sample identifiers to fastq.gz
or fastq
absolute filepathsthat contain sequence and quality data for the sample (i.e. these are FASTQ files). The manifest file also indicates the direction of the reads in each fastq.gz
or fastq
file. The manifest file will generally be created by you, and it is designed to be a simple format that doesnât put restrictions on the naming of the demultiplexed fastq.gz
/ fastq
files, since there is no broadly used naming convention for these files. You can call the manifest file whatever you want. As well, the manifest format is Metadata-compatible, so you can re-use the manifest file to bootstrap your Sample Metadata, too.
The manifest file is a tab-seperated (i.e., .tsv
) text file. The first column defines the Sample ID, while the second (and optional third) column defines the absolute filepath to the forward (and optional reverse) reads. All of the rules and behavior of this format are inherited from the QIIME 2 Metadata format.
Hi, @ErikaGanda,
Thank you for the help. It is helpful.
May I have the help further? Can I analysis in dada2 or deblur of my imported demux.qza? Or I need to pick OTU and check chimeras?
Really appreciate it!
Sincerely.
Brandon
Hi @Brandon,
The manifest approach @ErikaGanda points you towards is the correct format if you have FASTQ files. In your original inquiry you mentioned you had FASTA files which are currently not supported with the manifest importing.
Other FASTA formats like FASTA files with differently formatted sequence headers or per-sample demultiplexed FASTA files (i.e. one FASTA file per sample) are not currently supported.
Unfortunately it sounds like you fall into this category, if you indeed have separate FASTA files for each sample. I would recommend either starting with raw FASTQ files in QIIME2 if you have access to these files, or if that is not an option perhaps you can try combining your files elsewhere (ex Qiime1) and then try importing your combined .fna file to Qiime2 as per the importing tutorial mentioned.
If you only have FASTA files without quality scores you will not be able to perform DADA2 or Deblur and would have to use OTU picking methods.
Hi @Mehrbod_Estaki,
Thank you for your info.
My data are in this format
>A1-22751
GGTACCAGCAGCCGCGGTAATACGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAA
AGAGTGCGTAGGCGGTTTAGTAAGTTGGAAGTGAAAGCCCGGGGCTTAACCTCGGAATTG
CTTTCAAAACTACTAATCTAGAGTGTAGTAGGGGATGATGGAATTCCTAGTGTAGAGGTG
AAATTCTTAGATATTAGGAGGAACACCGGTGGCGAAGGCGGTCATCTGGGCTACAACTGA
CGCTGATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGT
AAACGATGAGTGCTAGATATCGGAAGATTCTCTTTCGGTTTCGCAGCTAACGCATTAAGC
ACTCCGCCTGGGGAGTACGGTCGCAAGATTAAACCTCAAAGGAATTGACGGAGTCTC
>A1-25524
GGTACCAGCAGCCGCGGTAATTCGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAA
AGAGTGCATAGGCGGTTTAGTAAGTTGGAAGTGAAAGCCCGGGGCTTAACCTCGGAATTG
CTTTCAAAACTACTAATCTAGAGTGTAGTAGGGGATGATGGAATTCCTAGTGTAGAGGTG
AAATTCTTAGATATTAGGAGGAACACCGGTGGCGAAGGCGGTCATCTGGGCTACAACTGA
CGCTGATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGT
AAACGATGAGTGCTAGATATCGGAAGATTCTCTTTCGGTTTCGCAGCTAACGCATTAAGC
ACTCCGCCTGGGGAGTACGGTCGCAAGATTAAACCTCAAAGGAATTGACGGAGTCTC
I tried
qiime demux summarize
> --i-data demux.qza
> --o-visualization demux.qzv
But when I use
qiime vsearch dereplicate-sequences --i-sequences seqs.qza --o-dereplicated-table table.qza --o-dereplicated-sequences rep-seqs.qza
It gives me error
Plugin error from vsearch:
Parameter 'sequences' received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequ$
See above for debug info.
Are there any other way for me to analyze this data?
Thanks so much.
Hi @Mehrbod_Estaki,
Thank you for your info.
My data are in this format
A1-22751
GGTACCAGCAGCCGCGGTAATACGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAA
AGAGTGCGTAGGCGGTTTAGTAAGTTGGAAGTGAAAGCCCGGGGCTTAACCTCGGAATTG
CTTTCAAAACTACTAATCTAGAGTGTAGTAGGGGATGATGGAATTCCTAGTGTAGAGGTG
AAATTCTTAGATATTAGGAGGAACACCGGTGGCGAAGGCGGTCATCTGGGCTACAACTGA
CGCTGATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGT
AAACGATGAGTGCTAGATATCGGAAGATTCTCTTTCGGTTTCGCAGCTAACGCATTAAGC
ACTCCGCCTGGGGAGTACGGTCGCAAGATTAAACCTCAAAGGAATTGACGGAGTCTC
>A1-25524
GGTACCAGCAGCCGCGGTAATTCGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAA
AGAGTGCATAGGCGGTTTAGTAAGTTGGAAGTGAAAGCCCGGGGCTTAACCTCGGAATTG
CTTTCAAAACTACTAATCTAGAGTGTAGTAGGGGATGATGGAATTCCTAGTGTAGAGGTG
AAATTCTTAGATATTAGGAGGAACACCGGTGGCGAAGGCGGTCATCTGGGCTACAACTGA
CGCTGATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGT
AAACGATGAGTGCTAGATATCGGAAGATTCTCTTTCGGTTTCGCAGCTAACGCATTAAGC
ACTCCGCCTGGGGAGTACGGTCGCAAGATTAAACCTCAAAGGAATTGACGGAGTCTC
I tried
qiime demux summarize
--i-data demux.qza
--o-visualization demux.qzv
But when I use
qiime vsearch dereplicate-sequences --i-sequences seqs.qza --o-dereplicated-table table.qza --o-dereplicated-sequences rep-seqs.qza
It gives me error
Plugin error from vsearch:
Parameter âsequencesâ received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequ$
See above for debug info.
Are there any other way for me to analyze this data?
Thanks so much.
Hi @Brandon,
It looks like you have FASTA files indeed but have a look at the importing tutorial with regards to the required formatting to make sure your FASTA files this.
The ID in each header must follow the format
<sample-id>_<seq-id>
.<sample-id>
is the identifier of the sample the sequence belongs to, and<seq-id>
is an identifier for the sequence within its sample.
Can you tell us how you actually have imported your separately demultiplexed FASTA files into qiime2 initially? That is to say how did you end up with your seqs.qza
.
Did that work?
Also your error message from vsearch
seems to have been cut off in your paste. Could you please re-run the command adding the --verbose
and share with us the full error message please.
Hi, @Mehrbod_Estaki
I used code below run in qiime2-2019.1
qiime tools import
> --input-path seqs.fna
> --output-path sequences.qza
> --type âFeatureData[Sequence]â
About vsearch error
Traceback (most recent call last):
File â/home/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.pyâ, line 274, in call
results = action(**arguments)
File â</home/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-128>â, line 2, in dereplicate_sequences
File â/home/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.pyâ, line 199, in bound_callable
self.signature.check_types(**user_input)
File â/home/miniconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/core/type/signature.pyâ, line 301, in check_types
name, kwargs[name].type, spec.qiime_type))
TypeError: Parameter âsequencesâ received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequences] is required.
Plugin error from vsearch:
Parameter 'sequences' received an argument of type FeatureData[Sequence]. An argument of subtype SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequences] is required.
See above for debug info.
Thank you for your kindness.
Hi @Brandon,
So as the error message implies your seqs.fna
file is of the type FeatureData[Sequence]
which is not a supported type in dereplicate-sequences
. This is because when you imported your file you have set the type incorrectly. See the OTU clustering tutorial for an example of this workflow.
That being said, your FASTA files actually are not in the right format either, see my previous comment about the header ID requirements. Youâll have to change these files so that they follow the <sample-id>_<seq-id>
header format prior to importing in order for this to work.
Good luck!
Thank you @Mehrbod_Estaki I see it. Appreciate your help.
This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.