I also have a mapping file & it looks like this: Fasting_Map.txt
I want to import both & proceed on removing chimeras from fasta file.
I do not have a qual file for the fasta file.
Now my questions are:
Based on how the sequences inside the seqs.fna file look like, which data format are my sequences like? I am asking this because I am not sure which one to follow to import my data. Should I follow this:
Per-feature unaligned sequence data (i.e., representative sequences)
Unaligned sequence data is imported from a fasta formatted file containing DNA sequences that are not aligned (i.e., do not contain - or . characters). The sequences may contain degenerate nucleotide characters, such as N, but some QIIME 2 actions may not support these characters. See the scikit-bio fasta format description for more information about the fasta format.
Importing data qiime tools import *
** --input-path sequences.fna *
** --output-path sequences.qza **
** --type 'FeatureData[Sequence]**
or should I follow this: Per-feature aligned sequence data (i.e., aligned representative sequences)
Aligned sequence data is imported from a fasta formatted file containing DNA sequences that are aligned to one another. All aligned sequences must be exactly the same length. The sequences may contain degenerate nucleotide characters, such as N, but some QIIME 2 actions may not support these characters. See the scikit-bio fasta format description for more information about the fasta format.
Importing data qiime tools import *
** --input-path aligned-sequences.fna *
** --output-path aligned-sequences.qza **
** --type 'FeatureData[AlignedSequence]'**
I looked into See the scikit-bio fasta format description for more information about the fasta format.
& I am still lost on how to import my data.
I dowloaded the sample sequences provided for both tutorials:
Per-feature unaligned sequence data (i.e., representative sequences)
&Per-feature aligned sequence data (i.e., aligned representative sequences)
I then compared these to the FASTA file that I have that is already demultiplexed.
Based on my comparison. I dowloaded my data in the format of FeatureData[Sequence]
--i-sequences sequences.qza
--o-dereplicated-table table.qza
--o-dereplicated-sequences rep-seqs.qza Plugin error from vsearch: Argument to parameter 'sequences' is not a subtype of SampleData[JoinedSequencesWithQuality] | SampleData[SequencesWithQuality] | SampleData[Sequences].
Debug info has been saved to /var/folders/nk/qmrj4r6d3jl131r7r4h959rw0000gn/T/qiime2-q2cli-err-jw1kfhiw.log
Great job looking at what was available and trying to pick the right types. You are very very close, however the axis (SampleData vs FeatureData) is wrong. You want to import as SampleData[Sequences] instead of FeatureData[Sequence].
The reason it’s SampleData[Sequences] instead of FeatureData[Sequence] is that we haven’t selected those reads as features (and counted them up) yet. You should be able to follow along with this tutorial which starts with seqs.fna, dereplicates, and then does OTU picking.
Assuming the file still exists, could you attach this file: /var/folders/nk/qmrj4r6d3jl131r7r4h959rw0000gn/T/qiime2-q2cli-err-21hvz62b.log to your reply? Otherwise, try rerunning in --verbose and paste the output.
Unfortunately it isn’t clear what is going wrong yet, but at least you have the types worked out!