Dear Qiime,
My initial fastq.gz samples underwent host contamination removal and merging within qiime2.
I found chimeric removal within the qiime2 framework took too long so I exported by samples out to perform vsearch based chimeric removal.
Now I wish to bring my fasta files back into qiime2 to begin taxanomic analysis however, I am struggling with the syntax of the import function.
My samples (only running 3 atm for a pilot) are all in the sample directory
(qiime2-amplicon-2024.2) root@b1c1dff011df:/Home/Data_vsearch/test# pwd
/Home/Data_vsearch/test
(qiime2-amplicon-2024.2) root@b1c1dff011df:/Home/Data_vsearch/test# ls
A11d1JCon4A_0_L001_R1_001_nonchimera.fasta B11d2JCon1B_2_L001_R1_001_nonchimera.fasta
B11d1CCon1B_1_L001_R1_001_nonchimera.fasta
I believe these fasta files accommodate to qiime2's DNAfasta requirements as there are 2 items per line and the sequence is in a single line (import demultiplexed fasta files into Qiime2 - #4 by colinvwood).
(qiime2-amplicon-2024.2) root@b1c1dff011df:/Home/Data_vsearch/test# head A11d1JCon4A_0_L001_R1_001_nonchimera.fasta
>A00707:180:HCLMHDSX7:2:1101:10303:26663;size=1
AATTAGAGTTAACAATAATCGGCAGCACCTCTGGTGTCAGGCCAACAGCCGCAGCTAAAGCAAAAATTAAGCTTTCTCCCCAATCGCCTTTAGTCAAGCCATTAATGACAAACAGTAGTGGGATGATAATTGC
>A00707:180:HCLMHDSX7:2:1101:10700:17018;size=1
ACTTATGGACGTCGGATCCTTCAAAGCAAGGT
>A00707:180:HCLMHDSX7:2:1101:10737:31422;size=1
ACTATTTATTACGCaaaaaagtgcaaatttttttcagaaatttaaaaatttagacacgaaaaaaGCCGATGCAAATGCATCGAC
>A00707:180:HCLMHDSX7:2:1101:10782:29684;size=1
ACATGAAAGAGATTACAAAAACAGTTATGATTGCTACTCATGATATGCAGCTGGTCTGCCAGTGGGCGGACAGGATCCTTGTCTTGTGCCAGGGAAAGATT
>A00707:180:HCLMHDSX7:2:1101:1081:16266;size=1
GAATATAGGGAGAGATTATCCTTTCCGCTTAAAAATGGGTAAATTGCAGGATTTTCGATCAAGGCCCCAACATTTTGTAGAGCCTTGTGATTATTGGCAGTAATGGGCTGATTGTTAAAAG
However, I am continuously getting this error from the command below.
qiime tools import \
--input-path /Home/Data_vsearch/test \
--output-path /Home/Data_vsearch/sequences.qza \
--type 'FeatureData[Sequence]' \
--input-format DNASequencesDirectoryFormat
There was a problem importing /Home/Data_vsearch/test/:
Missing one or more files for DNASequencesDirectoryFormat: 'dna-sequences.fasta'
Based on other posts (Import data problem), the solution to these types of issues is sometimes in the syntax, which I am hoping it is. However, even looking through the tutorial I cannot find the solution - can anyone help please.
Note. I corrected the lower cases in the sequences above with AWK and can import the fasta files individually.
(qiime2-amplicon-2024.2) root@b1c1dff011df:/Home/Data_vsearch/test# qiime tools import --input-path A11d1JCon4A_0_L001_R1_001_nonchimera.fasta --output-path sequences1.qza --type 'FeatureData[Sequence]'
qiime tools import --input-path B11d1CCon1B_1_L001_R1_001_nonchimera.fasta --output-path sequences2.qza --type 'FeatureData[Sequence]'
qiime tools import --input-path B11d2JCon1B_2_L001_R1_001_nonchimera.fasta --output-path sequences3.qza --type 'FeatureData[Sequence]'
Imported A11d1JCon4A_0_L001_R1_001_nonchimera.fasta as DNASequencesDirectoryFormat to sequences1.qza
Imported B11d1CCon1B_1_L001_R1_001_nonchimera.fasta as DNASequencesDirectoryFormat to sequences2.qza
Imported B11d2JCon1B_2_L001_R1_001_nonchimera.fasta as DNASequencesDirectoryFormat to sequences3.qza