Dear QIIME2 developers,
I am trying to import Oxford Nanopore MinION reads in qiime2-2019.7 for assigning taxonomy without doing any clustering. I have already demultiplexed and quality filtered them, and I am working with 3 fastq.gz files. I know that, after importing sequences as SampleData[SequencesWithQuality] I am supposed to dereplicate them, in order to get a FeatureData[Sequence] and a FeatureData[Frequency] artifacts. However, qiime vsearch dereplicate-sequences is changing the reads IDs, and this is causing issues with the downstream analysis.So, I kept only the table.qza FeatureData[Frequency] artifact produced by vsearch dereplicate-sequences and re-imported my reads converted to fasta as a FeatureData[Sequence] type artifact, but qiime feature-classifier classify-consensus-blast is complaining about the format, giving this error:
Plugin error from feature-classifier:
/tmp/qiime2-archive-lq34dzub/9f66d5cf-09b9-4b09-94f3-1960b1ab2f0b/data/dna-sequences.fasta is not a(n) DNAFASTAFormat file:
Invalid characters on line 22 (does not match IUPAC characters for a DNA sequence).
Debug info has been saved to /tmp/qiime2-q2cli-err-dgnhghd_.log
The commands I am using are:
DB=/home/simone/QIIME2_analysis/database/PRJNA33175_Bacterial_16S_sequence.qza
TAXONOMY=/home/simone/QIIME2_analysis/database/PRJNA33175_Bacterial_16S_taxonomy.qza
READS_MERGED=/home/simone/QIIME2_analysis/MinION_test/reads/reads.fasta
MANIFEST=/home/simone/QIIME2_analysis/MinION_test/manifest.txtqiime tools import
–type ‘SampleData[SequencesWithQuality]’
–input-path $MANIFEST
–input-format ‘SingleEndFastqManifestPhred33V2’
–output-path sequences.qzaqiime vsearch dereplicate-sequences
–i-sequences sequences.qza
–o-dereplicated-table table.qza
–o-dereplicated-sequences rep-seqs.qzarm rep-seqs.qza #removing them because of renaming issue
qiime tools import
–type ‘FeatureData[Sequence]’
–input-path $READS_MERGED
–input-format ‘DNAFASTAFormat’
–output-path rep-seqs.qzaqiime feature-classifier classify-consensus-blast
–i-query rep-seqs.qza
–i-reference-reads $DB
–i-reference-taxonomy $TAXONOMY
–p-perc-identity 0.8
–p-maxaccepts 1
–o-classification taxonomy.qza
What do you think I should do?
Thank you very much