I’ve been having the same problem, so I’ve been following this thread to try to solve my problem. After following the script you suggested, I attempted to import my reference-hit.seqs.fa file again, and this time my error said: reference-hit.seqs.fa is not a(n) QIIME1DemuxFormat file
Earlier, the error was: not a(n) DNAFASTAFormat file
This is confusing to me, because I don’t know why it would require a single file to have 2 different formats. Is there a way to force the plugin to choose one format or the other?
qiime tools import --input-path reference-hit.seqs.fa --output-path rep-seqs.qza --type SampleData[Sequence] #Error: reference-hit.seqs.fa is not a(n) QIIME1DemuxFormat file
I’m realizing now as I typed this that I must have copied something wrong, because on my second try I have a different sample type. So, I just tried to run the import again using FeatureData[Sequence] on the data we changed with the grep commands, and I received the “not a(n) DNAFASTAFormat file” error again.
My best guess is that the program isn’t reading it as a .fasta file because the ID/sequence name is the same as the sequence itself. But if I just changed the sample ID that would probably mess things up downstream, right?
First off, I recommend you see my post here, which recommends using q2-deblur instead of deblur --- deblur produces data that needs a bit of massaging before it can be loaded into QIIME 2, that is where the advantage of q2-deblur comes in - it does that clean-up for you!
Yep, I am noticing that too!
The data is unchanged --- grep is just a tool for searching within files, it is non-destructive. @antgonza was asking for @Jingsi_Tang to run some grep commands to get a sense of the structure of the data, to make sure that there wasn't anything too crazy happening in the fasta file.
I can't say for sure, but ID-cleanup is one of the things happening in the q2-deblur plugin, so if possible, I would recommend re-running your deblur step in q2-deblur!