Trouble importing MegaHit assemblies

Hey @Nick_D!

I disagree, I think you explained it very well!

I don't think so, I think @Nicholas_Bokulich & I are on the same page as you (more below).

This is what we mean by "per-sample assemblies" --- you have one file per sample containing the assembly sequences. As I mentioned above, in order to dereplicate in QIIME 2 you will need to multiplex (which is a step in the reverse direction) before importing and dereplicating.

For what its worth, I think in the future it will make sense for us to create a new import format in q2-types to support this schema.


Sample workflow

cd path/to/fasta/files

touch merged.fasta

for f in *.fasta; do fn="${$(basename -- "$f")%.*}"; sed "s/^>\(.*\)$/>\\$fn\_\1/" $f >> merged.fasta; done

qiime tools import \
  --input-path merged.fasta \
  --output-path seqs.qza \
  --type 'SampleData[Sequences]'

qiime vsearch dereplicate-sequences \
  --i-sequences seqs.qza \
  --o-dereplicated-table table.qza \
  --o-dereplicated-sequences rep-seqs.qza \
  --verbose

qiime feature-table summarize \
  --i-table table.qza \
  --o-visualization table.qzv