qiime moshpit bin-contigs-metabat: FeatureData[MAG] nowhere to be found

Dear @misialq and @Nicholas_Bokulich lab,

When binning contigs using qiime moshpit bin-contigs-metabat, the output produced for --o-mags are of type SampleData[MAGs]. However, for dereplicating MAGs and for classifying MAGs using 'qiime moshpit classify-kraken2' a FeatureData[MAG] type is required instead. browsing through your tutorials and help files, I am unable to find how to find or create the FeatureData[MAG] artifact. What I did see in my temp dir was a hash directory for every MAG created, and the data folder contains a fast format file. I suspect this is the data I am looking for. I am wondering why this is still in temp, and not output as FeatureData[MAG]. Are they unfinished somehow?

Happy to hear from you on how you advise me to proceed.

Cheers,
Pieter

1 Like

Hey @pietervanveelen,

great question, thanks! So the idea is that the SampleData[MAGs] represents non-dereplicated MAGs (straight from binning) and FeatureData[MAG] then corresponds to the dereplicated ones. The dereplicating action which we implemented for now in q2-moshpit can accept any DistanceMatrix representing similarity between all the MAGs in the SampleData artifact - we then use that similarity to find the non-redundant set of MAGs. This is not yet really documented anywhere but one way to obtain such a matrix is by using sourmash through its QIIME 2 plugin - you can then use it as input to the dereplicate action. If you want to see the steps involved in this process, you can check out our semi-official tutorial here.

Let me know if you need more information :slight_smile:

Cheers,
Michal

2 Likes

Hello @misialq,

Thanks, that is very helpful. I'll get onto trying that soon.
I'm gonna bug you about another moshpit request, but I'll create another topic for it.

Best,
Pieter

1 Like

If sourmash is the recommended way to go, it would be great to include it in future core distributions of qiime2-metagenome...

Hi @Mechah,

Jumping in for @misialq here! This is currently in the works, and we are hoping to have q2-sourmash included in the metagenome distribution for the 2024.10 release. :slightly_smiling_face:

Cheers :lizard:

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.