Issue with clawback sequence-variants-from-samples

Hello qiime2 community,
I was trying to assemble class weight for my samples. My sequences are v3-v4 sequences and qiita didn't seem much help there. So I followed this

from this link.

But when I run qiime clawback sequence-variants-from-samples --i-samples feature_table.qza --o-sequences sv.qza --verbose I get an error.
Error:

Plugin error from clawback:

Invalid characters in sequence: ['3', 'Z', 'o', 't', 'u'].
Valid characters: ['-', 'K', 'Y', 'N', 'D', 'R', 'H', '.', 'M', 'V', 'G', 'T', 'B', 'W', 'A', 'C', 'S']
Note: Use lowercase if your sequence contains lowercase characters not in the sequence's alphabet.

See above for debug info.

However my OTU table is not generated by deblur/dada2 and hence cannot use --p-no-hashed-feature flag which seems to be a solution to the error.

I would like to know if there is any other to run the command on my feature_table.qza (168.0 KB)

Thanks in advance.

Hi @abhishake,

I am moving your post into the Community Plugin Support category, since you are working with q2-clawback (which is not one of the QIIME 2 core plugins). Someone in there should hopefully have thoughts or suggestions on how to address this issue!

Cheers,
Liz

Hi @abhishake,

Thanks for your question and I'm sorry for the slow response.

The issue is that the feature table has to have ASVs as its feature ids if you want to use sequence-variants-from-samples. However, if you have a fasta file that lists the sequences for your feature ids, you don't need sequence-variants-from-samples, because that's what sequence-variants-from-samples is trying to generate.

If you have such a fasta file, you can import it into an artifact and use it in place of sv.qza where it appears in that tutorial.

Please let us know if that doesn't make sense or work.

Ben

@BenKaehler thanks for your reply.

I did as you suggest and followed the rest of the tutorial.

Once I got weights, I opened the biom file and found all taxonomy to have the same weight. The weights were of the order 5E-11.

Can you please shed some light on when that happens? Or did I make some error?

Thank you for your time.

No worries! For technical reasons, taxa that were observed not-at-all in your sequences get almost zero weights. I suspect that when you're looking at your biom you're seeing that many of them have almost zero weights, which is not unusual.

If you look through all of the taxa, are they all exactly 5E-11?

1 Like

@BenKaehler I looked through all the taxa and the exact value is 5.83362501458407E-11

And all of the taxa had this exact weight.
This is the screenshot

In order to cross-check, I went to the github code, where I find the unobserved_weight: float = 1e-6 for the function generate_class_weights. Therefore, I become unsure if there has been an error.

Thank you.