Keep ASV IDs when converting to relative abundance

emmlemore · March 31, 2023, 6:58pm

Hi there,

I am trying to convert the feature table that has taxonomy over to relative abundance. I was able to do this successfully by following this.

I converted the .biom table to a .tsv table that shows the taxonomy and the relative abundance. However, how do I get the ASV IDs? For instance for this bacterium: d__Bacteria;p__Bdellovibrionota;c__Bdellovibrionia;o__Bacteriovoracales;f__Bacteriovoracaceae;g__Peredibacter

I would like to know the ASV ID (e.g., ASV4, ASV30, ASV492).

Thank you.

Mehrbod_Estaki · March 31, 2023, 10:00pm

Hi @emmlemore,
If you are following the instructions in that link you provided, you can't get the ASV IDs anymore. Over there, we collapsed the ASVs into a higher level (Phyla, Genera etc.), once we do that you can't revert back to an ASV table, thus losing the feature-ids in their ASV resolution.

If you give us a bit more info on what you are specifically looking to achieve perhaps we can help come up with a different solution.

emmlemore · April 3, 2023, 1:21pm

Hi @Mehrbod_Estaki

Okay so I am trying to follow along here but I am a bit lost as to what the files are in terms of conversion

So after assigning taxonomy, I performed the following to get the taxonomy at the genus level:
qiime taxa collapse
--i-table /home/Rocks/outputs/qza_intermediates/rocks16S_table.qza
--i-taxonomy /home/Rocks/outputs/qza_intermediates/rocks16S_taxonomy.qza
--p-level 6
--o-collapsed-table genus16S_table.qza

Then converted that file into the relative frequency table:
qiime feature-table relative-frequency
--i-table genus16S_table.qza
--o-relative-frequency-table genus16S_relabund_table.qza

Then converted that relative abundance file into a BIOM file:
qiime tools export
--input-path genus16S_relabund_table.qza
--output-path genus16S_relabund

Then converted the BIOM file into a TSV file:
biom convert -i 16S_relabund.biom -o 16S_relabund.tsv --to-tsv

This is what the TSV file currently looks like:

But this is what I want it to look like:

From the tutorial linked above, it seems the feature IDs are kept as random letters and numbers (e.g., 4b5eeb300368260019c1fbc7a3c718fc) instead of ASV1.

Please advise. Thank you for your time.

emmlemore · April 3, 2023, 2:30pm

Hi @Mehrbod_Estaki

I started over with the original table.qza file and followed the tutorial linked above and was able to get the Feature table with taxonomy. However, the Feature ID is still gibberish instead of "ASV1, ASV2" etc. I saw this where you said it's not a good idea to rename them. I guess I can create a separate column and name ASV1, ASV2 manually beside each Feature ID so that each Feature ID is still retained. However, I have lost the relative abundance count that was outputted with the taxa collapse command.

How do I convert to relative abundance with the new table-with-taxonomy.biom/table-with-taxonomy.tsv file? The taxa collapse command is only allowed the table.qza and taxonomy.qza inputted as artifacts...

Please advise, thank you.

gregcaporaso · April 3, 2023, 9:11pm

Hi @emmlemore, You can convert your ASV feature table to relative frequency using the command:

qiime feature-table relative-frequency

That will give you the relative frequency with the sequence hash ids. (A note on those ids: while they look like gibberish, the same ASV sequence will always result in the same id. That makes it possible to relate feature ids across studies, as long as the same primers and trim/truncation parameters are used.)

Does this get you closer to what you're looking for?

emmlemore · April 4, 2023, 12:29pm

Hi @gregcaporaso

I was able to get the outputted relative frequencies. However, I am a little confused about this calculation. When I downloaded the absolute values, I was able to get the relative abundance like so: (feature ID frequency / total frequencies for each sample) x 100. The sum of each column was 100.

However when I performed qiime feature-table relative-frequency it appears it only performs this: feature ID frequency / total frequencies for each sample. The sum of each column was 1. And according to the usage definition here it says "Convert frequencies to relative frequencies by dividing each frequency in a sample by the sum of frequencies in that sample."

So is relative frequency different from relative abundance? Why is it not multiplied by 100 to get a %?

Thank you.

gregcaporaso · April 5, 2023, 8:54pm

@emmlemore, having the values sum to 1.0 or 100 are just two ways of representing the exact same information (just as a fraction or a percentage, respectively).