Lefse assign a taxa present in more number of samples in another group to represent a group in which it occurs in fewer number of samples

jrhaulung · August 9, 2024, 8:07am

Dear Qiime friends,

In order to identify differential abundance taxa between two group (PN (12 sample)and PT(12 sample) using lefse ifor my microbiome study. Dokdo was used to produce a input file (input_table.tsv in attached file) for lefse analysis. After process through the following scripts, 2 and 12 genera were assigned for PN and PT, respectively. Among the fourteen differential abundance genera, these genera show higher sequence frequencies and are present in more samples in their group than the other one, except Rothia, which was chosen as an representative biomarker for PN. (LDA plot show in attached file)

Viewing the input_table.tsv or the taxanomy barplot level 6 output, Rothia present in 4 samples in PN and 10 sample in PT groups with frequencies of (0.19%, 20.52%, 0.90%, 0.70%) and (0.69%,0.44%, 0.48%’ 4.91%, 0.46%, 0.28%, 0.54%, 1.42%, 0.55%,1.42%) in each positive sample, respectively.

I just wonder, whether this look-odd result could be frequently seen in lefse output, and
how should I properly explain the result for Rothia to represent the PN group?

Any suggestion will be highly appreciated.

Sincerely,

Jrhau Lung

(I have learned that different algorithms adopted to deal with the zero inflation for differential abundant taxa determination for microbiome study could assign totally different taxa from the same 16S sequence dataset. But assign a taxa present in more number of samples in another group to represent a group in which it occurs in fewer number of samples still look odd to me).
table.qza (596.6 KB)
silva-taxonomy-138-99-V3-V4.qza (260.5 KB)
sample-metadata.tsv (11.4 KB)

colinbrislawn · August 9, 2024, 7:07pm

Hello Jrha,

I would need to see the Qiime2 inputs to Dokdo to investigate further...

Speaking to the 'different algorithms' you mentioned, Lefse has been superseded by newer and arguably better methods.

From http://galaxy.biobakery.org/

Please consider using MaAsLin2 as a long term alternative for LEfSe.

Within Qiime2, I recommend qiime composition ancombc .

Better statistics should lead to better results!

jrhaulung · August 12, 2024, 12:20am

Hi colinbrislawn,
Sorry for forgetting to upload detailed files for the result replicattion the analysis.
Hope you could find out the cause for us. Your help will be highly appreciated.

sincerely

Jrhau

colinbrislawn · August 18, 2024, 8:44pm

Hello Jrhau,

I appreciate your patience.

I don't see anything wrong with these files...

I suspect that the issue is with the LefSe test itself.

Have you had a chance to run this through ANCOM? The Qiime2 plugin now makes barplots just like the one you posted, and ANCOM is a better fit for amplicons.
ancombc: Analysis of Composition of Microbiomes with Bias Correction — QIIME 2 2024.5.0 documentation
da-barplot: Differential abundance bar plots — QIIME 2 2024.5.0 documentation

jrhaulung · August 21, 2024, 8:54am

Hi colinbrislawn,
We appreciate your assistance and confirmation. We definitely try ANCOM in downstream analysis. Thank you again.

sincerely,

Jrhau