LEfSe after QIIME2 to test at all taxonomic levels

Hello,

I have been told that the instructions at this link (Lefse after QIIME2) creates an input file suitable for testing differences at the genus level and that an input file created where each taxa has a sum is more suitable for testing at all taxonomic levels. Does anyone know how to create this second file in QIIME2?

Thank you,
Akriti

1 Like

Hello Akiriti,

I was running into this issue as well after following the same steps listed [there] (Lefse after QIIME2) and collapsing the table down to level 6

If I am not mistaken (would love to have a moderator/admin/power user verify or correct my understanding), the problem for us stemmed from when representatives sequences were not able to be resolved with enough specificity, it would simply group those rep seqs, determine the relative abundance, and label those to the most specific one possible (next taxonomic level up), and create a new row with that label. Which for us, was not an issue for the most part as we interpreted it the same way with our external analysis.

For example:
1 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Corynebacterium
2 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Corynebacterium 1
3 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Lawsonella
4 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae

In this case, the row ending in “|Corynebacteriaceae” (Row 4) would only include the relative abundances of those rep seqs that could not be resolved past the Corynebacteriaceae family and does not include Rows 1-3. But the issue is that LEfSe does not interpret those rows in the same manner - instead, it assumes that the “|Corynebacteriaceae” row (Row 4) contains the relative abundance values of the entire Corynebacteriaceae family (including Rows 1-3) regardless of further specificity. This is where we ran into issues.

But what we also noticed was that when taxa were resolved to the same levels or at least one level more specific than the target comparison taxa level, there were not any issues. So using the example above, even though LEfSe was misinterpreting Row 4 as total cumulative relative abundances for the Corynebacteriaceae family, LEfSe did not have issues identifying which relative abundances were corresponding to the Corynebacteriales order as we did not have unresolved rep seqs at that level and thus no row that ended with “|Corynebacteriales”. So we assumed it was able to extrapolate that all relative abundance rows that contained “|Corynebacteriales|” (Rows 1-4 in this example) belonged to the Corynebacteriales order and it ran those values through the LEfSe analysis pipeline correctly.

Thus, our proposed solution was to simply add an additional taxa level label of “|Unknown” to all rows that were not resolved to the L6-Genus level. That way there were no rows that LEfSe was interpreting as cumulative relative abundances for a specific taxa level and instead would be forced to determine that value itself. Additionally, “|Unknown” would not be detected unless those ungrouped rep seqs had a significant enough relative abundance to pass the threshold to begin with.

For example:
1 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Corynebacterium
2 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Corynebacterium 1
3 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Lawsonella
4 Bacteria|Actinobacteria|Actinobacteria|Corynebacteriales|Corynebacteriaceae|Unknown

By doing this, we increased the number of discriminate features detected by LEfSe above our LDA threshold from 11 to 14 for our dataset.

Again, this seemed to have solved our LEfSe issues for testing differences at all levels 1-6 but would like to run this by the community to see if this solution is logical and would not have significant issues.

Thank you,
Daniel

4 Likes

Hi @dann818,

Recently I too have been playing around with the LEfSe + QIIME 2 combination, so I was very happy to come across your post.

I'm not sure I understood 100% of what you wrote, but I generally agree with you that preparing an input file for LEfSe can be tricky, especially with the distinction between relative abundance vs. accumulative relative abundance. For example, when we look at this example input file provided by the original authors (lefse · biobakery/biobakery Wiki · GitHub), each column -- which corresponds to a sample -- doesn't sum to 1 because, for example, the relative abundance of the taxon [ Bacteria ] is the accumulative relative abundance of [ Bacteria|Acidobacteria ], [ Bacteria|Bacteroidetes ], ... Therefore, this is not the same as the QIIME 2 feature table with FeatureTable[RelativeFrequency] type whose column does sum up to 1. And I think this is what you are trying to point out in your post (correct me if I am wrong).

The good news is: I believe LEfSe is smart enough to calculate accumulative relative abundance on its own when it's given a FeatureTable[RelativeFrequency] and it doesn't see accumulative relative abundance. I have written a short tutorial for performing LEfSe with a QIIME 2 feature table. I will copy and paste it below. Take a look at the input_table.tsv file vs. the formatted_table.tsv file. The first file has only 220 taxa with all of them belonging to the genus level; it's the input file for LEfSe. The second file is what LEfSe creates after formatting the first file and it has 429 taxa with varying taxonomic ranks from Kingdom to Genus.

I hope you find my tutorial helpful and please let me know if you have any questions.

LEfSe

In this section, I will walk you through how I run the LEfSe (linear discriminant analysis effect size) tool. But before I do that, it is important for you to acknowledge this:

LEfSe method is more a discriminant analysis method rather than a DA method. (Lin and Peddada, 2020; PMID: 33268781)

In order to use LEfSe, you will need to open two Terminal windows: one for your usual QIIME 2 environment and another for running LEfSe. For the latter, you should create a new conda environment and install LEfSe as described below.

  1. Terminal for running QIIME 2 and Dokdo:
$ conda activate qiime2-2020.8
  1. Terminal for running LEfSe:
$ conda create -n lefse -c conda-forge python=2.7.15
$ conda activate lefse
$ conda install -c bioconda -c conda-forge lefse

After you have both terminals set up, you can create an input file for LEfSe from a QIIME 2 feature table. We will use the "Moving Pictures" tutorial as an example (run below in the QIIME 2 terminal).

$ dokdo prepare-lefse \
-t data/moving-pictures-tutorial/table.qza \
-x data/moving-pictures-tutorial/taxonomy.qza \
-m data/moving-pictures-tutorial/sample-metadata.tsv \
-o output/Useful-Information/input_table.tsv \
-c body-site \
-u subject \
-w "[body-site] IN ('tongue', 'gut', 'left palm')"

Click here to view the input_table.tsv file.

Next, we need to format the input table (run below in the LEfSe terminal):

$ lefse-format_input.py \
output/Useful-Information/input_table.tsv \
output/Useful-Information/formatted_table.in \
-c 1 \
-u 2 \
-o 1000000 \
--output_table output/Useful-Information/formatted_table.tsv

Click here to view the formatted_table.in file. Click here to view the formatted_table.tsv file.

We can run LEfSe with (run below in the LEfSe terminal):

$ run_lefse.py \
output/Useful-Information/formatted_table.in \
output/Useful-Information/output.res

Which will give:

Number of significantly discriminative features: 199 ( 199 ) before internal wilcoxon
Number of discriminative features with abs LDA score > 2.0 : 199

Click here to view the output.res file.

We can then list the discriminative features and their LDA scores (run below in the LEfSe terminal):

$ lefse-plot_res.py \
output/Useful-Information/output.res \
output/Useful-Information/output.pdf \
--format pdf

Click here to view the output.pdf file.

Finally, you can create a cladogram for the discriminative features (run below in the LEfSe terminal):

$ lefse-plot_cladogram.py \
output/Useful-Information/output.res \
output/Useful-Information/output.cladogram.pdf \
--format pdf

Click here to view the output.cladogram.pdf file.

11 Likes

Hello @sbslee !!

What a nice code and tutorial! It may will be really helpful for all of us that are facing some problems with this LefSe and Qiime2 combination :slight_smile:

I have tried and followed your tutorial and have reached (almost with no problems, except for some commands I was missing, as Dokdo, for example, which It was not downloaded, neither installed ) to the step where the input table has to be formatted.

lefse-format_input.py \
output/Useful-Information/input_table.tsv \
output/Useful-Information/formatted_table.in \
-c 1 \
-u 2 \
-o 1000000 \
--output_table output/Useful-Information/formatted_table.tsv

I have a couple of questions in this step:

  1. I have tried several times, but don't get the solution for this error that I get when running the command above:

lefse-format_input.py: command not found

It doesn't come to my mind any solution or idea from where I can get or installed this command.

  1. Another question would be about the formatted_table.in. What is it? What does it contain? And... is it an input file or an output?

**Maybe when I move forward the tutorial I will have more doubts... But those two questions are what I get until now.

Thank you so much in advanced!

I'm looking forward to getting the whole code up :smiley:

Bests,

Miriam

1 Like

Hi @MiriamGorostidi,

Did you install LEfSe in a fresh conda environment as instructed in the tutorial? If you did, before you ran the command, did you activate the conda environment where LEfSe is installed (note that this is not the QIIME 2 environment)? I'm asking these questions because the error suggests your environment can't find LEfSe completely.

The formatted_table.in file is the input file for the command run_lefse.py. It contains relative abundance and accumulative relative abundance data. It is generated from the lefse-format_input.py command and serves as the input file for run_lefse.py. See my tutorial above for more details.

I hope this helps! Let me know if you have additional questions.

Hi @sbslee !
How are you? Thank you so much for your rapid answer!

You were right... I thought I had installed everything, but it seems I did not. I have reached to, almost, the last step. However, I get the next error when creating the last PDF documents:

lefse-plot_res.py
output/Useful-Information/output.res
output/Useful-Information/output.pdf
--format pdf

lefse-plot_cladogram.py
output/Useful-Information/output.res
output/Useful-Information/output.cladogram.pdf
--format pdf

When running the command above, I get an error that, apparently is attributed to:

clade_sep parameter too large, lowered to 0.266967773438
Traceback (most recent call last):
  File "/home/unidad/anaconda3/envs/lefse/bin/lefse-plot_cladogram.py", line 341, in <module>
    draw_tree(params['output_file'],clad_tree,params)
  File "/home/unidad/anaconda3/envs/lefse/bin/lefse-plot_cladogram.py", line 294, in draw_tree
    ax = fig.add_subplot(111, polar=True, frame_on=False, axis_bgcolor=params['back_color'] )
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/figure.py", line 1257, in add_subplot
    a = subplot_class_factory(projection_class)(self, *args, **kwargs)
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/axes/_subplots.py", line 77, in __init__
    self._axes_class.__init__(self, fig, self.figbox, **kwargs)
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/projections/polar.py", line 854, in __init__
    Axes.__init__(self, *args, **kwargs)
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 541, in __init__
    self.update(kwargs)
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/artist.py", line 888, in update
    for k, v in props.items()]
  File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/artist.py", line 881, in _update_property
    raise AttributeError('Unknown property %s' % k)

AttributeError: Unknown property axis_bgcolor

Do you know how could I fixed this?

Thank you so much for your help :smiley:

@MiriamGorostidi,

First of all, it seems like you were able to run lefse-plot_res.py without any issues at least and the error is coming from lefse-plot_cladogram.py. Can you confirm this? For example, can you show me the output/Useful-Information/output.pdf file?

Secondly, you may have one or more outdated libraries installed in your LEfSe environment. Can you show me the result of conda list after activating your LEfSe environment?

(lefse) sbslee@x86_64-apple-darwin13 ~ % conda list
# packages in environment at /Users/sbslee/opt/anaconda3/envs/lefse:
#
# Name                    Version                   Build  Channel
_r-mutex                  1.0.1               anacondar_1    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
backports_abc             0.5                        py_1    conda-forge
biom-format               2.1.7                    py27_0    bioconda
bwidget                   1.9.14               h694c41f_0    conda-forge
bzip2                     1.0.8                hc929b4f_4    conda-forge
ca-certificates           2020.12.5            h033912b_0    conda-forge
cairo                     1.16.0            h0ab9d94_1001    conda-forge
cctools_osx-64            949.0.1             h2f0f38f_19    conda-forge
certifi                   2019.11.28       py27h8c360ce_1    conda-forge
clang                     11.0.1               h694c41f_1    conda-forge
clang-11                  11.0.1          default_hf8bb9ca_1    conda-forge
clang_osx-64              11.0.1               hb91bd55_0    conda-forge
clangxx                   11.0.1          default_hf8bb9ca_1    conda-forge
clangxx_osx-64            11.0.1               h7e1b574_0    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
compiler-rt               11.0.1               h654b07c_0    conda-forge
compiler-rt_osx-64        11.0.1               h8c5fa43_0    conda-forge
curl                      7.68.0               h8754def_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
fontconfig                2.13.1            h1027ab8_1000    conda-forge
freetype                  2.10.4               h4cff582_1    conda-forge
fribidi                   1.0.10               hbcb3906_0    conda-forge
functools32               3.2.3.2                    py_3    conda-forge
future                    0.18.2           py27h8c360ce_1    conda-forge
futures                   3.3.0            py27h8c360ce_1    conda-forge
gettext                   0.19.8.1          haf92f58_1004    conda-forge
gfortran_osx-64           4.8.5                h22b1bf0_8    conda-forge
glib                      2.66.3               h519c658_0    conda-forge
graphite2                 1.3.13            h2e338ed_1001    conda-forge
gsl                       2.4               ha2d443c_1006    conda-forge
h5py                      2.10.0          nompi_py27h106b333_102    conda-forge
harfbuzz                  2.4.0                h92b87b8_1    conda-forge
hdf5                      1.10.5          nompi_h0cbb7df_1103    conda-forge
icu                       58.2              h0a44026_1000    conda-forge
jpeg                      9d                   hbcb3906_0    conda-forge
kiwisolver                1.1.0            py27h5cd23e5_1    conda-forge
krb5                      1.16.4               h1752a42_0    conda-forge
ld64_osx-64               530                 hea264c1_19    conda-forge
ldid                      2.1.2                h7660a38_2    conda-forge
lefse                     1.0.8.post1              py27_2    bioconda
libblas                   3.8.0               14_openblas    conda-forge
libcblas                  3.8.0               14_openblas    conda-forge
libclang-cpp11            11.0.1          default_hf8bb9ca_1    conda-forge
libcurl                   7.68.0               h709d2b2_0    conda-forge
libcxx                    11.0.1               habf9029_0    conda-forge
libedit                   3.1.20191231         h0678c8f_2    conda-forge
libffi                    3.2.1             hb1e8313_1007    conda-forge
libgfortran               3.0.1                         0    conda-forge
libglib                   2.66.3               h2575888_0    conda-forge
libiconv                  1.16                 haf1e3a3_0    conda-forge
liblapack                 3.8.0               14_openblas    conda-forge
libllvm11                 11.0.1               h223d4b2_0    conda-forge
libopenblas               0.3.7                hd44dcd8_1    conda-forge
libpng                    1.6.37               h7cec526_2    conda-forge
libssh2                   1.9.0                h8a08a2b_5    conda-forge
libtiff                   4.2.0                h355d032_0    conda-forge
libwebp-base              1.1.0                hbcb3906_3    conda-forge
libxml2                   2.9.9                hd80cff7_2    conda-forge
linecache2                1.0.0                      py_1    conda-forge
llvm-openmp               11.0.1               h7c73e74_0    conda-forge
llvm-tools                11.0.1               h223d4b2_0    conda-forge
lz4-c                     1.9.3                h046ec9c_0    conda-forge
make                      4.3                  h22f3db7_1    conda-forge
matplotlib                2.1.2                    py27_1    conda-forge
matplotlib-base           2.1.2            py27h31f9439_1    conda-forge
ncurses                   6.2                  h2e338ed_4    conda-forge
numpy                     1.16.5           py27hde6bac1_0    conda-forge
openssl                   1.1.1i               h35c211d_0    conda-forge
pandas                    0.24.2           py27h86efe34_0    conda-forge
pango                     1.42.4               haa940fe_4    conda-forge
pcre                      8.44                 hb1e8313_0    conda-forge
pip                       20.1.1             pyh9f0ad1d_0    conda-forge
pixman                    0.38.0            h01d97ff_1003    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqi                      0.3.2                    py27_1    bioconda
python                    2.7.15          h8e446fc_1011_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                2.7                     1_cp27m    conda-forge
pytz                      2020.1             pyh9f0ad1d_0    conda-forge
r-base                    3.5.3                hb1347aa_0  
r-codetools               0.2_16          r35h6115d3f_1001    conda-forge
r-coin                    1.3_1             r35h159158b_0    conda-forge
r-lattice                 0.20_41           r35h17f1fa6_1    conda-forge
r-libcoin                 1.0_5             r35h26f5615_1    conda-forge
r-mass                    7.3_51.6          r35h17f1fa6_1    conda-forge
r-matrix                  1.2_18            r35h26f5615_2    conda-forge
r-matrixstats             0.56.0            r35h17f1fa6_0    conda-forge
r-modeltools              0.2_23            r35h6115d3f_0    conda-forge
r-multcomp                1.4_13            r35h6115d3f_0    conda-forge
r-mvtnorm                 1.0_11            r35haf69682_2    conda-forge
r-sandwich                2.5_1             r35h6115d3f_1    conda-forge
r-survival                3.1_12            r35h17f1fa6_0    conda-forge
r-th.data                 1.0_10            r35h6115d3f_1    conda-forge
r-zoo                     1.8_7             r35h17f1fa6_0    conda-forge
readline                  8.0                  h0678c8f_2    conda-forge
rpy2                      2.8.6           py27r35hfc83f80_2    conda-forge
scipy                     1.2.1            py27hab3da7d_2    conda-forge
setuptools                44.0.0                   py27_0    conda-forge
singledispatch            3.4.0.3         pyh9f0ad1d_1001    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h17101e1_0    conda-forge
subprocess32              3.5.4            py27h0b31af3_0    conda-forge
tapi                      1100.0.11            h9ce4665_0    conda-forge
tk                        8.6.10               h0419947_1    conda-forge
tktable                   2.10                 h49f0cf7_3    conda-forge
tornado                   5.1.1           py27h1de35cc_1000    conda-forge
traceback2                1.4.0                    py27_0    conda-forge
unittest2                 1.1.0                      py_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                haf1e3a3_1    conda-forge
zlib                      1.2.11            h7795811_1010    conda-forge
zstd                      1.4.8                hf387650_1    conda-forge
(lefse) sbslee@x86_64-apple-darwin13 ~ % 

Finally, can you make sure you are using the exact input files in the tutorial?

Hello @sbslee !

I'm sorry I didn't explain myself so well... I was not able to run lefse-plot_res.py . Instead, I got the following error, so I can't show you the output.pdf file, due to this not being generated.

Traceback (most recent call last):
File "/home/unidad/anaconda3/envs/lefse/bin/lefse-plot_res.py", line 177, in
else: plot_histo_hor(params['output_file'],params,data,len(data['cls']) == 2,params['report_features'])
File "/home/unidad/anaconda3/envs/lefse/bin/lefse-plot_res.py", line 70, in plot_histo_hor
ax = fig.add_subplot(111,frame_on=False,axis_bgcolor=params['back_color'])
File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/figure.py", line 1257, in add_subplot
a = subplot_class_factory(projection_class)(self, *args, **kwargs)
File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/axes/_subplots.py", line 77, in init
self._axes_class.init(self, fig, self.figbox, **kwargs)
File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 541, in init
self.update(kwargs)
File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/artist.py", line 888, in update
for k, v in props.items()]
File "/home/unidad/anaconda3/envs/lefse/lib/python2.7/site-packages/matplotlib/artist.py", line 881, in _update_property
raise AttributeError('Unknown property %s' % k)
AttributeError: Unknown property axis_bgcolor

Here is the output of conda list command. I hope the error is on account of outdated libraries. However, I don't understand how is that possible. I mean, If it is the first time I use LefSe and I have just installed it, how are the libraries not updated? **I'm quite new on command line programming :sweat_smile:

conda list
# packages in environment at /home/unidad/anaconda3/envs/lefse:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
_r-mutex                  1.0.1               anacondar_1    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
backports_abc             0.5                        py_1    conda-forge
binutils_impl_linux-64    2.35.1               h193b22a_2    conda-forge
binutils_linux-64         2.35                hc3fd857_29    conda-forge
biom-format               2.1.7                    py27_0    bioconda
bwidget                   1.9.14               ha770c72_0    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cairo                     1.16.0            hcf35c78_1003    conda-forge
certifi                   2019.11.28       py27h8c360ce_1    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
curl                      7.71.1               he644dc0_3    conda-forge
cycler                    0.10.0                     py_2    conda-forge
dbus                      1.13.6               he372182_0    conda-forge
expat                     2.2.10               h9c3ff4c_0    conda-forge
fontconfig                2.13.1            hba837de_1004    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
functools32               3.2.3.2                    py_3    conda-forge
future                    0.18.2           py27h8c360ce_1    conda-forge
futures                   3.3.0            py27h8c360ce_1    conda-forge
gcc_impl_linux-64         7.5.0               hda68d29_13    conda-forge
gcc_linux-64              7.5.0               he2a3fca_29    conda-forge
gettext                   0.19.8.1          hf34092f_1004    conda-forge
gfortran_impl_linux-64    7.5.0               h56cb351_18    conda-forge
gfortran_linux-64         7.5.0               ha081f1e_29    conda-forge
glib                      2.66.1               h680cd38_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gsl                       2.6                  he838d99_2    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
gxx_impl_linux-64         7.5.0               h64c220c_13    conda-forge
gxx_linux-64              7.5.0               h547f3ba_29    conda-forge
h5py                      2.10.0          nompi_py27h513d04c_102    conda-forge
harfbuzz                  2.4.0                h9f30f68_3    conda-forge
hdf5                      1.10.5          nompi_h7c3c948_1111    conda-forge
icu                       64.2                 he1b5a44_1    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kernel-headers_linux-64   2.6.32              h77966d4_13    conda-forge
kiwisolver                1.1.0            py27h9e3301b_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
lefse                     1.0.8.post1              py27_1    bioconda
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcurl                   7.71.1               hcdd3856_3    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgfortran-ng            7.5.0               h14aa051_18    conda-forge
libgfortran4              7.5.0               h14aa051_18    conda-forge
libgomp                   9.3.0               h2828fa1_18    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          pthreads_hb3c22a3_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libssh2                   1.9.0                hab1572f_5    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.0                h7f98852_0    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
linecache2                1.0.0                      py_1    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
make                      4.3                  hd18ef5c_1    conda-forge
matplotlib                2.2.5                ha770c72_3    conda-forge
matplotlib-base           2.2.5            py27h250f245_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
numpy                     1.16.5           py27h95a1406_0    conda-forge
openssl                   1.1.1i               h7f98852_0    conda-forge
pandas                    0.24.2           py27hb3f55d8_0    conda-forge
pango                     1.42.4               h7062337_4    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pip                       20.1.1             pyh9f0ad1d_0    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqi                      0.3.2                    py27_1    bioconda
pyqt                      5.9.2            py27hcca6a23_4    conda-forge
python                    2.7.15          h5a48372_1011_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                2.7                    1_cp27mu    conda-forge
pytz                      2020.1             pyh9f0ad1d_0    conda-forge
qt                        5.9.7                h0c104cb_3    conda-forge
r-base                    3.6.3                h316533a_2    conda-forge
r-codetools               0.2_18            r36hc72bb7e_0    conda-forge
r-coin                    1.4_1             r36hcfec24a_0    conda-forge
r-lattice                 0.20_41           r36hcfec24a_3    conda-forge
r-libcoin                 1.0_8             r36he454529_0    conda-forge
r-mass                    7.3_53            r36hcfec24a_0    conda-forge
r-matrix                  1.3_2             r36he454529_0    conda-forge
r-matrixstats             0.58.0            r36hcfec24a_0    conda-forge
r-modeltools              0.2_23            r36h6115d3f_1    conda-forge
r-multcomp                1.4_16            r36hc72bb7e_0    conda-forge
r-mvtnorm                 1.1_1             r36h31ca83e_1    conda-forge
r-sandwich                3.0_0             r36h142f84f_0    conda-forge
r-survival                3.2_7             r36hcfec24a_0    conda-forge
r-th.data                 1.0_10            r36h6115d3f_2    conda-forge
r-zoo                     1.8_8             r36hcdcec82_0    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
rpy2                      2.8.6           py27r36hd767a1f_2    conda-forge
scipy                     1.2.1            py27h921218d_2    conda-forge
sed                       4.8                  he412f7d_0    conda-forge
setuptools                44.0.0                   py27_0    conda-forge
singledispatch            3.4.0.3         pyh9f0ad1d_1001    conda-forge
sip                       4.19.8          py27hf484d3e_1000    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h74cdb3f_0    conda-forge
subprocess32              3.5.4            py27h516909a_0    conda-forge
sysroot_linux-64          2.12                h77966d4_13    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
tktable                   2.10                 hb7b940f_3    conda-forge
tornado                   5.1.1           py27h14c3975_1000    conda-forge
traceback2                1.4.0                    py27_0    conda-forge
unittest2                 1.1.0                      py_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.12               h516909a_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.8                ha95c52a_1    conda-forge

Thank you so much in advanced :slight_smile:

1 Like

@MiriamGorostidi,

Good question! When you installed LEfSe in your conda environment, you actually installed additional packages with specific versions that you see with $ conda list. And these specific versions may be not most up to date. In fact, I think I found the issue. If you compare your list to mine, you will see that I'm using matplotlib 2.1.2 while you are using matplotlib 2.2.5. I'm not exactly sure how you got that particular version in the first place, but it seems like the error you reported (AttributeError: Unknown property axis_bgcolor) is caused because axis_bgcolor was removed since matplotlib 2.2.0 (see the changelog here). To fix the issue, you have to remove matplotlib from your LEfSe environment and then reinstall a version below 2.2.0:

$ conda activate lefse
$ conda remove matplotlib
$ conda install matplotlib=2.1.2

Hope this fixes the problem. Good luck!

2 Likes

@sbslee

Thank you so much for everything you have done! The problem is fixed and I finally got the two PDF files :slight_smile:
Now I have an additional (and I hope final) question: How are we supposed to interpret the results when we get a __ instead of a bacterial name? I have had a look to your PDF-s and you also get that kind of results (comparing with my PDF-s yours is much more completed and longer; I have a really small number of samples). I supposed those are sequences that did not get to be mapped or classified with any bacteria, aren’t them? Thus, how to interpret that? Just ignoring them?

I attached my PDF-s below, so you can have a look at them :smiley:

Thank you so much again!

Hi @MiriamGorostidi,

That's an excellent question! For general explanation on how to interpret things like __ and g__, I will link this previous, but clear answer from the forum: Follow-up on "'unique' taxonomy strings that seem to be shared" - #2 by Nicholas_Bokulich

The distinction is that the first row (ending in ; ) cannot be confidently classified beyond family level (probably because a close match does not exist in the reference database). So sequences receiving that classification can be any taxon in f__Geodermatophilaceae. The second row (ending in g__;s__) DOES have a close match in the reference database and hence is confidently classified at species level — unfortunately, that close match does not have genus or species-level annotations. This does not in any way imply that these two different taxonomic affiliations are related beyond the family level, so it would probably be inappropriate (or at least presumptuous) to collapse these at species level.

Now that we are (hopefully) on the same page with what those double underscores mean, what I ended up doing was updating my dokdo package to handle the double underscores better. For example, instead of outputting k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Intrasporangiaceae|f__Intrasporangiaceae|__, the 1.6.0-dev version now outputs k__Bacteria|p__Actinobacteria|c__Actinobacteria|o__Actinomycetales|f__Intrasporangiaceae|f__Intrasporangiaceae_x__L6. This way, if this particular taxon is returned as significant by LEfSe, it will show up as f__Intrasporangiaceae_x__L6 instead of __. For more examples, scroll up and click the files linked to my original tutorial; I updated those files. I will link one PDF file here as an example. output.pdf - Google Drive

Finally, if you wish to re-run your analysis with the 1.6.0-dev version of Dokdo, make sure to re-install Dokdo:

$ pip uninstall dokdo
$ git clone https://github.com/sbslee/dokdo
$ cd dokdo
$ git checkout 1.6.0-dev
$ pip install .

Hope this helps and let me know if you have any questions!

1 Like

Thank you @sbslee !!
Really helpful and clear to understand :slight_smile:

Hi @sbslee
I am trying to use LEfSe following the commands you provided here
during preparing the files I got this error (at the end in bold)
dokdo prepare-lefse
-t table.qza
-x taxonomy.qza
-m metadata.tsv
-o output/Useful-Information/input_table.tsv
-c health-state
-u age
-w "[health-state] IN ('healthy', 'patient')"

Traceback (most recent call last):
File "/usr/local/bin/dokdo", line 5, in
from dokdo.main import main
File "/usr/local/lib/python3.8/dist-packages/dokdo/init.py", line 1, in
from .api import *
File "/usr/local/lib/python3.8/dist-packages/dokdo/api/init.py", line 1, in
from .get_mf import get_mf
File "/usr/local/lib/python3.8/dist-packages/dokdo/api/get_mf.py", line 1, in
from qiime2 import Metadata
ModuleNotFoundError: No module named 'qiime2'

I am using qiime2. 2021.4 and python 3.8
any advices?

thank you

@Jalalalzanin,

It looks like you ran the command ($ dokdo prepare-lefse ...) in an environment where QIIME 2 is not installed. Did you make sure to activate your qiime2-2021.4 environment with $ conda activate before running the command?

thank you for prompt reply
Yes, sure that qiime2 was installed and activated.
image

Thanks for confirming that. The error is still puzzling to me because it indicates that Dokdo isn't able to find QIIME 2 in the current environment. If you don't mind, could you show me the results of

  1. $ qiime info
  2. $ dokdo -v

in the command line?

System versions
Python version: 3.8.8
QIIME 2 release: 2021.4
QIIME 2 version: 2021.4.0
q2cli version: 2021.4.0

Installed plugins
alignment: 2021.4.0
composition: 2021.4.0
cutadapt: 2021.4.0
dada2: 2021.4.0
deblur: 2021.4.0
demux: 2021.4.0
diversity: 2021.4.0
diversity-lib: 2021.4.0
emperor: 2021.4.0
feature-classifier: 2021.4.0
feature-table: 2021.4.0
fragment-insertion: 2021.4.0
gneiss: 2021.4.0
longitudinal: 2021.4.0
metadata: 2021.4.0
phylogeny: 2021.4.0
quality-control: 2021.4.0
quality-filter: 2021.4.0
sample-classifier: 2021.4.0
taxa: 2021.4.0
types: 2021.4.0
vsearch: 2021.4.0

for the second command "dokdo -v" gave me the same as previous although the dokdo was installed
image

after running the command dokdo -v

Traceback (most recent call last):
File "/usr/local/bin/dokdo", line 5, in
from dokdo.main import main
File "/usr/local/lib/python3.8/dist-packages/dokdo/init.py", line 1, in
from .api import *
File "/usr/local/lib/python3.8/dist-packages/dokdo/api/init.py", line 1, in
from .get_mf import get_mf
File "/usr/local/lib/python3.8/dist-packages/dokdo/api/get_mf.py", line 1, in
from qiime2 import Metadata
ModuleNotFoundError: No module named 'qiime2'

@Jalalalzanin,

I found the problem! You are currently using the 1.0.0 version of Dokdo, which is terribly outdated and doesn't even have the prepare-lefse command. Please re-install the latest version: 1.10.0:

$ git clone https://github.com/sbslee/dokdo
$ cd dokdo
$ pip install .

Please let me know if this doesn't solve the issue.

it is working now
I will go through the remining commands for LEfSe analysis and tell you if the analysis was done perfectly

thank you

1 Like