ITSxpress 2.0 in Qiime2023.2 with EMP data

Dear colleagues,

I have attempted to apply the new version of ITSxpress 2.0 (https://github.com/USDA-ARS-GBRU/itsxpress-tutorial/blob/master/ITSxpress-tutorial.md#updates-to-itsxpress), in qiime2023.2 as specified in the protocol. It seems to function correctly with different combinations of primers from different projects, for both the ITS1 and 2 regions. However, I am not obtaining results with projects sequenced with the EMP protocol (https://earthmicrobiome.org/protocols-and-standards/its/). All the samples I have tried show 0 reads after ITSx; I have check different combinations of samples (including mocks) and options (R1+R2, only R1; ITS1, ITS2, or ALL etc).

I suspect that this is due to the Ilumina ITS-EMP protocol, where the forward primer does not show in the sequence. This, Iguess, should prevent the ITSx algorithm from "fishing" the region(ITS1 in this case). I would like to confirm whether this is conceptually correct, or otherwise investigate further.

Thank you very much for your time, and for sharing so much knowledge,

Jose Morillo

3 Likes

Hello @Adam_Rivers, checking to see if you have any input on this topic. No one here on the core qiime team is very familiar with this sort of data or this plugin. Thanks!

2 Likes

We can take a look. Can you run your command in --verbose mode and share the command and outputs? @seinarsson do you have input?

2 Likes

i have the same problem. My files FASTQ.gz in theory are already without adapters and indexes, and I don't know if that affects q2-itsexpress, and I also did the filtering without itsexpress but the result gave 9 representative sequences without truncating the sequences.

Hello! Thanks for offering your assistance with this. I am pasting here the output of 1) ITSxpress using only the R1 sequences (this was my intention, due to the poor quality of R2; however, it results in an error and doesn't produce any output); 2) ITSxpress paired; it does produce output but with 0 sequences.In this example, I've taken two samples from the project with EMP primers; cutadapt works fine with the same samples; thanks!

Trim single

qiime itsxpress trim-single \

--i-per-sample-sequences demux-single-end.qza
--p-region ITS1
--p-taxa F
--p-cluster-id 0.995
--p-threads 18
--o-trimmed trimmed.qza
--verbose
vsearch v2.22.1_linux_x86_64, 125.6GB RAM, 20 cores
GitHub - torognes/vsearch: Versatile open-source tool for microbiome analysis

Reading file /tmp/itsxpress_2qy_zeka/seq.fq.gz 100%
0 nt in 0 seqs
Masking 100%
Sorting by abundance 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 0
Singletons: 0

ERROR:root:Could not perform ITS identification with hmmserach. The error was:

Error: Sequence file /tmp/itsxpress_2qy_zeka/rep.fa is empty or misformatted

Traceback (most recent call last):
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/main.py", line 564, in _search
p4.check_returncode()
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/subprocess.py", line 448, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['hmmsearch', '--domtblout', '/tmp/itsxpress_2qy_zeka/domtbl.txt', '-T', '10', '--cpu', '18', '--tformat', 'fasta', '--F1', '1e-6', '--F2', '1e-6', '--F3', '1e-6', '/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/ITSx_db/HMMs/F.hmm', '/tmp/itsxpress_2qy_zeka/rep.fa']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2cli/commands.py", line 352, in call
results = action(**arguments)
File "", line 2, in trim_single
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in callable_executor
output_views = self._callable(**view_args)
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2_itsxpress/_itsxpress.py", line 116, in trim_single
results = main(per_sample_sequences=per_sample_sequences,
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/q2_itsxpress/_itsxpress.py", line 212, in main
sobj._search(hmmfile=hmmfile, threads=threads)
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/main.py", line 567, in _search
raise e
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/main.py", line 564, in _search
p4.check_returncode()
File "/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/subprocess.py", line 448, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['hmmsearch', '--domtblout', '/tmp/itsxpress_2qy_zeka/domtbl.txt', '-T', '10', '--cpu', '18', '--tformat', 'fasta', '--F1', '1e-6', '--F2', '1e-6', '--F3', '1e-6', '/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/ITSx_db/HMMs/F.hmm', '/tmp/itsxpress_2qy_zeka/rep.fa']' returned non-zero exit status 1.

Plugin error from itsxpress:

Command '['hmmsearch', '--domtblout', '/tmp/itsxpress_2qy_zeka/domtbl.txt', '-T', '10', '--cpu', '18', '--tformat', 'fasta', '--F1', '1e-6', '--F2', '1e-6', '--F3', '1e-6', '/home/jmorillo/miniconda3/envs/qiime2-2023.2/lib/python3.8/site-packages/itsxpress/ITSx_db/HMMs/F.hmm', '/tmp/itsxpress_2qy_zeka/rep.fa']' returned non-zero exit status 1.

See above for debug info.

Trim paired

qiime itsxpress trim-pair-output-unmerged \

--i-per-sample-sequences demux-paired-end.qza
--p-region ITS1
--p-taxa F
--p-cluster-id 0.995
--p-threads 18
--o-trimmed ITSx_trimmed_paired.qza
--verbose

vsearch v2.22.1_linux_x86_64, 125.6GB RAM, 20 cores

Reading file /tmp/itsxpress_7n1y6ppy/seq.fq.gz 100%
575769 nt in 2515 seqs, min 134, max 506, avg 229
Masking 100%
Sorting by abundance 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 425 Size min 1, max 299, avg 5.9
Singletons: 282, 11.2% of seqs, 66.4% of clusters

vsearch v2.22.1_linux_x86_64, 125.6GB RAM, 20 cores

Reading file /tmp/itsxpress_qmf2onwa/seq.fq.gz 100%
720393 nt in 3614 seqs, min 92, max 444, avg 199
Masking 100%
Sorting by abundance 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 623 Size min 1, max 632, avg 5.8
Singletons: 464, 12.8% of seqs, 74.5% of clusters

Thanks so much. Do you have a link to the EMP files that cause this issue I could use to reproduce the issue? @seinarsson should be back soon and can probably figure this out for us. Removing the adaptors should not cause errors primer sets are usually pretty deep into conserved regions, and matching is done based on similarity to the conserved anchor regions, not the primers. The error on Trim-single is coming before clustering and hmmseach, we'll look into it.

Thank you very much. Your comments about the primers are very interesting - I thought they were the problem. You should be able to access the files using this link. The two samples come from soil (the project has 150 samples). When I use a pipeline without ITSx, I obtain a 'normal' fungal composition. However, I'm interested in testing if ITSx improves the taxonomic resolution, and reduces the percentage of 'unknown' fungi (those ASVs not classified at the phylum level for example).

Hi @ja.morillo , can you share with me the manifest file (for Trim single) you used in the Dropbox folder you've shared already? Can you also share the commands you used to import the data into Qiime?

I'm having some trouble recreating the problem.

1 Like

Hello,
Thanks, sure. Here it is.
No need of manifest file. This works well if I use cutadapt or other qiime2 command instead, so I guess that the import part is OK. Yo just need to create the directory raw_data_R1 contaning only the two R1 files.

conda activate qiime2-2023.2

  • import:

qiime tools import
--type 'SampleData[SequencesWithQuality]'
--input-path raw_data_R1
--input-format CasavaOneEightSingleLanePerSampleDirFmt
--output-path demux-single-end.qza

qiime demux summarize
--i-data demux-single-end.qza
--o-visualization demux-single-end.qzv

qiime tools view demux-single-end.qzv

  • ITSx, trim-single

qiime itsxpress trim-single
--i-per-sample-sequences demux-single-end.qza
--p-region ITS1
--p-taxa F
--p-cluster-id 0.995
--p-threads 18
--o-trimmed trimmed.qza
--verbose

I'm still having trouble replicating this issue. Did you try running this on just the two R1 files you shared and did you get the same error? Can you share the demux-single-end.qza file (of the whole dataset) in the dropbox folder?

I've run the commands as you did and the trimming seems to work. I'm wondering if there is a specific file that is causing an error, in which case I need to modify ITSxpress to catch the error so that the rest of the files can be trimmed.

Finally, there have been a few issues with dependencies before. Can you do:

conda list > env_list.txt

and share the file? I'd like to rule out any issue with another package.

This is what I did:

qiime tools import --type 'SampleData[SequencesWithQuality]' --input-path R1/ --input-format CasavaOneEightSingleLanePerSampleDirFmt --output-path demux-single-end.qza

qiime itsxpress trim-single --i-per-sample-sequences demux-single-end.qza --p-region ITS1 --p-taxa F --p-cluster-id 0.995 --p-threads 16 --o-trimmed trimmed_forum.qza --verbose

qiime demux summarize --o-visualization trimmed_forum.qzv --i-data trimmed_forum.qza

trimmed_forum.qzv (297.3 KB)

@ja.morillo Actually it looks like you are using ITSxpress V1.8.0 or 1.8.1 in the error code. ITSxpress has been updated recently. The package directory is no longer called q2_itsxpress, but I can confirm this if you share the following file:

conda list > env_list.txt

You may be able to just reinstall itsxpress:

#Conda or mamba or micromamba
micromamba install -c bioconda itsxpress

However, I'd recommend you create a new environment and start new to make sure there aren't any issues:

wget https://data.qiime2.org/distro/core/qiime2-2023.5-py38-linux-conda.yml
micromamba env create -n qiime2_itsxpress2_0_0 --file qiime2-2023.5-py38-linux-conda.yml
micromamba install -c bioconda itsxpress

You can replace micromamba with conda/mamba.

1 Like

OK, thanks again! First, I tried reinstalling the new version of ITSXpress, as it is likely the issue (even though ITSX 1.8.0 works well for the other primer combinations). But I'm having an installation issue, I followed the previous steps and got the following:

conda update conda

conda activate qiime2-2023.5 # fresh install

conda install -c bioconda itsxpress # yes to all updates

qiime dev refresh-cache # I added this, but does no help

qiime itsxpress

Error: QIIME 2 has no plugin/command named 'itsxpress'.

conda list > env_list.txt

For some reason I don't understand, if I do the same with the previous version of qiime2 (qiime2-2023.2, the one I was using), ITSx1.8 is installed, and it works as I explained before; with qiime2-2023.5, ITSX 1.8.0 is installed! (instead what I expected, 2.0), but it doesn't work.
I guess is something related to conda version/channels.
I'm using conda instead of mamba (its is a shared computer), I guess that's not the problem. I've tried on two computers with recent versions of Ubuntu, with the same result. Thanks!!
env_list.txt (50.6 KB)

I'm wondering if there is a version ITSxpress in the "base" environment that is maybe causing an issue. ITSxpress 1.8.0 probably works with newer Qiime installs but if you choose cluster-id=1.0, you'll likely get an issue due to a breaking change in new versions of Vsearch. I'd prefer if we can get 2.0.0 installed on your end.

Can you try this in your environment?

conda install -c bioconda itsxpress=2.0.0

Hello again,
I think we are getting close to what is causing the issue. I cant install ITSxpress 2.0 in qiime2-2023.5 (in a new unused, fresh env):

conda activate qiime2-2023.5
conda install -c bioconda itsxpress=2.0.0

Solving environment: -
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versionsThe following specifications were found to be incompatible with your system:

  • feature:/linux-64::__glibc==2.29=0
  • feature:|@/linux-64::__glibc==2.29=0

Your installed version is: 2.29

I tried this in two different computers, one of them with a fresh Ubuntu installation, but the error was the same. So I guess is something related to the qiime env but to be honest I am not sure.

Just in case, mi Linux system is:
(base) jamorillo@jamorillo-DNA-decoding:~$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.2 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="Bugs : Ubuntu"
PRIVACY_POLICY_URL="Data privacy | Ubuntu"
UBUNTU_CODENAME=jammy

Hello,
Has anyone managed to install ITSxpress (2.0) in the latest version of Qiime2(2023.5)?
I'm starting a new project and would like to use it if that can be arranged.
Thanks a lot!

@seinarsson, @Adam_Rivers - it sounds like ITSxpress can't be used in the most recent version of QIIME 2 (but I haven't tried this myself yet). We have a new release coming out later this week - that will be 2023.7. Would you like any help getting ITSxpress to work with the most recent version of QIIME 2, once the new release comes out? Just let us know what we can do.

1 Like

I've tried creating this error but I haven't been able to. I'm able to install 2023.5 Qiime (and back to 2022.8 at least) with ITSxpress2.0.0, both on my personal computer (Ubuntu 20.04), HPC and using Github Actions to test on the latest Ubuntu.

@gregcaporaso/@Adam_Rivers do either of you have issues installing ITSxpress with the most recent Qiime?

@ja.morillo one idea I have for you is to install micromamba or mamba to use instead of conda, since conda tends to give you unclear error messages. Hopefully we can try and isolate what is causing the conflict that way.

https://mamba.readthedocs.io/en/latest/micromamba-installation.html#umamba-install

Another I saw here: python - Conda UnsatisfiableError with linux64 and glibc - Stack Overflow
Where you may have locked conda into a specific version of python (but conda won't give you this information, since it has poor error messages). So if you can install one of the mambas, hopefully they'll either resolve the conflict or give a better explanation of what is causing the conflict.

Another thought I had is you may have to set the order of your conda channels? You can try this and then do the installation of Qiime and ITSxpress:

conda config --prepend channels bioconda
conda config --prepend channels conda-forge

Maybe there are some legacy packages that are causing an issue.

1 Like

Thanks a lot for the ideas and your time, and for implementing ITSxpress. I will try. As an alternative, I was also considering running ITSx outside of qiime2 and then importing the sequences into qiime2, in case I can't run it within the q-environment.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.