Error running picrust2_pipeline.py

Hello,
I runned the script : picrust2_pipeline.py on 2 different datasets. On the first dataset ( fasta + biom) the script worked and I generated picrust2_out_pipeline folder .
But when I tested on my second dataset I have this error :

**Error running this command:
place_seqs.py --study_fasta home/input_dataset/ASV-sequences_exp2.fasta --ref_dir /opt/conda/envs/picrust2/lib/python3.8/site-packages/picrust2/default_files/prokaryotic/pro_ref --out_tree picrust2_out_pipeline/out.tre --processes 4 --intermediate picrust2_out_pipeline/intermediate/place_seqs --min_align 0.8 --chunk_size 5000 --placement_tool epa-ng

Standard error of the above failed command:
Stopping - all 1633 input sequences aligned poorly to reference sequences (--min_align option specified a minimum proportion of 0.8 aligning to reference sequences).**

What I should do in this case ? how to solve the problem please ( my input files were generated using qiime2 pipeline : type of sequencing is single end and read size to conserve after quality PHRED check is 250)

Hello @safe_assli,

Welcome to the forums! :qiime2:

I'm glad the first data set worked well for you. Looks like the second data set was too different from the database.

This is a common issue with Picrust; we are relying on an inference of an inference, so database matching matters!

What biological environment do your two data sets come from?
Is it possible that an upstream issue is causing poor alignment, like primers sneaking into your second dataset?

1 Like

thank you for replying
they both come from animal microbiome.
the only difference is that for my first dataset read size to conserve after quality PHRED check is 400.

I have another question: Is processing data from single-end sequencing instead of paired-end sequencing along with read size of 250 could be a potential cause for this issue?

Yes, differences in amplicon size will affect coverage, which in turn will change what matches to the database and the quality of picturst2 inference.

To help me understand the problem, can you tell me more about your data set? What region, what sequencing technology, etc.

Let's look for clues! :mag_right:

1 Like

problem solved it appeared that it was paired end sequencing so when I runned the qiime2 pipeline on both forward and reverse data picrust2 script worked

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.