Fungal ITS tree based on ghost-trees


I want to build a fungal ITS tree based on ghost-tree, but I can't understand the author's tutorial very well, the following are the steps I wrote by myself, some of the steps I'm confused, can someone modify it for me? Any help is greatly appreciated:

  1. build a pre-built ghost tree by my self, because I I don't know how to choose 0.80, 0.90, 1.00 ghost-trees. Can I build 0.99 ghost-trees?

qiime tools import
--input-path SILVA_132_SSURef_Nr99_tax_silva_full_align_trunc.fasta
--type FeatureData[AlignedSequence] --input-format AlignedRNAFASTAFormat
--output-path SILVA_132_SSURef_Nr99_tax_silva_full_align_trunc.qza

Silva Taxonomy File

qiime tools import
--input-path tax_slv_ssu_132.txt
--type SilvaTaxonomy
--output-path tax_slv_ssu_132.qza
--input-format SilvaTaxonomyFormat

Silva Accession ID Map

qiime tools import
--input-path tax_slv_ssu_132.acc_taxid
--type SilvaAccession
--output-path tax_slv_ssu_132.acc_taxid.qza
--input-format SilvaAccessionFormat

extract fungi

qiime ghost-tree extract-fungi
--i-aligned-silva-file SILVA_132_SSURef_Nr99_tax_silva_full_align_trunc.qza
--i-accession-file tax_slv_ssu_132.acc_taxid.qza
--i-taxonomy-file tax_slv_ssu_132.qza
--o-aligned-seqs silva_fungi_only_full_aligned_132.qza

### Filter alignment positions, I don't understand how these two parameters are set: 0.9, 0.8
qiime ghost-tree filter-alignment-positions \
--i-aligned-sequences-file silva_fungi_only_full_aligned_132.qza \
--p-maximum-gap-frequency 0.9 \
--p-maximum-position-entropy 0.8 \
--o-aligned-seqs silva_fungi_only_full_aligned_132_FILTERED.qza

ghost-tree extensions group-extensions \
'sh_refs_qiime_ver8_97_10.05.2021.fasta' 0.97 \

ghost-tree scaffold hybrid-tree-foundation-alignment \
'otu_map_97_qiime_ver8_97_10.05.2021.txt' \
'sh_taxonomy_qiime_ver8_97_10.05.2021.txt' \
'sh_refs_qiime_ver8_97_10.05.2021.fasta' \
'silva_fungi_only_full_aligned_132_FILTERED.fasta' \

### So, I got the pre-ghost-tree: ghost_tree_97_qiime_ver8_97_10.05.2021 (ghost_tree.nwk and ghost_tree_extension_accession_ids.txt)

#### 2. the results of DADA2 was used reclustered: 
time qiime dada2 denoise-paired \
  --i-demultiplexed-seqs paired-end-demux-trimmed.qza \
  --p-trim-left-f 0 --p-trim-left-r 0  --p-trunc-len-f 0 --p-trunc-len-r 0 \
  --o-table table.qza \
  --o-representative-sequences rep-seqs.qza \
  --o-denoising-stats denoising-stats.qza

qiime vsearch cluster-features-closed-reference \
  --i-table table.qza \
  --i-sequences rep-seqs.qza \
  --i-reference-sequences sh_refs_qiime_ver8_97_10.05.2021.qza \
  --p-perc-identity 0.97 \
  --o-clustered-table table-cr-97.qza \
  --o-clustered-sequences rep-seqs-cr-97.qza \
  --o-unmatched-sequences unmatched-cr-97.qza

#### At the last, I only filter the 0.97 pre-built ghost tree to match the IDs inside my table-cr-97.qza file? Is the above process correct? What changes should I make? Thank you very much

Hi Weibo! Sorry you are having issues using ghost-tree and sorry for the delay. Unfortunately I no longer have time to work on any ongoing issues with ghost-tree as I have moved on to new projects. Thank you for understanding.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.