Large portion of Cyanobacteria

@SoilRotifer,

Could you please check if my codes for merging the NCBI and SILVA rep.seqs and tax. is correct, step 5&6 so simply merging both corresponding seqs and taxonomy files from both NCBI and SILVA?

qiime rescript get-ncbi-data \
--p-query '33175[BioProject] OR 33317[BioProject]' \
--o-sequences /user/asga9989/NCBI/ncbi-refseqs-unfiltered.qza \
--o-taxonomy /user/asga9989/NCBI/ncbi-refseqs-taxonomy-unfiltered.qza
qiime rescript filter-seqs-length-by-taxon \
--i-sequences /user/asga9989/NCBI/ncbi-refseqs-unfiltered.qza \
--i-taxonomy /user/asga9989/NCBI/ncbi-refseqs-taxonomy-unfiltered.qza \
--p-labels Archaea Bacteria \
--p-min-lens 900 1200 \
--o-filtered-seqs /user/asga9989/NCBI/ncbi-refseqs.qza \
--o-discarded-seqs /user/asga9989/NCBI/ncbi-refseqs-tooshort.qza
qiime rescript filter-taxa \
--i-taxonomy /user/asga9989/NCBI/ncbi-refseqs-taxonomy-unfiltered.qza \
--m-ids-to-keep-file /user/asga9989/NCBI/ncbi-refseqs.qza \
--o-filtered-taxonomy /user/asga9989/NCBI/ncbi-refseqs-taxonomy.qza
qiime rescript evaluate-fit-classifier \
--i-sequences /user/asga9989/NCBI/ncbi-refseqs.qza \
--i-taxonomy /user/asga9989/NCBI/ncbi-refseqs-taxonomy.qza \
--o-classifier /user/asga9989/NCBI/ncbi-refseqs-classifier.qza \
--o-evaluation /user/asga9989/NCBI/ncbi-refseqs-classifier-evaluation.qzv \
--o-observed-taxonomy /user/asga9989/NCBI/ncbi-refseqs-predicted-taxonomy.qza
qiime feature-table merge-seqs \
--i-data /user/asga9989/Taxonomy_output/training-feature-classifiers/ silva-138-ssu-nr99-seqs-derep-uniq.qza \
--i-data /user/asga9989/NCBI/ncbi-refseqs.qza \
--o-merged-data /user/asga9989/NCBI/SILVA-NCBI-merged-rep-seqs.qza
qiime feature-table merge-taxa \
--i-data /user/asga9989/Taxonomy_output/training-feature-classifiers/silva-138-ssu-nr99-tax-derep-uniq.qza \
--i-data /user/asga9989/NCBI/ncbi-refseqs-taxonomy.qza \
--o-merged-data /user/asga9989/NCBI/SILVA-NCBI-merged-rep-taxonomy.qza
qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads /user/asga9989/NCBI/SILVA-NCBI-merged-rep-seqs.qza
--i-reference-taxonomy /user/asga9989/NCBI/SILVA-NCBI-merged-rep-taxonomy.qza
--o-classifier /user/asga9989/NCBI/SILVA-NCBI-mergerd-classifier.qza
qiime feature-classifier classify-sklearn
--i-classifier /user/asga9989/NCBI/SILVA-NCBI-mergerd-classifier.qza
--i-reads /user/asga9989/atra-rep-seqs.qza
--o-classification /user/asga9989/NCBI/atra-vs-SILVA-NCBI-classifier-taxonomy.qza
qiime metadata tabulate
--m-input-file /user/asga9989/NCBI/atra-vs-SILVA-NCBI-classifier-taxonomy.qza
--o-visualization /user/asga9989/NCBI/atra-vs-SILVA-NCBI-classifier-taxonomy.qzv

Hi @Sabrin,

Everything looks fine to me. However, know that GenBank and SILVA use slightly different taxonomic nomenclature and also use different prefixes. Prior to merging the taxonomy, you'll likely want to run rescript edit taxonomy ... as I've described here:

-Mike

@SoilRotifer yes true, I got different prefixes in my final classification.

First question, what if I want to keep the short reads and not filter them, then can i just skip step 2. qiime rescript filter-seqs-length-by-taxon and step 3. and just use ncbi-refseqs-unfiltered.qza and ncbi-refseqs-taxonomy-unfiltered.qza for downstream analysis of step 4.qiime rescript evaluate-fit-classifier ?

second question please, I am trying to use qiime rescript edit-taxonomy now to fix different prefixes issue. So I should chang the prefix K__, in NCBI taxonomy file before merging it to the SILVA taxonomy file like that,

qiime rescript edit-taxonomy
--i-taxonomy ncbi-refseqs-taxonomy-unfiltered.qza
--p-search-strings 'k__Bacteria'
--p-replacement-strings 'd__Bacteria'
--o-edited-taxonomy ncbi-refseqs-edited-taxonomy-unfiltered.qza

Finally, Is there is a simple tutorial I can follow to export my files into R formats for further visualizations please.

Thanks a lot for your continuous help.

You can do whatever you like to suit your needs. The tutorial just simply offers a series of examples to process your data.

I'd be more generic and simply do the following (just incase you have some Archaea... ):

qiime rescript edit-taxonomy \
    --i-taxonomy ncbi-refseqs-taxonomy-unfiltered.qza \
    --p-search-strings 'k__' \
    --p-replacement-strings 'd__' \
    --o-edited-taxonomy ncbi-refseqs-edited-taxonomy-unfiltered.qza

Try to keep the questions on topic. Ideally, this question should be a separate post. But if you search this forum you'll be able to fins many examples and discussions.

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.