Hello, me again!
I have a question about forward filling of taxonomic ranks, as explained in the RESCRIPt Tutorial.
Everything in this tutorial seemed to work well for me, and I've had a few goes at training an amplicon specific classifier to use with my own small dataset.
I recently ran a small set of samples - DNA extracted from Boar Semen - with the aim of getting this pipeline working right before moving onto human samples.
i used a Mock Community that I'd prepared myself; one that is relevant to the community I've found so far in semen, and in this sequencing run in particular I see evidence of contaminants and index-hopping. My next step was to try and experiment with some tools outside of Qiime to try and identify and quantify these, and as part of that I wanted to create a table of read counts in csv format, that I could then try and work into an unspread.py script.
I have followed the advice here about exporting my feature table and taxonomy as a tsv, but have not been successful at merging the taxonomic metadata to the biom-tsv file using the command below:
'biom add-metadata -i exported/feature-table.biom -o table-with-taxonomy.biom --observation-metadata-fp biom-taxonomy.tsv --sc-separated taxonomy'
I have searched the forum and found posts from others about this issue, none of which seem to have been solved. I was trying to figure out why this step might not have worked, and was wondering if it is because I have some incomplete strings in my final taxonomic annotation file. E.g.:
'd__Bacteria;k__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;;'
I hadn't really considered these gaps before, but now I went back to the RESCRIPt tutorial to try and figure out where I might have gone wrong.
I've gone through all the steps about 3 times now.
The steps which I figured were most important, and hence I've played around with are:
'qiime rescript parse-silva-taxonomy
--i-taxonomy-tree taxtree-silva-138.1-nr99.qza
--i-taxonomy-map taxmap-silva-138.1-ssu-nr99.qza
--i-taxonomy-ranks taxranks-silva-138.1-ssu-nr99.qza
--o-taxonomy silva-138.1-ssu-nr99-tax.qza'
And here I've added --p-rank-propagation TRUE (although I'm aware this is the default setting), and also I've tried this both with and without specifying ranks with the '--p-ranks' command.
The other command which I've tried various combinations of is dereplicate, where initially I selected '--p-mode "super"' and have since tried '"uniq"' just in case this was messing with forward filling.
Ultimately, each time I make a visualization of the final taxonomy.qza I get the same blank ranks, with no forward filling.
I'm wondering if I am misunderstanding something here, or missing an important step, or I dunno - I am a bit of a novice at everything.
Anyway, there are a lot of commands I've run, so have only selected those I thought most relevant for now, but obviously will post more if needed.
Am running qiime2-2023.2 on a conda environment and just want to say that both the quality of the tools you folk have created, and the level of support here has been amazing. Thanks for everything.
Richard