Hello, me again!
I have a question about forward filling of taxonomic ranks, as explained in the RESCRIPt Tutorial.
Everything in this tutorial seemed to work well for me, and I've had a few goes at training an amplicon specific classifier to use with my own small dataset.
I recently ran a small set of samples - DNA extracted from Boar Semen - with the aim of getting this pipeline working right before moving onto human samples.
i used a Mock Community that I'd prepared myself; one that is relevant to the community I've found so far in semen, and in this sequencing run in particular I see evidence of contaminants and index-hopping. My next step was to try and experiment with some tools outside of Qiime to try and identify and quantify these, and as part of that I wanted to create a table of read counts in csv format, that I could then try and work into an unspread.py script.
I have followed the advice here about exporting my feature table and taxonomy as a tsv, but have not been successful at merging the taxonomic metadata to the biom-tsv file using the command below:
'biom add-metadata -i exported/feature-table.biom -o table-with-taxonomy.biom --observation-metadata-fp biom-taxonomy.tsv --sc-separated taxonomy'
I have searched the forum and found posts from others about this issue, none of which seem to have been solved. I was trying to figure out why this step might not have worked, and was wondering if it is because I have some incomplete strings in my final taxonomic annotation file. E.g.:
I hadn't really considered these gaps before, but now I went back to the RESCRIPt tutorial to try and figure out where I might have gone wrong.
I've gone through all the steps about 3 times now.
The steps which I figured were most important, and hence I've played around with are:
'qiime rescript parse-silva-taxonomy
And here I've added --p-rank-propagation TRUE (although I'm aware this is the default setting), and also I've tried this both with and without specifying ranks with the '--p-ranks' command.
The other command which I've tried various combinations of is dereplicate, where initially I selected '--p-mode "super"' and have since tried '"uniq"' just in case this was messing with forward filling.
Ultimately, each time I make a visualization of the final taxonomy.qza I get the same blank ranks, with no forward filling.
I'm wondering if I am misunderstanding something here, or missing an important step, or I dunno - I am a bit of a novice at everything.
Anyway, there are a lot of commands I've run, so have only selected those I thought most relevant for now, but obviously will post more if needed.
Am running qiime2-2023.2 on a conda environment and just want to say that both the quality of the tools you folk have created, and the level of support here has been amazing. Thanks for everything.