Is there a way to see the actual file names that were the inputs for the various steps shown in the provenance chain when visualizing a .qza or .qzv object?
For example, I have a file taxa-bar-plots.qzv and I would like to make sure that the input files were all prefixed with 300bp as I also created files with 400bp prefixes to see the influence of --p-trim-length on the deblur output before moving onto downstream steps.
The provenance chain just lists the steps but not the input files as far as I can tell.
The filename isn't recorded as part of provenance --- I will ask @ebolyen to provide some detail as to why, but as I understand it, it boils down to the fact that filenames are mutable, while something like the artifact's UUID aren't. @ebolyen, is there a technical limitation/hurdle here, too?
Here is a simple python script to demonstrate how to get a mapping of Artifact UUIDs to filenames:
import qiime2
import pathlib
artifacts = pathlib.Path('.').glob('**/*.qz*')
artifact_map = {qiime2.Artifact.peek(str(a)).uuid: str(a) for a in artifacts}
print(artifact_map)
{'00d4b3d4-a036-4d99-8a3d-25866fe519dd': 'beta-rarefaction.qzv',
'158f42e6-0576-4ba6-9b7b-7ebf572a0b5b': 'alpha-rare-without-spaces.qzv',
'2ecc209a-1e9f-4e2f-a417-c0e01c313170': 'table.qza',
'45158415-22ed-4601-9373-f461042311f6': 'alpha-rare-with-spaces.qzv'}
That still requires a bit of manual work, although, to identify which file is which. We still have plenty of plans in the provenance (and citation) departments, so stay tuned!