Huggingface Datasets metadata
DatasetInfo) object contains dataset metadata like version etc.
Adding pre-existing attributes described here: Create a dataset loading script. But apparently you can’t add custom ones through it.
Option1 - subclass DatasetBuilder
Build and load touches the topic and suggests subclassing
BuilderConfig, it’s the class that then is used by the DatasetBulider.
Option2 - you can subclass the Dataset
Example shown, not for this problem, and I don’t really like it but whatever.
The best solution
Ended up just not adding metadata, I basically needed things that can be recovered anyway from a
Features object with
No easy support for custom metadata is really strange to me - sounds like something quite useful to many “Dataset created with version XX of converter program” and I see no reason why HF doesn’t do this.
Strong intuitive feeling that I’m misunderstanding the logic on some level and the answer I need is closer in spirit to “why would you want to add custom attributes to X, you could just ….”
Does everyone use separate key/values in the dataset itself or something?