In the middle of the desert you can say anything you want

29 Jan 2024

Dumping pretty cyrillic UTF YAML and JSON files


  • PyYAML: allow_unicode=True
  • JSON: ensure_ascii=True


My favourite params for dumping both, esp. if Ukrainian/cyrillic/UTF is involved

All of the below are params one can pass to to_[json|yaml][_file]() of Wizard Mixin Classes — Dataclass Wizard 0.22.3 documentation


		params = dict(
		allow_unicode=True,  # write Ukrainian as Ukrainian
		sort_keys=False,  # so template is first in the YAML for readability
	self.to_yaml_file(yaml_target, **params)

default_flow_style prefers lists like this (from docu):

>>> print yaml.dump(yaml.load(document), default_flow_style=False)
a: 1
  c: 3
  d: 4


to_json(indent=4, ensure_ascii=False)

The difference being:

(Pdb++) created_tasks[0][0].to_json()
'{"question": "\\u042f\\u043a\\u0435 \\u0441\\u043b\\u043e\\u0432\\u043e \\u043a\\u043e\\u0440\\u043e\\u0442\\u0448\\u0435: \\"\\u043a\\u0456\\u0442\\"\\u0447\\u0438 \\"\\u0441\\u043e\\u0431\\u0430\\u043a\\u0430\\"?", "correctAnswer": "\\u043a\\u0456\\u0442", "templateUuid": "1da85d6e7cf5440cba54e3a9b548a037", "taskInstanceUuid": "6ac71cd524474684abfec0cfa3ef5e1e", "additionalMetadata": {"kind": "less", "template_n": 2, "t1": "\\u043a\\u0456\\u0442","t2": "\\u0441\\u043e\\u0431\\u0430\\u043a\\u0430", "reversed": false}}'
(Pdb++) created_tasks[0][0].to_json(ensure_ascii=False)
'{"question": "Яке слово коротше: \\"кіт\\" чи \\"собака\\"?", "correctAnswer": "кіт", "templateUuid": "1da85d6e7cf5440cba54e3a9b548a037", "taskInstanceUuid": "6ac71cd524474684abfec0cfa3ef5e1e", "additionalMetadata": {"kind": "less", "template_n": 2, "t1": "кіт", "t2": "собака", "reversed": false}}'
Nel mezzo del deserto posso dire tutto quello che voglio.