serhii.net

In the middle of the desert you can say anything you want

UNLISTED

25 Jul 2025

Unsymbols ml generation notes

The plan

  • I get the idea that Stable Diffusion is better than GAN, less fuss and memorization

Ideas

  • ChatGPT says:

    Add a style token (font) and content token (Unicode code point). During sampling, drop the content token → the model synthesises “glyph-like” shapes no Unicode ever had.

    Condition-free sampling. To get shapes outside Unicode, drop or randomise the content token (for FontDiffuser) or sample a latent without the class embedding (for DeepSVG/GlyphGAN). This produces “glyph-like” forms that aren’t tied to any code point.

    if class was one-hot, pass either 00000 or a random soft vector

  • We can get a GAN model encoded on e.g. latin and then fine-tune it on the rest, instead of doing it from scratch

Tentative plan

  1. Generate chars
  2. Generate unseen chars
  3. Filter

Generate chars

  • FontDiffuser
    • try running existing checkpoint
    • play w/ sampling script to make it no-sampling
  • DeepSVG
    • they explicitly do font generation at the end, training with a letter class on a vector fonts dataset. So:
    • Get weights
    • Play with latent operations: deepsvg/notebooks/latent_ops.ipynb at master · alexandre01/deepsvg
    • Try to reproduce their fonts thingy
    • try to generate unseen glyphs from their fonts weights
    • train on my own fonts dataset
  • Run GlyphGAN on correctly-sized pictures as notebook, on GPU, see if it stabilizes
    • Run an existing checkpoint — GAN w/ correct size, style diffusion — to get existing chars
    • Try condition-free sampling

Filter chars

  • todo

Play with more creative char generation

Resources

Font generation

Handwriting

Generation

Movements

Misc

Datasets

  • Font Book a lot of variations for English chars, used in glyphgan, 2700 chars
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus