KO on CJK fonts

Post Reply
mumchristmas
Posts: 3
Joined: 14 Dec 2021

KO on CJK fonts

Post by mumchristmas »

We are doing some research for CJK kerning tech. Just found the KO engine works fine on CJK chars. But won't output any CJK auto pairs when click the Kern on button. For auto values, we have to set the pairs one by one in Glyphs UI.
Is there any python api to generate the auto kerning pairs with KO?
User avatar
SCarewe
Posts: 100
Joined: 23 Apr 2021

Re: KO on CJK fonts

Post by SCarewe »

Hello, in order to have Kern On autokern your pairs, you need to add them to the pair_frequencies.txt file (which can be found if you open the plugin in the Finder, click "Show Package Contents" and navigate to Contents/Resources/

Add your pairs and give them an arbitrary number that puts them high enough in the priotisation (depending on if you have other glyphs in the font).

As for generating a list of all possible pair combinations, I'm not at all knowledgable enough on CJK to tell what approach is best, because I assume generating a list of all possible combinations would be ridiculously long, with Wikipedia stating something like 3,000 characters needed minimum for Chinese (those alone resulting in 9,000,000 possible combinations). So, for that part, I hope Tim can give some insight :)
mumchristmas
Posts: 3
Joined: 14 Dec 2021

Re: KO on CJK fonts

Post by mumchristmas »

Yes, CJK has too many characters, and the "complete" models will drive everyone crazy.

We plan to perform a round of parameterized statistics on the left and right shapes of the glyphs. Then use kerning groups to simplify the number of kerning pairs. Glyphs frequency statistics also should be considered.

It's just a simple idea, and the workflow is still being set up.
User avatar
Tim Ahrens
Site Admin
Posts: 404
Joined: 11 Jul 2019

Re: KO on CJK fonts

Post by Tim Ahrens »

Sorry about the late reply. Yes, Sebastian is right, the pairs need to go into the pair_frequencies.txt file. I’m more than happy to do that for you (i.e. for everyone). So far, I have only included scripts that I know enough about, as I don’t want to include scripts that are usually not kerned at all.

When we are talking about CJK, isn’t it necessary to distinguish between the different scripts?
  • Chinese: Sorry, I don’t know enough about kerning there but I’m keen to understand. Can you help? Which characters are kerned against which? I was assuming that the ideographic characters (i.e. the ~10,000 characters) are unkerned, at least they are not kerned between themselves? Is there any kerning with punctuation?
  • Japanese: I know how Japanese is kerned but it’s implemented as a combination of sidebearing adjustments via the palt feature, combined with “real”, i.e. pair kerning. In order to generate sensible pair kerning, KO first needs to “understand” palt. I am planning to implement that.
  • Korean: Sorry, again, I don’t know enough. Which characters are kerned against which in Korean? It would be great if you could help me get some insight.
Thanks!
Post Reply