-
Notifications
You must be signed in to change notification settings - Fork 48
Description
some machine learning to adjust the grid size of components when composed...
though i believe it's trivially the future work...
about decomposition,
https://www.babelstone.co.uk/CJK/index.html -> ids.txt
there's an actively maintained dataset in IDS format, not sure if you're already using it. those could be easily parsed and directly transformed to your format. though also it's expected not a few of them will need to be hand corrected. visualizing them with this project may help to identify errors too.
personally i suggest not to break up the components too fine/deep. down to some mid-level components, much likely they're essentially (look up at zdic.net or hanziyuan.net for example) standalone components but may be generalized and merged to look like combinations of semantically irrelevant sub-shapes during 隸變 楷化. in other word, there's much chance for some fake orthogonality. for example 它→宀匕, 宁→宀丁, 寅→宀?, but in fact none of them is really composed with 宀. also too many levels of composition degrades output quality.