Reinventing the Wheel

(about 7 years ago)

This is just a little post about an alternative system for writing Japanese that I came up with in my spare time, mostly to prove to myself that a language with thousands of characters and a tiny handful of syllables could be reproduced faithfully and accurately. Rather than describing every last detail, let's just say it uses the same basic shape to represent each consonant and the same accent mark for each vowel. I made "an online proof-of-concept IME":cun to illustrate how it works.

Try typing in things like "hiragana":hiragana or "DAINIPPONKOKU":nippon. In the latter, the caps implies an on'yomi "kanji":kanji reading, which cun forces to be one symbol apiece to further enhance efficiency. That is, for all Chinese words, each kanji, regardless of its reading, can be replaced by a single letter.

h3. Examples

!http://www.brymck.com/images/cun_name.png!
"My name":myname (ブライアン・リー・マッケルビー Buraian Rī Makkerubī)

!http://www.brymck.com/images/yukiguni.png!
"The opening lines of Yasunari Kawabata's Snow Country":yukiguni (国境の長いトンネルを抜けると雪国であった。 Kunizakai no nagai tonneru o nukeru to yukiguni de atta.)

h3. Some Notes on Syllables in Japanese

Japanese syllables are broken into moras. A mora is like a unit of time, such that a short syllable contains one mora and a long syllable contains two moras. For example, the word Nippon 日本 ("Japan") can be broken into two syllables (Nip + pon) and four moras (Ni + p + po + n).

|_\2. Word |_. Definition |_\4. Moras |_. Length |
| koko | ここ | here | ko | ko | | | 2 moras |
| kokko | 国庫 | national treasury | ko | k | ko | | 3 moras |
| kōko | 公庫 | public loan corporation | ko | o | ko | | 3 moras |
| kōkō | 高校 | high school | ko | o | ko | o | 4 moras |
| konkon | コンコン | knocking sound | ko | n | ko | n | 4 moras |

We can neatly group these moras together in a table. Note that some consonants undergo changes in a few places (t + i = chi, t + u = tsu), and that some entries in the "d" row were deleted because they are pronounced the same as those in the "z" row:

|_. |_. a |_. i |_. u |_. e |_. o |_. ya |_. yu |_. yo |
|_. - | a | i | u | e | o | ya | yu | yo |
|_. k | ka | ki | ku | ke | ko | kya | kyu | kyo |
|_. g | ga | gi | gu | ge | go | gya | gyu | gyo |
|_. s | sa | shi | su | se | so | sha | shu | sho |
|_. z | za | ji | zu | ze | zo | ja | ju | jo |
|_. t | ta | chi | tsu | te | to | cha | chu | cho |
|_. d | da | - | - | de | do | - | - | - |
|_. n | na | ni | nu | ne | no | nya | nyu | nyo |
|_. m | ma | mi | mu | me | mo | mya | myu | myo |
|_. h | ha | hi | fu | he | ho | hya | hyu | hyo |
|_. b | ba | bi | bu | be | bo | bya | byu | byo |
|_. p | pa | pi | pu | pe | po | pya | pyu | pyo |
|_. r | ra | ri | ru | re | ro | rya | ryu | r |

Others: wa, n, "geminate":gemination, long vowel

In cun, the above is represented "thusly":cunfull.

Click the above image to see the entire alphabet and almost all possible symbols. What should stand out is how organized it is in comparison to "hiragana":wikihiragana.

To make a consonant geminate, you repeat the consonant part of the letter in the same amount of space, so for example ıc (あか aka) becomes ıε (あっか akka). To nasalize a vowel, you add a horizontal line, such that ı (あ a) becomes ī (あん an). Long vowels depend on which vowel it is, but you can generally see that in the "IME":kseries.

But in any case, that's 103 moras (13 rows × 8 columns - 5 duplicates + 4 special moras) necessary to
describe every distinct unit of time in Japanese. Yet the Japanese are expected to learn 1,006 kanji in primary school and another 939 in secondary school. By contrast, the basic unit in English is a syllable, of which approximately 5,000 see actual use.[1] [2] Despite the complexities of English, however, only around 13% of words possess pronunciations different from their spellings. And of course, there's no reason someone couldn't use kanji for disambiguation, as they do in Korean (and increasingly sparingly, as people can usually understand from context).

Anyway, I wouldn't propose this is as a spelling reform, but it's nonetheless interesting how inertia can outweigh a more logical approach.

fn1. Tamaoka, Katsuo and Makioka, Shoga. "??Frequency of Occurrence for Units of Phonemes, Morae, and Syllables Appearing in a Lexical Corpus of a Japanese Newspaper??," Behavior Research Methods, Instruments, & Computers 36, no. 3 (2004), 531-547. "http://brm.psychonomic-journals.org/content/36/3/531.full.pdf+html":http://brm.psychonomic-journals.org/content/36/3/531.full.pdf+html

fn2. Duanmu, San. ??Syllable Structure: The Limits of Variation??. Oxford: Oxford University Press, 2008. "http://books.google.com/books?id=K5HR4oMYIlUC":http://books.google.com/books?id=K5HR4oMYIlUC

[cun]http://www.brymck.com/cun
[nippon]http://www.brymck.com/cun?romaji=DAINIPPONKOKU
[hiragana]http://www.brymck.com/cun?romaji=hiragana
[kanji]http://en.wikipedia.org/wiki/Kanji
[myname]http://www.brymck.com/cun?romaji=buRAIan%20RII%20makkeruBII
[yukiguni]http://www.brymck.com/cun?romaji=kunizakai%20no%20nagai%20tonneru%20o%20nukeru%20to%20yukiguni%20deatta
[gemination]http://en.wikipedia.org/wiki/Gemination
[cunfull]http://www.brymck.com/images/cun.png
[wikihiragana]http://en.wikipedia.org/wiki/Hiragana
[kseries]http://www.brymck.com/cun?romaji=ka%20kaa%20ki%20kii%20ku%20kuu%20ke%20kee%20ko%20koo