fix vietnamese telex: qu-/gi- onset tone placement and dd+tone
transvi fixes: 1. qu-/gi- onset tone placement. The u after q, and the i after g when a vowel follows, are onset glides rather than the rime nucleus, so the tone must skip them: qua -> quá (was qúa), gia -> giá. The onset was previously passed straight through to the app, so transvi never saw it and toned the glide. Keep the onset in the preedit by adding qu-/gi- clusters to telex.map (mktelex.py onsets(), appended additively to the curated map), and add onsetglide() so transvi skips the glide. gi- with no following vowel keeps i as the nucleus (gì, gìn). 2. A tone key on a vowel-less preedit (e.g. "đ" from dd) now commits the preedit and lets the tone key pass through (eat=0), matching the engine commit-on-passthrough invariant, instead of eating it into the commit. Verified against the running engine: qua/quan/quay/quê/quên/quyển, gia/già/giàu/giữ/giúp/giống, gì/gìn, dd+s; unchanged mua->mùa, của, lúa; all non-qu/gi words byte-identical to before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -31,7 +31,10 @@ upper = str.maketrans(
|
||||
def addtone(v, t):
|
||||
return v.translate(tone[t])
|
||||
|
||||
entries = []
|
||||
|
||||
def emit(input, output):
|
||||
entries.append((input, output))
|
||||
print(f"{input}\t{output}")
|
||||
def up(s):
|
||||
c = s[0].translate(upper)
|
||||
@@ -147,6 +150,24 @@ def final():
|
||||
for t in tone:
|
||||
emit(i+c+t, o.replace(v, addtone(v, t), 1)+c)
|
||||
|
||||
def onsets():
|
||||
# Keep the qu-/gi- onset in the preedit so the tone lands on the rime
|
||||
# nucleus, not on the onset glide (qua->quá not qúa, gia->giá not gía).
|
||||
# transvi (onsetglide) knows to skip the glide; here we only need the
|
||||
# composed clusters to exist so the preedit accumulates them.
|
||||
vowels = set("aeiouy")
|
||||
tones = set("sfrxj")
|
||||
base = [(i, o) for (i, o) in list(entries)
|
||||
if i and i[0] in vowels and not (set(i) & tones)]
|
||||
for i, o in base:
|
||||
if i[0] != 'u': # no qu+u syllable; u is the glide
|
||||
emit("qu" + i, "qu" + o)
|
||||
emit("gi" + i, "gi" + o)
|
||||
# gi- with the i as nucleus (no following vowel): gì, gìn, gìm, ...
|
||||
for c in ["", "c", "m", "n", "p", "t", "ch", "ng", "nh"]:
|
||||
if c:
|
||||
emit("gi" + c, "gi" + c)
|
||||
|
||||
vowel1()
|
||||
vowel2()
|
||||
vowel3()
|
||||
@@ -157,3 +178,4 @@ tone1mod()
|
||||
tone2mod()
|
||||
escape()
|
||||
final()
|
||||
onsets()
|
||||
|
||||
Reference in New Issue
Block a user