Re: Fast UTF-8 strlen function



NoDot wrote:
Sevag Krikorian wrote:

So many different character formats, it's just insane. Everyone

should

just speak English and be done with it!


I think Beth's already commented along these lines.


It would be nice to come up with a new phonetic alphabet that just

uses

the standard 27 keys.  If you drop the redundancies in English
characters, that would free up several possible keys for adding new
'sounds' ... that also leaves plenty of space in the byte for useful
'symbolic' characters.

eg:
ku = 'q' -- frees up 'q' for a new sound
s - c - k -- just use 's' for all 's' and 'c' sound, free up 'c'
use 'k' for all 'k' and 'c' words that sound like 'k'
use 'c' for ch and change 'ch' to the gutteral version "loch" or

"ach"

It's possible to fit all sounds in use by all languages in 27 keys

along

with 2 key combos.


Um, look at the phoenics of the word "through". The "th" sound is a
consonant sound that realy could go almost anywhere.

Keep in mind that a word like "through" had a different pronouciation at an earlier time in history, with a more guttural sound for "gh"
For the modern represantation using phonetic sounds, I think: th-r-u


Don't forget,
also, the Japanese "*y*" sets made of a consonant at the start, the
"y", and a vowel that goes with "y" ("a", "o", and "u", if memory
serves). Next, my favorite Japanese syllable: "tsu". It's one that
takes a few moments of practice, but it's fun to say. Lastly, we have
the Japanese "n" syllable, and it's the only case of repeating with
"na", "ne", "ni", "no", "nu", and "n".


Isn't that the case with all languages? The vowel modifies the consonant. It seems you already have the representations of Japanese syllables using Latin characters.


The case of a syllable like "tsu" can be broken down, it is already a consonant "ts" + a vowel "u".
When you write "tsu" do you mean "ts" as in "boo_ts_" + u or more a "dsu" as in "bu_ds_" + u ? Sorry, not familiar with Japanese. If it's like Chinese, I expect it to sound like "dsu" as in Sun Tsu.
The point is any sound can be represented by 27 characters + combos if there is international agreement on what letters represent what sounds.


A funny thing with the "ts" sound. An English speaking friend of mine had difficulty pronouncing that and I was surprised as it is already used in English "Boo_ts_" He could easily pronounce "boots" but not "ts" by itself.

(Oh, and what vowels are we going to use? Japanese uses "a" ("father"),
"e" (short 'e'), "i" (long 'e'), "o" ("'Oh,' you don't say!"), and "u"
("through"). Should the short 'i' ("bit") should be included?)


All the vowels are already in the standard Latin-set. We'll keep all those as they are important in phonitic-combos. Short vowles/consonants can be represented by a back-quote character "`". That leaves tilde '~' for maybe long vowels/consonants. If 'hard' is desired, use the same letter twice: "kk" for hard "k".


Let's step back for a minute and say you did try comming up with this:
what kind of culture would it be aimed at? If it comes up online, then
you'll definately need Information Technology terminology, acronyms,
and other things in the vocabulary. Give it a few seconds thought, and
you'll see it isn't such an easy problem.



I'm not talking about a new language (though that would be nice for the linguists to figure out). I'm talking about people writing their native language using the Latin-characters.


For example, I write this Hai phrases in latin characters:

How are you?
"inch bes es?"

However, if I want to ask if you want an apple:

"kh'ntsor guzes?"

Starts to get confusing without a strict set of rules.
"kh"	guttural, like the "ch" the German "ach"
'n	short n
ts	like the "ts" in "boots"

The Armenian alphabet has 36 different syllables, remove the redundancies and you still have about 32. Yet with word-combos, every syllable can be represented by the latin character set as it is now. The latin character set can even be 'cleaned' up by removing the redundancies there to make room for more syllables to keep down the sheer number of combos.

So there really is no need for me to use a special keyboard or retarted character format to see a graphical representation of the Armenian alphabet.

--
[kain]
http://www.geocities.com/kahlinor


NoDot,
...

P.S. The phoenics of "through" might turn out to be "thru", for the
interested.

Hey, we came up with the same thing!

Here is my quick and dirty chart

a	a in car
b	b in bat
c	ch in chore
ch	ch in German ach
d	d in door
dh	th in that
ds	ds in buds
e	a in cat
'e	i in bit
f	f in fat
g	g in good
gh	hard guttural, no equivalent
h	h in hat
i	ee in feel
j	j in jog
jh	j in French Jean
k	k in took
'k	ke in rake
kh	possible alternate for "ch" above
l	l in late
m	m in moon
n	n in moon
o	o in oh
p	p in peer
q			free
r	r in rat
s	s in seer
t	t in rat
th	th in through
ts	ts in boots
u	oo in door
v	v in vat
w	w in wool
x			free
y	y in yield
z	z in zoo

It's a pretty comprehensive list which leaves 2 free ones. Also, if we use "kh" instead of the proposed "ch", move the proposed "c" to "ch", that will free up another letter "c" for whatever I missed. Some work still needs to be done on the vowels, but nothing that can't be worked out to fit easily in 7 bits.


-- [kain] http://www.geocities.com/kahlinor .



Relevant Pages

  • Re: Character name advice wanted.
    ... my ancestors' names would look different from Jacey's list. ... of your characters as individuals, but you DON'T want to limit yourself ... people are NOT from an ethnically diverse culture. ... effort into it than just assigning them a random jumble of syllables. ...
    (rec.arts.sf.composition)
  • Re: Which scripts show syllable breaks
    ... > Joachim Pense wrote: ... >>> Morphosyllabaries record syllables with meanings. ... So Chinese characters form a morphosyllabary. ... >> Chinese characters record morphemes with meanings, ...
    (sci.lang)
  • Re: How did the Semitic Alphabet become the Greek Alphabet so quickly?
    ...     Peter> One where the units denoted by the characters are ... In Hangul, the "characters" can be further decomposed, in Chinese, the ... characters denote syllables, ...
    (sci.lang)
  • Re: Which scripts show syllable breaks
    ... mostly denote other syllables? ... > But in Chinese, we have one-to-many. ... So Chinese characters form a morphosyllabary. ... > record morphemes with meanings, and the morphemes happen to be syllables. ...
    (sci.lang)
  • Re: Shooting the past - BBC4 Sunday Night
    ... It was how the BBC used to make dramas, The acting was superb and the ... sound and direction quite beautiful. ... recording of the second episode :-( So it was interesting to hear (as well ... The characters were excellent: ...
    (uk.tech.broadcast)