Re: [ Attn: Randy ] Ad-hoc Parsing?
From: Herbert Kleebauer (klee_at_unibwm.de)
Date: 12/25/04
- Next message: Herbert Kleebauer: "Re: a 'turbo' assembly language"
- Previous message: Herbert Kleebauer: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- In reply to: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Next in thread: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Reply: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 25 Dec 2004 11:36:19 +0100
Phil Carmody wrote:
> Herbert Kleebauer <klee@unibwm.de> writes:
> > > > Now, with your superior shell and your posted method
> > > >
> > > > phil@nonospaz:tmp$ echo -n -e '\x01' >| crap
> > > >
> > > > you would need 4 characters for any binary byte giving a total
> > > > of 8192 bytes. In this case I prefer to use the inferior shell
> > > > which needs only 1812 bytes.
With your table given below I get:
0- 6: 1236 * 2 = 2472
7- 13: 42 * 2 = 84
14- 15: 11 * 3 = 33
16- 31: 105 * 4 = 420
32- 38: 36 * 1 = 36
39: 0 * 2 = 0
40- 91: 195 * 1 = 195
92: 0 * 2 = 0
93-126: 301 * 1 = 301
127-159: 26 * 4 = 104
160-255: 96 * 1 = 96
-------------------------
3741 byte
compared to the 1871 this is still twice the size.
Now to your table itself.
> Character codes 0-7 can be done in 2 characters - \#
And what happens if the code 0x03 is followed by the
character '3': I suppose \33 would be interpreted as
octal 33, therefore 2 characters are not always sufficient
for codes 0-7
> Character codes 7-13 can be done in 2 characters - \a, \b, \t, \n, \v, \f, \r
> Character codes 14-15 can be done in 3 characters - \x#
The same as above. What when 0x0f is followed by the letter 'a'?
> Character codes 32-126 \ 39,92 can be done in 1 character - themselves
> Character code 39, 92 can be done in 2 characters - \', \\
> Character codes 160-255 can be done in 1 character - themselves
This will make your script a binary file. Maybe the shell doesn't
have a problem with binary files, but many editor have. If you
open the file in an ascii editor to change a single byte the
complete file can be corrupted.
> Assuming a binary with no bias towards any particular character, that's
> a mean chars/byte of
>
> (14*2+16*3+93*1+2*2+33*4+96*1)/256
> 1.56640625
>
> Which is substantially less than 4.
3741/2048 = 1.8
But as I already said, whether this factor is 4 (use \xnn for
any byte) 1.8 (in the example above, which will result in
encoding errors) or 0.9 (in the case of the self extracting
compressed ascii encoding used in my batch program) doesn't
matter. They are all in the same category, far above the
43 bytes for the com program solution.
> Of course, with Here Documents, as long as you don't mind binary data
> in your script file, then there's no character-esaping, so no overhead.
And if you mind about binary data, you have to use the
\xnn form for any byte >127 which will increase your
1.56 factor essentially.
- Next message: Herbert Kleebauer: "Re: a 'turbo' assembly language"
- Previous message: Herbert Kleebauer: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- In reply to: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Next in thread: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Reply: Phil Carmody: "Re: [ Attn: Randy ] Ad-hoc Parsing?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|