Re: HLA Lib
- From: "randyhyde@xxxxxxxxxxxxx" <randyhyde@xxxxxxxxxxxxx>
- Date: 19 Mar 2007 16:02:17 -0700
On Mar 19, 3:06 pm, "vid...@xxxxxxxxx" <vid...@xxxxxxxxx> wrote:
Sure, take a look at the conversion routines.
i just did. the lowmost functions seemed to be done pretty fast. But
the higher level functions suffered from lot of abstraction there, for
example separate predicting number of digits before padding, and
separate conversion.
Actually, the "separate predicting number of digits before padding"
was researched quite well. Take a look at the threads over in CLAX for
the idea. I tried a *lot* of different ways to do generic numeric
conversion. (actually, the "subtract" algorithm in the library still
needs to be replace by Terje's algorithm; "subtract" is faster on a P4
[where I developed the code], but Terje's algorithm is faster on most
other processors).
i was solving this for very long for FASMLIB until some good state.
General answer is: utilize your own read buffering for file.
Just something I'm avoiding. If someone wants to do buffered I/O, I
encourage them to rethink their algorithms if speed is an issue.
Because once you start doing all the conversions and buffering and
stuff like that, you're almost *always* paying a heavy price for all
that. Better to rethink the I/O model and try to use fileio.read and
fileio.write (or memory-mapped files). Yes, I could make the stuff
faster at the expense of a fair amount of extra complexity. But that's
not the purpose of the fileio code. The fileio code exists to make
light-weight stuff *easy*, not to be a performance demon. Again, for
*really fast* code, people shouldn't be thinking in terms of doing
file I/O in an item-by-item manner as you would with the usually
fileio routines. If this means that calls the the C stdlib FILE
routines will be faster than the HLA stdlib code, great! I consider
this to be a good thing. It makes the programmer stop thinking in such
high-level terms when performance is an issue.
Sure, all this could be handled, but that
would involve having HLA emit some code behind the user's back to
execute whenever the program quits, and surely you've seen the
reaction to that kind of stuff around here. With the single exception
of exception handling, the HLA stdlib does not require any startup or
cleanup code. I want to keep it that way. Again, if someone wants
buffering, they can use memory-mapped files (or fileio.read) and use
the conv.* library to do their processing.
some init and finishing codes are a must-have for solid design
library. How are you going to implement heap for linux without it?
The first memory-management call sets things up. There is no
"finishing code" required for heap management.
It
doesn't have to be hidden, you can force user to always call it :)
That is a fundamental difference in philosophy between your library
and mine. Mine is intended to be used by beginning students as well as
by more advanced programmers. Requiring a whole bunch of calls at the
beginning of their program to initialize things they aren't explicitly
using requires a lot of explanation that you just don't want to go
into early on in the course. And thus far, I've gotten away without
having any explicit initialization code up front other than for
exception handling. If this means that the standard fileio routines
won't be able to do buffering, I'm okay with that. Of course, at some
point I *could* add a bufferio package to the library that requires
appropriate initialization and cleanup calls, but I just don't see
that as an absolute necessity for the current go-round. There are much
more important things that the library needs, such as a zstring
package, a full set of filesys functions (e.g., directory enumeration,
stuff like that).
???
stdin.readln will raise the EOF exception if you insist on reading
beyond the EOF. If you're writing filter programs that generally take
file data on the stdin, you can handle EOF that way. If you *really*
want to use generic fileio functions on stdin, just call
stdin.handle() and use the resulting handing as the parameter to the
fileio functions. stdin is mainly intended for console-like input.
but other functions regarding stdin seem to act as if stdin shouldn't
never give EOF.
The are all based around readln. The exception percolates up.
for example stdin._geti_ fails to read number if it's
last thing in redirected file.
_geti_ is based around readln. Readln will return a line feed the
*first* time EOF occurs, it raise an exception if you try to read more
data from the input. This behavior, btw, was modified based on *your*
suggestions from the last time we had this discussion. In any case,
let's look at _get_ from the perspective of a number appearing just
before the EOF:
procedure _geti_;
@noframe;
@nodisplay;
begin _geti_;
push( ecx );
push( esi );
// If the input buffer is empty, read a new line from the
// standard input device.
// NOTE: because we're processing digits, NeedsInput will be false
// as there is still additional data in the input buffer. Therefore,
// we skip over the following call to ReadLn.
cmp( NeedsInput, false );
je gotInput;
ReadLn();
gotInput:
// The following code skips over leading delimiter characters in the
input buffer
// Presumably, we're processing digits at this point, so it doesn't
really do anything
// (if there are delimiters, it will skip over them up to that last
integer on the line).
xor( eax, eax );
mov( InputIndex, ebx );
mov( CharsInBuf, ecx );
SkipDelimsLoop:
mov( InputBuffer[ ebx ], al );
cmp( al, '_' );
je isDelim;
bt( eax, (type dword Delimiters ));
jnc NoMoreDelims;
isDelim:
inc( ebx );
cmp( ebx, ecx );
jb SkipDelimsLoop;
// We are at the end of the line, so read a new line from the
user.
ReadLn();
mov( InputIndex, ebx );
mov( CharsInBuf, ecx );
jmp SkipDelimsLoop;
NoMoreDelims:
// Okay, now we process the digits up to the end of the input buffer.
// Note that ReadLn always puts a zero at the end of the input buffer,
so
// we know there is a valid delimiter/sentinel character at the end of
the
// buffer (even if we've hit EOF):
// Point esi at the start of the text and call _atoi to
// convert this text to an integer.
lea( esi, InputBuffer[ebx] );
try
conv.atoi64( [esi] ); // Convert to an integer.
// Verify that we end with a valid delimiter:
movzx( (type char [esi]), ecx );
bt( ecx, Delimiters );
jc goodDelim;
raise( ex.ConversionError );
goodDelim:
anyexception
mov( true, NeedsInput);
raise( eax );
endtry;
// Compute new InputIndex value.
sub( &InputBuffer, esi );
mov( esi, InputIndex );
pop( esi );
pop( ecx );
ret();
end _geti_;
Now there might be a bug in there somewhere. If you've got an example
that fails, I'd love to see it. But based on the way ReadLn and _geti_
work, it *should* handle EOF at the end of the digit string just fine.
Cheers,
Randy Hyde
P.S. Just for completeness, here's the ReadLn code:
procedure ReadLn;
@noframe;
@nodisplay;
@noalignstack;
// Ugly static variables means this code is not reentrant, but then,
// we couldn't have two separate threads simultaneously reading from
// the stdin, anyway.
static
lastWasEOF :boolean := false;
lastWasCR :boolean := false;
inputChar :char;
begin ReadLn;
push( eax );
push( ecx );
mov( -1, ecx ); // Becomes 0 on first iteration.
repeatUntilEOLN:
add( 1, ecx );
stdin.read( inputChar, 1 );
test( eax, eax ); // Check for EOF
mov( inputChar, al );
jnz notAtEOF;
// If we hit end of file, then return a linefeed
// on the first EOF and raise EOF on the second.
cmp( lastWasEOF, false );
jne raiseEOF;
mov( stdio.lf, al );
mov( true, lastWasEOF );
jmp processChar;
notAtEOF:
mov( false, lastWasEOF );
processChar:
mov( al, InputBuffer[ecx] );
// If the last character we read was a CR, then set
// lastWasCR to true so we'll ignore the next character
// if it is a LF.
//
// Technically, this is a win32-ism, but it probably
// doesn't hurt to do this for the *NIX OSes as well.
cmp( al, stdio.cr );
jne tryLF;
// Carriage return is a special case.
// If the input is coming from the console,
// we treat CR like the end of line. But if
// we're reading data from a file, then we
// need to read one more character to see if
// it is a LF (because we don't want to leave
// the LF around for the next read operation).
//
// Technically, this is a win32-ism, but it probably
// doesn't hurt to do this for the *NIX OSes as well.
mov( true, lastWasCR );
jmp readlnDone;
// Bail if we encounter a LF:
tryLF:
cmp( al, stdio.lf );
jne chkIndex;
// If the current character is a LF and the
// last character was a CR, then quietly ignore
// this LF because we've already processed the
// end of the line.
cmp( lastWasCR, false );
je readlnDone;
// At this point, the last character was a LF, not
// a CR, so set lastWasCR to false:
mov( false, lastWasCR );
sub( 1, ecx ); // Don't use this LF char.
jmp repeatUntilEOLN;
// Bail if buffer overflow:
chkIndex:
mov( false, lastWasCR ); // At this point, the last
char
cmp( ecx, 1022 ); // was not a CR.
jb repeatUntilEOLN;
readlnDone:
// Save away the character count:
mov( ecx, CharsInBuf );
// Zero terminate the input buffer (other stdlib code depends
// upon this buffer being zero terminated).
mov( 0, InputBuffer[ ecx ] );
// Initialize the index into the input buffer.
mov( 0, InputIndex );
// Initialize the NeedsInput variable with false
// since we just read some input (even if it was
// the empty string we should do this).
mov( false, NeedsInput );
pop( ecx );
pop( eax );
ret();
raiseEOF:
raise( ex.EndOfFile );
end ReadLn;
.
- Follow-Ups:
- Re: HLA Lib
- From: vid512@xxxxxxxxx
- Re: HLA Lib
- From: vid512@xxxxxxxxx
- Re: HLA Lib
- References:
- HLA Lib
- From: vid512@xxxxxxxxx
- Re: HLA Lib
- From: randyhyde@xxxxxxxxxxxxx
- Re: HLA Lib
- From: vid512@xxxxxxxxx
- Re: HLA Lib
- From: randyhyde@xxxxxxxxxxxxx
- Re: HLA Lib
- From: vid512@xxxxxxxxx
- HLA Lib
- Prev by Date: Re: HLA Lib
- Next by Date: Re: Could you please help me - branch command
- Previous by thread: Re: HLA Lib
- Next by thread: Re: HLA Lib
- Index(es):
Relevant Pages
|