Re: help with parsing and dcg (swi-prolog in particular)

On 2005-05-21, Brian Hulley <brianh@xxxxxxxxxxxx> wrote:
> Brian Hulley wrote:
>> blindsearch wrote:
>> > I don't think a single list could handle it. The file I was reading
>> in
>> > was 2.5mb, there were 8199 frames.
> You could also modify the convert_file_to_list predicate so that it
> counted parentheses - when the opening bracket is closed, you stop
> reading from the file and return the list which should contain the
> chars for a single concept (assuming the original file is bracketed
> correctly, which from your first post seemed to be the case)
> Thus your main loop would be something like:

Just some stylistic issues. see/seen are old. New code better uses
the ISO open/3 and close/1.

> go :-
> see(File),
> get_byte(Char),

get_byte/1 is ISO, but it is probably not what you want. It appears
you're trying to parse a text file, so you better use get_code/1. In
SWI-Prolog 5.4.x this is still the same, but in 5.5.x get_byte reads a
byte and get_code reads a unicode character code, dealing with whatever
character encoding the file has (ISO-8859-1, UTF-8, UCS-2, etc.)

> start_with(Char).
> start_with(-1) :- seen. % end of file

You need a cut here. Otherwise a choicepoint is left and if you
backtrack into this predicate it will start reading from whatever
stream is current (probably the user). Here you also see a reason
for open/close: instead of starting to read from some unknown stream
you will get an error that the stream does not exist.

> start_with(Char) :-
> get_next_concept_list(Char,List,NextChar),
> phrase(concept(Concept), List, Remainder), % ignore remainder (?)
> write(Concept),nl,
> start_with(NextChar).
> Garbage collection will ensure that you don't run out of space in this
> kind of loop (using tail recursion).

Only if there are no choicepoints left by get_next_concept_list or
concept! Write carefully and you can use the graphical debugger to
see the choicepoints. A poor mens solution is to add a cut before
entering the recursion, but this is bad style.