When using chars (lists of characters, thus atoms of length one) to represent text, we have the following options for writing them within terms:
"First,"the double quoted list notation (6.3.7) is the most efficient one, requiring at least n+2 characters. But it can only be read back if the Prolog flagdouble_quotesis set tochars.['N',e,x,t,',']comes the list notation with at least 2n+1 characters. While it is nice and relatively compact, it implies that also operators are used when writing other data since it is enabled withignore_ops(false), and this necessitates that the same operators will be present when reading, making it quite brittle.'.'('L','.'(a,'.'(s,'.'(t,'.'(',',[])))))the canonical notation which uses functional form also for lists requiring at least 7n+2 characters. That is a lot, but for interoperability (and that includes interoperability with the same system) it is best since it neither depends on thedouble_quotesflag nor the various operator declarations.
Writing chars in canonical notation can be done in constant space. But for reading, the situation is a bit more tricky. After all, a sequence starting with '.'(a, may also refer to a term '.'(a,Further,b). So a naive reading will have to wait (and use space) until the entire list of chars is read in. On the other hand, it seems to be a safe bet that '.'(a, will be a list constructor '.'(a,Further). In other words,
How to read a term in canonical notation with constant auxiliary space for the reading of chars within?
In case it helps just consider terms sampleterm/1. So consider the reading of all such terms written in canonical form. And, if you like, formulate it as a DCG.
sampleterm([]).
sampleterm(a).
sampleterm(b).
sampleterm('.'(E,Es)) :- % the real list constructor
sampleterm(E),
sampleterm(Es).
sampleterm('.'(E,F,G)) :- % no list constructor
sampleterm(E),
sampleterm(F),
sampleterm(G).
If such space efficient reading is possible, then systems that support a compact internal representation of chars like Scryer and Trealla could even go a tiny step further.
Ah, lest I forget what I have tried: read/1 indeed, but currently it was not ideal.
The following straightforward code is based on Prolog streams.
It focuses on reading "real lists sampletrees" from repositionable streams.
For (1) non-repositionable streams and (2) handling
'.'/3we fall back toread/1.The main predicate is
read_sampleterm/1:read_sampleterm(Term) :- current_input(S), ( stream_property(S,reposition(true)), stream_property(S,position(P)), ( aux_read_sampleterm_1(Term0), get_char('.') % this is sloppy! -> true ; set_stream_position(S,P), fail ) -> Term = Term0 ; read(Term) % fallback ).Note that above code is sloppy: at the end of the read, we need to ensure that EOF or a character that does not combine with
'.'follows.The actual reading is done by the following auxiliary predicates:
aux_read_sampleterm_1(Term) :- get_char(Ch), aux_read_sampleterm_2(Ch,Term,0). % use indexing aux_read_sampleterm_2('\'',[X|Xs],N0) :- get_char('.'), get_char('\''), get_char('('), aux_read_sampleterm_1(X), get_char(','), N1 is N0 + 1, get_char(Ch), aux_read_sampleterm_2(Ch,Xs,N1). aux_read_sampleterm_2('[',[],N) :- get_char(']'), eat_rparens(N). aux_read_sampleterm_2(a,a,N) :- eat_rparens(N). aux_read_sampleterm_2(b,b,N) :- eat_rparens(N). eat_rparens(N) :- ( N > 0 -> get_char(')'), N0 is N - 1, eat_rparens(N0) ; true ).To show some simple use cases we read from files:
read_sampleterm_from_file(File,Term) :- open(File,read,S,[type(text)]), current_input(S0), set_input(S), read_sampleterm(Term0), set_input(S0), close(S), Term = Term0.Sample queries using GNU Prolog 1.5.0:
First,
sample1.txt:We get:
| ?- read_sampleterm_from_file('sample1.txt',T). T = [a,b] yesNext,
sample2.txt:We get:
| ?- read_sampleterm_from_file('sample2.txt',T). T = [a,b|a] yessample3.txtis next:'.'('.'(a,'.'(b,'.'(a,[]))),[]).We get:
| ?- read_sampleterm_from_file('sample3.txt',T). T = [[a,b,a]] (1 ms) yesNote that running above tests were run without the "fallback option".