----------------------------------------------------------------------- Notes on Parsing in Prolog (produced by Diane Horton) ----------------------------------------------------------------------- - Parsing in Prolog is quite easy because we can exploit Prolog's backtracking ability to do much of the work. - We simply have to tell Prolog about the grammar, and also what we want the parse trees to look like. - To do this, we'll write one predicate for every non-terminal in the grammar. - As an example, we'll use this grammar: ::= . ::= | ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 - Exercise: 3.519 is an example "sentence" in the language. Draw its parse tree. - In our parser code below, we will assume that the sentence has been read and stored as a list of atoms, such as [3, ., 5, 1, 9]. You will make the same assumption in your project. Using difference lists ---------------------- - With this approach, each of the parsing predicates has 3 arguments. - For example, a parser for the little grammar above would include a predicate part(Tree, Input, Remainder) that succeeds iff Input begins with a , Remainder is what's left of Input after parsing the , and Tree is the piece of parse tree that represents the . - Here's a parser for the whole grammar (not just for ). digit(_, [H|Rest], Rest):- member(H, [0,1,2,3,4,5,6,7,8,9]). part(_, Input, Remainder):- digit(_, Input, Remainder). part(_, Input, Remainder):- digit(_, Input, R1), part(_, R1, Remainder). real-number(_, Input, Remainder):- part(_, Input, [.|R1]), part(_, R1, Remainder). Notice that I have used "_" everywhere the pieces of parse tree would go. - Here is an example of what might happen if we queried the part predicate. Of course, the user would not directly ask such a thing; it would happen during the parsing of a full real-number. ?- part(Str, [5, 1, 9], Rem). Rem = [1,9] Str = _82; Rem = [9] Str = _82; Rem = [] Str = _82; no - Notice that the first solution above used up only the first element of the input, leaving a remainder of [1, 9]. If this goal were to be satisfied during parsing of the "519" component of the real-number "3.519", we would eventually hit a dead end -- the "19" at the end would have no way of being used up. - Exercise: By hand, trace the goal real-number(ParseTree, [3, ., 5, 1, 9], Rem). far enough to observe that this is true. - But here is the beauty of Prolog: It just backs up and keeps trying other ways to parse "519" as beginning with (or completely made of) a . Eventually it will find the 3rd solution above, which *will* fit in with solving the initial goal. - Exercise: Run the goal real-number(ParseTree, [3, ., 5, 1, 9], Rem). with tracing turned on, to see the backtracking in action. Using DCG notation for difference lists --------------------------------------- (recommended as the most elegant solution) real_number --> part, [.], part. part --> digit. part --> digit, part. digit --> [H],{member(H, [0,1,2,3,4,5,6,7,8,9])}. % or equivalently % digit([H|Rest], Rest):- member(H, [0,1,2,3,4,5,6,7,8,9]). Note: Between curly brackets you can put normal Prolog code. Add parse trees ----------------- The names of the functors in the internal nodes of the trees are the nonterminals in the CFG. Example: ?- real_number(Tree, [3,.,5,6],[]). Tree = real_number(part(digit(3)), '.',part(digit(5), part(digit(6)))) Yes real_number / | \ part . part | / \ digit digit part | | | 3 5 digit | 6 real_number(real_number(P1,.,P2)) --> part(P1), [.], part(P2). part(part(D)) --> digit(D). part(part(D,P)) --> digit(D), part(P). digit(digit(H)) --> [H],{member(H, [0,1,2,3,4,5,6,7,8,9])}. Not using difference lists -------------------------- - It is also possible to write our predicates without the remainder argument. But In order to do so, you have to find a way to give Prolog the ability to back up and try alternative ways of breaking down a "sentence" -- as we saw happening above.