Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
940 views
in Technique[技术] by (71.8m points)

list - Sentence parsing and matching in Prolog

I'm trying to create a sentence parser in Prolog. I want the sentence to be parsed into three separate lists that will be matched with a suggested outcome.

For example, here's the code I have come up with so far...


This is the vocabulary which will be used to parse certain words:

det(the).
det(a).
det(an).
noun(cat).
noun(boy).
noun(girl).
noun(grandfather).
noun(person).
noun(mat).
noun(party).
noun(book).
noun(problem).
noun(father).
noun(student).
noun(golf).
noun(conversation).
noun(challenge).
noun(person).
noun(problem).
noun(sat).
verb(thrives).
verb(loves).
verb(likes).
prep(on).
prep(with).
adj(big).
adj(lonely).
adj(elderly).
adj(teenage).
adj(good).
adj(fat).

sentence(Sentence,sentence(Noun_Phrase,Verb_Phrase)):-
      np(Sentence,Noun_Phrase,Rem),
      vp(Rem,Verb_Phrase).

Code for parsing:

np([X|T],np(det(X),NP2),Rem):- /* Det NP2 */
    det(X),
    np2(T,NP2,Rem).
np(Sentence,Parse,Rem):- np2(Sentence,Parse,Rem). /* NP2 */
np(Sentence,np(NP,PP),Rem):- /* NP PP */
    np(Sentence,NP,Rem1),
    pp(Rem1,PP,Rem).

np2([H|T],np2(noun(H)),T):- noun(H). /* Noun */
np2([H|T],np2(adj(H),Rest),Rem):- adj(H),np2(T,Rest,Rem).

pp([H|T],pp(prep(H),Parse),Rem):- /* PP NP */
    prep(H),
    np(T,Parse,Rem).

vp([H| []], vp(verb(H))):- /* Verb */
    verb(H).

vp([H|T], vp(verb(H), Rem)):- /* VP PP */
    vp(H, Rem),
    pp(T, Rem, _).

vp([H|T], vp(verb(H), Rem)):- /* Verb NP */
    verb(H),
    np(T, Rem, _).

I then want to get the output from this with something like this...

?- sentence([a,very,young,boy,loves,a,manual,problem],Parse).

And then somehow suggest a present out of a fact like this...

present(construction_kit, subject(very,young,boy), verb(loves), object(manual,problem)).
present(a_golfing_sweater, subject(young,father), verb(loves), object(golf)).

As you can probably see, I would want to match this with the 'construction_kit' present.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Let's try and trim this down to a small problem so we can see how we might approach it. Let's start with a simpler grammar.

sentence(NP, VP) --> noun_phrase(NP), verb_phrase(VP).
noun_phrase(np(Noun, Adj)) --> det, adjectives(Adj), noun(Noun).

det --> [D], { det(D) }.
det --> [].

noun(N) --> [N], { noun(N) }.

adjectives([]) --> [].
adjectives([A|As]) --> adjective(A), adjectives(As).
adjective(A) --> [A], { adj(A) }.

verb_phrase(vp(Verb, Noun)) --> verb(Verb), noun_phrase(Noun).

verb(V) --> [V], { verb(V) }.

This is not a complete grammar even for your problem, but it will parse some sentences:

?- phrase(sentence(NP,VP), [a,teenage,boy,loves,a,big,problem]).
NP = np(boy, [teenage]),
VP = vp(loves, np(problem, [big])) 

Now we can make a present database. I'm just going to consider the "head" of the noun phrase for now.

present('construction kit', boy, loves, problem).

Now you can deploy a very simple query to try and match them up:

?- phrase(sentence(np(Noun,_), vp(Verb, np(Object, _))),
   [a,teenage,boy,loves,a,big,problem]), 
   present(Suggestion, Noun, Verb, Object).
Noun = boy,
Verb = loves,
Object = problem,
Suggestion = 'construction kit' ;

Now you can see immediately your biggest problem: if you try to match a more complex grammar, you're going to have to have a corresponding increase in the complexity of your query which turns that into a suggestion. The solution to this kind of problem, where two things appear to increase in complexity in concert, is always to create an abstraction between them, but that is out of scope for this question.

Edit It's worth noting that if you want to build a general recommendation system, this is not a modern approach. I would recommend the book Programming Collective Intelligence for a quick overview of several approaches, backed by fairly readable Python.

Edit Is it clear that making the grammar more complex will result in more complexity in matching presents?

For starters, I have omitted entirely your prepositional phrase production. I just didn't include that. If it were there, would it have bearing on the suggestion? Consider a sentence like "a teenage boy who loves to code loves a big problem"—the natural extension of my noun_phrase//1 to cover this case will miss out on an important detail of this child which might lead you to a different present than the construction set.

Supposing you want to handle that supplemental information, you will have to add more patterns to the first argument to phrase/3. In fact, you'll probably accept whatever comes out and then dispatch to some kind of suggestion system.

One thing @TomasBy gets right is that we are using a very simple intermediate representation here, which is just (noun,verb,object). But if we make the grammar more complex, we will have to capture and handle more of it in this representation which will raise the complexity of the suggestion system. I don't see how to extend this simple representation to handle prepositional phrases, for instance, unless you just throw them away.

Converting the DCG back to ordinary Prolog

This is straightforward to do, but Prolog will do it for you. You can simply load the DCG above and use listing/0 to get the answer.

| ?- listing.

% file: user

sentence(A, B, C, D) :-
    noun_phrase(A, C, E),
    verb_phrase(B, E, D).

noun_phrase(np(A, B), C, D) :-
    det(C, E),
    adjectives(B, E, F),
    noun(A, F, D).

det([A|B], C) :-
    det(A),
    B = C.
det(A, A).

noun(A, [A|B], C) :-
    noun(A),
    B = C.

adjectives([], A, A).
adjectives([A|B], C, D) :-
    adjective(A, C, E),
    adjectives(B, E, D).

adjective(A, [A|B], C) :-
    adj(A),
    B = C.

verb_phrase(vp(A, B), C, D) :-
    verb(A, C, E),
    noun_phrase(B, E, D).

verb(A, [A|B], C) :-
    verb(A),
    B = C.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...