Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
249 views
in Technique[技术] by (71.8m points)

regular language - The difference between Chomsky type 3 and Chomsky type 2 grammar

I'm having trouble articulating the difference between Chomsky type 2 (context free languages) and Chomsky type 3 (Regular languages).

Can someone out there give me an answer in plain English? I'm having trouble understanding the whole hierarchy thing.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A Type II grammar is a Type III grammar with a stack

A Type II grammar is basically a Type III grammar with nesting.

Type III grammar (Regular):

Use Case - CSV (Comma Separated Values)

Characteristics:

  • can be read with a using a FSM (Finite State Machine)
  • requires no intermediate storage
  • can be read with Regular Expressions
  • usually expressed using a 1D or 2D data structure
  • is flat, meaning no nesting or recursive properties

Ex:

this,is,,"an "" example",

"of, a",type,"III
",grammar

As long as you can figure out all of the rules and edge cases for the above text you can parse CSV.

Type II grammar (Context Free):

Use Case - HTML (Hyper Text Markup Language) or SGML in general

Characteristics:

  • can be read using a DPDA (Deterministic Pushdown Automata)
  • will require a stack for intermediate storage
  • may be expressed as an AST (Abstract Syntax Tree)
  • may contain nesting and/or recursive properties

HTML could be expressed as a regular grammar:

<h1>Useless Example</h1>
<p>Some stuff written here</p>
<p>Isn't this fun</p>

But it's try parsing this using a FSM:

<body>
  <div id=titlebar>
    <h1>XHTML 1.0</h1>
    <h2>W3C's failed attempt to enforce HTML as a context-free language</h2>
  </div>
  <p>Back when the web was still pretty boring, the W3C attempted to standardize away the quirkiness of HTML by introducing a strict specification</p
  <p>Unfortunately, everybody ignored it.</p>
</body>

See the difference? Imagine you were writing a parser, you could start on an open tag and finish on a closing tag but what happens when you encounter a second opening tag before reaching the closing tag?

It's simple, you push the first opening tag onto a stack and start parsing the second tag. Repeat this process for as many levels of nesting that exist and if the syntax is well-structured, the stack can be un-rolled one layer at a time in the opposite level that it was built

Due to the strict nature of 'pure' context-free languages, they're relatively rare unless they're generated by a program. JSON, is a prime example.

The benefit of context-free languages is that, while very expressive, they're still relatively simple to parse.


But wait, didn't I just say HTML is context-free. Yep, if it is well-formed (ie XHTML).

While XHTML may be considered context-free, the looser-defined HTML would actually considered Type I (Ie Context Sensitive). The reason being, when the parser reaches poorly structured code it actually makes decisions about how to interpret the code based on the surrounding context. For example if an element is missing its closing tags, it would need to determine where that element exists in the hierarchy before it can decide where the closing tag should be placed.

Other features that could make a context-free language context-sensitive include, templates, imports, preprocessors, macros, etc.

In short, context-sensitive languages look a lot like context-free languages but the elements of a context-sensitive languages may be interpreted in different ways depending on the program state.

Disclaimer: I am not formally trained in CompSci so this answer may contain errors or assumptions. If you asked me the difference between a terminal and a non-terminal you'll earn yourself a blank stare. I learned this much by actually building a Type III (Regular) parser and by reading extensively about the rest.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...