A context-free grammar is a four-tuple (V, Σ, R, S) where
Here is an example grammar
S → 0S0
S → 1S1
S→ε
If a grammar offers multiple options for what a variable can expand to, we can consolidate all of those options into a single rule that offers several alternatives on the right hand side separated by | symbols.
For example,
S → 0S0
S → 1S1
S→ε
becomes
S → 0S0 | 1S1 | ε
A derivation is a sequence of applications of grammar rules, starting from the string S and continuing until there are only terminal symbols left in the string.
Here is an example derivation using the previous example grammar.
S → 0S0 →01S10→011110
A context-free language is the set of all strings that can be derived from a particular context-free grammar.
To demonstrate that a particular string is in the language, we have to present a derivation for that string.
To show that a particular language is a context-free language, we must construct a grammar for the language and then argue that the grammar generates exactly L.
Consider the language
L = { x | x contains at least two 1s }
I claim that the following grammar generates L:
S → AAB
A → 0A | A0 | 1A | A1 | 1
B → 0B | 1B | ε
The intuition behind this grammar is that any string in L can be decomposed into three parts: a substring containing at least one 1, a second substring containing at least one 1, and a third substring containing 0 or more 1s. The rules for the variable A generate substrings that have to have at least one 1. The rules for the variable B generate substrings that contain 0 or more 1s.
A context-free grammar is in Chomsky normal form if all of its rules obey the following constraints:
See https://en.wikipedia.org/wiki/Chomsky_normal_form
We convert the previous example grammar
S → AAB
A → 0A | A0 | 1A | A1 | 1
B → 0B | 1B | ε
into Chomsky normal form.
1) Eliminate start symbol from right-hand sides: not needed
2) Elimate rules with nonsolitary terminals:
S → AAB
A → ZA | AZ | OA | AO | O
B → ZB | OB | ε
Z → 0
O → 1
3) Eliminate long right hand sides:
S → AC
C → AB
A → ZA | AZ | OA | AO | O
B → ZB | OB | ε
Z → 0
O → 1
4) Eliminate ε rules
S → AC
C → AB | A
A → ZA | AZ | OA | AO | O
B → ZB | OB | Z | O
Z → 0
O → 1
5) Eliminate unit rules
S → AC | AA | OC | AO | OA | OO
C → AB | OB | AZ | AO | OZ | OO
A → ZA | AZ | OA | AO | ZO | OZ | OO
B → ZB | OB | ZZ | ZO | OO | OZ
Z → 0
O → 1