regular expressions
Post on 18-Nov-2014
319 Views
Preview:
TRANSCRIPT
1
Regular Expressions
Regular Expression
• A regular expression (RE) is defined inductivelya ordinary character
from the empty string
2
Regular Expression
R|S = either R or SRS = R followed by S
(concatenation)R* = concatenation of R
zero or more times(R*= |R|RR|RRR...)
3
RE Extentions
R? = | R (zero or one R)
R+ = RR* (one or more R)
4
RE Extentions
[abc] = a|b|c (any of listed)
[a-z] = a|b|....|z (range)
[^ab] = c|d|... (anything but
‘a’‘b’) 5
Regular Expression
RE Strings in L(R)a “a”ab “ab”a|b “a” “b”(ab)* “” “ab”
“abab” ...(a|)b “ab” “b”
6
Example: integers
• integer: a non-empty string
of digits• digit = ‘0’|’1’|’2’|’3’|’4’|
’5’|’6’|’7’|’8’|’9’• integer = digit digit*
7
Example: identifiers
• identifier: string or letters or digits starting with a letter
• C identifier:[a-zA-Z_][a-zA-Z0-9_]*
8
9
Regular Definitions
• To write regular expression for some languages can be difficult, because their regular expressions can be quite complex. In those cases, we may use regular definitions.
• We can give names to regular expressions, and we can use these names as symbols to define other regular expressions.
• A regular definition is a sequence of the definitions of the form:d1 r1 where di is a distinct name and
d2 r2 ri is a regular expression over symbols in
. {d1,d2,...,di-1}
dn rn
10
Specification of Patterns for Tokens: Regular Definitions
• Example:
letter AB…Zab…z digit 01…9 id letter ( letterdigit )*
• digits digit digit*
11
Regular Definitions (cont.)
• Ex: Identifiers in Pascalletter A | B | ... | Z | a | b | ... | zdigit 0 | 1 | ... | 9id letter (letter | digit ) *
– If we try to write the regular expression representing identifiers without using regular definitions, that regular expression will be complex.
(A|...|Z|a|...|z) ( (A|...|Z|a|...|z) | (0|...|9) ) *
• Ex: Unsigned numbers in Pascaldigit 0 | 1 | ... | 9digits digit +
opt-fraction ( . digits ) ?opt-exponent ( E (+|-)? digits ) ?
unsigned-num digits opt-fraction opt-exponent
12
Specification of Patterns for Tokens: Notational Shorthand
• The following shorthands are often used:– + one or more instances of– ? Zero or one instance
r+ = rr*
r? = r[a-z] = abc…z
• Examples:digit [0-9]num digit+ (. digit+)? ( E (+-)? digit+ )?
13
Definition
• For primitive regular expressions:
aaL
L
L
14
Definition (continued)
• For regular expressions and
•
1r 2r
2121 rLrLrrL
2121 rLrLrrL
** 11 rLrL
11 rLrL
Concatenation of Languages
• If L1 and L2 are languages, we can define the concatenationL1L2 = {w | w=xy, xL1, yL2}
• Examples:– {ab, ba}{cd, dc} =? {abcd, abdc, bacd, badc}– Ø{ab} =? Ø
Kleene Closure
• L* = i=0Li
= L0 L1 L2 …• Examples:
– {ab, ba}* =? {, ab, ba, abab, abba,…}– Ø* =? {}– {}* =? {}
17
Example
• Regular expression *)10(00*)10( r
)(rL = { all strings with at least two consecutive 0 }
18
Example
• Regular expression )0(*)011( r
)(rL = { all strings without two consecutive 0 }
19
Equivalent Regular Expressions
• Definition:
• Regular expressions and
• are equivalent if
1r 2r
)()( 21 rLrL
20
Example
• L= { all strings without two consecutive 0 }
)0(*)011(1 r
)0(*1)0(**)011*1(2 r
LrLrL )()( 211r 2rand
are equivalentregular expr.
Assignment
• Σ = {0, 1}• What is the language for
– 0*1*
• What is the regular expression for– {w | w has at least one 1}– {w | w starts and ends with same symbol}– {w | |w| 5}– {w | every 3rd position of w is 1}– L+ = L1 L2 …– L? (means an optional L)
22
Regular Expressionsand
Regular Languages
23
Theorem
LanguagesGenerated byRegular Expressions
RegularLanguages
24
Standard Representations of Regular Languages
Regular Languages
FAs
NFAsRegularExpressions
25
Elementary Questions
about
Regular Languages
26
Membership Question
Question: Given regular languageand string how can we check if ?
L
Lw w
Answer: Take the DFA that acceptsand check if is accepted
Lw
27
DFA
Lw
DFA
Lw
w
w
28
Given regular languagehow can we checkif is empty: ?
L
L
Take the DFA that accepts
Check if there is any path from the initial state to a final state
L
)( L
Question:
Answer:
29
DFA
L
DFA
L
30
Given regular languagehow can we checkif is finite?
L
L
Take the DFA that accepts
Check if there is a walk with cyclefrom the initial state to a final state
L
Question:
Answer:
31
DFA
L is infinite
DFA
L is finite
From RE to -NFA
• For every regular expression R, we can construct an -NFA A, s.t. L(A) = L(R).
• Proof by structural induction:
Ø:
:
a:a
From RE to -NFA
R+S:
RS:
R*:
R
S
R S
R
Example: (0+1)*1(0+1)
0
1
0
1
0
1
1
0
1
Example : (a+b)*aba
top related