python programming - xi. string manipulation and regular expressions
DESCRIPTION
TRANSCRIPT
![Page 1: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/1.jpg)
PYTHON PROGRAMMING Text ProcessingXI. String Manipulation and Regular Expressions
Engr. Ranel O. Padon
![Page 2: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/2.jpg)
PYTHON PROGRAMMING TOPICS
I• Introduction to Python Programming
II• Python Basics
III• Controlling the Program Flow
IV• Program Components: Functions, Classes, Packages, and Modules
V• Sequences (List and Tuples), and Dictionaries
VI• Object-Based Programming: Classes and Objects
VII• Customizing Classes and Operator Overloading
VIII• Object-Oriented Programming: Inheritance and Polymorphism
IX• Randomization Algorithms
X• Exception Handling and Assertions
XI• String Manipulation and Regular Expressions
XII• File Handling and Processing
XIII• GUI Programming Using Tkinter
![Page 3: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/3.jpg)
Text Processing
String Manipulation
Regular Expressions
![Page 4: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/4.jpg)
TEXT PROCESSING
* used to develop text editors, word processors,
page-layout soft-ware, computerized typesetting systems,
and other text-processing software
* used to search for patterns in text
* used to validate user-inputs
* used to process the contents of text files
![Page 5: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/5.jpg)
STRING MANIPULATION
Strings are made up of Characters.
Characters are made up of:
Digits (0, 1, 2, …, 9)
Letters (a, b, c, …, z)
Symbols (@, *, #, $, %, &, …)
![Page 6: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/6.jpg)
String
Methods
![Page 7: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/7.jpg)
String
Methods
![Page 8: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/8.jpg)
String
Methods
![Page 9: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/9.jpg)
String
Methods
![Page 10: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/10.jpg)
String
Methods
![Page 11: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/11.jpg)
STRING MANIPULATION | Samples
![Page 12: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/12.jpg)
STRING MANIPULATION | Samples
![Page 13: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/13.jpg)
STRING MANIPULATION | Samples
![Page 14: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/14.jpg)
STRING MANIPULATION | Samples
![Page 15: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/15.jpg)
STRING MANIPULATION | Samples
![Page 16: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/16.jpg)
STRING MANIPULATION | Samples
![Page 17: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/17.jpg)
STRING MANIPULATION | Samples
![Page 18: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/18.jpg)
STRING MANIPULATION | Samples
![Page 19: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/19.jpg)
STRING MANIPULATION | Samples
![Page 20: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/20.jpg)
STRING MANIPULATION | Samples
![Page 21: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/21.jpg)
REGULAR EXPRESSIONS
to test if a certain string contains a day of a week,
it has to test if it contains “Monday,” “Tuesday”, and so on.
you will need to use the find() method seven times
but, it could be solved elegantly by Regular Expressions
![Page 22: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/22.jpg)
REGULAR EXPRESSIONS
* use string methods for simple text processing
* string methods are more readable and simpler
than regular expressions
![Page 23: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/23.jpg)
REGULAR EXPRESSION
text pattern that a program uses to find substrings that will
match the required pattern
expression that specify a set of strings
a pattern matching mechanism
also known as Regex
introduced in the 1950s as part of formal language theory
![Page 24: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/24.jpg)
REGULAR EXPRESSIONS
very powerful! hundreds of code could be reduced to
a one-liner elegant regular expression.
used to construct compilers, interpreters, text editors, …
used to search & match text patterns
used to validate text data formats especially input data
![Page 25: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/25.jpg)
REGULAR EXPRESSIONS
Popular programming languages have RegEx capabilities:
Perl, JavaScript, PHP, Python, Ruby, Tcl,
Java, C, C++, C#, .Net, Ruby, …
![Page 26: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/26.jpg)
REGEX
Popular programming languages have RegEx capabilities:
Perl, JavaScript, PHP, Python, Ruby, Tcl,
Java, C, C++, C#, .Net, Ruby, …
![Page 27: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/27.jpg)
REGEX | General Concepts
Alternative
Grouping
Quantification
Anchors
Meta-characters
Character Classes
![Page 28: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/28.jpg)
REGEX | General Concepts
Alternative: |
Grouping: ()
Quantification: ? + * {m,n}
Anchors: ^ $
Meta-characters: . [ ] [-] [^ ]
Character Classes: \w \d \s \W …
![Page 29: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/29.jpg)
REGEX | Alternative
“ranel|ranilio” == “ranel” or “ranilio”
“gray|grey” == “gray” or “grey”
![Page 30: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/30.jpg)
REGEX | Grouping
“ran(el|ilio)” == “ranel” or “ranilio”
“gr(a|e)y” == “gray” or “grey”
“ra(mil|n(ny|el))” == “ramil” or “ranny” or “ranel”
![Page 31: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/31.jpg)
REGEX | Quantification | ?
? == zero or one of the preceding element
“rani?el” == “raniel” or “ranel”
“colou?r” == “colour” or “color”
![Page 32: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/32.jpg)
REGEX | Quantification | *
* == zero or more of the preceding element
“goo*gle” == “gogle” or “google” or “gooooogle”
“(ha)*” == “” or “ha” or “haha” or “hahahahaha”
“12*3” == “13” or “1223” or “12223”
![Page 33: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/33.jpg)
REGEX | Quantification | +
+ == one or more of the preceding element
“goo+gle” == “google” or “gooogle” or “gooooogle”
“(ha)+” == “ha” or “haha” or “hahahahaha”
“12+3” == “123” or “1223” or “12223”
![Page 34: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/34.jpg)
REGEX | Quantification | {m,n}
{m, n} == m to n times of the preceding element
“go{2, 3}gle” == “google” or “gooogle”
“6{3, 6}” == “666” or “6666” or “66666” or “666666”
“5{3}” == “555”
“a{2,}” == “aa” or “aaa” or “aaaa” or “aaaaa” …
![Page 35: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/35.jpg)
REGEX | Anchors | ^
^ == matches the starting position within the string
“^laman” == “lamang” or “lamang-loob” or “lamang-lupa”
“^2013” == “2013”, “2013-12345”, “2013/1320”
![Page 36: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/36.jpg)
REGEX | Anchors | $
$ == matches the ending position within the string
“laman$” == “halaman” or “kaalaman”
“2013$” == “2013”, “777-2013”, “0933-445-2013”
![Page 37: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/37.jpg)
REGEX | Meta-characters | .
. == matches any single character
“ala.” == “ala” or “alat” or “alas” or “ala2”
“1.3” == “123” or “143” or “1s3”
![Page 38: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/38.jpg)
REGEX | Meta-characters | [ ]
[ ] == matches a single character that is
contained within the brackets.
“[abc]” == “a” or “b” or “c”
“[aoieu]” == any vowel
“[0123456789]” == any digit
![Page 39: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/39.jpg)
REGEX | Meta-characters | [ - ]
[ - ] == matches a single character that is
contained within the brackets
and the specified range.
“[a-c]” == “a” or “b” or “c”
“[a-z]” == all alphabet letters (lowercase only)
“[a-zA-Z]” == all letters (lowercase & uppercase)
“[0-9]” == all digits
![Page 40: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/40.jpg)
REGEX | Meta-characters | [^ ]
[^ ] == matches a single character that is not contained
within the brackets.
“[^aeiou]” == any non-vowel
“[^0-9]” == any non-digit
“[^abc]” == any character, but not “a”, “b”, or “c”
![Page 41: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/41.jpg)
REGEX | Character Classes
Character classes specifies a group of characters
to match in a string
![Page 42: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/42.jpg)
REGEX | Summary
Alternative: |
Grouping: ()
Quantification: ? + * {m,n}
Anchors: ^ $
Meta-characters: . [ ] [-] [^ ]
Character Classes: \w \d \s \W …
![Page 43: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/43.jpg)
REGEX | Combo
![Page 44: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/44.jpg)
REGEX | Date Validation
“1/3/2013” or “24/2/2020”
(\d{1,2}\/\d{1,2}\/\d{4})
![Page 45: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/45.jpg)
REGEX | Alphanumeric, -, & _
“rr2000” or “ranel_padon” or “Oblan-Padon”
([a-zA-Z0-9-_]+)
![Page 46: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/46.jpg)
REGEX | Numbers in 1 to 50
“1” or “50” or “14”
(^[1-9]{1}$|^[1-4]{1}[0-9]{1}$|^50$)
![Page 47: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/47.jpg)
REGEX | HTML Tags
“<title>” or “<strong>” or “/body”
(<(/?[^>]+)>)
![Page 48: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/48.jpg)
PYTHON REGEX | Raw String
![Page 49: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/49.jpg)
PYTHON REGEX | Raw String r
Two Solutions:
![Page 50: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/50.jpg)
PYTHON REGEX | Raw String r
Raw Strings are used for enhancing readability.
![Page 51: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/51.jpg)
PYTHON REGEX | Raw String
![Page 52: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/52.jpg)
PYTHON REGEX | The re Module
![Page 53: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/53.jpg)
PYTHON REGEX | Samples
![Page 54: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/54.jpg)
PYTHON REGEX | Samples
![Page 55: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/55.jpg)
PYTHON REGEX | Samples
![Page 56: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/56.jpg)
PYTHON REGEX | Samples
![Page 57: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/57.jpg)
PYTHON REGEX | Samples
![Page 58: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/58.jpg)
PYTHON REGEX | Samples
![Page 59: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/59.jpg)
PYTHON REGEX | Samples
![Page 60: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/60.jpg)
PYTHON REGEX | Samples
![Page 61: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/61.jpg)
PYTHON REGEX | Samples
![Page 62: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/62.jpg)
![Page 63: Python Programming - XI. String Manipulation and Regular Expressions](https://reader033.vdocuments.site/reader033/viewer/2022051210/54b70ef24a79590f628b4587/html5/thumbnails/63.jpg)
REFERENCES
Deitel, Deitel, Liperi, and Wiedermann - Python: How to Program (2001).
Disclaimer: Most of the images/information used here have no proper source
citation, and I do not claim ownership of these either. I don’t want to reinvent the
wheel, and I just want to reuse and reintegrate materials that I think are useful or
cool, then present them in another light, form, or perspective. Moreover, the
images/information here are mainly used for illustration/educational purposes only,
in the spirit of openness of data, spreading light, and empowering people with
knowledge.