introduction to perl part ii

37
Introduction to Perl Introduction to Perl Part II Part II By: Bridget Thomson McInnes By: Bridget Thomson McInnes 22 January 2004 22 January 2004

Upload: savea

Post on 18-Mar-2016

37 views

Category:

Documents


1 download

DESCRIPTION

Introduction to Perl Part II. By: Bridget Thomson McInnes 22 January 2004. File Handlers. Very simple compared to C/ C++ !!! Are not prefixed with a symbol ($, @, %, ect) Opening a File: open (SRC, “my_file.txt”); Reading from a File $line = ; # reads upto a newline character - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to Perl Part II

Introduction to PerlIntroduction to Perl

Part IIPart II

By: Bridget Thomson McInnesBy: Bridget Thomson McInnes

22 January 200422 January 2004

Page 2: Introduction to Perl Part II

File HandlersFile Handlers Very simple compared to C/ C++ !!!Very simple compared to C/ C++ !!! Are not prefixed with a symbol ($, @, %, ect)Are not prefixed with a symbol ($, @, %, ect)

Opening a File:Opening a File:open (SRC, “my_file.txt”);open (SRC, “my_file.txt”);

Reading from a FileReading from a File$line = <SRC>; # reads upto a newline character$line = <SRC>; # reads upto a newline character

Closing a FileClosing a Fileclose (SRC);close (SRC);

Page 3: Introduction to Perl Part II

File Handlers cont...File Handlers cont... Opening a file for output:Opening a file for output:

open (DST, “>my_file.txt”);open (DST, “>my_file.txt”); Opening a file for appendingOpening a file for appending

open (DST, “>>my_file.txt”);open (DST, “>>my_file.txt”); Writing to a file:Writing to a file:

print DST “Printing my first line.\n”;print DST “Printing my first line.\n”;

Safeguarding against opening a non existent Safeguarding against opening a non existent filefileopen (SRC, “file.txt”) || die “Could not open file.\n”;open (SRC, “file.txt”) || die “Could not open file.\n”;

Page 4: Introduction to Perl Part II

File Test OperatorsFile Test Operators Check to see if a file exists:Check to see if a file exists:

if ( -e “file.txt”) {if ( -e “file.txt”) { # The file exists!# The file exists!}}

Other file test operators:Other file test operators:-r-r readablereadable-x-x executableexecutable-d-d is a directoryis a directory-T-Tis a text fileis a text file

Page 5: Introduction to Perl Part II

Quick Program with File Quick Program with File HandlesHandles

Program to copy a file to a destination fileProgram to copy a file to a destination file

#!/usr/local/bin/perl -w#!/usr/local/bin/perl -wopen(SRC, “file.txt”) || die “Could not open open(SRC, “file.txt”) || die “Could not open

source file.\n”;source file.\n”;open(DST< “>newfile.txt”);open(DST< “>newfile.txt”);while ( $line = <SRC> ) {while ( $line = <SRC> ) { print DST $line;print DST $line;}}close SRC;close SRC;close DST;close DST;

Page 6: Introduction to Perl Part II

Some Default File HandlesSome Default File Handles STDIN : Standard InputSTDIN : Standard Input

$line = <STDIN>; # takes input from stdin$line = <STDIN>; # takes input from stdin

STDOUT : Standard outputSTDOUT : Standard outputprint STDOUT “File handling in Perl is sweet!\print STDOUT “File handling in Perl is sweet!\

n”;n”;

STDERR : Standard ErrorSTDERR : Standard Errorprint STDERR “Error!!\n”; print STDERR “Error!!\n”;

Page 7: Introduction to Perl Part II

The <> File HandleThe <> File Handle The “empty” file handle takes the command The “empty” file handle takes the command

line file(s) or STDIN;line file(s) or STDIN;– $line = <>;$line = <>;

If program is run ./prog.pl file.txt, this will If program is run ./prog.pl file.txt, this will automatically open file.txt and read the first automatically open file.txt and read the first line.line.

If program is run ./prog.pl file1.txt file2.txt, this If program is run ./prog.pl file1.txt file2.txt, this will first read in file1.txt and then file2.txt ... will first read in file1.txt and then file2.txt ... you will not know when one ends and the other you will not know when one ends and the other begins.begins.

Page 8: Introduction to Perl Part II

The <> File Handle cont...The <> File Handle cont... If program is run ./prog.pl, the program If program is run ./prog.pl, the program

will wait for you to enter text at the will wait for you to enter text at the prompt, and will continue until you enter prompt, and will continue until you enter the EOF character the EOF character

– CTRL-D in UNIXCTRL-D in UNIX

Page 9: Introduction to Perl Part II

Example Program with Example Program with STDINSTDIN

Suppose you want to determine if you are Suppose you want to determine if you are one of the three stoogesone of the three stooges

#!/usr/local/bin/perl#!/usr/local/bin/perl%stooges = (larry => 1, moe => 1, curly => 1 );%stooges = (larry => 1, moe => 1, curly => 1 );print “Enter your name: ? “; print “Enter your name: ? “; $name = <STDIN>; chomp $name;$name = <STDIN>; chomp $name;if($stooges{lc($name)}) { if($stooges{lc($name)}) { print “You are one of the Three Stooges!!\n”;print “You are one of the Three Stooges!!\n”;} else { } else { print “Sorry, you are not a Stooge!!\n”;print “Sorry, you are not a Stooge!!\n”;}}

Page 10: Introduction to Perl Part II

Chomp and ChopChomp and Chop Chomp : function that deletes a trailing newline

from the end of a string. $line = “this is the first line of text\n”; chomp $line; # removes the new line character print $line; # prints “this is the first line of

# text” without returning Chop : function that chops off the last character

of a string. $line = “this is the first line of text”; chop $line; print $line; #prints “this is the first line of tex”

Page 11: Introduction to Perl Part II

Regular ExpressionsRegular Expressions What are Regular Expressions .. a few What are Regular Expressions .. a few

definitions.definitions.– Specifies a class of strings that belong to the Specifies a class of strings that belong to the

formal / regular languages defined by regular formal / regular languages defined by regular expressionsexpressions

– In other words, a formula for matching strings In other words, a formula for matching strings that follow a specified pattern.that follow a specified pattern.

Some things you can do with regular Some things you can do with regular expressionsexpressions– Parse the textParse the text– Add and/or replace subsections of textAdd and/or replace subsections of text– Remove pieces of the textRemove pieces of the text

Page 12: Introduction to Perl Part II

Regular Expressions cont..Regular Expressions cont.. A regular expression characterizes a A regular expression characterizes a

regular languageregular language

Examples in UNIX: Examples in UNIX: – ls *.cls *.c

Lists all the files in the current directory Lists all the files in the current directory that are postfixed '.c'that are postfixed '.c'

– ls *.txtls *.txt Lists all the files in the current directory Lists all the files in the current directory

that are postfixed '.txt'that are postfixed '.txt'

Page 13: Introduction to Perl Part II

Simple Example for ... ? Simple Example for ... ? ClarityClarity

In the simplest form, a regular expression In the simplest form, a regular expression is a string of characters that you are is a string of characters that you are looking forlooking for

We want to find all the words that contain We want to find all the words that contain the string 'ing' in our text.the string 'ing' in our text.

The regular expression we would use :The regular expression we would use : /ing//ing/

Page 14: Introduction to Perl Part II

Simple Example cont...Simple Example cont... What would are program then look like:What would are program then look like:

#!/usr/local/bin/perl#!/usr/local/bin/perlwhile(<>) {while(<>) { chomp;chomp; @words = split/ /;@words = split/ /; foreach $word(@words) {foreach $word(@words) { if(if($word=~m/ing/$word=~m/ing/) { print “$word\) { print “$word\

n”; }n”; } }}}}

Page 15: Introduction to Perl Part II

Regular Expressions TypesRegular Expressions Types Regular expressions are composed of two Regular expressions are composed of two

types of characters:types of characters:– LiteralsLiterals

Normal text charactersNormal text characters Like what we saw in the previous program Like what we saw in the previous program

( /ing/ ) ( /ing/ )

– MetacharactersMetacharacters special charactersspecial characters Add a great deal of flexibility to your searchAdd a great deal of flexibility to your search

Page 16: Introduction to Perl Part II

MetacharactersMetacharacters Match more than just charactersMatch more than just characters Match line positionMatch line position

– ^̂ start of a linestart of a line ( carat )( carat )– $$ end of a lineend of a line ( dollar sign )( dollar sign )

Match any characters in a list : [ ... ] Match any characters in a list : [ ... ] Example : Example :

– /[Bb]ridget//[Bb]ridget/ matches Bridget or bridget matches Bridget or bridget – /Mc[Ii]nnes//Mc[Ii]nnes/ matches McInnes or Mcinnes matches McInnes or Mcinnes

Page 17: Introduction to Perl Part II

Our Simple Example Our Simple Example RevisitedRevisited

Now suppose we only want to match words Now suppose we only want to match words that end in 'ing' rather than just contain 'ing'.that end in 'ing' rather than just contain 'ing'.

How would we change are regular How would we change are regular expressions to accomplish this:expressions to accomplish this:

– Previous Regular Expression:Previous Regular Expression: $word =~m/ ing /$word =~m/ ing /

– New Regular Expression:New Regular Expression: $word=~m/ ing$word=~m/ ing$ $ //

Page 18: Introduction to Perl Part II

Ranges of Regular Ranges of Regular ExpressionsExpressions

Ranges can be specified in Regular Ranges can be specified in Regular ExpressionsExpressions

Valid RangesValid Ranges– [A-Z][A-Z] Upper Case Roman AlphabetUpper Case Roman Alphabet– [a-z][a-z] Lower Case Roman AlphabetLower Case Roman Alphabet– [A-Za-z][A-Za-z] Upper or Lower Case Roman AlphabetUpper or Lower Case Roman Alphabet– [A-F][A-F] Upper Case A through F Roman Upper Case A through F Roman

CharactersCharacters– [A-z][A-z] Valid but be carefulValid but be careful

Invalid RangesInvalid Ranges– [a-Z][a-Z] Not ValidNot Valid– [F-A][F-A] Not ValidNot Valid

Page 19: Introduction to Perl Part II

Ranges cont ...Ranges cont ... Ranges of Digits can also be specifiedRanges of Digits can also be specified

– [0-9][0-9] ValidValid– [9-0][9-0] InvalidInvalid

Negating RangesNegating Ranges– / [^0-9] // [^0-9] /

Match anything except a digitMatch anything except a digit– / ^a // ^a /

Match anything except an aMatch anything except an a– / ^[^A-Z] // ^[^A-Z] /

Match anything that starts with something Match anything that starts with something other than a single upper case letter other than a single upper case letter

First ^ First ^ : : start of linestart of line Second ^ :Second ^ : negationnegation

Page 20: Introduction to Perl Part II

Our Simple Example AgainOur Simple Example Again Now suppose we want to create a list of all Now suppose we want to create a list of all

the words in our text that do not end in 'ing'the words in our text that do not end in 'ing' How would we change are regular How would we change are regular

expressions to accomplish this:expressions to accomplish this:

– Previous Regular Expression:Previous Regular Expression: $word =~m/ ing$ /$word =~m/ ing$ /

– New Regular Expression:New Regular Expression: $word=~m/ $word=~m/ [^ ing][^ ing]$ /$ /

Page 21: Introduction to Perl Part II

Literal MetacharactersLiteral Metacharacters Suppose that you actually want to look Suppose that you actually want to look

for all strings that equal '^' in your textfor all strings that equal '^' in your text– Use theUse the \ \ symbolsymbol– // \^ \^ / / Regular expression to search forRegular expression to search for

What does the following Regular What does the following Regular Expressions Match?Expressions Match? / [ A - Z ^ ] ^ / / [ A - Z ^ ] ^ /

– Matches any line that contains ( A-Z or ^) Matches any line that contains ( A-Z or ^) followed by ^followed by ^

Page 22: Introduction to Perl Part II

Patterns provided in PerlPatterns provided in Perl Some PatternsSome Patterns

– \d\d [ 0 – 9 ][ 0 – 9 ]– \w\w [a – z A – z 0 – 9 _ ][a – z A – z 0 – 9 _ ]– \s\s [ \r \t \n \f ][ \r \t \n \f ] (white space pattern)(white space pattern)– \D\D [^ 0 - 9][^ 0 - 9]– \W\W [^ a – z A – Z 0 – 9 ][^ a – z A – Z 0 – 9 ]– \S\S [^ \r \t \n \f][^ \r \t \n \f]

Example :Example : [ 19[ 19\d\d\d\d ] ]– Looks for any year in the 1900'sLooks for any year in the 1900's

Page 23: Introduction to Perl Part II

Using Patterns in our Using Patterns in our ExampleExample

Commonly words are not separated by just a single space but by tabs, returns, ect...

Let's modify our split function to incorporate multiple white space

#!/usr/local/bin/perlwhile(<>) { chomp; @words = split/\s+/, $_; foreach $word(@words) { if($word=~m/ing/) { print “$word\n”; }}

Page 24: Introduction to Perl Part II

Word Boundary Word Boundary MetacharacterMetacharacter

Regular Expression to match the start or Regular Expression to match the start or the end of a 'word' : the end of a 'word' : \b\b

Examples:Examples:

– / Jeff\b // Jeff\b / Match Jeff but not JeffersonMatch Jeff but not Jefferson– / Carol\b // Carol\b / Match Chris but not Caroline Match Chris but not Caroline – / Rollin\b // Rollin\b / Match Rollin but not RollingMatch Rollin but not Rolling– /\bform //\bform / Match form or formation but Match form or formation but

not not InformationInformation– /\bform\b//\bform\b/ Match form but neither Match form but neither

information information nor formationnor formation

Page 25: Introduction to Perl Part II

DOT MetacharacterDOT Metacharacter The DOT Metacharacter, 'The DOT Metacharacter, '..' symbolizes ' symbolizes

any character except a new lineany character except a new line / b / b . . bble/bble/

– Would possibly return : bobble, babble, bubbleWould possibly return : bobble, babble, bubble / / . . oat/oat/

– Would possibly return : boat, coat, goatWould possibly return : boat, coat, goat

Note: remember 'Note: remember '.*.*' usually means a ' usually means a bunch of anything, this can be handy but bunch of anything, this can be handy but also can have hidden ramifications.also can have hidden ramifications.

Page 26: Introduction to Perl Part II

PIPE MetacharacterPIPE Metacharacter The PIPE Metacharacter is used for alternationThe PIPE Metacharacter is used for alternation

/ Bridget (Thomson | McInnes) // Bridget (Thomson | McInnes) /– Match Bridget Thomson or Bridget McInnes but Match Bridget Thomson or Bridget McInnes but

NOTNOT Bridget Thomson McInnes Bridget Thomson McInnes

/ B | bridget / / B | bridget / – Match B or bridgetMatch B or bridget

/ ^( B | b ) ridget // ^( B | b ) ridget /– Match Bridget or bridget at the beginning of a lineMatch Bridget or bridget at the beginning of a line

Page 27: Introduction to Perl Part II

Our Simple ExampleOur Simple Example Now with our example, suppose that we want to Now with our example, suppose that we want to

not only get all words that end in 'ing' but also not only get all words that end in 'ing' but also 'ed'.'ed'.

How would we change are regular expressions How would we change are regular expressions to accomplish this:to accomplish this:

– Previous Regular Expression:Previous Regular Expression: $word =~m/ ing$ /$word =~m/ ing$ /

– New Regular Expression:New Regular Expression: $word=~m/ $word=~m/ (ing|ed)(ing|ed)$ /$ /

Page 28: Introduction to Perl Part II

The ? MetacharacterThe ? Metacharacter The metacharacter, ?, indicates that the The metacharacter, ?, indicates that the

character immediately preceding it character immediately preceding it occurs zero or one timeoccurs zero or one time

Examples:Examples:

– / worl?ds // worl?ds / Match either 'worlds' or 'words'Match either 'worlds' or 'words'

– / m?ethane / / m?ethane / Match either 'methane' or 'ethane'Match either 'methane' or 'ethane'

Page 29: Introduction to Perl Part II

The * MetacharacterThe * Metacharacter The metacharacter, *, indicates that the The metacharacter, *, indicates that the

characterer immediately preceding it occurs characterer immediately preceding it occurs zero or more timeszero or more times

Example :Example :

– / ab*c// ab*c/ Match 'ac', 'abc', 'abbc', 'abbbc' ect... Match 'ac', 'abc', 'abbc', 'abbbc' ect...

– Matches any string that starts with an a, if possibly Matches any string that starts with an a, if possibly followed by a sequence of b's and ends with a c.followed by a sequence of b's and ends with a c.

Sometimes called Kleene's starSometimes called Kleene's star

Page 30: Introduction to Perl Part II

Our Simple Example againOur Simple Example again Now suppose we want to create a list of all

the words in our text that end in 'ing' or 'ings'

How would we change are regular expressions to accomplish this:

– Previous Regular Expression: $word =~m/ ing$ /

– New Regular Expression: $word=~m/ ings?$ /

Page 31: Introduction to Perl Part II

Modifying TextModifying Text MatchMatch

– Up to this point, we have seen attempt to Up to this point, we have seen attempt to match a given regular expression match a given regular expression

– Example : $variable =~m/ regex /Example : $variable =~m/ regex /

SubstitutionSubstitution– Takes match one step further : if there is a Takes match one step further : if there is a

match, then replace it with the given stringmatch, then replace it with the given string– Example : $variable =~s/ regex / replacementExample : $variable =~s/ regex / replacement

$var =~ / Thomson / McInnes /;$var =~ / Thomson / McInnes /; $var =~ / Bridgette / Bridget /;$var =~ / Bridgette / Bridget /;

Page 32: Introduction to Perl Part II

Substitution ExampleSubstitution Example Suppose when we find all our words that end Suppose when we find all our words that end

in 'ing' we want to replace the 'ing' with 'ed'.in 'ing' we want to replace the 'ing' with 'ed'.

#!/usr/local/bin/perl -w#!/usr/local/bin/perl -wwhile(<>) {while(<>) { chomp $_;chomp $_; @words = split/ \s+/, $_; @words = split/ \s+/, $_; foreach $word(@words) { foreach $word(@words) { if(if($word=~s/ing$/ed/$word=~s/ing$/ed/) { print “$word\n”; ) { print “$word\n”;

}} }}}}

Page 33: Introduction to Perl Part II

Special Variable Modified Special Variable Modified by a Matchby a Match

$& $& – Copy of text matched by the regexCopy of text matched by the regex

$' $' – A copy of the target text in from of the matchA copy of the target text in from of the match

$` $` – A copy of the target text after the matchA copy of the target text after the match

$1, $2, $3, ect$1, $2, $3, ect– The text matched by 1st, 2nd, ect., set of The text matched by 1st, 2nd, ect., set of

parentheses. Note : $0 is not included hereparentheses. Note : $0 is not included here $+$+

– A copy of the highest numbered $1, $2, $3, ect..A copy of the highest numbered $1, $2, $3, ect..

Page 34: Introduction to Perl Part II

Our Simple Example once Our Simple Example once againagain

Now lets revise are program to find all Now lets revise are program to find all the words that end in 'ing' without the words that end in 'ing' without splitting our line of text into an array of splitting our line of text into an array of wordswords

#!/usr/local/bin/perl -w#!/usr/local/bin/perl -wwhile(<>) {while(<>) { chomp $_;chomp $_; if($_=~/([A-Za-z]*ing\b)/) { print "$&\n"; }if($_=~/([A-Za-z]*ing\b)/) { print "$&\n"; }}}

Page 35: Introduction to Perl Part II

ExampleExample#!/usr/local/bin#!/usr/local/bin$exp = <STDIN>; chomp $exp;$exp = <STDIN>; chomp $exp;if($exp=~/^([A-Za-z+\s)*\bcrave\b(\s[A-Za-z]+)*/) {if($exp=~/^([A-Za-z+\s)*\bcrave\b(\s[A-Za-z]+)*/) {

print “$1\n”; print “$1\n”; print “$2\n”; print “$2\n”;

}}– Run Program with string : I crave to rule the world!Run Program with string : I crave to rule the world!– Results:Results:

II to rule the world!to rule the world!

Page 36: Introduction to Perl Part II

ExampleExample#!/usr/local/bin#!/usr/local/bin$exp = <STDIN>; chomp $exp;$exp = <STDIN>; chomp $exp;if($exp=~/\bcrave\b/) {if($exp=~/\bcrave\b/) {

print “$`\n”; print “$&\n”; print “$’\n”; print “$`\n”; print “$&\n”; print “$’\n”; }}– Run Program with string : I crave to rule the Run Program with string : I crave to rule the

world!world!– Results:Results:

II cravecrave to rule the world!to rule the world!

Page 37: Introduction to Perl Part II

Thank you Thank you