scheme in python

48
Scheme Scheme in in Python Python

Upload: kiefer

Post on 05-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Scheme in Python. Overview. We’ll look at how to implement a simple scheme interpreter in Python This is based on the Scheme in Scheme interpreter we studied before We’ll look at pyscheme 1.6, which was implemented by Danny Yoo as an undergraduate at Berkeley - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scheme in Python

Scheme in Scheme in PythonPython

Page 2: Scheme in Python

OverviewOverview

We’ll look at how to implement a simple scheme interpreter in Python

This is based on the Scheme in Scheme interpreter we studied before

We’ll look at pyscheme 1.6, which was implemented by Danny Yoo as an undergraduate at Berkeley

Since Python doesn’t optimize for tail recursion, he uses trampolining, which we’ll introduce

Page 3: Scheme in Python

What we need to doWhat we need to do

A representation for Scheme’s native data structures• Pairs (aka, cons cells), symbols, strings,

numbers,Booleans A reader that converts a stream of characters

into a stream of s-expressions• We’ll introduce an intervening step reading

characters and converting to tokens Implement various built-ins

• e.g., cons, car, +, …

Page 4: Scheme in Python

What we won’t need to doWhat we won’t need to do

We can rely on Python for a number of very useful things

• Representing numbers and strings• Garbage collection• Low level I/O

Page 5: Scheme in Python

atomsatoms

Atoms include strings, number, and symbols We’ll use Python’s native representation for

string and numbers Symbols in Scheme are interned – there is a

unique object for each symbol read This is how they differ from strings,

which are not interned• Note: some Lisp implemen-

tations intern small integers

Page 6: Scheme in Python

SymbolsSymbols# A global dictionary that contains all known symbols

__INTERNED_SYMBOLS = {}

class __Symbol(str):

"""A symbol is just a special kind of string"""

def __eq__(self, other): return self is other

def symbol(s):

""""Returns symbol given string, creating new ones if needed”””

global __interned_symbols

if s not in __INTERNED_SYMBOLS:

__INTERNED_SYMBOLS[s] = __Symbol(s)

return __INTERNED_SYMBOLS[s]

# Here are definitions of symbols that we should know

scheme_false = symbol("#f")

scheme_true = symbol("#t")

__empty_symbol = Symbol("")

def isSymbol(s): return type(s) == type(__empty_symbol)

Page 7: Scheme in Python

GCing Unused SymbolsGCing Unused Symbols

If the only reference to a symbol is from theglobal list of interned symbols, it can be garbage collected

We’ll use Python’s weakref’s for this A weak reference is a reference that doesn’t

protect an object from garbage collection Objects referenced only by weak

references are considered unreachable(or "weakly reachable") and may becollected at any time

Page 8: Scheme in Python

using weakrefsusing weakrefs

import weakref

from UserString import UserString as __UserStr

__INTERNED_SYMBOLS = weakref.WeakValueDictionary({})

class __Symbol(__UserStr):

if s not in __INTERNED_SYMBOLS:

# make a temp strong reference

newSymbol = __Symbol(s)

__INTERNED_SYMBOLS[s] = newSymbol

return __INTERNED_SYMBOLS[s]

Page 9: Scheme in Python

Representing pairsRepresenting pairs

The core of scheme only has one kind of data structure – lists– and it is made up out of pairs

What Python types should we use?• A user defined class, Pair• Lists• Tuples• Dictionary• Closures

Page 10: Scheme in Python

Aside: pairs as closuresAside: pairs as closures

Functions are very powerful We can use them to represent cons cells or

pairs We don’t want to do this in practice But it shows the power of programming with

functions

Page 11: Scheme in Python

(define (mycons theCar theCdr)

;; mycons returns a closure that takes a 2-arg function and applies

;; it to the two remembered vlue's, i.e., the pair's car and cdr.

(lambda (f) (f theCar theCdr)))

(define (mycar cell)

;; mycar takes a pair closure and feeds it a 2-arg function that

;; just returns the first arg

(cell (lambda (theCar theCdr) theCar)))

(define (mycdr cell)

;; mycdr takes a pair closure and feeds it a 2-arg function that

;; just returns the first arg

(cell (lambda (theCar theCdr) theCdr)))

(define myempty

;; the empty list is just a function that always returns true.

(lambda (f) #t))

(define (mynull? cell)

;; a pair is not the empty list

(eq? cell myempty))

Page 12: Scheme in Python

exampleexample

> (define p1 (mycons 1 (mycons 2 myempty)))

> p1

#<procedure:.../scheme/pairs.ss:1:31>

> (mycar p1)

1

> (mycdr p1)

#<procedure:.../scheme/pairs.ss:1:31>

> (mycar (mycdr p1))

2

> (mycdr (mycdr p1))

#<procedure:myempty>

Page 13: Scheme in Python

Representing pairsRepresenting pairs

We’ll define a subclass of list to represent a pair

Class Pair(list) : pass The cons functions creates a new cons cell with

a given car and cdr

def cons(car, cdr):return Pair([car, cdr])

Defining built-in functions for pairs will be easy

def car(p): return p[0]

def cdr(p): return p[1]

def cadr(p): return car(cdr(p))

def set_car(p,x): p[0] = x

Page 14: Scheme in Python

Lexical AnalyzerLexical Analyzer

Consume a string of characters, identify tokens, throw away comments and whitespace, and return a list of remaining tokens

Each token will be a (<type>, <token>) tuple like (‘number’, ‘3.145’) or (‘comment’, ‘;; foo’)

Recognize tokens using regular expressions We won’t worry about efficiency

Page 15: Scheme in Python

Token regular expressionsToken regular expressionsPATTERNS = [

('whitespace', re.compile(r'(\s+)')),

('comment', re.compile(r'(;[^\n]*)')),

('(', re.compile(r'(\()')),

(')', re.compile(r'(\))')),

('dot', re.compile(r'(\.\s)')),

('number', re.compile(r'([+\-]?(?:\d+\.\d+|\d+\.|\.\d+|\d+))')),

('symbol', re.compile(r'([a-zA-Z\+\=\?\!\@\#\$\%\^\&\*\-\/\.\>\<][\w\+\=\?\!\@\#\$\%\^\&\*\-\/\.\>\<]*)')),

('string', re.compile(r'"(([^\"]|\\")*)"')),

('\'', re.compile(r'(\')')),

('`', re.compile(r'(`)')),

(',', re.compile(r'(,)')) ]

Page 16: Scheme in Python

Lex ExamplesLex Examples

>>> from lex import *

>>> tokenize("")

[(None, None)]

>>> tokenize(" 1 2. .3 1.3 -4")

[('number', '1'), ('number', '2.'), ('number', '.3'), ('number', '1.3'), ('number', '-4'), (None, None)]

>>> tokenize('foo 12.3foo +')

[('symbol', 'foo'), ('number', '12.3'), ('symbol', 'foo'), ('symbol', '+'), (None, None)]

>>> tokenize('(foo (bar ()))')

[('(', '('), ('symbol', 'foo'), ('(', '('), ('symbol', 'bar'), ('(', '('), (')', ')'), (')', ')'), (')', ')'), (None, None)]

Page 17: Scheme in Python

Raw string notationRaw string notation

>>> s = ‘\nfoo\n’

>>> s

'\nfoo\n'

>>> print s

foo

>>> s = r'\nfoo\n'

>>> s

'\\nfoo\\n'

>>> print s

\nfoo\n

Page 18: Scheme in Python

tokenize()tokenize()def tokenize(s): toks = [] found = True while s and found: found = False for type, regex in PATTERNS: match_obj = regex.match(s) if match_obj: if type not in ('whitespace', 'comment'): toks.append((type, match_obj.group(1))) s = s[match_obj.span()[1] :] found = True break if not found: print "\nNo match'", s, ”’ – tokenize” toks.append(EOF_TOKEN) return tokens

Page 19: Scheme in Python

tokenize() examplestokenize() examples

>>> from lex import *

>>> tokenize('(a 1.0)')[('(', '('), ('symbol', 'a'), ('number', '1.0'), (')', ')'), (None, None)]

>>> tokenize('(define (add1 x)(+ x 1))')[('(', '('), ('symbol', 'define'), ('(', '('), ('symbol', 'add1'), ('symbol', 'x'), (')', ')'), ('(', '('), ('symbol', '+'), ('symbol', 'x'), ('number', '1'), (')', ')'), (')', ')'), (None, None)]

Page 20: Scheme in Python

parseparse

Consume a sequence of tokens and produce a sequence of s-expressions

Use a recursive descent parser We’ll handle just a few special cases, namely

quote and backquote and dotted pairs

Page 21: Scheme in Python

Peeking and eatingPeeking and eating

def peek(tokens):

"""Take a quick glance at the first token in our tokens list.""”

if len(tokens) == 0:

raise ParserError, "While peeking: ran out of tokens.”

return tokens[0]

Page 22: Scheme in Python

Peeking and eatingPeeking and eatingdef eat(tokens, desired_type):

"""If the type of the next token is desired_type, pop it from the list and return it, else return False”””

if len(tokens) == 0:

raise ParserError, 'No tokens left, seeking ' + desired_type

return tokens.pop(0)

if tokens[0][0] == desired_type

else False

Page 23: Scheme in Python

Peeking and eatingPeeking and eatingdef eat_safe(tokens, tokenType):

"""Digest the first token in our tokens list, making sure that we're

biting on the right tokenType of thing.""”

if len(tokens) == 0:

raise ParserError, "While trying to eat %s: ran out of tokens." % tokenType )

if tokens[0][0] != tokenType:

raise ParserError, "Seeking %s got %s" % (tokenType, tokens[0])

return tokens.pop(0)

Page 24: Scheme in Python

parseparsedef parseExpression(tokens):

if eat(tokens, '\''):

return cons(symbol('quote'), cons(parseExpression(tokens), NIL))

if eat(tokens, '`'):

return cons(symbol('quasiquote'), cons(parseExpression(tokens), NIL))

elif eat(tokens, ','):

return cons(symbol('unquote'), cons(parseExpression(tokens), NIL))

elif eat(tokens, '('):

return parse_list_members(tokens)

elif peek(tokens)[0] in ('number’,'symbol’,'string'):

return parse_atom(tokens)

else:

raise ParserError, ”Parsing: no alternatives"

Page 25: Scheme in Python

parse_list_members()parse_list_members()

def parse_list_members(tokens):

if eat(tokens, 'dot'):

final = parseExpression(tokens)

eat_safe(tokens, ')')

return final

if peek(tokens)[0] in ('\'’,'`’,',’,'(’, 'number’,'symbol’,'string'):

return cons(parseExpression(tokens), parse_list_members(tokens))

if eat(tokens, ')'):

return NIL

raise ParserError, "Can't finish list” + tokens

Page 26: Scheme in Python

Recursive descent parsingRecursive descent parsing

Remember one problem with recursive descent parsing is that the grammar has to be right recursive

Another potential problem is recursing too deeply and exceeding the limit on the stack

But maybe we can use tail recursion, which an interpreter or compiler can recognize and execute as iteration?

Not in Python

Page 27: Scheme in Python

Python doesn’t optimize tail recursionPython doesn’t optimize tail recursion

def fact0(n):

# iterative facorial

result = 1

while n>1:

result *= n

n -= 1

return result

def fact1(n):

# simple recursive factorial

return 1 if n==1 else n*fact2(n - 1)

def fact2(n, result=1):

# tail recursive factorial

return result if n==1 else fact2(n-1, n*result)

Page 28: Scheme in Python

Try thisTry this

http://www.csee.umbc.edu/331/fall08/0101/code/python/pyscheme-1.7/src/fact.py

Page 29: Scheme in Python

Default limit is 999Default limit is 999

fact2(1000) and fact3(1000) both die>>> fact2(1000)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "fact.py", line 17, in fact2

return result if n==1 else fact2(n-1, n*result)

File "fact.py", line 17, in fact2

File "fact.py", line 17, in fact2

return result if n==1 else fact2(n-1, n*result)

RuntimeError: maximum recursion depth exceeded

Page 30: Scheme in Python

How to solve this?How to solve this?

You can set the maximum recursion depth higher>>> import sys

>>> sys.getrecursionlimit()

1000

>>> sys.setrecursionlimit(10000)

>>> fact2(1100)

53437084880926377034242155 ... 00000000L

But this is not a general solution And Guido is on the record as not wanting to

optimize tail recursion• http://www.artima.com/forums/flat.jsp?forum=106&thread=147358

Page 31: Scheme in Python

Trampoline StyleTrampoline Style

A trampoline is a loop that iteratively invokes thunk-returning functions

A thunk is just a a piece of code to perform a delayed computation (e.g., a closure)

A single trampoline can express all control transfers of a program

Converting a program to trampolined style is trampolining

This is kind of continuation passing style of programming

Trampolined functions can do tail recursive function calls in stack-oriented languages

Page 32: Scheme in Python

Trampolining is one answerTrampolining is one answer

A way to program using CPS, Continuation Passing Style

CPS is a style of programming where control is passed explicitly as continuations

Trampolining is a simple way toeliminate recursion

We’ll use a simple kind of trampolining Instead of making a recursive call, a

procedure can bounce back up to itscaller with a continuation, which canbe called to proceed with the computation

Page 33: Scheme in Python

PogoPogo

from pogo import pogo, land, bounce

def fact3(n):

# factorial in a trampolined style

return pogo(fact_tramp(n))

def fact_tramp(n, result=1):

return land(result) if n==1 else bounce(fact_tramp, n-1, n*result)

Page 34: Scheme in Python

Variable length argument listsVariable length argument lists

>>> def foo(*args):

print "Number of arguments:", len(args)

print "Arguments are: ", args

>>> foo(1,2,3,'d',5)

Number of arguments: 5

Arguments are: (1, 2, 3, 'd', 5)

>>> def bar(arg1, *rest):

print …

Page 35: Scheme in Python

pogo.pypogo.py

def bounce(function, *args):

"""Returns new trampolined value that continues bouncing"""

return ('bounce', function, args)

def land(value):

"""Returns new trampolined value that lands off trampoline"""

return ('land', value)

Page 36: Scheme in Python

It worksIt works

>>> sys.setrecursionlimit(10)

>>> fact3(100)

93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000L

>>> fact3(1000)

4023872600770937735...00000000000000L

Page 37: Scheme in Python

pogo.pypogo.py

def pogo(bouncer):

try:

while True:

if bouncer[0] == 'land’:

return bouncer[1]

elif bouncer[0] == 'bounce':

bouncer = bouncer[1](*bouncer[2])

else:

traceback.print_exc()

raise TypeError, "not a bouncer”

except TypeError:

traceback.print_exc()

raise TypeError, "not a bouncer”

Page 38: Scheme in Python

See pyscheme1.6See pyscheme1.6

Pyscheme1.6 is written in trampoline style Which was done by hand, as opposed to

using an automatic trampoliner And which I’ve been undoing by hand

Page 39: Scheme in Python

def eval(exp, env):

return pogo.pogo(teval(exp, env, pogo.land))

def teval(exp, env, cont):

if expressions.isIf(exp):

return evalIf(exp, env, cont)

def evalIf(exp, env, cont):

def c(predicate_val):

if isTrue(predicate_val):

return teval(ifConsequent(exp), env, cont)

else:

return teval(ifAlternative(exp), env, cont)

return teval(expressions.ifPredicate(exp), env, c)

Page 40: Scheme in Python

evalevaldef eval(exp, env):

if exp.isSelfEvaluating(exp): return exp

if exp.isVariable(exp):

return env.lookupVariableValue(exp, env)

if exp.isQuoted(exp): return evalQuoted(exp, env)

if exp.isAssignment(exp):

return evalAssignment(exp, env)

if exp.isDefinition(exp):

return evalDefinition(exp, env)

if exp.isIf(exp): return evalIf(exp, env)

if exp.isLambda(exp):

return exp.makeProcedure(exp.lambdaParameters(exp),

exp.lambdaBody(exp), env)

if exp.isBegin(exp):

return evalSequence(exp.beginActions(exp), env)

if exp.isApplication(exp):

return evalApplication(exp, env)

raise SchemeError, "Unknown expr, eval " + str(exp)

Page 41: Scheme in Python

applyapply

def apply(procedure, arguments, env):

if exp.isPrimitiveProcedure(procedure):

return applyPrimProc(procedure, arguments, env)

if exp.isCompoundProcedure(procedure):

newEnv = env.extendEnvironment(

exp.procedureParameters(procedure),

arguments,

exp.procedureEnvironment(procedure))

return evalSequence(exp.procedureBody(procedure), newEnv)

raise SchemeError, "Unknown proc - apply " + str(procedure)

Page 42: Scheme in Python

EnvironmentsEnvironments

An environment will be a list of frames Each frame will be a Python dictionary with

the variable names as keys and their values as values

Page 43: Scheme in Python

envenvTHE_EMPTY_ENVIRONMENT = []

def enclosingEnvironment(env): return env[1:]

def firstFrame(env): return env[0]

def extendEnvironment(var_pairs, val_pairs, base):

new_frame = {}

vars = toPythonList(var_pairs)

vals = toPythonList(val_pairs)

if len(vars) != len vals:

raise SchemeError, "Mismatched vals and vars"

for (var, val) in zip(vars, vals):

new_frame[var] = val

return new_frame + base_env

Page 44: Scheme in Python

Lookup a Variable ValueLookup a Variable Value

def lookupVariableValue(var, env):

while True:

if env == THE_EMPTY_ENVIRONMENT:

raise SchemeError,"Unbound var “+var

frame = firstFrame(env)

if frame.has_key(var):

return frame[var]

env = enclosingEnvironment(env)

Page 45: Scheme in Python

Define/Set a VariableDefine/Set a Variable

def defineVariable(var, val, env):

firstFrame(env)[var] = val

def setVariableValue(var, val, env):

while True:

if env == THE_EMPTY_ENVIRONMENT:

raise SchemeError, "Unbound variable -- SET! " + var

top = firstFrame(env)

if top.has_key(var):

top[var] = val

return

env = enclosingEnvironment(env)

Page 46: Scheme in Python

BuiltinsBuiltins

We’ll define a Python function to handle each of the primitive Scheme functions

Many List functions take any number of args:• (+ 1 2) => 3

• (+ 1 2 3 4 5) => 15

• (+ ) => 0

We can takuse Python’s (new) syntax for functions that take any number or args, e.g.:• If the last parameter in a function’s parameter list is

preceded by a *, it’s bound to a list of the remaining args• def add (*args): sum(args)

Page 47: Scheme in Python

BuiltinsBuiltinsdef allNumbers(numbers):

for n in numbers:

if type(n) not in (types.IntType, types.LongType,

types.FloatType):

return 0

return 1

def schemeAdd(*numbers):

if not allNumbers(numbers):

raise SchemeError, "prim + - non-numeric arg”

return sum(numbers)

Page 48: Scheme in Python

Setting up the initial environmentSetting up the initial environment

def setupEnvironment():

PRIME_PROCEDURES = [ ["car", pair.car],

["cdr", pair.cdr], ["+", schemeAdd], ... ]

init_env = env.extendEnvironment(

pair.NIL, pair.NIL, env.THE_EMPTY_ENVIRONMENT)

for name, proc in PRIME_PROCEDURES:

p = cons(symbol("primitive"), cons(proc, NIL))

defineVariable(symbol(name), p, env)

defineVariable(symbol("#t"),symbol("#t"), init_env)

defineVariable(symbol("#f"), symbol("#f"), init_env)

return initial_environment