pycon2012 notes documentation€¦ · pycon2012 notes documentation, release 0.0.1 sluggerml...

21
PyCon2012 Notes Documentation Release 0.0.1 Eric Walstad March 16, 2012

Upload: others

Post on 17-Apr-2020

68 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes DocumentationRelease 0.0.1

Eric Walstad

March 16, 2012

Page 2: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics
Page 3: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CONTENTS

1 Thursday 31.1 Sphinx Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Learning Music with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Friday 52.1 A Noob Speaks to Noobs: Your First Site in the Cloud . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Extracting musical information from sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Practical Machine Learning in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.4 Throwing Together Distributed Services With Gevent . . . . . . . . . . . . . . . . . . . . . . . . . . 62.5 Decorators and Context Managers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.6 Python Metaprogramming for Mad Scientists and Evil Geniuses . . . . . . . . . . . . . . . . . . . . 72.7 Introspecting Running Python Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Saturday 93.1 Python meets arduino . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Python for makers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.3 Pragmatic Unicode, or, How do I stop the pain? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.4 Coroutines, event loops, and the history of Python generators . . . . . . . . . . . . . . . . . . . . . 103.5 Militarizing Your Backyard with Python: Computer Vision and the Squirrel Hordes . . . . . . . . . . 113.6 Using fabric to standardize the development process . . . . . . . . . . . . . . . . . . . . . . . . . . 113.7 Spatial data and web mapping with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.8 The Pyed Piper: A Modern Python Alternative to awk, sed and Other Unix Text Manipulation Utilities 13

4 Sunday 154.1 Parsing Horrible Things with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Improving Documentation with “Beginner’s Mind” (or: Fixing the Django Tutorial) . . . . . . . . . 154.3 More than just a pretty web framework, the Tornado IOLoop . . . . . . . . . . . . . . . . . . . . . . 16

5 Indices and tables 17

i

Page 4: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

ii

Page 5: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

These notes are kept on github: https://github.com/ewalstad/PyCon2012-Notes

CONTENTS 1

Page 6: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

2 CONTENTS

Page 7: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CHAPTER

ONE

THURSDAY

1.1 Sphinx Tutorial

Brandon Rhodes

• http://rhodesmill.org/brandon/

• http://twitter.com/#!/brandon_rhodes

• https://us.pycon.org/2012/schedule/presentation/355/

• http://pyvideo.org/video/616/documenting-your-project-with-sphinx

• http://sphinx.pocoo.org/contents.html

– http://sphinx.pocoo.org/rest.html

• http://docutils.sourceforge.net/docs/user/rst/quickref.html

Setup a virtualenv:

workon sphinxenvsphinx-quickstartmkdir pg8; cd pg8while true; do sleep 1; make html; donepython -m SimpleHTTPServer

Browse to http://localhost:8000

To make a table of contents:

Table of Contents-----------------

.. toctree:::maxdepth: 2

guideapi

To make api docs, use autoddoc: edit conf.py and add:

3

Page 8: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

sys.path.insert(0, os.path.abspath(’.’))extensions = [’sphinx.ext.autodoc’]

then, in your rst file, reference the code with something like:

.. automodule:: triangles.shape:members:

or

two instances of the :class:\‘~triangles.shape.Triangle\‘ class and calling on of their:meth:\‘~triangles.shape.Triangle.is_similar()\‘ methods

Set the theme by adding something like this to the conf.py file:

html_theme = ’agogo’

http://sphinx.pocoo.org/theming.html

Batteries included: Comes with a built-in search engine powered by JavaScript

Can run doctest to verify your docs. In conf.py:

extensions = [’sphinx.ext.autodoc’,’sphinx.ext.viewcode’,’sphinx.ext.doctest’,

]doctest_path = sys.path[0:1]

1.2 Learning Music with Python

Pedro Kroger

@pedrokroger

• https://github.com/kroger/learning-music-with-python

• https://us.pycon.org/2012/schedule/presentation/148/

• No pyvideo yet

• http://musicforgeeksandnerds.com/

• http://pedrokroger.net/

• https://github.com/kroger

4 Chapter 1. Thursday

Page 9: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CHAPTER

TWO

FRIDAY

2.1 A Noob Speaks to Noobs: Your First Site in the Cloud

Katie Cunningham

• https://us.pycon.org/2012/schedule/presentation/101/

• http://pyvideo.org/video/628/a-noob-speaks-to-noobs-your-first-site-in-the-cl

2.2 Extracting musical information from sound

Adrian Holovaty

• https://us.pycon.org/2012/schedule/presentation/6/

• http://pyvideo.org/video/878/extracting-musical-information-from-sound

• http://morecowbell.dj/

• http://marsyas.info

• http://vamp-plugins.org

• http://aubio.org py library

the echo nest (free personal use) http://the.echonest.com/

wav file to notes

2.3 Practical Machine Learning in Python

Matt Spitz @mattspitz

• http://mloss.org ML package

• https://us.pycon.org/2012/schedule/presentation/119/

• http://pyvideo.org/video/636/practical-machine-learning-in-python

• https://github.com/mattspits/sluggerml

• http://slideshare.net/mattspitz/practical-machine-learningin-in-python

5

Page 10: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

SluggerML training set: blob of data

data points train a classifier

classifier returns statistics

sources, coalescing, scrubbing

training set

• nltk: natural language processing: including stemming

• mlpy: regression, classification , clustering

• PyML

• PyBrain

• mdp-tollki: modular data processing toolkit

• scikit-learn

demo

feature selection: identify predictiv feature

chi-squared feature selection from scikit-learn

nltk NaiveBayesClassifier

tips and tricks

Persistent classifier internals

once trained, save and reuse

Use generators where possible

avoid keeping data in memory

single pass algorithms

conversion pass before training

Multicore text processing

scrubbing: low memory footprint

multiprocessing module

The fine print

understand your data and algorithms

ml-class.org is an excellent resource

2.4 Throwing Together Distributed Services With Gevent

Jeff Lindsay @progrium

• https://us.pycon.org/2012/schedule/presentation/288/

• http://pyvideo.org/video/642/throwing-together-distributed-services-with-geven

• https://github.com/progrium/ginkgo

6 Chapter 2. Friday

Page 11: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

ginko.service

services are classes and nested modules reusable components

start stop and reload, can run as daemons

have configuration

from ginko.core import Service

2.5 Decorators and Context Managers

Dave Brondsema [email protected]

• https://us.pycon.org/2012/schedule/presentation/131/

• http://pyvideo.org/video/883/decorators-and-context-managers

• speakerdeck.com/u/brondsem

memoize decorator

pip install decorator

flattens nested function: helps code look cleaner

class based decorators using the __call__ with the decorator packacge

decorator(decorator function, function)

context managers

class with __enter__ and __exit__:

from contextlib import closing # (use nested in prev versions)

2.5.1 Ideas for using context managers:

• acquire locks

• set global vars or flags

• timing

• monkey patching

• transactions

nice visual seperation of logical code blocks

2.6 Python Metaprogramming for Mad Scientists and Evil Geniuses

Walker Hale

• https://us.pycon.org/2012/schedule/presentation/380/

• http://pyvideo.org/video/884/python-metaprogramming-for-mad-scientists-and-evi

2.5. Decorators and Context Managers 7

Page 12: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

I had to sit on the floor for this talk so I didn’t take any notes.

2.7 Introspecting Running Python Processes

Adam Lowry

• https://us.pycon.org/2012/schedule/presentation/466/

• http://pyvideo.org/video/656/introspecting-running-python-processes

• https://github.com/robotadam/socketconsole

• https://github.com/schmichael/mmstats

• https://github.com/Greplin/scales/

• https://github.com/alonho/pystuck

• http://urbanairship.com/jobs They are hiring

gdb-heap

eventlet’s backdoor mudule

werkzeug debugger

approaches:

new relic hosted webapp monitoring

graphite scalable graphing system for time series data with py lib for seinding data from inside your app

socketconsole stack trace dumps for python processes

mmstats: the /proc filesystem for your application:: import mmstats

bin/slurpstats progress-stats

bin/pollstats

mmash: a web server finds the metrics you’ve created and displays them

8 Chapter 2. Friday

Page 13: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CHAPTER

THREE

SATURDAY

3.1 Python meets arduino

Peter Kropf

• https://us.pycon.org/2012/schedule/presentation/41/

• http://pyvideo.org/video/660/python-meets-the-arduino

• http://firmata.org

• http://modbus.org

• http://github.com/pkropf/pycon2012

pyserial module:

import serial

packages to handle serial protocol

3.2 Python for makers

Hugo Boyer

• https://us.pycon.org/2012/schedule/presentation/282/

• http://pyvideo.org/video/663/python-for-makers

• http://machinetouch.appspot.com/svgEngraver.py

Software engineer at MakerBot

Wrote his own code to produce gcode, turned it into a library. Struggled between writing code to run his cnc machinesand finding the time to actually build stuff.

9

Page 14: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

3.3 Pragmatic Unicode, or, How do I stop the pain?

Ned Batchelder

@nedbat

• https://us.pycon.org/2012/schedule/presentation/141/

• http://pyvideo.org/video/948/pragmatic-unicode-or-how-do-i-stop-the-pain

• http://nedbatchelder.com

• http://bit.ly/unipain Blog post of the talk he gave on unicode

• http://nedbatchelder.com/text/unipain/unipain.html#1 Slides

• https://github.com/nedbat

• https://bitbucket.org/ned

everything is bytes, remember that

unicode: assigns characer to code points (integers)

encoding : have to map unicode points to bytes somehow

utf-16, utf-32, utf-8, etc

utf-8: the king of encodings variable length ascii chars are still one byte

str vs unicode str: A sequence of bytes Unicode: A sequence of code points

know the difference between bytes and codepoints

.encode codepoints to bytes (operation on unicode strings)

.decode bytes to codepoints

len(unicode) => returns number of codepoints len(utfy) => bytes

myunicode.encode("ascii", "xmlcharrefreplace")

sys.getdefaultencoding()

unicode sandwich: bytes on the outside unicode on the inside encode/decode at the edges

know what you have, bytes or unicode if bytes, what encoding

encoding is out of band

you cannot infer the encoding of bytes you must be told or you have to guess

Data is dirty, sometimes you are told wrong.

test unicode

3.4 Coroutines, event loops, and the history of Python generators

David Mertz

• https://us.pycon.org/2012/schedule/presentation/104/

• http://pyvideo.org/video/668/coroutines-event-loops-and-the-history-of-pytho

Sorry, no notes for this one, either

10 Chapter 3. Saturday

Page 15: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

3.5 Militarizing Your Backyard with Python: Computer Vision and theSquirrel Hordes

Kurt Grandis

• https://us.pycon.org/2012/schedule/presentation/267/

• http://pyvideo.org/video/674/militarizing-your-backyard-with-python-computer

• https://sites.google.com/site/projectsentrygun/

• http://code.google.com/p/python-on-a-chip/

OpenCV open source computer vision:

import cvcv.NamedWindow("camera raw", 1)capture = cv.CreateCameraCapture(0)img = cv.Query..

vbBlobsLib

cvFind...

O’Reilly Learning OpenCV

Support Vector Machines (SVM):

from svm import *

Use opencv to create color histogram (different for cardinal vs squirrel)

OpenTLD

predator

3.6 Using fabric to standardize the development process

Ricardo Kirkner

• https://us.pycon.org/2012/schedule/presentation/25/

• http://pyvideo.org/video/677/using-fabric-to-standardize-the-development-proce

• https://github.com/ricardokirkner/fabric-pycon2012

Good intro to fabric but the speaker’s examples were all from an old version of fabric.

3.7 Spatial data and web mapping with Python

Paul Smith

@paulsmith

• https://us.pycon.org/2012/schedule/presentation/428/

3.5. Militarizing Your Backyard with Python: Computer Vision and the Squirrel Hordes 11

Page 16: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

• http://pyvideo.org/video/680/spatial-data-and-web-mapping-with-python

Predicates answer questions about relationships

indexes: R-tree: good for all types of geometries quatd-tree: best for poits of data equivalent to a geohash

indexed queries: nearest neighbors bounding box point query

spacial reference systems: wgs 84 spherical mercator texas centric albers equal area various state plane

Formats: Vector:

ESRI Shapefile shp, shx, dbf, prj

GeoJSON

KML

WKT

Raster: GeoTIFF

Libraries: GDAL/OGR

GEOS

PROJ.4

libspatialindex

Python: Shapely Wrapper around GEOS

Rtree Wraps libspatialindex

Mapnik Library for creating maps, plugins for reading PostGIS, shapefiles, etc

GeoDjango Bundled with Django, standalone wrappers for GEOS, GDAL, OGR, GeoIP

Kartograph: Python library and CLI, input shaptefiles, output SVG, style with CSS, behaviour with JavaScript.

Applications:

TileStache: web map tile serving and composting; serves rendered mapnik tiles; prerendered tiles; vectordata; cached

ipython:

QGIS: desktop app;

Data sources:

US Census TIGER/Line

http://NationalAtlas.gov

OpenStreetMap

State and local GIS departments

See also OpenLayers, Leaflet, Modest Maps, Wax, Polymaps, TileMill

GDAL collection of unix binaries - swiss army knife for conversions, etc.

12 Chapter 3. Saturday

Page 17: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

3.8 The Pyed Piper: A Modern Python Alternative to awk, sed andOther Unix Text Manipulation Utilities

Toby Rosen

• https://us.pycon.org/2012/schedule/presentation/14/

• http://pyvideo.org/video/686/the-pyed-piper-a-modern-python-alternative-to-aw

• http://code.google.com/p/pyp/wiki/intro

Example usage:

echo hello|pyp "p.split()"[’hello’, ’world’]

echo goodbye/crue/world|pyp "slash"[%dgoodbye%cruel%world]]

Command line script that lets you write python code on the command line and use the pipe operator to chain textmanipulation operations together.

Looks cool, but there are a lot of special command line options and syntax to learn. I think I’ll stick with the *nixtools.

3.8. The Pyed Piper: A Modern Python Alternative to awk, sed and Other Unix Text ManipulationUtilities

13

Page 18: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

14 Chapter 3. Saturday

Page 19: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CHAPTER

FOUR

SUNDAY

4.1 Parsing Horrible Things with Python

Erik Rose

• https://us.pycon.org/2012/schedule/presentation/468/

• http://pyvideo.org/video/708/parsing-horrible-things-with-python

• https://github.com/erikrose/media-wiki-parser

Ugh, my head hurts after this one. The speaker describes his experiences in writing a language parser for parsingMediaWiki grammar.

4.2 Improving Documentation with “Beginner’s Mind” (or: Fixing theDjango Tutorial)

Karen Rustad

@mllerustad

• https://us.pycon.org/2012/schedule/presentation/422/

• http://pyvideo.org/video/713/improving-documentation-with-beginners-mind-o

• http://PyStar.org

• http://bit.ly/xM1GML Another mis-copied bit.ly link

• https://github.com/aldeka

documentation for the newcomer

step 0: install the target package

make sure there are pointers to where to get more help, early in the tutorial

make sure there are pointers to keep from reinventing wheels

tell the user how to test their apps

list tutorial prereqs early (like South for migrations)

15

Page 20: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

PyCon2012 Notes Documentation, Release 0.0.1

at the end of the tutorial there should be a “what’s next” section.

make it easy for the user to have a working, public facing, product at the end of the tutorial.

say who you are writing for at the beginning of the tutorial

test your docs; have your test runner test your Sphinx docs have a tutorial for each user type

4.3 More than just a pretty web framework, the Tornado IOLoop

Gavin M. Roy

• https://us.pycon.org/2012/schedule/presentation/328/

• http://pyvideo.org/video/720/more-than-just-a-pretty-web-framework-the-tornad

• http://myyearbook.com They are hiring

Twisted has a non-pythonic reputation

fsm game server

Tornado is a ‘take wat you need’ framework

IOLoop is the core of Tornado’s network stack: single instance per process Used for client libraries and server ap-plications http://goo.gl/VFeAF read the source

IOStream convenient utility class for dealing with the IOLoop; does most of the work for you

SSLIOStream for ssl sockets

tornado.netutil.TCPServer

See Hello Async World:

class EchoServer(netutil.TCPServer):

tornado.stack_context.StackContext “Slight Magic”

tornado.IOLoop:

.instance

.add_handler

.update_hander

.remove_handler

Events

Timers:

add_timeout(deadline, callback)remove_timeout(timeout)

16 Chapter 4. Sunday

Page 21: PyCon2012 Notes Documentation€¦ · PyCon2012 Notes Documentation, Release 0.0.1 SluggerML training set: blob of data data points train a classifier classifier returns statistics

CHAPTER

FIVE

INDICES AND TABLES

• genindex

• search

17