ocaml datatypes part ii: an exercise in type design · • design a datatype to describe...
TRANSCRIPT
![Page 1: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/1.jpg)
OCaml Datatypes Part II: An Exercise in Type Design
COS 326 Andrew W. Appel
Princeton University
slides copyright 2013-2015 David Walker and Andrew W. Appel
![Page 2: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/2.jpg)
A Note on Parameterized Type Definitions
![Page 3: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/3.jpg)
type (‘key, ‘val) tree = Leaf | Node of ‘key * ‘val * (‘key, ‘val) tree * (‘key, ‘val) tree type ‘a stree = (string, ‘a) tree type sitree = int stree
type ‘x f = body
arg f
definition:
use:
type f x = body
f arg
definition:
use:
General form: A Better Notation:
![Page 4: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/4.jpg)
Take-home Message • Think of parameterized types like functions:
– a function that take a type as an argument – produces a type as a result
• Theoretical basis:
– System F-omega – a typed lambda calculus with general type-level functions as
well as value-level functions
![Page 5: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/5.jpg)
Example Type Design
5
IBM developed GML (Generalize Markup Language) in 1969 • http://en.wikipedia.org/wiki/IBM_Generalized_Markup_Language • Precursor to SGML, HTML and XML
:h1.Chapter 1: Introduction :p.GML supported hierarchical containers, such as :ol :li.Ordered lists (like this one), :li.Unordered lists, and :li.Definition lists :eol. as well as simple structures. :p.Markup Minimization (later generalized and formalized in SGML), allowed the end-tags to be omitted for the “h1” and “p” elements.
![Page 6: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/6.jpg)
Simplified GML
6
To process a GML document, an OCaml program would: • Read a series of characters from a text file & Parse GML structure • Represent the information content as an OCaml data structure • Analyze or transform the data structure • Print/Store/Communicate results We will focus on how to represent and transform the information content of a GML document.
![Page 7: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/7.jpg)
Example Type Design
7
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
• A GML document consists of: – a list of elements
• An element is either: – a word or markup applied to an element
• Markup is either: – italicize, bold, or a font name
![Page 8: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/8.jpg)
Example Data
8
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
let d = [ Formatted (Bold, Formatted (Font “Arial”, Words [“Chapter”;“One”])); Words [“It”; ”was”; ”a”; ”dark”; ”&”; ”stormy; ”night.”; "A"]; Formatted (Ital, Words[“shot”]); Words [“rang”; ”out.”] ];;
![Page 9: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/9.jpg)
Challenge
9
• Change all of the “Arial” fonts in a document to “Courier”. • Of course, when we program functionally, we implement
change via a function that – receives one data structure as input – builds a new (different) data structure as an output
![Page 10: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/10.jpg)
Challenge
10
• Change all of the “Arial” fonts in a document to “Courier”.
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 11: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/11.jpg)
Challenge
11
• Change all of the “Arial” fonts in a document to “Courier”.
• Technique: approach the problem top down, work on doc first:
let rec chfonts (elts:doc) : doc =
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 12: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/12.jpg)
Challenge
12
• Change all of the “Arial” fonts in a document to “Courier”.
• Technique: approach the problem top down, work on doc first:
let rec chfonts (elts:doc) : doc = match elts with | [] -> | hd::tl ->
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 13: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/13.jpg)
Challenge
13
• Change all of the “Arial” fonts in a document to “Courier”.
• Technique: approach the problem top down, work on doc first:
let rec chfonts (elts:doc) : doc = match elts with | [] -> [] | hd::tl -> (chfont hd)::(chfonts tl)
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 14: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/14.jpg)
Changing fonts in an element
14
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing the font of an element:
let rec chfont (e:elt) : elt =
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 15: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/15.jpg)
Changing fonts in an element
15
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing the font of an element:
let rec chfont (e:elt) : elt = match e with | Words ws -> | Formatted(m,e) ->
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 16: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/16.jpg)
Changing fonts in an element
16
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing the font of an element:
let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) ->
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 17: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/17.jpg)
Changing fonts in an element
17
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing the font of an element:
let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) -> Formatted(chmarkup m, chfont e)
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 18: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/18.jpg)
Changing fonts in an element
18
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing a markup:
let chmarkup (m:markup) : markup =
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 19: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/19.jpg)
Changing fonts in an element
19
• Change all of the “Arial” fonts in a document to “Courier”.
• Next work on changing a markup:
let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m
type markup = Ital | Bold | Font of string type elt = Words of string list | Formatted of markup * elt type doc = elt list
![Page 20: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/20.jpg)
Summary: Changing fonts in an element
20
• Change all of the “Arial” fonts in a document to “Courier” • Lesson: function structure follows type structure
let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m let rec chfont (e:elt) : elt = match e with | Words ws -> Words ws | Formatted(m,e) -> Formatted(chmarkup m, chfont e) let rec chfonts (elts:doc) : doc = match elts with | [] -> [] | hd::tl -> (chfont hd)::(chfonts tl)
![Page 21: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/21.jpg)
Poor Style
21
• Consider again our definition of markup and markup change:
type markup = Ital | Bold | Font of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m
![Page 22: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/22.jpg)
Poor Style
22
• What if we make a change:
type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | _ -> m
the underscore silently catches all possible alternatives this may not be what we want -- perhaps there is an Arial TT font it is better if we are alerted of all functions whose implementation may need to change
![Page 23: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/23.jpg)
Better Style
23
• Original code:
type markup = Ital | Bold | Font of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | Ital | Bold -> m
![Page 24: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/24.jpg)
Better Style
24
• Updated code:
type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font “Arial” -> Font “Courier” | Ital | Bold -> m
..match m with | Font "Arial" -> Font "Courier" | Ital | Bold -> m.. Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: TTFont _
![Page 25: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/25.jpg)
Better Style
25
• Updated code, fixed:
• Lesson: use the type checker where possible to help you maintain your code
type markup = Ital | Bold | Font of string | TTFont of string let chmarkup (m:markup) : markup = match m with | Font "Arial" -> Font "Courier" | TTFont "Arial" -> TTFont "Courier" | Font s -> Font s | TTFont s -> TTFont s | Ital | Bold -> m
![Page 26: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/26.jpg)
A couple of practice problems
26
• Write a function that gets rid of immediately redundant markup in a document. – Formatted(Ital, Formatted(Ital,e)) can be simplified to
Formatted(Ital,e) – write maps and folds over markups
• Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are books, and others are conference papers. Journals have a name, number and issue; books have an ISBN number; All of these entries should have a title and author. – design a sorting function – design maps and folds over your bibliography entries
![Page 27: OCaml Datatypes Part II: An Exercise in Type Design · • Design a datatype to describe bibliography entries for publications. Some publications are journal articles, others are](https://reader033.vdocuments.site/reader033/viewer/2022060215/5f05e10f7e708231d4152c4b/html5/thumbnails/27.jpg)
To Summarize
27
• Design recipe for writing OCaml code: – write down English specifications
• try to break problem into obvious sub-problems – write down some sample test cases – write down the signature (types) for the code – use the signature to guide construction of the code:
• tear apart inputs using pattern matching – make sure to cover all of the cases! (OCaml will tell you)
• handle each case, building results using data constructor – this is where human intelligence comes into play – the “skeleton” given by types can almost be done
automatically! • clean up your code
– use your sample tests (and ideally others) to ensure correctness