lecture03

55
Knowledge Representation in Digital Humanities Antonio Jiménez Mavillard Department of Modern Languages and Literatures Western University

Upload: mavillard

Post on 11-May-2015

28 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Lecture03

Knowledge Representationin

Digital HumanitiesAntonio Jiménez Mavillard

Department of Modern Languages and LiteraturesWestern University

Page 2: Lecture03

Lecture 3

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard

* Contents: 1. Why this lecture? 2. Discussion 3. Chapter 3 4. Assignment 5. Bibliography

2

Page 3: Lecture03

Why this lecture?

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard

* This lecture... · trains the problem solving skill by means of algorithm formalization · prepares the ground to write real programs

3

Page 4: Lecture03

Last assignment discussion

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard

* Time to... · consolidate ideas and concepts dealt in the readings · discuss issues arised in the specific solutions to the projects

4

Page 5: Lecture03

Chapter 3

Fundamentals of Programming

1. Designing algorithms2. Elements of a program

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard5

Page 6: Lecture03

Chapter 3

1 Designing algorithms 1.1 The programming process 1.2 What is an algorithm?

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard6

Page 7: Lecture03

Chapter 3

2 Elements of a program 2.1 What is a program? 2.2 Components of a program 2.3 Types of errors

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard7

Page 8: Lecture03

Designing algorithms

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard8

Page 9: Lecture03

The programming process

* Programming cycle: 1. Define the problem 2. Plan the solution 3. Code the program 4. Test the program 5. Document the process

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard9

Page 10: Lecture03

The programming process

* Define the problem · Identify the input data (what we have) · Determine the output information (what we want to obtain)

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard10

Page 11: Lecture03

The programming process

* Plan the solution · Design an algorithm + by drawing a flow diagram + by writing pseudocode

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard11

Page 12: Lecture03

The programming process

* Code the program · Translate the algorithm into a programming language

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard12

Page 13: Lecture03

The programming process

* Test the program · Verify if for certain input, the program produces the correct output · Find and fix errors (debugging): + syntax + runtime + semantic

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard13

Page 14: Lecture03

The programming process

* Document the process · Describe the problem and the solution · Include pseudocode or flow diagrams · Report testing results · Comment the code

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard14

Page 15: Lecture03

References

Glassborow, Francis. “Chapter 1: You Can Program.” You Can Do It!: A Beginner’s Introduction to Computer

Programming. Chichester, West Sussex, England; Hoboken, NJ: John Wiley, 2004. Print.

Mohd Harris. “PROG0101 - Fundamentals of Programming.” N. p., n.d. Web. 17 Jan. 2014.

Perry, Greg M. “Chapter 2: Anatomy of a Program.” Absolute Beginner’s Guide to Programming. Indianapolis, Ind.: Que

Pub., 2003. Print.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard15

Page 16: Lecture03

What is an algorithm?

* Definitions · A detailed plan to solve a problem · A step-by-step set of instructions for solving a problem · A finite process that if followed will solve a problem

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard16

Page 17: Lecture03

What is an algorithm?

* Characterized by 5 properties: 1. Input: initial data 2. Output: final result

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard17

Page 18: Lecture03

What is an algorithm?

* Characterized by 5 properties: 3. Finiteness: has to terminate in a finite number of steps 4. Definiteness: each step has to be unambiguously specified 5. Effectiveness: each step should be doable in a finite time by a human

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard18

Page 19: Lecture03

What is an algorithm?

* Exercise 1 · A recipe is an algorithm that solve the next problem: how to prepare a meal · Search for the recipe of Green Tea Berry Delight (http://allrecipes.com/Recipe/Green-Tea-Berry-Delight/Detail.aspx)

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard19

Page 20: Lecture03

What is an algorithm?

* Exercise 1 · Identify: input, output and steps · Answer the following questions: + Is the recipe finite? + Is each step definite? + Is each step effective?

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard20

Page 21: Lecture03

What is an algorithm?* Exercise 1 (solution) · Recipe:

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard21

Page 22: Lecture03

What is an algorithm?

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard22

Page 23: Lecture03

What is an algorithm?* Exercise 1 (solution) · Is the recipe finite? Yes, it is done in 5 minutes · Is each step definite? Yes, they are not ambiguous · Is each step effective? Yes, in fact they are thought to be done by a human

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard23

Page 24: Lecture03

What is an algorithm?

* Exercise 2 · Design an algorithm to divide two numbers by using only additions and substractions

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard24

Page 25: Lecture03

What is an algorithm?

* Exercise 2 (solution) · Solve the specific case: divide 7 by 3 Hint: how many times is 3 contained in 7? · Solve the general case: divide A by B

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard25

Page 26: Lecture03

What is an algorithm?

* Exercise 2 (solution) · How many times is 3 contained in 7?

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard26

Page 27: Lecture03

What is an algorithm?

* Exercise 2 (solution) · Count the number of sustractions

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard27

Page 28: Lecture03

What is an algorithm?* Exercise 2 (solution)

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard28

Page 29: Lecture03

What is an algorithm?* Exercise 2 (solution) Is the algorithm correct? Trace for 7/3

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard29

Step 0  A is 7, B is 3, C is 0Step 1  Is 7 >= 3? Yes    A is now 7 ­ 3 = 4    C is now 0 + 1 = 1Step 2  Is 4 >= 3? Yes    A is now 4 ­ 3 = 1    C is now 1 + 1 = 2Step 3  Is 1 >= 3? NoStep 4  Output C, that is 2

Page 30: Lecture03

References

Cormen, Thomas H. “Chapter 1: The Role of Algorithms in Computing.” Introduction to Algorithms. Cambridge,

Masachusetts; London: The MIT Press, 2009. Print.

De la Rosa, Javier. “Computer Tools for Linguists.” Yutzu. N. p., n.d. Web. 16 Sept. 2013.

Knuth, Donald E. “Chapter 1: Basic Concepts.” The Art of Computer Programming. Volume 1: Fundamental Algorithms.

Vol. 1. Reading, Mass.: Addison-Wesley, 1997. Print.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard30

Page 31: Lecture03

Elements of a program

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard31

Page 32: Lecture03

What is a program?

* Definition “A program is an implementation of an algorithm in a program language.” (The concrete written program is called source code or just code)

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard32

Page 33: Lecture03

What is a program?

* Algorithm vs program · Algorithm: + Abstract + Represented by a flow diagram, pseudocode... + For human understanding

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard33

Page 34: Lecture03

What is a program?

* Algorithm vs program · Program: + Concrete + Represented by a program language + For computer processing

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard34

Page 35: Lecture03

Components of a program

* The content and structure of a program depend on the programming language* Every programming language is formed by a set of symbols* The combination of these symbols defines the programs

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard35

Page 36: Lecture03

Components of a program

* Programming languages are defined by: · Morphology · Syntax · Semantics

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard36

Page 37: Lecture03

Components of a program

* Morphology · Symbols: numbers, letters and special characters · Symbols are combined to form tokens: the basic elements of a language

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard37

Page 38: Lecture03

Components of a program

* Morphology · Vocabulary: a set of keywords (special tokens) with specific funcionality · Examples in Python: def, elif, except, print

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard38

Page 39: Lecture03

Components of a program

* Syntax · Grammar rules to write a program + Tokens - How the symbols are combined - Examples in Python: correct: 3, counter, def incorrect: $+1, (&variable

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard39

Page 40: Lecture03

Components of a program* Syntax · Grammar rules to write a program + Structure - Way that tokens are arranged - Expressions, blocks... - Examples in Python correct: a += 1 incorrect: a $= 1

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard40

Page 41: Lecture03

Components of a program* Semantics · Meaning of the program · Examples: how to interpret the order of the operators + Operator precedence x - 2 * 3 ≡ x - (2 * 3) + Operator associativity x - 2 + 3 ≡ (x - 2) + 3

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard41

Page 42: Lecture03

Components of a program

* A program is a set of instructions* An instruction is an statement* A statement is an executable unit of code formed by expressions* An expression is a combination of tokens* A token is a sequence of symbols

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard42

Page 43: Lecture03

Components of a program

* Exercise 3 Given the next code...

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard43

fruit = 'banana'counter = 0index = 0while index < len(fruit):    char = fruit[index]    if char == 'a':        counter += 1    index += 1print counter

Page 44: Lecture03

Components of a program

* Exercise 3 ... identify: · Symbols, tokens and keywords · Some grammar rules · Its semantics

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard44

Page 45: Lecture03

Components of a program

* Exercise 3 (solution) · symbols: letters, numbers, ', <, (, ), :, =, [, ], and + · tokens: fruit, =, 'banana', counter, 0, index, while, <, len(fruit), :, char, fruit[index], if, ==, 'a', +=, 1, print · keywords: while, len, if, print

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard45

Page 46: Lecture03

Components of a program* Exercise 3 (solution) · grammar rules: quotations for strings '', colon after testings :, indentention for blocks, closing parenthesis after opening parenthesis (), closing bracket after opening bracket [] · semantics: counts and print on screen number of a's in the word “banana”

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard46

Page 47: Lecture03

References

De la Rosa, Javier. “Computer Tools for Linguists.” Yutzu. N. p., n.d. Web. 16 Sept. 2013.

The Little Introduction To Programming. N. p. Web.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard47

Page 48: Lecture03

Types of errors* Syntax error The code of the program breaks the syntax rules of the programming language* Logic error The code is syntactically correct but results in illegal operations in execution* Semantic error The program does not behave as expected

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard48

Page 49: Lecture03

Types of errors* Examples in Python · Syntax error a + 1 = b Assignment malformed · Logic error a = 4 b = 0 c = a / b Division by 0

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard49

my_list = [1, 2, 3, 4] element = my_list[7]Access to a non existing object

Page 50: Lecture03

Types of errors* Examples in Python · Semantic error

It does not print the number of a'sKnowledge Representation in Digital Humanities

Antonio Jiménez Mavillard50

fruit = 'banana'counter = 0index = 0while index < len(fruit):    char = fruit[index]    if char == 'a':        counter += 1    index += 1print fruit

Page 51: Lecture03

References

Severance, Dr Charles R. “Chapter 1: Why Should You Learn to Write Programs?” Python for Informatics: Exploring

Information. 1 edition. CreateSpace Independent Publishing Platform, 2013. Print.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard51

Page 52: Lecture03

Assignment

* Assignment 3: Playing with algorithms · Readings + The Role of Algorithms in Computing (Introduction to Algorithms) + Strings (Python for Informatics)

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard52

Page 53: Lecture03

Assignment* Assignment 3: Playing with algorithms · Project + Write an algorithm in Python that cleans a text of punctuations marks

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard53

O Romeo Romeo wherefore art thou Romeo Deny thy father and refuse thy name Or if thou wilt not be but sworn my love And I'll no longer be a Capulet

O Romeo, Romeo! wherefore art thou Romeo? Deny thy father and refuse thy name; Or, if thou wilt not, be but sworn my love, And I'll no longer be a Capulet.

»

Page 54: Lecture03

References

Cormen, Thomas H. “Chapter 1: The Role of Algorithms in Computing.” Introduction to Algorithms. Cambridge,

Masachusetts; London: The MIT Press, 2009. Print.

Severance, Dr Charles R. “Chapter 6: Strings” Python for Informatics: Exploring Information. 1 edition. CreateSpace

Independent Publishing Platform, 2013. Print.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard54

Page 55: Lecture03

BibliographyCormen, Thomas H. Introduction to Algorithms. Cambridge, Masachusetts; London: The MIT Press, 2009. Print.

De la Rosa, Javier. “Computer Tools for Linguists.” Yutzu. N. p., n.d. Web. 16 Sept. 2013.

Glassborow, Francis. You Can Do It!: A Beginner’s Introduction to Computer Programming. Chichester, West Sussex,

England; Hoboken, NJ: John Wiley, 2004. Print.

Knuth, Donald E. The Art of Computer Programming. Volume 1: Fundamental Algorithms. Vol. 1. Reading, Mass.:

Addison-Wesley, 1997. Print.

Mohd Harris. “PROG0101 - Fundamentals of Programming.” N. p., n.d. Web. 17 Jan. 2014.

Perry, Greg M. Absolute Beginner’s Guide to Programming. Indianapolis, Ind.: Que Pub., 2003. Print.

Severance, Dr Charles R. Python for Informatics: Exploring Information. 1 edition. CreateSpace Independent Publishing

Platform, 2013. Print.

The Little Introduction To Programming. N. p. Print.

Knowledge Representation in Digital HumanitiesAntonio Jiménez Mavillard55