1 data types primitive data types character string types user-defined ordinal types array types...

34
1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types Ref: Chapter 6 in text

Upload: sylvia-ward

Post on 19-Jan-2016

245 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

1Data Types

• Primitive Data Types• Character String Types• User-Defined Ordinal Types• Array Types• Associative Arrays• Record Types• Union Types• Pointer Types• Ref: Chapter 6 in text

Page 2: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

2Memory Layout: Overview

• Text: – code

– constant data

• Data:– initialized global & static

variables

– global & static variables

– 0 initialized or un-initialized (blank)

• Heap– dynamic memory

• Stack– dynamic

– local variables

– subroutine info

Text

Data

Heap

Stack

0

xxxxxxxx

Page 3: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

3Data Types

• Data type– a collection of data values/objects, and

– a set of predefined operations on these objects

• Evolution of data types:– FORTRAN I (1957) - INTEGER, REAL, arrays

– Ada (1983) - User created unique types enforced by the system

• Design issues for all data types:– Syntax of references to variables

– Which operations are defined and how to specify them

– Mapping to computer representation

• Memory

• Bit pattern

Page 4: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

4Primitive Data Types

Not defined in terms of other data types

• Integers• Floating point types • Decimal• Character• Boolean• in some PLs: Character Strings

Page 5: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

5Integers

• Almost always an exact reflection of hardware– the mapping is trivial

• Up to 8 different integer types in a PL– size– unsigned– signed:

– negative numbers: • 2-complement? -> two bit patterns for 0!• => 2-complement - 1

Value

word_size - 1 bits

Sig

n1 b

it

Page 6: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

6Floating Point

• Model real numbers– Only approximations

– Fractional part: c12-1+c22-2+…+cn2-n

– Can't be exact, e.g. 0.1, etc.

• Mostly at least two floating-point types– sometimes more

• Usually reflects exactly the hardware– but not always

• IEEE Standard

Page 7: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

7IEEE Floating Point Format

• Single precision

• Double precision

• NaN : not a number

Exponent

11 bits

Fraction

52 bits

Sig

n1 b

it

Exponent

8 bits

Fraction

23 bits

Sig

n1 b

it

Page 8: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

8Decimal• Base 10• Business applications (usually money)• Stored as fixed number of decimal digits

– Encoded, not as floating point.

• Advantage – Accuracy

• No rounding errors, no exponent

• Binary representation can’t do this

• Disadvantages– Limited range– Needs more memory

• Example: BCD (binary coded decimal)– Uses 4 bits per decimal digit

• as much space as hexadecimal)

Page 9: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

9Character

• Stored as numeric codes– ASCII

• weird order– EBCDIC– Unicode

• mostly 16 bit, not for ASCII• Arabic, Thai, Korean, Katakana, Hiragana, Chinese (some)

• Often compatible with integer– the same operations

Page 10: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

10Boolean

• true / false• Could be implemented as bits

– Pascal: packed array of boolean

• Often as bytes or words– Java: 32 bits word

– Operations: • not (unary), and, or, =, ≠,

• xor (often), >, <, ≥, ≤, (false < true)

• Sometimes compatible with integer– C, C++– not in Java, Pascal

• Advantage– readability

Page 11: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

11String Types

• Sequences of characters• Design issues:

– A primitive type or just a special array?– Is the length of objects static or dynamic?

• Operations:– Assignment– Comparison (=, >, etc.)– Concatenation (+)– Substring reference– Pattern matching

• Regular expressions

Page 12: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

12String in PLs

• Pascal– packed array of char – not primitive type;– Only assignment and comparison

• Ada, FORTRAN 90, and BASIC– Somewhat primitive– Assignment, comparison, concatenation, substring

S := S1 & S2 (concatenation)

S(2..4) (substring reference)

– FORTRAN has an intrinsic for pattern matching e.g. (Ada)

• C and C++– Not primitive– Use char arrays – Library of functions provide operations

Page 13: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

13String in PLs (cont.)

• Java– Somewhat primitive: String class (not arrays of char)

• Concatenation with +• Objects cannot be changed (immutable)

– StringBuffer is a class for changeable string objects

– Rich libraries provide operations, regular expressions

• Perl and JavaScript– Patterns are defined in terms of regular expressions

• Very powerful, e.g.,

/[A-Za-z][A-Za-z\d]+/

• SNOBOL4 (string manipulation language)– Primitive string type

– Many operations, including elaborate pattern matching predefined

Page 14: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

14String Length: Static vs. Dynamic

1. Static– Implementation: compile-time descriptor – FORTRAN 77, Ada, COBOL, Pascal, Java– e.g. (FORTRAN 90)

• CHARACTER (LEN = 15) NAME;

2. Limited Dynamic Length– Implementation: may need a run-time descriptor

• not in C and C++ – a null character indicates actual length

3. Dynamic– Implementation: needs run-time descriptor

• allocation/deallocation is the problem

– SNOBOL4, Perl, JavaScript, Common Lisp

Page 15: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

15Character String Types

Run-time descriptorfor limited dynamic strings

Limited dynamic string

Maximum length

Actual length

Address

Static string

Length

Address

Compile-time descriptor for static strings

Page 16: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

16String Type Evaluation

• Advantages– Writability – Readability

• Static length primitive type– Inexpensive to provide, why not?

• Dynamic length– Flexibility vs. cost to provide

Page 17: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

17User-Defined Ordinal Types

• Ordinal type – The range of values is mapped onto the set of positive

integers

• Enumeration Types– User enumerates all possible values as symbolic constants– Represented with ordinal numbers

• Design Issues– Can symbolic constants be in more than one type definition?

• hair = (red, brown, blonde);• cat = (brown, striped, black);

– Can they be read/written as symbols?– Allowed as array indices?– Are subranges supported?

Page 18: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

18User-Defined Ordinal Type in PLs

• Pascal– can't reuse values, no input or output of values– can be used as array indices and case selectors– can be compared

• C and C++– like Pascal; can be input and output as integers

• Ada– constants can be reused (overloaded literals)– used as in Pascal; can be input and output

Page 19: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

19User-Defined Ordinal Type in Java

• enum types• Rigorous type checking• Since JDK 1.5 (Java 5)• Not only constants• As classes, can include

– Fields (incl. static final, i.e. constants)– Methods– Constructors (with parameters)– Subclass of Object,

• i.e. toString() is supported (and can be overwritten!)

• Sets of constants are supported!• Compiled into its own *.class file

• The Rolls Royce of enumeration types!

Page 20: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

20Evaluation of User-Defined Ordinal Types

• Readability– e.g., no need to code a color as a number

• Reliability– compiler can check– e.g., for ranges of values

• suppose you have 7 colors coded as integers (1..7), then • 9 is a legal color because it is legal integer!• colors can be added ?!

• Operations specified – e.g., colors can’t be added

• Support can be extensive, e.g. Java

Page 21: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

21Subrange Types

• An ordered contiguous subsequence of an ordinal type• Design Issue: How can they be used?• Pascal

– Behave as their parent types, can be used as – Indices in for-loops – Array indices

– e.g., type pos = 0..MAXINT;• Ada

– Not new types, just constrained existing types – They are compatible– Can be used as in Pascal– Can be used as case constants– e.g., subtype POS_TYPE is INTEGER

range 0..INTEGER'LAST;

Page 22: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

22Subrange Types: Implementation & Evaluation

• Implemented – As parent types – With code to restrict assignments to variables

• Code inserted by the compiler

• Advantages– Readability– Reliability

• restricted ranges improve error detection

Page 23: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

23Arrays

• An ordered collection of homogeneous data elements– An element is identified by its index

• Position relative to the first element

• Design issues– What types are legal for subscripts?– Is the range checked on subscript expressions in references?– When does binding of subscript ranges happen?– When does allocation take place?– What is the maximum number of subscripts?– Can array objects be initialized?– Are any kind of slices allowed?

Page 24: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

24Array Indexing

• Mapping from indices to elements – map (array_name, index_value_list) → elements

• Index Syntax– FORTRAN, PL/I, Ada: use parentheses– Most other languages use brackets

• Subscript types– FORTRAN, C: integer only– Pascal: any ordinal type (integer, boolean, char, enum)– Ada: integer or enum (includes boolean and char)– Java: integer types only

Page 25: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

25Static and Fixed Stack Dynamic Arrays

Based on subscript binding and binding to storage

1.Static– Static range of subscripts and storage bindings

• e.g. FORTRAN 77, some arrays in Ada

– Advantage:• time efficiency (no allocation or deallocation)

2.Fixed stack dynamic– Static range of subscripts, but – Storage is bound at elaboration time

• e.g. Most Java locals, and C non-static locals

– Advantage• space efficiency

Page 26: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

26Dynamic Arrays

3. Stack-dynamic– Subscript range and storage bindings are dynamic– But fixed for the variable’s lifetime– Advantage

• flexibility - size need not be known until the array is used

4. Heap-dynamic – Subscript range and storage bindings are dynamic – Not fixed

• e.g. declare in FORTRAN 90 a dynamic 2-dim array MAT

INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT

– APL, Perl, JavaScript: arrays grow, shrink as needed– Java: arrays are heap-dynamic objects

Page 27: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

27Array Dimensions and Initialization

• Number of subscripts / dimension– FORTRAN I allowed up to three– FORTRAN 77 allows up to seven– Other PLs: no limit

• Array Initialization– A list of values in the order of increasing subscripts– FORTRAN: DATA statement, or in / ... /– C, C++, Java: values in braces

int stuff [] = {2, 4, 6, 8};

– Ada positions for the values can be specifiedSCORE : array (1..14, 1..2) :=

(1 => (24, 10), 2 => (10, 7),

3 => (12, 30), others => (0, 0));

– Pascal: array initialization not allowed

Page 28: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

28Built-in Array Operations

• APL– many (text p. 240-241)

• Ada– Assignment

• to an aggregate constant • to an array name

– Concatenation for single-dimensioned arrays– Relational operators (= and != only)

• FORTRAN 90– A wide variety of intrinsic operations, e.g.

• matrix multiplication• vector dot product

Page 29: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

29Array Slices

• A slice is some substructure of an array– only a referencing mechanism

• Only useful in PLs with array operations• FORTRAN 90:

INTEGER MAT (1:4, 1:4)

MAT(1:4, 1) - the first column

MAT(2, 1:4) - the second row

2. Ada: single-dimensioned arrays onlyLIST(4..10)

Page 30: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

30Slices in FORTRAN 90

Page 31: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

31Implementation of Arrays

• Access function – Maps subscript expressions to an address in memory

• Row major order– By rows– Suppose array with n*m elements with subscripts starting at 0

• i.e. indices (0,0) (0,1) (0,2) … (m-1, n-2) (m-1,n-1)

– then A(i,j) = A(0,0) + (i * n + j) * element_size

• Column major order– By columns – Suppose array with n*m elements with subscripts starting at 0

• i.e. indices (0,0) (1,0) (2,0) … (m-2, n-1) (m-1,n-1)

– then A(i,j) = A(0,0) + (j * m + i) * element_size

Page 32: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

32Memory Access Function 0 … j … n-1

0

i

m-1

i*n

j

0 … j … n-1

0

i

m-1

i

j*m

Row major:Ai,j = A0,0+(i *n+j)*el_size

Column major:Ai,j = A0,0+(j*m+i)*el_size

Page 33: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

33Associative Arrays

• Associative array– An unordered collection of data elements– Indexed by keys– Implemented using hash tables

• Design Issues– What is the form of references to elements?– Is the size static or dynamic?

Page 34: 1 Data Types Primitive Data Types Character String Types User-Defined Ordinal Types Array Types Associative Arrays Record Types Union Types Pointer Types

34Associative Arrays in Perl

• Structure and Operations in Perl– Names begin with %

• e.g., declaration with initialization

%hi_temps = ("Monday"=>77, "Tuesday"=>79,…);

– Subscripting uses braces and keys • e.g., access using key

$hi_temps{"Wednesday"} = 83;

– Elements can be removed with delete • e.g., delete using key

delete $hi_temps{"Tuesday"};