expressions and statements prepared by manuel e. bermúdez, ph.d. associate professor university of...
TRANSCRIPT
Expressions and Statements
Prepared by
Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida
Programming Language ConceptsLecture 16
7 Categories of Control Constructs
1. Sequencing:• After A, execute B.• A block is a group of sequenced
statements.2. Selection:
• Choice between two (or more) statements.
3. Iteration: • A fragment is executed repeatedly.
7 Categories of Control Constructs (cont’d)
4. Procedural abstraction:• Encapsulate a collection of control
constructs in a single unit.5. Recursion:
• An expression defined in terms of a simpler version of itself.
6. Concurrency:• Two or more fragments executed at
same time.
7 Categories of Control Constructs (cont’d)
7. Non-determinacy:• Order is deliberately left
unspecified, implying any alternative will work.
Expression Evaluation
• Expressions consist of:• Simple object, or• Operator function applied to a
collection of expressions.
• Structure of expressions:• Prefix (Lisp),• Infix (most languages),• postfix (Postscript, Forth, some
calculators)
Expression Evaluation (cont’d)
• By far the most popular notation is infix.
• Raises some issues.• Precedence:
• Specify that some operators, in absence of parentheses, group more tightly than other operators.
Expression Evaluation (cont’d)
• Associativity: tie-breaker for operators on the same level of precedence.• Left Associativity: a+b+c
evaluated as (a+b)+c• Right Associativity: a+b+c evaluated
as a+(b+c)• Different results may or may not
accrue:• Generally: (a+b)+c = a+(b+c), but (a-b)-c <> a-(b-c)
Expression Evaluation (cont’d)
• Specify evaluation order of operators.• Generally, left-to-right (Java), but in
some languages, the order is implementation-defined (C).
Operators and Precedence in Various Languages
• C is operator-richer than most languages• 17 levels of precedence.• some not shown in figure:
• type casts,• array subscripts,• field selection (.)• dereference and field selection
•a->b, equivalent to (*a).b•Pascal:<, <=, …, in (row 6)
Pitfalls in Pascal
if a < b and c < d then ... is parsed as
if a < (b and c) < d.
Will only work if a,b,c are booleans.
Pitfalls in C
a < b < c parsed as (a < b) < c,
yielding a comparison between (a < b) (0 or 1) and c.
Assignments
• Functional programming:• We return a value for surrounding
context.• Value of expression depends solely on
referencing environment, not on the time in which the evaluation occurs.
• Expressions are "referentially transparent."
Assignments (cont’d)
• Imperative: • Based on side-effects.• Influence subsequent computation.• Distinction between
• Expressions (return a value)• Statements (no value returned,
done solely for the side-effects).
Variables
• Can denote a location in memory (l-value)
• Can denote a value (r-value)
• Typically,
2+3 := c; is illegal, as well as c := 2+3; if c is a declared
constant.
Variables (cont’d)
• Expression on left-hand-side of assignment can be complex, as long as it has an l-value:
(f(a)+3)->b[c] = 2; in C.
• Here we assume f returns a pointer to an array of elements, each of which is a structure containing a field b, an array. Entry c of b has an l-value.
Referencing/Dereferencing
• Consider
b := 2;
c := b;
a := b + c;
• Value Model Reference Model
Referencing/Dereferencing (cont’d)
• Pascal, C, C++ use the "value model": • Store 2 in b• Copy the 2 into c• Access b,c, add them, store in a.
• Clu uses the "reference" model:• Let b refer to 2.• Let c also refer to 2.• Pass references a,b to "+", let a
refer to result.
Referencing/Dereferencing (cont’d)
• Java uses value model for intrinsic (int, float, etc.) (could change soon !), and reference model for user-defined types (classes)
Orthogonality
• Features can be used in any combination
• Every combination is consistent.
• Algol was first language to make orthogonality a major design goal.
Orthogonality In Algol 68
• Expression oriented; no separate notion of statement.
begin a := if b < c then d else e;
a := begin f(b); g(c) end;
g(d);
2+3
end
Orthogonality In Algol 68 (cont’d)
• Value of 'if' is either expression (d or e).• Value of 'begin-end' block is value of last
expression in it, namely g(c).• Value of g(d) is obtained, and discarded.• Value of entire block is 5.
Orthogonality In Algol 68 (cont’d)
• C does this as well:• Value of assignment is value of
right-hand-side:
c = b= a++;
Pitfall in C
if (a=b) { ... }
/* assign b to a and proceed */ /* if result is nonzero /*
• Some C compilers warn against this.• Different from
if (a==b) { ... }
• Java has separate boolean type: • prohibits using an int as a boolean.
Initialization
• Not always provided (there is assignment)• Useful for 2 reasons:
1. Static allocation:• compiler can place value directly into
memory.• No execution time spent on
initialization.2. Variable not initialized is common error.
Initialization (cont’d)
• Pascal has NO initialization.• Some compilers provide it as an
extension.• Not orthogonal, provided only for
intrinsics.• C, C++, Ada allow aggregates:
• Initialization of a user-defined composite type.
Example: (C)
int a[] = {2,3,4,5,6,7}
• Rules for mismatches between declaration and initialization:
int a[4] = {1,2,3}; /* rest filled with zeroes */
int a[4] = {0}; /* filled with all zeroes */
int a[4] = {1,2,3,4,5,6,7} /* oops! */
• Additional rules apply for multi-dimensional arrays and structs in C.
Uninitialized Variables
• Pascal guarantees default values (e.g. zero for integers)
• C guarantees zero values only for static variables, *garbage* for everyone else !
Uninitialized Variables (cont’d)
• C++ distinguishes between:• initialization (invocation of a
constructor, no initial value is required) • Crucial for user-defined ADT's to
manage their own storage, along with destructors.
• assignment (explicit)
Uninitialized Variables (cont’d)
• Difference between initialization and assignment: variable length string:• Initialization: allocate memory.• Assignment: deallocate old memory
AND allocate new.
Uninitialized Variables (cont’d)
• Java uses reference model, no need for distinction between initialization and assignment.
• Java requires every variable to be
"definitely assigned", before using it in an expression.• Definitely assigned: every execution
path assigns a value to the variable.
Uninitialized Variables (cont’d)
• Catching uninitialized variables at run-time is expensive.• harware can help, detecting special
values, e.g. "NaN" IEEE floating-point standard.
• may need extra storage, if all possible bit patterns represent legitimate values.
Combination Assignment Operators
• Useful in imperative languages, to avoid repetition in frequent updates:
a = a + 1;
b.c[3].d = b.c[3].d * 2; /* ack ! */
• Can simplify: ++a;
b.c[3].d *= 2;
Combination Assignment Operators (cont’d)
• Syntactic sugar for often used combinations.
• Useful in combination with autoincrement operators:
A[--i]=b; equivalent to A[i -= 1] = b;
Combination Assignment Operators (cont’d)
*p++ = *q++; /* ++ has higher precedence than * */
equivalent to*(t=p, p += 1, t) = *(t=q, q += 1, t);
• Advantage of autoincrement operators:• Increment is done in units of the
(user-defined) type.
Comma Operator
• In C, merely a sequence:
int a=2, b=3;
a,b = 6; /* now a=2 and b=6 */
int a=2, b=3;
a,b = 7,6; /* now a=2 and b=7 */
/* = has higher precedence than , */
Comma Operator
• In Clu, "comma" creates a tuple:
a,b := 3,4 assigns 3 to a, 4 to b a,b := b,a swaps them !
• We already had that in RPAL:
let t=(1,2)
in (t 2, t 1)
Ordering within Expressions
• Important for two reasons:
1. Side effect:• One sub expression can have a side
effect upon another subexpression:
(b = ++a + a--)
2. Code improvement:• Order evaluation has effect on
register/instruction scheduling.
Ordering within Expressions (cont’d)
• Example: a * b + f(c)• Want to call f first, avoid storing
(using up a register) for a*b during call to f.
Ordering within Expressions (cont’d)
• Example: a := B[i];
c := a * 2 + d * 3;
• Want to calculate d * 3 before a * 2: Getting a requires going to memory (slow); calculating d * 3 can proceed in parallel.
Ordering within Expressions (cont’d)
• Most languages leave subexpression order unspecified (Java is a notable exception, uses left-to-right)
• Some will actually rearrange subexpressions.
Example (Fortran)
a = b + c c = c + e + b rearranged as a = b + c c = b + c + e
and then as a = b + c c = a + e
Rearranging Can Be Dangerous
• If a,b,c are close to the precision limit (say, about ¾ of largest possible value), then
a + b - c will overflow, whereas
a - c + b will not.
• Safety net: most compilers guarantee to follow ordering imposed by parentheses.
Short Circuit Evaluation
• As soon as we can conclude outcome of the evaluation, skip the remainder of it.
• Example (in Java):
if ( list != null && list.size() != 0)) System.out.println(list.size());
• Will never throw null pointer exception
Short Circuit Evaluation (cont’d)
• Can't do this in Pascal:
if (list <> nil) and (list^.size <> 0)
• will evaluate list^.size even when list is nil.
• Cumbersome to do it in Pascal:
if list <> = nil then if list^.size <> 0 then
System.out.println(list.size());
Short Circuit Evaluation (cont’d)
• So, is short-circuit evaluation always good?
• Not necessarily.
Short Circuit Evaluation (cont’d)
Short Circuit Evaluation (cont’d)
• Here, the idea is to tally AND to spell-check every word, and print the word if it's misspelled and has appeared for the 10th time.
• If the 'and' is short-circuit, the program breaks.
• Some languages (Clu, Ada, C) provide BOTH short-circuit and non short-circuit Boolean operators.
Structured Programming
• Federal Law: Abandon Goto's ! • Originally, Fortran had goto's: if a .lt. b goto 10 ...
10
Structured Programming (cont’d)
• Controversy surrounding Goto's:
• Paper (letter to editor ACM Comm.) in 1968 by E. Dykstra: • "Goto statement Considered
Harmful"• argument: Goto's create
"spaguetti code".
Structured Programming (cont’d)
• Legacy: structured programming: use of• sequencing (;)• alternation (if)• iteration (while)
• Sufficient to solve any problem.
Structured Programming (cont’d)
• Part of focus on *control* during first 40 years in programming.
• During 80's, 90's and beyond, focus shifted to *data* (OO-programming)
Structured Programming (cont’d)
• Common (former) use of goto: break out of loop(s), maybe deeply nested:
while true do begin
if (...) then goto 100;
end;
100: ...
Structured Programming (cont’d)
• In C, this can be accomplished using a 'break' statement, but consider this ...
while (...) { switch (...) {
...
goto loop_done; /* break won't do */
{
}
loop_done: ...
Structured Programming (cont’d)
• Today, we use *exceptions*.• Exception: Upon a certain (error)
condition, allows a program to back out of nested context to some point where it can recover and proceed.
• Requires unwinding of the stack frame.• More later.
Structured Programming (cont’d)
• Semantically, goto's are *very* difficult to understand and implement correctly.
• Some (circumstantial) evidence:
• RPAL LPAL JPAL
• (JPAL: PAL with jumps)• JPAL by far the hardest of the three
to describe.
Structured Programming (cont’d)
• When executing a jump, we might be:
• exiting one or more procedure calls.• exiting many nested loops.• diving into the middle of a procedure• diving into the middle of a loop.
• What happens to the stack ???
Structured Programming (cont’d)
• Goto's in general described using *continuations*:
• A continuation captures the context (state) in which execution might continue.
• Continuations essential to denotational semantics (more later).
Statement Sequencing
• Basic assumption:• A sequence of statements will have
side effects.• Not always desirable; easier to reason
(prove correct) programs in which functions have no side effects.
• Sometimes side-effects are *very* desirable. Example: rand() function.• want it to produce a different
number each time it's called.
Statement Selection
• Most languages use a variant of the original if...then...else introduced in Algol 60:
if condition then statement
else if condition then statement
else if condition then statement
...
else statement
Statement Selection (cont’d)
switch (condition) { case a: block_a; case b: block_b : ... default: block_c}
is often syntactical sugar for
if (condition == a) block_a else if (condition == b) block_b ...
else block_c
Statement Selection (cont’d)
• Some languages require explicit break statements between cases, otherwise all the other cases evaluate to true (e.g. C)
Short-Circuited Conditions
• Design goal: implement if’s efficiently.
• Jump code: efficient organization of code to take advantage of short-circuited boolean expressions.
• Value of expression never stored in a register.
Short-Circuited Conditions (cont’d)
• If the value of the entire expression is needed, we can still use jump code.
• Example (Ada):
found := p /= null and then p.key =val;
equivalent to
if p /= null and then p.key=val then found := true; else found := false; end if;
Short-Circuited Conditions (cont’d)
• Jump code:
r1 := p if r1=0 goto L1
r2 := r1->key if r2 <> val goto L1
r1 := 1 goto L2
L1: r1:=0 L2: found := r1
Could be L2! Better to perform that improvement in a code optimizer
Case/Switch Statements
• Alternative syntax for nested if...then...else statements. Example:
i := (* potentially complicated expression *)
if i=1 then clause_Aelse if i=2 or i=7 then clause_Belse if i >=3 and i <= 5 then clause_Celse if i=10 then clause_Delse clause_E
Corresponding CASE statement
Case/Switch Statements (cont’d)
• Purpose of case statement is not only syntactic elegance, but efficiency. Wish to *compute* the address to which to branch.
• So, list ten cases (range of values tested):
• Store addresses starting at location T (jump table).
• Calculate r1 (the test expression value).• First test for r1 out of range 1..10.• Subtract 1 from r1, obtaining an offset
(0..9).• Get address T[r1], store in r2.• Branch to (indirect) r2.
Case/Switch Statements (cont’d)
• Advantages: fast, occupies reasonable space if labels are dense.
• Disadvantage: can occupy enormous amounts of space if values are not dense (e.g. 1, 3..5, 50000..50003)
Case/Switch Statements (cont’d)
• Variations:• Use hash table for T.
• Good idea if total range is large, many missing values, and no large value ranges.
• Requires a separate entry for each possible value.
• Use binary search for table T.• Good idea if value ranges are large,
runs in O(log n) time.
Case/Switch Statements (cont’d)
• Combining techniques:• Compilers usually generate code for
each arm, building up knowledge of the label set.
• Then use knowledge to choose strategy (binary search or hash table).
• Less sophisticated compilers often generate poor code, programmer must restructure case statement to prevent huge tables, or very inefficient code.
Case/Switch Statements (cont’d)
• Pascal, C don't allow ranges (avoid binary search).
• Standard Pascal doesn't allow a default clause:• Run-time semantic error if no case
matches expression.• Many Pascal compilers *do* allow it
as an extension.
Case/Switch Statements (cont’d)
• Modula provides an optional ELSE clause.
• Ada requires labels to cover *ALL* values in the domain of the type of the expression. Ranges and an *others* clause are allowed.
• C, Fortran 90: OK for expression to match no value: statement does nothing.
Case/Switch Statements (cont’d)
• C is different in other respects:• A label can have an empty arm,
• Control "falls" through to next label.• Effectively allows lists of values.• Example:
switch (grade) { case 10: case 9: case 8: case 7:
printf("Pass"); break;
default: printf("Fail"); break;
}
Case/Switch Statements (cont’d)
•'break' needed to prevent fallthrough.
• If a value matches test expression, fall-through takes place i.e. no more comparisons.
Case/Switch Statements (cont’d)
• Example:
switch (grade) { case 10: case 9: case 8: case 7: num_pass++; case 6: borderline++; case 0: case 1: case: 2 case 3: case 4: case: 5
fail ++; default: total++; break; }
• In C, a forgotten break can be a difficult bug to find.
Expressions and Statements
Prepared by
Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida
Programming Language ConceptsLecture 16