chapter 2 types, operators and expressions · 2021. 1. 14. · chapter 2 types, operators and...
TRANSCRIPT
Chapter 2
Types, operatorsand expressions
Since Java is sort of based on C, you should
already know many of these things. We will
simply focus on those that are unique in C.
There are essentially four different basic types
in C: char refers to a single byte, capable of
holding one character in the local character
set; int refers to an integer, typically reflect-
ing the “natural” size of integers on the host
machine. This is different from Java, which
specifies four bytes for an int variable. It is
indeed four bytes in turing as well.
The type float refers to a single-precision float-
ing point number; and double refers to a double-
precision floating point. We usually use double
to serve such a purpose.
1
A bit more...
We can also use a few qualifiers to apply tothese basic types. For example,
short int sh;
long int counter;
where the word int is dropped.
Although int is flexible, short often leads to 16bits and long refers to 32 bits. In other words,long in C is equivalent to int in Java.
We can also use signed and unsigned to qualifythe char type and any int types.
The value of an unsigned type cannot be nega-tive. For example, if char leads to 8 bits, thenthe value of an unsigned char variable rangesbetween 0 and 255. But, the value of a signed
char ranges from -128 to 127.
We will see important applications of unsignedlater on Page 20.
2
The type long double specifies extended-precis-
ion floating point numbers. Again, the size of
such a type, just as that for other floating-
point objects, really depends on a particular
implementation of the language in a host ma-
chine. As a result, we can only say float,
double and long double could lead to one, two,
or three distinct sizes.
The header files <limits.h> and <float.h> con-
tains symbolic constants for the sizes of all
these types. Check out the links as included in
the course page.
All these hold because C is a pre-Internet lan-
guage, w/o a universally agreed standard for
implementation. /
Labwork: Have a look at Appendix B11 in the
textbook, and write a program to determine
the ranges of char, short, int and long vari-
ables, both signed and unsigned of the turing
version of C.
3
Constants
An integer like 12345 is an int, a long constant
can be expressed with a terminating l or L,
e.g., 123456789L. An unsigned value is indicated
with a u at the end so that proper storage can
be allocated to hold it.
Just like in Java, a floating-point constant is
taken as a double quantity.
A value in octal (base 8), using a leading o; a
value in hexadecimal (base 16), using a leading
x, so that proper interpretation can be made.
A character constant is an integer, written as
one character with single quotes, e.g., ’a’, with
its value being the numerical value of that
character in the machine’s character set. For
example, in the ASCII set, the decimal value of
’0’ is 48, while its hexadecimal value is ‘0X30’..
We obviously should use ’0’ instead of 48 for
ease of reading. ,
4
Special characters
We have already mentioned certain charactersare represented with escape sequence, e.g., ‘\n’
refers to new line. The following won’t com-pile, as we saw earlier.
#include <stdio.h>
int main() {
printf("Hello,
world!"); }
But, this will.
#include <stdio.h>
int main() {
printf("Hello,%cworld!", 0x0A);}
What is this “0x0A”? Nothing but “\n”.
#include <stdio.h>
int main() {
printf("Hello,\nworld!");}
Check out all the ASCII codes... .
5
In different bases
We can also represent an arbitrary byte-sized
bit pattern with the format of ‘\ooo’, each an
octal digit, where the first ‘o’ is between 0 and
3, and the other two between 0 and 7 (?);
or by ‘\xhh’, where ‘h’ is an hexadecimal digit,
each between 0 and ’F’ (=15). For example
#define VTAB ’\013’
#define BELL ’\007’
can also be represented as, in hexadecimal
#define VTAB ’\xB’
#define BELL ’\x7’
The character constant ‘\0’ represents the char-
acter with value 0, and ‘\012’=‘00001010’, i.e.,
‘\xA’ in hex, or 0xA, a number.
Question: What does ‘\xA’ represent? How
about ‘\u00C1’? Let’s run display.c and have
a look at the escape sequence stuff... .
6
Constant expressions
A constant expression only contains constants,e.g.,
#define LEAP 1
#define daysOfFirstThreeMonth[31+28+LEAP+31];
A string constant, or a literal, is a sequence of0 or more characters with double quotes, e.g.,"I am a string" or "", the empty string.
String constants can be concatenated during
the compiling time, e..g, “hello,” “ world” isequivalent to “hello, world”.
Question: Is ‘x’ the same as “x”?
Hint: How many characters does each of themcontain?
Question: Is “if c=="a"” correct, where c isof char type?
Answer: Check out charCompare.c...
7
What’s special about a string?
As we mentioned in C Core, a string is really
an array of characters with an extra ‘\0’ at the
end. Thus, the space required by such an array
is one more than the length of the string.
Below shows the gist of strlen(), which goes
through the whole string to decide its length.
int strlen1(char s[]){
int i;
i=0;
while(s[i]!=’\0’)
++i;
return i;
}
Questions: What do we get for strlen1("abc")?
What does its output tell us of a string s?
The strlen() function and a bunch of others
are declared in the <string.h> header file.
8
Enumeration
An enumeration is a list of constant integer
values, e.g.,
enum boolean {NO, YES};
The first value in such a list is 0, the second
is 1, etc., unless specified otherwise. For ex-
ample,
enum months {JAN=1, FEB, MARCH};
then, FEB stands for 2, etc..
Self study Section 2.4 through Section 2.6 on
declarations, and various operators. Pretty sim-
ilar to the Java stuff... ,
9
Type conversions
We often ran into such issues in Java pro-
gramming. Essentially, when an operator has
operands of different types, they have to be
converted to a common type according to cer-
tain rules. In general, a smaller size is con-
verted to a bigger one.
For example, in f+i, where f is a floating-point
number and i is an integer, i will be converted
to a floating-point number before the addition
is done. For example,
celsius=5*(fahr-32)/9.0;
Notice that, when used as a truth value in
a boolean condition of, e.g., if, for, while,
etc., 0 means false, and anything else means
true.
This is not the same as the boolean type that
you used in Java. Check out the course page
to refresh your memory... .
10
What a character!
A char quantity can be interpreted as a small
integer, taking a byte, thus characters can be
used in any arithmetic expression.
int atoi(char s[]){
int i, n;
n=0;
for (i=0; s[i]>=’0’&&s[i]<=’9’; ++i)
n=10*n+(s[i]-’0’);
return n; }
The following converts an upper case letter to
a lower case one.
int lower(int c){
if(c>=’A’ && c<=’Z’)
return c+’a’-’A’;
else return c; }
11
Could Dr. Shen lie again?% > more testATOI.c#include <stdio.h>//#include <stdlib.h>
int atoi1(char s[]);
main(){char num1 []={’1’, ’2’, ’3’, ’\0’};char num2 []="456";
printf("num1=%d\n", atoi1(num1));printf("num2=%d\n", atoi1(num2));}
int atoi1(char s[]){int n=0;int i=0;
for(i=0; s[i]>=’0’&&s[i]<=’9’; i++)n=10*n+(s[i]-’0’);
return n;}% cc testATOI.c% ./a.outnum1=123num2=456
How about the HEX case?
12
/home/zshen > more upperToLow.c#include <stdio.h>
int lower(int c);
main(){
char testC = ’A’;char testD = ’a’;
printf("Lower case of %c is %c\n", testC, lower(testC));printf("Lower case of %c is %c\n", testD, lower(testD));
}
int lower(int c){if(c>=’A’ && c<=’Z’)return c+’a’-’A’;
else return c;}/home/zshen > cc upperToLow.c/home/zshen > ./a.outLower case of A is aLower case of a is a
Guess not...
Always test programs out... ,
13
Labwork
Write a function any(s1, s2) that returns the
first location in the string s1 where any char-
acter in a string s2 occurs, or -1 if nothing in
s2 occurs anywhere in s1.
Send in the whole program, together with sam-
ple outputs for both the case when some letter
in s2 occurs in s1, and none in s2 can be found
in s1.
For example, any("This is fun", "fin") returns
2, (‘f’ occurs in position 8, ‘i’ occurs in 2, and
‘n’ in 10), while any("This is fun", "dead") re-
turns -1.
14
Bitwise operators
I am afraid we did not go through this part
in previous programming courses. / But, you
should have learned their concepts in CS 2010
and/or CS 2220, just like the binary addition
(CS 3221 Exercise 2.1-4)? ,
C is often used as a system programming lan-
guage when it has to deal directly with the
registers.
To serve this purpose well, C provides six log-
ical operators for bit level manipulation.
Question: What are they?
15
Here they go...
& bitwise AND| bitwise inclusive OR^ bitwise exclusive OR~ one’s complement<< left shift>> right shift
The first four logic operations are defined via
the following truth tables.
a b a & b a | b ~a a ^ b
0 0 0 0 1 00 1 0 1 1 11 0 0 1 0 11 1 1 1 0 0
16
One’s complement...
The operator ‘~’ simply inverts, i.e., flips, the
bit string representation of its argument. For
example, given the following declaration
int a=70707;
its binary representation in four bytes, in tur-
ing, is the following:
00000000 00000001 00010101 00110111
and the binary representation of “~a” is the
following:
11111111 11111110 11101010 11001000
Finally, to get the 2’s complement of ‘a’, we
simply add a 1 to “~a”:
11111111 11111110 11101010 11001001
17
An application
Question: How to set the last six bits of x to
0, while keeping the other bits intact?
Answer: x = x & ~077;
The bit representation of 33333:
00000000 00000000 10000010 00110101
The bit pattern of “077:
00000000 00000000 00000000 00111111
The bit pattern of “~077:
11111111 11111111 11111111 11000000
Here is the result of “33333&~077”:
00000000 00000000 10000010 00000000
Every bit in 33333 will be the same, except the
rightmost six bits are set to 0.
Let’s run start.c
18
... and two’s complement
Two’s complement changes a nonnegative num-
ber into its arithmetic negation.
Given the binary representation of 7:
00000000 00000111
its one’s complement is
11111111 11111000
and its two’s complement is
11111111 11111001,
representing -7. For example,
15 − 7 = 15 + (−7) = 8
0 0 0 0 1 1 1 1+ 11 11 11 11 11 01 01 1
1 0 0 0 0 1 0 0 0
19
An example
Given the following declaration:
int a=33333, b=-77777
Exp Binary Valuea 00000000 00000000 10000010 00110101 33333b 11111111 11111110 11010000 00101111 -77777
a&b 00000000 00000000 10000000 00100101 32805a^b 11111111 11111110 01010010 00011010 16384a|b 11111111 11111110 11010010 00111111 -77249
~(a|b) 00000000 00000001 00101101 11000000 77248
On the other hand, given the following:
char c=’Z’;
Exp Binary Valuec 01011010 unshifted
c<<1 00000000 00000000 00000000 10110100 left shifted 1c<<4 00000000 00000000 00000101 10100000 left shifted 4
Question: Is Dr. Shen lying again?
Let’s check it out with bitPrint.c
20
Application is the king!
In general, the bitwise AND (&) can be used to
mask off some set of bits. For example,
n=n & 0177
will set everything but the last 7 digits of n to
0.
The bitwise OR (|) sets things up. For exam-
ple
x=x|0007
will set the last three digits to 1, no matter
what their current bits are.
The bitwise exclusive OR (^) sets a 1 in each
position where the two operands have different
bit values, and to 0 when they agree.
Question: Is 1 && 2 the same as 1 & 2? ,
21
Shifts
The shift operators shift the operand left or
right for certain number of positions. For ex-
ample, “x << 2” shifts the value of x left by
two positions, and always fills the vacant bits
with 0, thus effectively multiplying x by 4.
Right shifting an unsigned value always fills the
vacated bits with 0. For a signed quantity, it
fills with the sign bit (arithmetic shift) in some
machines, and with 0 (logical shift) in some
others. /
Question: What to do?
Answer: Use unsigned with bitwise right shift
to avoid confusion... .
22
Another example
We will go through a function that prints out
the bit representation of an integer, which we
already used in bitPrint.c.
/* Bit print an int expression */
void bitPrint(int v){
int i, mask=1<<31; /* mask=100...0 */
for(i=1; i<=32; ++i){
putchar(((v & mask)==0) ? ’0’: ’1’);
v <<=1;
if(i%8==0&&i!=32)
putchar(’ ’);
}
}
Question: Should we use unsigned int instead?
Answer: No, since we will just print out the
current bits and we use “<<” , Let’s see what
it does with 33333, bitPrint.c, when it prints
-77777.
23
Cut it open...
• int i, mask=1<<31;
mask starts with a 1 in the least significant
bit, i.e., the one on the rightmost position.
With four bytes long, after shifting to the
left for 31 positions, its most significant,
the leftmost, bit is set to 1.
10000000 00000000 00000000 00000000
• putchar(((v & mask)==0) ? ’0’: ’1’);
If the high-order bit of v is off, then the
test of (v & mask)==0 is true, hence a ‘0’
will be printed. Otherwise, a ‘1’ will be
printed.
24
• v <<=1;
The above is the same as v = v<<1; which
is shifted 1 position to the left. For the
first time, the second high-order bit will be
shifted over to the highest-order, the most
significant bit, or the leftmost bit.
• if(i%8==0&&i!=32) putchar(’ ’);
This piece prints out a blank after each
group of 8 bits are printed. Since there is
no need to print out a blank after the last
digit, we put in the condition about 32.
Hence, bit_print(33333) will print out the fol-
lowing:
00000000 00000000 10000010 00110101
And bit_print(-77777) will print out the fol-
lowing:
11111111 11111110 11010000 00101111
25
Precedence and order
Precedence of the operators specifies the or-
der of their evaluation, as shown in Table 2-
1 on Page 53 of the textbook, or the one on
the course page, where the operations of the
higher precedence come earlier in the table.
Those placed in the same row have the same
precedence among each other. You can use
parenthesis to change the order.
Given the pattern ~(~0 << n), since ‘~’ has a
higher priority than ‘<<’, and the usage of the
parenthesis, for n = 3, the order of the opera-
tions will be the following:
1. 00000000 00000000 flips to 11111111 11111111
2. 11111111 11111111 shifts to 11111111 11111000
3. 11111111 11111000 flips to 00000000 00000111
26
An application
Question: How many bits of a bit string x
should we shift to the right, so a segment of nbits with its leftmost bit located at position p
will become the rightmost n bits?
Notice that x[p] sits at p, x[p-1] sits at p-1,hence p[l]=p[p-(n-1)] sits at p-(n-1)=p+1-n.We need to move x[l], the last one in the
segment, located at position p+1-n, to posi-tion 0.
When we shift it one position to the right, itmoves to position (p+1-n)-1, when we shifttwo positions, it goes over to (p+1-n)-2, ....
The pattern seems to be that, if we shift it p
positions, position l ends at p+1-n-p. ,
27
The final kick...
Question: How many times should we shift to
the right, so position l goes to position 0?
Answer: We want p+1-n-p=0, thus we must
have p=p+1-n.
For example, let p=4 and n=3, then if we shift
to the right p + 1 − n = 2 positions, the bits
[4, 3, 2] will be located in the rightmost 3
positions. In particular, position 2 now ends at
position 0.
Thus, the following line
x >> ( p+1-n)
moves the segment of size n with its leftmost
position being p to the rightmost segment.
28
Remember this pattern?
As we saw earlier, the following shifting oper-ation
~(~0 << n)
will create a pattern with the rightmost n bits
being 1, and the rest being 0.
Question: What does the following do?
unsigned getbits(unsigned x, int p, int n){return x >> (p+1-n) & ~(~0 << n);}
Answer: It shows us the n-bit segment of x
with its leftmost bit sitting at position p.
Question: Do we have to do the following?
(x >> ( p+1-n)) & (~(~0 << n))
Answer: No, ‘>> has higher precedence over
‘&’. (Should we trust Dr. Shen?)
Answer: Let’s check it out... ,
29
Labwork
Write a function that flips the n bits, starting
from position p downwards.
Send in the whole program, together with sam-
ple output(s).
For example, when n = 3, p = 7, your program
will turn the following bit string
10000010 00110101
into the following where bits 7 through 5 have
been flipped.
10000010 11010101
30
Another application
The bitwise operations are also often used to
pack things together to save space. Remem-
ber that memory capacity was a hot commod-
ity back then. /
Question: How much did a 5MB hard disk
cost back in the early 1980’s?
For example, the following function is to pack
four characters into one unsigned int type vari-
able, which contains four bytes as you found
out in a previous lab.
int pack(char a, char b, char c, char d){//Why unsigned?unsigned int p=a;
p=(p<<8)|b;p=(p<<8)|c;p=(p<<8)|d;
return p;}
Question: How does it work?
31
/home/zshen > more pack.c#include <stdio.h>
unsigned int pack(char a, char, char, char);void bit_print(unsigned int);
main(){printf("abcd=");bit_print(pack(’a’, ’b’, ’c’, ’d’));putchar(’\n’);}
unsigned int pack(char a, char b, char c, char d){unsigned int p=a;
p=p<<8|b;p=p<<8|c;p=p<<8|d;return p;}
void bit_print(unsigned int v){int i, mask=1<<31; /* mask=100...0 */
for(i=1; i<=32; ++i){putchar(((v & mask)==0) ? ’0’: ’1’);v <<=1;if(i%8==0&&i!=32)putchar(’ ’);
}}/home/zshen > cc pack.c/home/zshen > ./a.outabcd=01100001 01100010 01100011 01100100
Notice ‘a’ is represented as ‘0x61 (=01100001)”.
32
The other direction
Question: How could we unpack the 32 bit
string to get back the four characters?
void unpack(unsigned int p,char *a, char *b,
char *c, char *d){
*a=(p&0xff000000)>>24;
*b=(p&0xff0000)>>16;
*c=(p&0xff00)>>8;
*d=(p&0xff);
}
Notice that p&0xff take the last 8 bits of p,
(p&0xff00)>>8 take the next byte then shift it
8 positions to the right, etc..
Question: Can we take off the leading ‘0’?
A rather important issue here is that char *a
states that a is a pointer to a character type
variable, another sticking point of C.
Let’s check it out with (un)pack.c.
33
To test our program, we can write the follow-ing driver:
#include <stdio.h>
main(){char u, v, w, x;int pw;
pw=pack(’a’, ’b’, ’c’, ’d’);bit_print(pw);putchar(’\n’);printf("Now, put them back:\n");//This is how to call the unpack functionunpack(pw, &u, &v, &w, &x);printf("%c %c %c %c\n", u, v, w, x);}
and get the following:
/users/faculty/zshen > a.out
We are ready to pack a, b, c and d
01100001 01100010 01100011 01100100
Now, put them back:
a b c d
Question: How does the following work?
unpack(pw, &u, &v, &w, &x);
34
A point about pointers
The expression &u refers to the address of the
variable u, or, figuratively speaking, a pointer
to u. There are two reasons for such a usage.
1. This is required by the unpack function.
2. The purpose is to “hook” the formal pa-
rameter, e.g., *a, and the actual one, e.g., &u,
via the assignment
char *a= &u;
Then, a as a “pointer” to a storage space of
char type will be fed with the address of u,
thus “pointing” at u. Then anything we will
do with a will also happen to u.
We will discuss pointers in detail in Chapter 5.
35
Assignments
Just as in Java, an assignment expression such
as
i=i+2;
can be written as i+=2, here “+=” is referred to
as the assignment operator.
In general, for most of the binary operators,
we have that
expr1 op= expr2
really means
expr1 = expr1 op expr2;
Notice the definition of such operations. For
example, “x*=y+1” means x=x*(y+1) but not
x=x*y+1. (?)
Assignment: Check out Category 14 of the
Precedence table.
36
An application
The following counts the number of 1’s in x.
//Why unsigned?
int bitcount(unsigned x){
int b;
for (b=0; x!=0; x>>=1)
if (x & 01)
b++;
return b;
}
In the above, whenever x is still not completely
zero yet, if its rightmost bit, the least signif-
icant bit, is 1, the counter, b, goes up by 1.
Then x is shifted to the right by one position,
while the vacant bit is to be filled with a 0,
since x is unsigned.
Check out an example of running this program
on the course page.
37
Conditional expression
We sort of know this in Java as well. For ex-
ample, the following
if (a>b)
z=a;
else z=b;
can be expressed in another way
z = (a>b)? a:b;
Have a look at the bitPrint function, on Page 20,
for an application.
Question: What does the following do?
for (i=0; i<n; i++)printf("%6d%c", a[i], (i%10==9||i==n-1) ? ’\n’ : ’ ’);
Question: What does the following do?
printf("You have %d item%s.\n", n, n==1 ? "":"s");
38