systems programming - İtÜ · systems programming fatih kesgin &yusuf yaslan ... linker. we...

23
Systems Programming Fatih Kesgin &Yusuf Yaslan Istanbul Technical University Computer Engineering Department 18/10/2005

Upload: others

Post on 15-Mar-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Systems ProgrammingFatih Kesgin &Yusuf Yaslan

Istanbul Technical University

Computer Engineering Department

18/10/2005

Outline

● How to assemble and link– nasm– ld– gcc

● Debugging – Using gdb;

● breakpoints,registers, memory● Objdump, readelf,nm,ldd

Definition

● Compilers and assemblers create object files

containing the generated binary code and data for a

source file. ● Linkers combine multiple object files into one, loaders

take object files and load them into memory. (In an

integrated programming environment, the compilers,

assemblers, and linkers are run implicitly when the user

tells it to build a program, but they're there under the

covers.)

Example: Hello world

segment .data

msg db "Hello, world!",10

len equ $ - msg

segment .text

global main

main:

mov eax,4 write syscal

mov ebx,1 stdout

mov ecx,msg address of output

buffer

mov edx,len length of buffer

int 80h

mov eax,1 exit syscal

mov ebx,0 success

int 80h

Assembling Hello.asm

● nasm - the Netwide Assembler, a portable 80x86

assemblernasm -f elf hello.asm

● To change the output file name use the -o commandnasm -f elf hello.asm merhaba.o

● The -f elf option tells nasm to output the object code in the Executable and Linking Format (ELF) that Linux uses.

● The object code is still not executable.

Linking Hello.asm

● To create an executable file, we have to link it using a

linker. We can use the GNU linker ld to link our object

file:ld hello.o -o hello

● However, this will result in the following warning:ld: warning: cannot find entry symbol _start; defaulting to 08048080

● ld searches for a _start label to use as the entry point

of the linked program. Since our entry point is not _start

but main instead

Linking Hello.asm

● we have to tell the ld use the label main as the entry point.

ld hello.o -o hello -e main● Let’s execute the program● examine listing file (little endianness)● A listing file can be created by nasm for the assembled 

code by using the -l option:● nasm -f elf hello.asm -l hello.lis● The original source is displayed on the right hand side 

and the generated code is shown in hex on the left. 

Examining the .lis file

10 00000000 B804000000 mov eax,4

● Here we can see that the machine code for the mov eax instruction is B8. 

● The value 04000000 next to it correspond to the immediate addressed parameter 4, but as we can see

● It is stored in 32 bits and stored as little endian. This is

because our system is a 32 bit little endian system

(Intel)

10 00000000 B804000000 mov eax,4

Differences between ld and gcc: entry points, size

● We can also use the GNU C compiler gcc to link

the object file:gcc hello.o -o hello

● Note that gcc is not a linker but a compiler ● gcc is able to the determine the type of its input files 

and take appropriate actions to produce the executable

● gcc  will try to invoke the linker ld in the background to generate the executable 

● Compile hello.c and compare the file sizes

Differences between ld and gcc: entry points, size

● gcc links the object files to the standard C runtime library by default.

● Linking to the standard C runtime library result in an

increase in the size of the executable.● there is a label _start in one of the standard C runtime

library object files ld does not complain about the entry point. (The _start function in the standard C runtime library is responsible for initializing the argc and argv variables for the main function of the C programs.)

Russian peasant method of multiplication

● Write the operands on top of two columns. At each

step, divide the number on the first column by two and

multiply the number on the second column by two.

Ignore the remainders of division operations. Each time

you obtain an odd number on the first column, add the

number on the second column to the result. Stop when

the number on the first column becomes 0.

Russian peasant method of multiplication

● Example: Multiply 92 by 37.

92   3746   7423   148     14811   296     148+296=4445     592     444+592=10362     11841     2368    1036+2368=34040

Russian peasant method of multiplication

● Write a C program. The main function should read the

values from the keyboard, multiply them using the

assembly function and display the result on the screen● solution: rusmain.c● Write a function in Intel assembly that multiplies its two

operands using the algorithm described above and

returns the result● solution: russian.asm

Russian peasant method of multiplication

● Replace the call to the assembler function by

appropriate inline assembly instructions that implement

the described multiplication algorithm● solution: rusinl.c Note That AT&T syntax is used in rusinl.c● how to assemble/link● nasm -f elf -g russian.asm -o russian.o● Compiling rusmain.c:● gcc -c -g rusmain.c -o rusmain.o● Linking the object files russian.o and rusmain.o to

produce russian executable:● gcc -g russian.o rusmain.o -o russian

Debugging

● In order to debug the executable file; -g is used during

compiling and linkinggdb russian

● Add breakpoints to program break mainbreak russianruninfo register

trace the registers and program

Object File Formats

● MS-DOS .COM files

A .COM file literally consists of nothing other than

binary code

Loaded to memory. Segments adjusted and Run

If the program doesn’t fit into segment fix up needed

Object File Formats

● Unix a.out files● Computers with hardware memory relocation usually

create a new process with an empty address space for

each newly run program, in which case programs can

be linked to start at a fixed address and require no

relocation at load time. The Unix a.out object format

handles this situationa.out header

text section

data section

other section

● a.out Header

int a_magic; // magic number

int a_text; // text segment size

int a_data; // initialized data size

int a_bss; // uninitialized data size

int a_syms; // symbol table size

int a_entry; // entry point

int a_trsize; // text relocation size

int a_drsize; // data relocation size

Object File Formats

● COFF (Common Object File Format )something better

to support cross-compilation, dynamic linking and other

modern system features– Time Sharing Problem

● UNIX ELF (Executable & Linking Format)– ELF files come in three slightly different flavors:

relocatable, executable, and shared object.

Relocatable files are created by compilers and

assemblers but need to be processed by the linker

before running.

Object File Formats

Executable: All relocations are done, all symbols are

resolved except shared library symbols at runtime

Shared object: Symbol info + runnable code ● ELF Summary ● Complex but good● Flexible Relocatable ,Supports C++● Efficient Executable format for V.M. with dynamic

linking● Cross Compilation, cross linking enabled

Inspecting the Object File After

Compiling● objdump : display information from object files

objdump -d rusmain.o● Note that the address of the call function is dummy and

the address of the main is 00000000

objdump -d russian.o objdump -d russian

● The address of the printf call is 80482ec and the new

address of the main is 080483e0

Inspecting the Object File After

Compiling● readelf - Displays information about ELF files

readelf -a russian.oreadelf -a russian

● nm - list symbols from object filesnm russian.onm russian

Static Linking

● gcc -static rusmain.o russian.o -o rus_static

● Let’s look at the file size of the rus_static:ll -h

477K, Isn’t it too big? Why?objdump -d rus_static● strace - trace system calls and signals

strace ./russian● ltrace - A library call tracer

ltrace ./russian