lecture 5 - politechnika Śląskadb.zmitac.aei.polsl.pl/kt/lecture5.pdf · 2004-11-04 · @cpu - a...
TRANSCRIPT
Assembler Programming
Lecture 5
Lecture 5
• MASM. General components. Operators. Identifiers. Statements. Directives. Memorymodels. Simplified segment directives.
Functions of the assembler
• mnemonic replacement of binary instruction coding,
• comfortable operand noting,• calculating constant expressions,• symbolic and relative addressing,• easy communication with standard procedures
and operating system,• replacement of multiple lines with one symbolic
name.
How does the assembler work?
• To create the executable MASM:– assembles the source code into object file,– calls the linker to link the object files and libraries
into executable file.• Then the operating system can load executable
into the memory and run.
Assembling• Evaluates conditional-assembly directives• Expands macros and macro functions.• Evaluates constant expressions.• Encodes instructions and nonaddress operands.• Saves memory offsets as offsets from their segments.• Places segments and segment attributes in the object
file.• Saves placeholders for offsets and segments
(relocatable addresses).• Outputs a listing if requested.• Passes messages directly to the linker.
Linking
• Combines segments according to the instructions in the object files, rearranging the positions of segments that share the same class or group.
• Fills in placeholders for offsets (relocatableaddresses).
• Writes relocations for segments into the header of .EXE files (but not .COM files).
• Writes the result as an executable program file.
Loading and running
• Operation system:– Creates the program segment prefix (PSP).– Allocates memory for the program, based on the
values in the PSP.– Loads the program.– Calculates the correct values for absolute
addresses from the relocation table.– Loads the segment registers with values that point
to the proper areas of memory.– Jumps to the first instruction of the program.
MASM - General functions
• Free form assembler– it does not matter how many spaces are between
words
mov ax, [bx]mov ax, [bx]
mov ax , [bx]mov ax,[bx]
mov ax, [bx]mov ax, [bx]
mov ax , [bx]mov ax,[bx]
MASM - General functions
• Capital letters recognizing– small and capital letters are treated the same– they are distinguished only in strings– small to capital letters automatic conversion
possible as an option
movMOVmOVMOv
movMOVmOVMOv
General language components
• Alphabetical symbols.• Special characters.• Operators.• Constants and constant expressions.• Reserved words (keywords).• Symbolic names (identifiers).• Predefined Symbols.
Alphabetical symbols• Symbols allowed for use:
– Letters: A…Z, a…z– Digits: 0…9– Special characters: + - * / = ( ) [ ] < > . , ‘ ” _ : ? @ $
& %– Hidden ASCII characters: 20h, 0Dh, 0Ah
Special characters• 20h – space – separates or ends the words in the
source code,• 09h – tab – improves legibility of source code,• , – comma – separates the operands,• ‘…’ – apostrophe – text delimiter,• „…” – inverted commas – text delimiter,• (…) – parethensis – order of expression counting,• 0D0Ah – CRLF (enter) – end of line,• ; – semicolon – begins the comment,• : – colon – delimits the labels and segment prefixes.
Special characters• . – dot – used in record data types, begins some of the
directives,• & – and – used in macros,• <…> – parethensis – used in macros,• […] – parethensis – used in address expressions,• $ – dollar – actual value of instruction pointer,• = – equal sign – directive,• ? – interrogation – indefinite value,• @ – begins predefined names,• _ – underline – used in symbolic names instead of the
space.
Operators• Arithmetic
+ - * / [ ] . MOD• Logical and shift
AND NOT OR SHL SHR XOR• Record
MASK WIDTH• Control flow
! != & && < <= == > >= ||• Macro
! % & ;; <>
Operators• Relational
EQ GE GT LE LT NE• Segment
: LROFFSET OFFSET SEG• Type
HIGH HIGHWORD LOW LOWWORD LENGTHLENGTHOF PTR SHORT SIZE SIZEOFTHIS TYPE
• Miscellaneous’ ’ “ ” : ; DUP
• Runtime operatorsCARRY? OVERFLOW? PARITY? SIGN? ZERO?
Constants• Known while assembling.• Can’t change during program execution.• Numeric constants
– EQU – equal sign (=)
• Text string constants– EQU
Numeric Constants• Integer constants:
– binary, 1b, 0101B, -10y, 111111Y– octal, 34o, -746O, 2167q, 0Q– decimal, 39, 12d, 1200D, -90t, 56T– hex, 0h, 14A6h, 0FE3H
• .RADIX base directive:– (.RADIX 16)
• Floating point constants:– decimal notation, 1.0, 3.1415, -0.5– exponent notation, 1e5, 1.56e-2, -15.7e+12
String constants• String is an array of characters.
– ‘Hello world’– ”123*x=???”
• Equal notations:– mov BH,’A’– mov BH,”A”– mov BH,41h
Text macros
• TEXTEQU directive• They are treated as a single literal element
hello TEXTEQU <Hello beautiful world>wp TEXTEQU <WORD PTR>
mov wp [bx], 0
hello TEXTEQU <Hello beautiful world>wp TEXTEQU <WORD PTR>
mov wp [bx], 0
Reserved words - Keywords• Words that have special meaning in MASM.
– Instructions – Directives– Attributes– Operators– Predefined symbols
• They can’t be used as:– label names– variable names– constant names– etc.
Symbolic names - Identifiers
• Words with special meaning used for identifyingcostants, variables, addresses, segments etc.
• Can’t begin with the digit.• Only first 31 characters are recognized.• Can consist of special characters like $, @, _, ?• Upper case and lower case letters are treated
as the same.
Symbolic names• Proper symbolic names
KT123_4Number_602602602_@1?quest$125__Right_Here
• Improper symbolic names12_cats?‘name’Hello.worldRight Here
Predefined symbolsSegment symbols:@code - The name of the code segment (text macro).@CodeSize - 0 for TINY, SMALL, COMPACT, and FLAT models,
and 1 for MEDIUM, LARGE, and HUGE models (numeric equate).
@CurSeg - The name of the current segment (text macro).@data - The name of the default data group. Evaluates to
DGROUP for all models except FLAT. Evaluates to FLAT under the FLAT memory model (text macro).
@DataSize - 0 for TINY, SMALL, MEDIUM, and FLAT models, 1 for COMPACT and LARGE models, and 2 for HUGE model (numeric equate).
Predefined symbols@fardata - The name of the segment defined by the .FARDATA
directive (text macro).@fardata? - The name of the segment defined by the
.FARDATA? directive (text macro).@Model - 1 for TINY model, 2 for SMALL model, 3 for COMPACT
model, 4 for MEDIUM model, 5 for LARGE model, 6 for HUGE model, and 7 for FLAT model (numeric equate).
@stack - DGROUP for near stacks or STACK for far stacks (text macro).
@WordSize - Two for a 16-bit segment or 4 for a 32-bit segment (numeric equate).
Predefined symbolsMacro functions:@CatStr( string1 [[, string2...]] ) - Macro function that
concatenates one or more strings. Returns a string.@InStr( [[position]], string1, string2 ) - Macro function that
finds the first occurrence of string2 in string1, beginning at position within string1. If position does not appear, search begins at start of string1. Returns a position integer or 0 if string2 is not found.
@SizeStr( string ) - Macro function that returns the length of the given string. Returns an integer.
@SubStr( string, position [[, length]] ) - Macro function that returns a substring starting at position.
Predefined symbolsEnvironment functions:@Cpu - A bit mask specifying the processor mode (numeric
equate). @Environ( envvar ) - Value of environment variable envvar
(macro function).@Interface - Information about the language parameters (numeric
equate).@Version - 610 in MASM 6.1 (text macro).
Predefined symbolsFile functions:@FileCur - The name of the current file (text macro).@FileName - The base name of the main file being assembled
(text macro).@Line - The source line number in the current file (numeric
equate).
Date and time functions:@Date - The system date in the format mm/dd/yy (text macro). @Time - The system time in 24-hour hh:mm:ss format (text
macro).
Predefined symbolsMiscelaneous:$ - The current value of the location counter.? - In data declarations, a value that the assembler allocates but
does not initialize.@@: - Defines a code label recognizable only between label1 and label2, where label1 is either start of code or the previous @@:label, and label2 is either end of code or the next @@: label.
@B - The location of the previous @@: label@F - The location of the next @@: label.
Statements
• name – labels the statement• operation – defines the action• operands – list of items on which operation
works• comment – ignored by assembler, for
documentation purposes only
[[name:]] [[operation]] [[operands]] [[;comment]] [[name:]] [[operation]] [[operands]] [[;comment]]
Statement example
• mainlp – is the label• mov – is the operation• ax and 7 – are operands • all after semicolon – is the comment
mainlp: mov ax, 7 ; Load AX with the value 7mainlp: mov ax, 7 ; Load AX with the value 7
Statement examples
repeat: mov ax, cx ; Load AX with the value ; of CX
compar: repne scas str_1 ; compare the str_1
beginning:byte_var db 55h
dw 0AAAh
program segment para ; begining of the segmentend ; end of the program
repeat: mov ax, cx ; Load AX with the value ; of CX
compar: repne scas str_1 ; compare the str_1
beginning:byte_var db 55h
dw 0AAAh
program segment para ; begining of the segmentend ; end of the program
Statement examples
such_strange_example mov edx,& pointer_to_the_table[bx+145]
such_strange_example mov edx,& pointer_to_the_table[bx+145]
.IF (x > 0) \ ; X must be positive
&& (ax > x) \ ; Result must be > x&& (cx == 0) ; Check loop counter, toomov dx, 20h.ENDIF
.IF (x > 0) \ ; X must be positive
&& (ax > x) \ ; Result must be > x&& (cx == 0) ; Check loop counter, toomov dx, 20h.ENDIF
Directives• Create the segments.• Define symbolic names.• Define connections between the modules.• Define variables.• Reserve memory.• Provide conditional assembling.• Change options of the list file.• Define macros.• Other.
Organizing segments• Physical segment:
– begins at memory locations evenly divisible by 16,– its hexadecimal address always ends with 0, as in 10000h or
2EA70h,– 8086/286 processors allow segments 64K in size,– 80386/486 processors in protected mode se 32-bit registers
that can hold addresses up to 4 gigabytes.• Logical segment:
– contain the three components of a program: code, data, and stack,
– segment registers contain the addresses of the physical memory segments where the logical segments reside.
Organizing segments• Segment defining
– simplified segment directives,– full segment definitions,– You can use both kinds in one program.
• Simplified directives– generate necessary code, specify segment
attributes, and arrange segment order.• Full definitions
– provide more complete control over how the assembler generates segments.
Simplified segment directives• .MODEL – defines memory model of module• .CODE – starts the code segment• .CONST – defines constant data segment• .DATA – starts the data segment• .DATA? – starts uninitialized data segment• .FARDATA – starts far data segment• .FARDATA? – uninitialized far data segment• .STACK – starts stack segment• .STARTUP – generates the beginning code• .EXIT – generates the ending code
.MODEL directive
• Must begins the module that uses othersimplified segment directives.
• Defines the attributes:– memory model– default calling conventions– default naming conventions– operating system– stack type
.MODEL memorymodel [[, modeloptions ]].MODEL memorymodel [[, modeloptions ]]
.MODEL directive.MODEL memorymodel [[, modeloptions ]].MODEL memorymodel [[, modeloptions ]]
NoDOS, WinFarFarLarge
YesWin NTNearNearFlatNoDOS, WinFarFarHuge
NoDOS, WinFarNearCompactNoDOS, WinNearFarMediumNoDOS, WinNearNearSmallYesDOSNearNearTiny
Data and codecombined
Operatingsystem
Defaultdata
Defaultcode
Memorymodel
.MODEL directive
MultipleMultipleLarge
32-bit OSOneFlatMultipleMultipleHuge
MultipleOneCompactOneMultipleMediumOneOneSmall
DOS onlyOneTiny
Data segments
Codesegments
Memorymodel
.MODEL memorymodel [[, modeloptions ]].MODEL memorymodel [[, modeloptions ]]
.MODEL directive
• Language options:– PASCAL– BASIC– FORTRAN– C– SYSCALL
• Stack options:– NEARSTACK– FARSTACK
.MODEL memorymodel [[, modeloptions ]].MODEL memorymodel [[, modeloptions ]]
.MODEL example
.MODEL small ; Small memory model
.MODEL large, c, farstack; Large memory model,
; C conventions,; separate stack
.MODEL medium, pascal; Medium memory model,; Pascal conventions,; near stack (default)
.MODEL small ; Small memory model
.MODEL large, c, farstack; Large memory model,
; C conventions,; separate stack
.MODEL medium, pascal; Medium memory model,; Pascal conventions,; near stack (default)
.CODE directive
• Starts the code segment• Near segments
– SMALL, COMPACT, TINY– linker combines multiple code segments into one– default name of combined segment: _TEXT
• Far segments– MEDIUM, LARGE, HUGE– default name of segment: MODNAME_TEXT– „name” overrides the „MODNAME” part
.CODE [[ name ]].CODE [[ name ]]
.CODE example
.CODE FIRST ; Begin of the code segment; named FIRST_TEXT
...
.CODE SECOND ; End of FIRST_TEXT segment; Begin of the code segment; named SECOND_TEXT
...
END ; End of SECOND_TEXT segment
.CODE FIRST ; Begin of the code segment; named FIRST_TEXT
...
.CODE SECOND ; End of FIRST_TEXT segment; Begin of the code segment; named SECOND_TEXT
...
END ; End of SECOND_TEXT segment
.DATA directive
• Starts the near data segment• Up to 64k in MS-DOS• Up to 512M in FLAT model in WinNT• It is placed in the DGROUP group of segments• DGROUP is also only 64k in size
.DATA.DATA
.CONST. and DATA? directives
• .CONST Starts the near data segment thatholds constant data
• .DATA? Starts the near data segment ofuninitialized data
• Enhance compatibility with HLL• They go into the DGROUP
.CONST
.DATA?
.CONST
.DATA?
.FARDATA, .FARDATA? directives
• .FARDATA – Starts the far data segment– Name of the segment: FAR_DATA
• .FARDATA?– Starts the far data segment of uninitialized data– Name of the segment: FAR_BSS
.FARDATA.FARDATA?
.FARDATA.FARDATA?
. STACK directive
• Starts the stack data segment• Use only in assembler main module• Allocates 1kB by default• To create a stack of different size use „size”
argument• „Size” argument indicates stack size in bytes
.STACK [[ size ]].STACK [[ size ]]
. STARTUP and .EXIT directives
• .STARTUP– Generate starting code in MS-DOS programs.– Usually follows the .CODE
• .EXIT– Generate terminating code in MS-DOS programs.– One-byte exit code is returned to the system.– Default exit code is the value from AL.
.STARTUP.EXIT [[ returncode ]]
.STARTUP.EXIT [[ returncode ]]
.STARTUP code@Startup: mov dx, DGROUP
mov ds, dxmov bx, sssub bx, dxshl bx, 1 ; If .286 or higher, this isshl bx, 1 ; shortened to shl bx, 4shl bx, 1shl bx, 1cli ; Not necessary in .286 or highermov ss, dxadd sp, bxsti ; Not necessary in .286 or higher...END @Startup
@Startup: mov dx, DGROUPmov ds, dxmov bx, sssub bx, dxshl bx, 1 ; If .286 or higher, this isshl bx, 1 ; shortened to shl bx, 4shl bx, 1shl bx, 1cli ; Not necessary in .286 or highermov ss, dxadd sp, bxsti ; Not necessary in .286 or higher...END @Startup
.STARTUP code
@Startup: mov dx, DGROUPmov ds, dx...END @Startup
@Startup: mov dx, DGROUPmov ds, dx...END @Startup
.EXIT code
mov al, valuemov ah, 04Chint 21h
mov al, valuemov ah, 04Chint 21h
Simplified directives example.MODEL small, c ; This statement is required
; before you can use other ; simpl. segment directives
.STACK ; Default 1-kilobyte stack
.DATA ; Begin data segment; Place data declarations here
.CODE ; Begin code segment
.STARTUP ; Generate start-up code; Place instructions here
.EXIT ; Generate exit codeEND
.MODEL small, c ; This statement is required; before you can use other ; simpl. segment directives
.STACK ; Default 1-kilobyte stack
.DATA ; Begin data segment; Place data declarations here
.CODE ; Begin code segment
.STARTUP ; Generate start-up code; Place instructions here
.EXIT ; Generate exit codeEND