Rechnernetz R Teil D3 2010, Ver 0.9 Michael Hutter Karl C. Posch www.iaik.tugraz.at/content/teaching/bachelor_courses/rechnernetze_und_organisation/ 1 Contents of lecture TOY x86 Networks Hardware, Stack, Input/Output 2 1
Contents of part B GCC C-Preprocessor Linker Libraries Inline-Assembly Combining C/C++ with Assembler language 3 GCC: cpp: GCC is a wrapper program C-Preprocessor cc1: Compiler: C-code to assembler code: generates x.s as: ld: Assembler: generates object code x.o Linker: combine object files and generate executable. 4 2
%cat y.c // y.c #define ZZZZ 7 #include w.h main() { int x,y,z; scanf( %d%d,&x,%y); x *= ZZZZ; y += ZZZZ; z = bigger(x,y); printf( %d\n,z); C-Preprocessor cpp %cat w.h // w.h #define bigger(a,b) (a > b)? (a) : (b) 5 % cpp y.c # 1 y.c # 1 <built-in> # 1 <command line> # 1 y.c # 1 w.h 1 # 6 y.c 2 C-Preprocessor cpp main() { int x,y,z; scanf( %d%d,&x,%y); x *= 7; y += 7; z = (x < y)? (x) : (y); printf( %d\n,z); 6 3
The linker Example: gcc g x.c y.c Compiler generates object files x.o and y.o Linker Resolves cross-file function calls creates a.out No matter whether original files were in C, C++, etc. Linker needs also to be called with only one object file. Why? C-programs use implicit libraries, like libc.so libc.so defines the label _start Sets up the stack 7 Headers in executable files Define sections and their start addresses ELF-format is used in Linux With readelf s one can find out the addresses Alternatively use nm For object files: Find out headers with objdump 8 4
Libraries Is a conglomerate of several object files Can be static or dynamic Static:.a Dynamic:.so (called DLL in Windows) Static: included at compile time Dynamic: included d at run time In Unix, library names usually start with lib and end with a version number: e.g. libc.so.6 9 How to make a static library? % gcc c x.c % gcc c y.c % ar lib8888.a x.o y.o % ranlib lib8888.a Generate object files x.o and y.o Generate the library file lib8888.a % ar t lib888.a Find out what s in a library file 10 5
How to make a dynamic library? % gcc -g c fpic x.c % gcc g c fpic y.c Generate object files x.o and y.o % gcc shared o lib8888.so x.o y.o Generate the library file lib8888.so % readelf -s lib888.so Find out what s in a library file 11 How to link a library? % gcc g w.c lm Link the static library libm.so; % gcc g w.c lqrs L/a/b/c search the default directories for the libraries (/usr/lib or /lib); See /etc/ld.so.cache for default directories for libraries Indicate a specific directory for the library; here /a/b/c In the case of a dynamic library, gcc only checks for the existence of the library libqrs.so in the directory /a/b/c. One can also use also setenv LD_LIBRARY_PATH /a/b/c in order to set the path to the library. % ldd a.out Find out which library is used by a.out 12 6
Inline assembly code for C++ // file a.c int x; main() { scanf(" %d,&x); asm ( pushl x ); # call #APP pushl #NO_APP # scanf x 13 Combining C/C++ with assembly language Why? For hardware-dependent parts of a code For speed optimisation In this class: for learning and understanding Example Linux: Most code has been written in C: Portability across different hardware platforms Some code is written in assembly language: Access to certain hardware resources 14 7
Example TryAddOne.c 1 // TryAddOne.c, example of interfacing C to assembly language 2 // paired with AddOne.s, which contains the function addone() 3 // compile by assembling AddOne.s first, and then typing 4 // 5 // gcc -g -o tryaddone TryAddOne.c AddOne.o 6 // 7 // to link the two.o files into an executable file tryaddone 8 // (recall the gcc invokes ld) 9 10 int x; 11 12 main() 13 14 { x = 7; 15 addone(&x); 16 printf("%d\n",x); // should print out 8 17 exit(1); 18 19 15.file "TryAddOne.c".section.rodata.LC0:.string "%d\n".text.globl main.type main, @function main: leal 4(%esp), %ecx andl $-16 16, %esp pushl -4(%ecx) pushl %ebp movl %esp, %ebp pushl %ecx subl $20, %esp gcc -S TryAddOne.s movl movl call movl movl movl call movl call $7, x $x, (%esp) addone x, %eax %eax, 4(%esp) $.LC0, (%esp) printf $1, (%esp) exit.size main,.-main.comm x,4,4.ident "GCC: (Debian 4.3.2-1.1) 4.3.2".section.note.GNUstack,, @progbits 16 8
as gstabs o AddOne.o AddOne.s 15.text 16 17 # need.globl to make addone visible to ld 18.globl addone 19 20 addone: ESP addr - 4 addr addr + 4 addr + 8 addr + 12 21 22 # will use EBX for temporary storage below, and since the calling 23 # module might have a value there, better save the latter on the stack 24 # and restore it when we leave 25 push %ebx 26 27 # at this point the old EBX is on the top of the stack, then the 28 # return address, then the argument, so the latter is at ESP+8 29 Datum von EBX Return Address Argument (&x) 30 movl 8(%esp), %ebx 31 32 incl (%ebx) # increment; need the (), since the argument was an address 33 34 # restore value of EBX in the calling module 35 pop %ebx 36 37 ret 17 Try it out! % as gstabs o AddOne.o AddOne.s % gcc -g -o tryaddone TryAddOne.c AddOne.o 18 9
Sections in memory.comm x,4,4 int x; // uninitialized.bss.data y:.long 4 int y = 4; // initialized.section.rodata.lc0:.string %d\n.text With the command nm you can find out about all symbols in a file. // read-only data like // strings // code section starts 19 nm tryaddone address of subroutine addone T stands for.text D stands for.data B stands for.bss R stands for.rodata With the command nm you can find out about all symbols in a file. 20 10
printf( %d\n,x); More arguments movl x, %eax movl %eax, 4(%esp) movl $.LC0, (%esp) call printf 21 Return values from a function For int or char: in register EAX For long long (= 8 bytes): EDX:EAX For float: The CPU has special registers for this (see Matloff s text Arithmetic & Logic ) 22 11
Calling C-functions from assembly code.data x:.long 1.long 5.long 2.long 18 sum:.long 0 fmt:.string %d\n.text.globl main main: movl $4, %eax movl $0, %ebx movl $x, %ecx top: addl (%ecx), %ebx addl $4, %ecx decl %eax jnz top printsum: pushl %ebx pushl $fmt call printf done: movl %ebx, sum 23 GCC asks for label main gcc g o sum sum.s GCC links the C-startup library which has the label _start The code starting at _start makes some initialisations and then jumps to label main Watch out: The C-routine could change the values in the registers! 24 12
file sum.c Local variables live also on the stack int sum(int *x, int n) { int i=0,s=0; for (i = 0; i < n; i++) s += x[i]; return s; gcc S sum.c... sum: pushl movl subl movl movl %ebp %esp, %ebp $8, %esp $0, -8(%ebp) $0, -4(%ebp) Local variables are stored in reverse order Thus, the variable i is pushed first, then the variable s Not all compilers produce exactly the same code as above 25 The stack frame of a function g() 26 13
void h(int *w) { int z; *w = 13 * *w; int g(int u) { int v; h(&u); v = u + 12; main() { int x,y; x = 5; y = g(x); Stack frames are linked Each stack frame starts with a pointer to the calling function The value in EBP points to the current stack frame In GDB we can use the command bt ( backtrace ) 27 Check out stack frames with GDB void h(int *w) { int z; *w = 13 * *w; int g(int u) { int v; h(&u); v = u + 12; main() { int x,y; x = 5; y = g(x); (gdb) b h Breakpoint 1 at 0x804837a: file bt.c, line 3. (gdb) r Starting program: /home/test/o/teil_d/unterprogramme/ a.out Breakpoint 1, h (w=0xbf998dbc) at bt.c:3 3 *w = 13 * *w; (gdb) bt #0 h (w=0xbf998dbc) at bt.c:3 #1 0x080483a3 in g (u=5) at bt.c:8 #2 0x080483d1 in main () at bt.c:16 (gdb) f 1 #1 0x080483a3 in g (u=5) at bt.c:8 8 h(&u); (gdb) p u $1 = 5 (gdb) p v $2 = -1208813047 28 14
The instructions ENTER and LEAVE pushl %ebp movl %esp, %ebp subl $8, %esp Currently not used, since this instruction is too slow enter 8, 0 Prologue movl %ebp, %esp popl %ebp leave Epilogue 29 main() is a function with a stack frame too main(int argc, char** argv) { int i; printf( %d %s\n, argc, argv[1]); file argv.c Upon linking, startup code from libraries is added: Code first starts at label _start, later function main is called. Note: argc and argv do not necessarily need to have these names! Translate with gcc S Run as: a.out abc def 30 15
After calling main() 31 The prologue of main() main: # after call: ESP points to return address leal 4(%esp), %ecx # ECX = ESP+4: ECX holds pointer to argc andl $-16, %esp # set ESP to the next lower address which is # 0 modulo 16 pushl -4(%ecx) # push ECX-4 on stack: is return value pushl %ebp # push EBP: save caller s base pointer movl %esp, %ebp # save ESP in EBP: set new base pointer pushl %ecx # save value of ECX: points to argc 32 16
Preparing the call for printf() subl $36, %esp # increase stack by 9 words movl 4(%ecx), %eax # get address of **argv into EAX addl $4, %eax # now EAX points to argv[1] movl (%eax), %eax # now EAX points to first argument ("abc") movl %eax, 8(%esp) # pointer to first argument ("abc"), # moved to two words below TOS movl (%ecx), %eax # get argc (=3) to EAX movl %eax, 4(%esp) # second parameter of printf one below TOS # (argc, 3) movl $.LC0, (%esp) # address of 1st parameter of printf on TOS 33 Preparing the call for printf() subl $36, %esp # increase stack by 9 words movl 4(%ecx), %eax # get address of **argv into EAX addl $4, %eax # now EAX points to argv[1] movl (%eax), %eax # now EAX points to first argument ("abc") movl %eax, 8(%esp) # pointer to first argument ("abc"), # moved to two words below TOS movl (%ecx), %eax # get argc (=3) to EAX movl %eax, 4(%esp) # second parameter of printf one below TOS # (argc, 3) movl $.LC0, (%esp) # address of 1st parameter of printf on TOS 34 17
addl $36, %esp popl %ecx popl %ebp After call to printf() # after call: "pop" 9 words from stack # restore ecx: # points now to argc on stack again # restore ebp: # EBP has now caller's base pointer again leal -4(%ecx), %esp # restore esp: # ESP points now to return address again ret 35 char **argv 36 18
Compiling, assembling, trying out in GDB %gcc S argv.c %as -gstabs o argv.o argv.s %gcc g argv.o % gdb -q a.out (gdb) b 9 Breakpoint 1 at 0x80483ae: file argv.s, line 12. (gdb) r abc def Starting program: /home/test/o/teil_d/unterprogramme/a.out abc def Breakpoint 1, main () at argv.s:9 9 leal 4(%esp), %ecx Current language: auto; currently asm 37 Continuing session in GDB (gdb) x/3x $esp 0xbff88bcc: 0xb7e1a455 0x00000003 0xbff88c54 (gdb) x/3x 0xbff88c54 0xbff88c54: 0xbff8a64c 0xbff8a677 0xbff8a67b (gdb) x/s 0xbff8a64c 0xbff8a64c: /home/test/o/teil_d/unterprogramme/a.out (gdb) x/s 0xbfdd6677 0xbfdd6677: "abc" (gdb) x/s 0xbfdd667b 0xbfdd667b: "def" (gdb) 38 19
By the way: int main(int argc, char **argv, char**envp) 39 Hardware has no glue of data types Only the compiler defines scope of variables. The compiler thus watches the programmer. There is no scope at hardware level. In C++ you learn: A private member of a class cannot be accessed from anywhere outside the class. This is wrong! Right is: The compiler will refuse to compile any C++ code you write which h attempts t to access by name a private member of a class from anywhere outside the class. 40 20
What should you know by now? Understand the terms; understand the connections between them; be able to operate with them: C-preprocessor Nm Linking Arguments Headers in exe-files Return values readelf, objdump Calling C-functions from assembly Static & dynamic libraries code How to make libraries Local variables How to link libraries Stack frame Inline assembly code Enter, leave Combinine C/C++ code with assembly code Sections in memory argc, argv Data types at hardware level? Scope at hardware level? 41 21