inria - labriphoenix group oct-05 1 program specialization of systems programs charles consel...

156
INRIA - LaBRI Phoenix Group Oct-05 1 Program Specialization of Systems Programs Charles Consel Phoenix Research Group (formerly known as the Compose Group) INRIA -LaBRI October 2005

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

INRIA - LaBRI Phoenix Group Oct-051

Program Specialization of Systems Programs

Charles Consel

Phoenix Research Group(formerly known as the Compose Group)

INRIA -LaBRI

October 2005

INRIA - LaBRIPhoenix Group Oct-052

Phoenix Group: Research Topics

Programming language technology– Program analysis and transformation– Program specialization– Domain-specific languages

Application areas– Operating systems– Networking– Telecommunications

Prototypes– Tempo– Devil, Plan-P, Spidle, SPL

INRIA - LaBRIPhoenix Group Oct-053

Recent Contributors

Anne-Françoise Le Meur, University of Lille Sapan Bhatia, PhD student in Phoenix Julia Lawall, University of Copenhagen

INRIA - LaBRI Phoenix Group Oct-054

Adaptable Programs:Why, Where, How

Introduction

INRIA - LaBRIPhoenix Group Oct-055

Adaptable Programs: Why?

Problem 1 Program1

Problem 2 Program2

Problem 3 Program3

Problem family Program family

ProblemAdaptableProgram

Tool

AdaptedProgram

INRIA - LaBRIPhoenix Group Oct-056

Adaptable Programs: Where?

Areas:– Operating system components– Networking layers

Features:– Adaptability to a variety of platforms– Adaptability to a family of services– Adaptability to a variety of states– Long-running programs

INRIA - LaBRIPhoenix Group Oct-057

Adaptable Programs: How?

Program structuring approach: generic programs

GenericProgram

Customizationinformation

CustomizedProgram

Specializer

INRIA - LaBRIPhoenix Group Oct-058

Specialization of System/Networking Code: Challenges

System code– Legacy usually written in C– Generic but optimized code– Poor software architecture

Program specialization– Specialization of a real-size language (i.e., C)– Specialization of a low-level language– Specialization at run time (not only compile time)– Targeting real-size cases

INRIA - LaBRIPhoenix Group Oct-059

Specializing System/Networking Code

Synthesis Kernel - Columbia– Ad Hoc manual run-time code generation for system calls

[SOSP’89]

Synthetix Project – OGI– Methodology for manual specialization [SOSP’95]

Tempo - Compose/OGI Automatic specialization [TOCS’01]

Sun Remote Procedure Call [ICDCS’98] Incremental Checkpointing [DSN’00]

INRIA - LaBRIPhoenix Group Oct-0510

Tempo in One Slide!

Specialization declarations Program analyses

– Context/flow/return sensitive binding-time analysis

– Evaluation-time analysis– Action analysis

Program transformations– Compile time– Run time– Data specialization

Program specializer: Tempo

Languages: C, C++, Java Software architectures Components:

– IPC– RPC– Signals– TCP/IP

Gains: time (and/or) space[…SOSP’95, POPL’96, PLDI’99, ECOOP’99, ASE’00, CD’02, EMSOFT’04, LCN’04…][…HOSC’99, HOSC’00, TOCS’01, TOPLAS’03, HOSC’04, SCP’04…]

INRIA - LaBRIPhoenix Group Oct-0511

Anatomy of Tempo

Analysis ctx C program

Preprocessing

Compile-time

Specializer

Run-time specializer

Generator

Specialized binary

Specialization ctx

Run-timeSpecializer

PostprocessingSpecialization ctx

Specialized source

INRIA - LaBRIPhoenix Group Oct-0512

Anatomy of Tempo

Analysis ctx C program

Preprocessing

Compile-time

Specializer

Run-time specializer

Generator

Specialized binary

Specialization ctx

Run-timeSpecializer

PostprocessingSpecialization ctx

Specialized source

INRIA - LaBRI Phoenix Group Oct-0513

Integrating Program Specialization in The Software Development Process

Part I

INRIA - LaBRIPhoenix Group Oct-0514

Context

Customizable component Family of problems

ComponentCustomization

information

Customizedcomponent

Customizable for a family of

problems

Description of a given problem (a usage context)

Component customization?

INRIA - LaBRIPhoenix Group Oct-0515

Software Component Customization

Functional customization

– restriction of the behavior

ex: JavaBeans

Code customization

– restriction of the behavior

– reduction of the genericity

smaller & faster code

INRIA - LaBRIPhoenix Group Oct-0516

Code customization

Scope :

– often program fragments

How :– ifdef compiler directives

code hard to read, hard to maintain, complex

– C++ Templates

cumbersome, error-prone, difficult to use, no debugger

Limitation :

– the will of the programmer to encode the transformations

INRIA - LaBRIPhoenix Group Oct-0517

Example: power

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0518

Example: power

Functional customization

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0519

Example: power

int power(int base, 3) {

int accum = 1;

while ( expon > 0) {

accum *= base;

expon - - ; }

return accum; }

Functional customization

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0520

Example: power

Code customization

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0521

Example: power

expon = 3

int power(int base){ return base*base*base;}

Code customization

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0522

Example: power

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

expon = 3

int power(int base){ return base*base*base;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0523

Example: power

expon = 3

int power(int base){ return base*base*base;} Poor readability

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0524

Example: power

expon = 3

int power(int base){ return base*base*base;} Semantic transformation

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0525

Example: power

expon = 3

int power(int base){ return base*base*base;}

power<3>

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0526

Example: power

expon = 3

int power(int base){ return base*base*base;}

power<2>(base) * base

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0527

Example: power

expon = 3

int power(int base){ return base*base*base;}

power<1>(base) * base * base

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0528

Example: power

expon = 3

int power(int base){ return base*base*base;}

base * base * base

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0529

Example: power

expon = 3

int power(int base){ return base*base*base;}

base * base * base

Do we really get what we wanted ??

template<int expon>

inline int power(const int& base)

{ return power<expon-1>(base) * base;}

template<>

inline int power<1>(const int& base)

{ return base;}

template<>

inline int power<0>(const int& base)

{ return 1;}

int power(int base, int expon) {int accum = 1;while (expon > 0) {

accum *= base;expon - - ; }

return accum; }

INRIA - LaBRIPhoenix Group Oct-0530

To Summarize

ComponentCustomization

information

Customizedcomponent

?

INRIA - LaBRIPhoenix Group Oct-0531

To Summarize

ComponentCustomization

information

Customizedcomponent

?

Source codewith

hardcoded transformations :what to customize

& how

INRIA - LaBRIPhoenix Group Oct-0532

To Summarize

ComponentCustomization

information

Customizedcomponent

?

Source codewith

hardcoded transformations :what to customize

& how

Values

INRIA - LaBRIPhoenix Group Oct-0533

To Summarize

ComponentCustomization

information

Customizedcomponent

Values

Compiler

Source codewith

hardcoded transformations :what to customize

& how

INRIA - LaBRIPhoenix Group Oct-0534

To Summarize

ComponentCustomization

information

Customizedcomponent

Values

Compiler

No real guaranty

Source codewith

hardcoded transformations :what to customize

& how

INRIA - LaBRIPhoenix Group Oct-0535

What We Would Like

ComponentCustomization

information

Customizedcomponent

Values

Compiler

No real guaranty

Source codewith

hardcoded transformations :what to customize

& how

INRIA - LaBRIPhoenix Group Oct-0536

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

INRIA - LaBRIPhoenix Group Oct-0537

Automatic generation of hardcoded

transformations

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations

INRIA - LaBRIPhoenix Group Oct-0538

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations

Automatic generation of hardcoded

transformations

INRIA - LaBRIPhoenix Group Oct-0539

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations Guaranteed

Automatic generation of hardcoded

transformations

INRIA - LaBRIPhoenix Group Oct-0540

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations

Guaranteed

Automatic generation of hardcoded

transformations

INRIA - LaBRIPhoenix Group Oct-0541

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations

Guaranteed

Automatic generation of hardcoded

transformations

Developer

INRIA - LaBRIPhoenix Group Oct-0542

What We Would Like

Customizationinformation

Customizedcomponent

Values

Compiler

Sourcecode

What tocustomize

Code with annotatedtransformations

Guaranteed

Automatic generation of hardcoded

transformations

Developer User

INRIA - LaBRIPhoenix Group Oct-0543

Our Approach

Component developer Component user

Customizablecomponent

Customizablecomponent

Customizationvalues

Customizedcomponent

Sourcecode

Customizationscenarios

INRIA - LaBRIPhoenix Group Oct-0544

Component Developer

Customizablecomponent

Sourcecode

Customizationscenarios

INRIA - LaBRIPhoenix Group Oct-0545

Component Developer

Customizablecomponent

Sourcecode

Customizationscenarios

Step 1: Developing customizable components

INRIA - LaBRIPhoenix Group Oct-0546

Component Developer

Customizablecomponent

Sourcecode

Customizationscenarios

Step 1: Developing customizable components

Declaration language: WHAT are the customization parameters

• global variables• function parameters• data structure fields

INRIA - LaBRIPhoenix Group Oct-0547

An example of component :Forward Error Correction Encoders

• Prevent losses and errors• Transmit redundant information

family of problems

INRIA - LaBRIPhoenix Group Oct-0548

Encoder Features

INRIA - LaBRIPhoenix Group Oct-0549

FEC Component Hierarchy

encode

parity_bit

mult_mat

lbc_sys code_conv crclbc libstring

INRIA - LaBRIPhoenix Group Oct-0550

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code

INRIA - LaBRIPhoenix Group Oct-0551

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code

INRIA - LaBRIPhoenix Group Oct-0552

Remember power

source code declaration module

Module power {Defines { From power.c { Btp :: intern power(D(int) b, S(int) e); };}Exports { Btp; }}

customization parametersother parameters

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

INRIA - LaBRIPhoenix Group Oct-0553

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code declaration module

Module power {Defines { From power.c { Btp :: intern power(D(int) b, S(int) e); };}Exports { Btp; }}

customization parametersother parameters

INRIA - LaBRIPhoenix Group Oct-0554

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code declaration module

Module power {Defines { From power.c { Btp :: intern power(D(int) b, S(int) e); };}Exports { Btp; }}

customization parametersother parameters

INRIA - LaBRIPhoenix Group Oct-0555

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code declaration module

Module power {Defines { From power.c { Btp :: intern power(D(int) b, S(int) e); };}Exports { Btp; }}

customization parametersother parameters

INRIA - LaBRIPhoenix Group Oct-0556

Remember power

int power(int base, int expon)

{

int accum = 1;

while (expon > 0) {

accum *= base;

expon - - ;

}

return accum;

}

source code declaration module

Module power {Defines { From power.c { Btp :: intern power(D(int) b, S(int) e); };}Exports { Btp; }}

customization parametersother parameters

INRIA - LaBRIPhoenix Group Oct-0557

Component: compute

Component: power

Define: compute

needs to perform power calculation

Define: power

Remember power

INRIA - LaBRIPhoenix Group Oct-0558

Module: compute

Module: power

Module power { Defines { From power.c { Btp :: intern power(D(int) , S(int) ); };} Exports { Btp; } }

Module compute{ Imports { From power.mdl {Btp; }} Defines { From compute.c { Btcomp :: intern compute(D(int) , S(int) , S(int) ) needs { Btp;} };} Exports { Btcomp; } }

Remember power

INRIA - LaBRIPhoenix Group Oct-0559

Module: compute

Module: power

Module power { Defines { From power.c { Btp :: intern power(D(int) , S(int) ); };} Exports { Btp; } }

Module compute{ Imports { From power.mdl {Btp; }} Defines { From compute.c { Btcomp :: intern compute(D(int) , S(int) , S(int) ) needs { Btp;} };} Exports { Btcomp; } }

Customization contract

Remember power

INRIA - LaBRIPhoenix Group Oct-0560

Hierarchies of Sources & Modules

encode

parity_bit

mult_mat

lbc_sys code_conv crclbc libstring

INRIA - LaBRIPhoenix Group Oct-0561

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0562

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0563

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0564

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0565

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0566

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0567

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0568

Developing Customizable Components

INRIA - LaBRIPhoenix Group Oct-0569

Component Developer

Customizablecomponent

Sourcecode

Customization scenarios

Step 1: Developing customizable components

INRIA - LaBRIPhoenix Group Oct-0570

Component Developer

Customizablecomponent

Sourcecode

Customization scenarios

Step 1: Developing customizable components

Step 2: Making customizable components

INRIA - LaBRIPhoenix Group Oct-0571

Making Customizable Components

INRIA - LaBRIPhoenix Group Oct-0572

Component Developer

Customizablecomponent

Sourcecode

Customizationscenarios

Step 1: Developing customizable components

Step 2: Making customizable components

- Determine HOW- Verify that declared customization is possible

INRIA - LaBRIPhoenix Group Oct-0573

Component Developer

Customizablecomponent

Sourcecode

Customizationscenarios

Step 1: Developing customizable components

Step 2: Making customizable components

Guaranteed

INRIA - LaBRIPhoenix Group Oct-0574

Our Approach

Component developer Component user

Customizablecomponent

Customizablecomponent

Customizationvalues

Customizedcomponent

Sourcecode

Customizationscenarios

INRIA - LaBRIPhoenix Group Oct-0575

Component User

Customizablecomponent

Customizationvalues

Customizedcomponent

INRIA - LaBRIPhoenix Group Oct-0576

Component User

Customizablecomponent

Customizationvalues

Customizedcomponent

Step 3: Makingcustomized components

INRIA - LaBRIPhoenix Group Oct-0577

Generating Customized Components

INRIA - LaBRIPhoenix Group Oct-0578

Generating Customized Components

INRIA - LaBRIPhoenix Group Oct-0579

Generating Customized Components

INRIA - LaBRIPhoenix Group Oct-0580

Component User

Customizablecomponent

Customizationvalues

Customizedcomponent

Step 3: Makingcustomized components

INRIA - LaBRIPhoenix Group Oct-0581

Component User

Customizablecomponent

Customizationvalues

Customizedcomponent

Step 3: Makingcustomized components

Ready to be integrated

INRIA - LaBRIPhoenix Group Oct-0582

#include "encode.h"#include "parity_bit.h"#include "lin_b_code.h"#include "lin_b_code_sys.h"#include "code_conv.h"#include "encode_crc.h"#include "libstring.h"

voidencode(struct data *donnees, struct code *codage, struct data *result){ if (donnees->longueur != codage->k) { perror("donnees->longueur != codage->k"); exit(1);} else if (strcmp(codage->type, "parity_bit") == 0) if (result->longueur != codage->k + 1) { perror("result->longueur != codage->k + 1");

exit(1);} else parity_bit(codage->k, donnees->v, result->v); else if (strcmp(codage->type, "lin_b_code") == 0) if ((*result).longueur != (*codage).n)

{ perror ("result->longueur != codage->n"); exit (1);}

else if (strcmp(codage->opt, "syst") == 0) lin_b_code_sys(donnees->v, codage->k, codage->n, codage->matrix, result->v); else lin_b_code(donnees->v, codage->k, codage->n, codage->matrix, result->v);

else if (strcmp(codage->type, "code_conv") == 0) if ((*result).longueur != (2 * (*codage).k))

{ perror ("result->longueur != (2 * codage->k)"); exit (1);}

else code_conv(donnees->v, codage->k, result->v); else if (strcmp(codage->type, "crc") == 0) if ((*result).longueur != (*codage).n )

{ perror ("result->longueur != codage->n"); exit (1); }

else encode_crc(donnees->v, codage->k, codage->n, codage->matrix[0], result->v); else { perror("unknown code type"); exit(1); }}

#include "parity_bit.h"voidparity_bit(int l, int *vector, int *result){ int res; int i; res = 0; for(i = 0; i < l; i++) { res ^= vector[i]; result[i] = vector[i]; } result[l] = res;}

#include "mult_mat.h"voidmult_mat(int *vector, int k, int n, int **matrix, int *result, int indice){ int temp; int i, j;

for(i = 0; i < n; i++) { temp = 0;

for(j = 0; j < k; j++)temp ^= vector[j] & matrix[j][i];

result[i + indice] = temp; }}

#include "lin_b_code_sys.h"#include "mult_mat.h"voidlin_b_code_sys(int *vector, int k, int n, int **matrix, int *result){ int sub_matrix_size; int i;

sub_matrix_size = n - k; mult_mat(vector, k, sub_matrix_size, matrix, result, k); for(i = 0; i < k; i++) result[i] = vector[i];}

#include "lin_b_code.h"#include "mult_mat.h"voidlin_b_code(int *vector, int k, int n, int **matrix, int *result){ mult_mat(vector, k, n, matrix, result, 0);}

#include "code_conv.h"voidcode_conv(int *donnees, int k, int *result){ int temp; int i; for (i = 0; i < k - 2; i++) { temp = donnees[i] ^ donnees[i + 2]; result[2 * i + 1] = temp; temp ^= donnees[i + 1]; result[2 * i] = temp; } temp = donnees[k - 2] ^ donnees[k - 1]; result[2 * k - 4] = temp; result[2 * k - 3] = donnees[k - 2]; result[2 * k - 2] = donnees[k - 1]; result[2 * k - 1] = donnees[k - 1];}

#include "encode_crc.h"voidencode_crc(int *donnees, int k, int n, int *poly, int *result){ int l_poly; int i, j;

l_poly = n - k + 1; for (i = 0; i < k; i++) result[i] = donnees[i]; for (i = 0; i < k; i++) if (result[i] == 1) for (j = 0; j < l_poly; j++)

result[i + j] ^= poly[j]; for (i = 0; i < k; i++) result[i] = donnees[i];}

extern int strcmp(char *s1, char *s2);

struct data{ int longueur; int *v;};

struct code{ int k; int n; int **matrix; char *type; char *opt;};

extern void encode(struct data *donnees, struct code *codage, struct data *result);

extern void parity_bit(int l, int *vector, int *result);

extern void lin_b_code_sys(int *vector, int k, int n, int **matrix, int *result);

extern void code_conv(int *donnees, int l, int *result);

extern void lin_b_code(int *vector, int k, int n, int **matrix, int *result);

extern void mult_mat(int *vector, int k, int n, int **matrix, int *result, int indice);

extern void encode_crc(int *donnees, int k, int n, int *poly, int *result);

INRIA - LaBRIPhoenix Group Oct-0583

#include "encode.h"#include "parity_bit.h"#include "lin_b_code.h"#include "lin_b_code_sys.h"#include "code_conv.h"#include "encode_crc.h"#include "libstring.h"

voidencode(struct data *donnees, struct code *codage, struct data *result){ if (donnees->longueur != codage->k) { perror("donnees->longueur != codage->k"); exit(1);} else if (strcmp(codage->type, "parity_bit") == 0) if (result->longueur != codage->k + 1) { perror("result->longueur != codage->k + 1");

exit(1);} else parity_bit(codage->k, donnees->v, result->v); else if (strcmp(codage->type, "lin_b_code") == 0) if ((*result).longueur != (*codage).n)

{ perror ("result->longueur != codage->n"); exit (1);}

else if (strcmp(codage->opt, "syst") == 0) lin_b_code_sys(donnees->v, codage->k, codage->n, codage->matrix, result->v); else lin_b_code(donnees->v, codage->k, codage->n, codage->matrix, result->v);

else if (strcmp(codage->type, "code_conv") == 0) if ((*result).longueur != (2 * (*codage).k))

{ perror ("result->longueur != (2 * codage->k)"); exit (1);}

else code_conv(donnees->v, codage->k, result->v); else if (strcmp(codage->type, "crc") == 0) if ((*result).longueur != (*codage).n )

{ perror ("result->longueur != codage->n"); exit (1); }

else encode_crc(donnees->v, codage->k, codage->n, codage->matrix[0], result->v); else { perror("unknown code type"); exit(1); }}

#include "parity_bit.h"voidparity_bit(int l, int *vector, int *result){ int res; int i; res = 0; for(i = 0; i < l; i++) { res ^= vector[i]; result[i] = vector[i]; } result[l] = res;}

#include "mult_mat.h"voidmult_mat(int *vector, int k, int n, int **matrix, int *result, int indice){ int temp; int i, j;

for(i = 0; i < n; i++) { temp = 0;

for(j = 0; j < k; j++)temp ^= vector[j] & matrix[j][i];

result[i + indice] = temp; }}

#include "lin_b_code_sys.h"#include "mult_mat.h"voidlin_b_code_sys(int *vector, int k, int n, int **matrix, int *result){ int sub_matrix_size; int i;

sub_matrix_size = n - k; mult_mat(vector, k, sub_matrix_size, matrix, result, k); for(i = 0; i < k; i++) result[i] = vector[i];}

#include "lin_b_code.h"#include "mult_mat.h"voidlin_b_code(int *vector, int k, int n, int **matrix, int *result){ mult_mat(vector, k, n, matrix, result, 0);}

#include "code_conv.h"voidcode_conv(int *donnees, int k, int *result){ int temp; int i; for (i = 0; i < k - 2; i++) { temp = donnees[i] ^ donnees[i + 2]; result[2 * i + 1] = temp; temp ^= donnees[i + 1]; result[2 * i] = temp; } temp = donnees[k - 2] ^ donnees[k - 1]; result[2 * k - 4] = temp; result[2 * k - 3] = donnees[k - 2]; result[2 * k - 2] = donnees[k - 1]; result[2 * k - 1] = donnees[k - 1];}

#include "encode_crc.h"voidencode_crc(int *donnees, int k, int n, int *poly, int *result){ int l_poly; int i, j;

l_poly = n - k + 1; for (i = 0; i < k; i++) result[i] = donnees[i]; for (i = 0; i < k; i++) if (result[i] == 1) for (j = 0; j < l_poly; j++)

result[i + j] ^= poly[j]; for (i = 0; i < k; i++) result[i] = donnees[i];}

extern int strcmp(char *s1, char *s2);

struct data{ int longueur; int *v;};

struct code{ int k; int n; int **matrix; char *type; char *opt;};

extern void encode(struct data *donnees, struct code *codage, struct data *result);

extern void parity_bit(int l, int *vector, int *result);

extern void lin_b_code_sys(int *vector, int k, int n, int **matrix, int *result);

extern void code_conv(int *donnees, int l, int *result);

extern void lin_b_code(int *vector, int k, int n, int **matrix, int *result);

extern void mult_mat(int *vector, int k, int n, int **matrix, int *result, int indice);

extern void encode_crc(int *donnees, int k, int n, int *poly, int *result);

length inlength outtype generatoropt

INRIA - LaBRIPhoenix Group Oct-0584

#include "encode.h"#include "parity_bit.h"#include "lin_b_code.h"#include "lin_b_code_sys.h"#include "code_conv.h"#include "encode_crc.h"#include "libstring.h"

voidencode(struct data *donnees, struct code *codage, struct data *result){ if (donnees->longueur != codage->k) { perror("donnees->longueur != codage->k"); exit(1);} else if (strcmp(codage->type, "parity_bit") == 0) if (result->longueur != codage->k + 1) { perror("result->longueur != codage->k + 1");

exit(1);} else parity_bit(codage->k, donnees->v, result->v); else if (strcmp(codage->type, "lin_b_code") == 0) if ((*result).longueur != (*codage).n)

{ perror ("result->longueur != codage->n"); exit (1);}

else if (strcmp(codage->opt, "syst") == 0) lin_b_code_sys(donnees->v, codage->k, codage->n, codage->matrix, result->v); else lin_b_code(donnees->v, codage->k, codage->n, codage->matrix, result->v);

else if (strcmp(codage->type, "code_conv") == 0) if ((*result).longueur != (2 * (*codage).k))

{ perror ("result->longueur != (2 * codage->k)"); exit (1);}

else code_conv(donnees->v, codage->k, result->v); else if (strcmp(codage->type, "crc") == 0) if ((*result).longueur != (*codage).n )

{ perror ("result->longueur != codage->n"); exit (1); }

else encode_crc(donnees->v, codage->k, codage->n, codage->matrix[0], result->v); else { perror("unknown code type"); exit(1); }}

#include "parity_bit.h"voidparity_bit(int l, int *vector, int *result){ int res; int i; res = 0; for(i = 0; i < l; i++) { res ^= vector[i]; result[i] = vector[i]; } result[l] = res;}

#include "mult_mat.h"voidmult_mat(int *vector, int k, int n, int **matrix, int *result, int indice){ int temp; int i, j;

for(i = 0; i < n; i++) { temp = 0;

for(j = 0; j < k; j++)temp ^= vector[j] & matrix[j][i];

result[i + indice] = temp; }}

#include "lin_b_code_sys.h"#include "mult_mat.h"voidlin_b_code_sys(int *vector, int k, int n, int **matrix, int *result){ int sub_matrix_size; int i;

sub_matrix_size = n - k; mult_mat(vector, k, sub_matrix_size, matrix, result, k); for(i = 0; i < k; i++) result[i] = vector[i];}

#include "lin_b_code.h"#include "mult_mat.h"voidlin_b_code(int *vector, int k, int n, int **matrix, int *result){ mult_mat(vector, k, n, matrix, result, 0);}

#include "code_conv.h"voidcode_conv(int *donnees, int k, int *result){ int temp; int i; for (i = 0; i < k - 2; i++) { temp = donnees[i] ^ donnees[i + 2]; result[2 * i + 1] = temp; temp ^= donnees[i + 1]; result[2 * i] = temp; } temp = donnees[k - 2] ^ donnees[k - 1]; result[2 * k - 4] = temp; result[2 * k - 3] = donnees[k - 2]; result[2 * k - 2] = donnees[k - 1]; result[2 * k - 1] = donnees[k - 1];}

#include "encode_crc.h"voidencode_crc(int *donnees, int k, int n, int *poly, int *result){ int l_poly; int i, j;

l_poly = n - k + 1; for (i = 0; i < k; i++) result[i] = donnees[i]; for (i = 0; i < k; i++) if (result[i] == 1) for (j = 0; j < l_poly; j++)

result[i + j] ^= poly[j]; for (i = 0; i < k; i++) result[i] = donnees[i];}

extern int strcmp(char *s1, char *s2);

struct data{ int longueur; int *v;};

struct code{ int k; int n; int **matrix; char *type; char *opt;};

extern void encode(struct data *donnees, struct code *codage, struct data *result);

extern void parity_bit(int l, int *vector, int *result);

extern void lin_b_code_sys(int *vector, int k, int n, int **matrix, int *result);

extern void code_conv(int *donnees, int l, int *result);

extern void lin_b_code(int *vector, int k, int n, int **matrix, int *result);

extern void mult_mat(int *vector, int k, int n, int **matrix, int *result, int indice);

extern void encode_crc(int *donnees, int k, int n, int *poly, int *result);

length inlength outtype generatoropt

INRIA - LaBRIPhoenix Group Oct-0585

#include "encode.h"#include "parity_bit.h"#include "lin_b_code.h"#include "lin_b_code_sys.h"#include "code_conv.h"#include "encode_crc.h"#include "libstring.h"

voidencode(struct data *donnees, struct code *codage, struct data *result){ if (donnees->longueur != codage->k) { perror("donnees->longueur != codage->k"); exit(1);} else if (strcmp(codage->type, "parity_bit") == 0) if (result->longueur != codage->k + 1) { perror("result->longueur != codage->k + 1");

exit(1);} else parity_bit(codage->k, donnees->v, result->v); else if (strcmp(codage->type, "lin_b_code") == 0) if ((*result).longueur != (*codage).n)

{ perror ("result->longueur != codage->n"); exit (1);}

else if (strcmp(codage->opt, "syst") == 0) lin_b_code_sys(donnees->v, codage->k, codage->n, codage->matrix, result->v); else lin_b_code(donnees->v, codage->k, codage->n, codage->matrix, result->v);

else if (strcmp(codage->type, "code_conv") == 0) if ((*result).longueur != (2 * (*codage).k))

{ perror ("result->longueur != (2 * codage->k)"); exit (1);}

else code_conv(donnees->v, codage->k, result->v); else if (strcmp(codage->type, "crc") == 0) if ((*result).longueur != (*codage).n )

{ perror ("result->longueur != codage->n"); exit (1); }

else encode_crc(donnees->v, codage->k, codage->n, codage->matrix[0], result->v); else { perror("unknown code type"); exit(1); }}

#include "parity_bit.h"voidparity_bit(int l, int *vector, int *result){ int res; int i; res = 0; for(i = 0; i < l; i++) { res ^= vector[i]; result[i] = vector[i]; } result[l] = res;}

#include "mult_mat.h"voidmult_mat(int *vector, int k, int n, int **matrix, int *result, int indice){ int temp; int i, j;

for(i = 0; i < n; i++) { temp = 0; for(j = 0; j < k; j++)

temp ^= vector[j] & matrix[j][i]; result[i + indice] = temp; }}

#include "lin_b_code_sys.h"#include "mult_mat.h"voidlin_b_code_sys(int *vector, int k, int n, int **matrix, int *result){ int sub_matrix_size; int i;

sub_matrix_size = n - k; mult_mat(vector, k, sub_matrix_size, matrix, result, k); for(i = 0; i < k; i++) result[i] = vector[i];}

#include "lin_b_code.h"#include "mult_mat.h"voidlin_b_code(int *vector, int k, int n, int **matrix, int *result){ mult_mat(vector, k, n, matrix, result, 0);}

#include "code_conv.h"voidcode_conv(int *donnees, int k, int *result){ int temp; int i; for (i = 0; i < k - 2; i++) { temp = donnees[i] ^ donnees[i + 2]; result[2 * i + 1] = temp; temp ^= donnees[i + 1]; result[2 * i] = temp; } temp = donnees[k - 2] ^ donnees[k - 1]; result[2 * k - 4] = temp; result[2 * k - 3] = donnees[k - 2]; result[2 * k - 2] = donnees[k - 1]; result[2 * k - 1] = donnees[k - 1];}

#include "encode_crc.h"voidencode_crc(int *donnees, int k, int n, int *poly, int *result){ int l_poly; int i, j;

l_poly = n - k + 1; for (i = 0; i < k; i++) result[i] = donnees[i]; for (i = 0; i < k; i++) if (result[i] == 1) for (j = 0; j < l_poly; j++)

result[i + j] ^= poly[j]; for (i = 0; i < k; i++) result[i] = donnees[i];}

extern int strcmp(char *s1, char *s2);

struct data{ int longueur; int *v;};

struct code{ int k; int n; int **matrix; char *type; char *opt;};

extern void encode(struct data *donnees, struct code *codage, struct data *result);

extern void parity_bit(int l, int *vector, int *result);

extern void lin_b_code_sys(int *vector, int k, int n, int **matrix, int *result);

extern void code_conv(int *donnees, int l, int *result);

extern void lin_b_code(int *vector, int k, int n, int **matrix, int *result);

extern void mult_mat(int *vector, int k, int n, int **matrix, int *result, int indice);

extern void encode_crc(int *donnees, int k, int n, int *poly, int *result);

length in = 4length out = 7type = "lin_b_code” matrix = {{1, 1, 1}, {1, 1, 0}, {1, 0, 0}, {0, 1, 1}}opt = "syst"

INRIA - LaBRIPhoenix Group Oct-0586

struct data { int longueur; int *v; };struct code { int k; int n; int **matrix; char *type; char *opt; };extern int perror();extern int exit();extern void encode_spe(struct data *, struct data *);

extern void encode_spe/*0*/(struct data *donnees, struct data *result) { { int *lin_b_code_sys_0_result; int *lin_b_code_sys_0_vector;

lin_b_code_sys_0_vector = (*donnees).v; lin_b_code_sys_0_result = (*result).v; { int *mult_mat_1_result; int *mult_mat_1_vector; int mult_mat_1_temp;

mult_mat_1_vector = lin_b_code_sys_0_vector; mult_mat_1_result = lin_b_code_sys_0_result; mult_mat_1_temp = (((0 ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[0] & 1))) ^

(unsigned int)((int)((unsigned int)mult_mat_1_vector[1] & 1))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[2] & 1))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[3] & 0));

mult_mat_1_result[4] = mult_mat_1_temp; mult_mat_1_temp = (((0 ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[0] & 1))) ^

(unsigned int)((int)((unsigned int)mult_mat_1_vector[1] & 1))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[2] & 0))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[3] & 1));

mult_mat_1_result[5] = mult_mat_1_temp; mult_mat_1_temp = (((0 ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[0] & 1))) ^

(unsigned int)((int)((unsigned int)mult_mat_1_vector[1] & 0))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[2] & 0))) ^ (unsigned int)((int)((unsigned int)mult_mat_1_vector[3] & 1));

mult_mat_1_result[6] = mult_mat_1_temp; } lin_b_code_sys_0_result[0] = lin_b_code_sys_0_vector[0]; lin_b_code_sys_0_result[1] = lin_b_code_sys_0_vector[1]; lin_b_code_sys_0_result[2] = lin_b_code_sys_0_vector[2]; lin_b_code_sys_0_result[3] = lin_b_code_sys_0_vector[3]; } return; }

length in = 4length out = 7type = "lin_b_code"matrix = {{1, 1, 1}, {1, 1, 0}, {1, 0, 0}, {0, 1, 1}}opt = "syst"

INRIA - LaBRIPhoenix Group Oct-0587

Benchmarks

Handwritten encoderGeneric encoderCustomized encoder

INRIA - LaBRIPhoenix Group Oct-0588

Conclusion

Customizable component based on a declarative approach– no hardcoded transformations

– module aside from the code

– customization properties

Environment– visualization tools

– compiler for modules code transformation verification

Automatic generator of customized component Applications: XDR library, packet filtering, FFT, system

interrupts…

INRIA - LaBRI Phoenix Group Oct-0589

Specialization of Protocol Stacks in OS Kernels

Part II

INRIA - LaBRIPhoenix Group Oct-0590

Introduction

High-speed networks require efficient end-system support

Efficient end-system support =

Efficient applications (clients, servers) + Efficient protocol stacks

For high throughput and low service latency, we need

efficient applications and protocol stacks

INRIA - LaBRIPhoenix Group Oct-0591

Introduction

High-speed networks require efficient end-system support

Efficient end-system support =

Efficient applications (clients, servers) + Efficient protocol stacks

For high throughput and low service latency, we need

efficient applications and protocol stacks

INRIA - LaBRIPhoenix Group Oct-0592

Overview

Some data points Application vs Protocol Stack optimization Program Specialization Specialization of the protocol stack Performance Evaluation Conclusion

INRIA - LaBRIPhoenix Group Oct-0593

Some data points, to begin with

Physical transmission bound vs actual throughput between two Pentium II-700Mhz machines with 100Mbps network (MTU 1500):– Max UDP throughput: (70 Mbps)– Max TCP throughput: (47 Mbps)– HTTP throughput at saturation: (28 Mbps)

All limited by the end system

INRIA - LaBRIPhoenix Group Oct-0594

Application vs Protocol stack optimization

Application Protocol Stack

Exploit full b/w of protocol stack

Exploit full b/w of hardware

Limited to application,

non-intrusive

Implemented in kernel,

intrusive

Late Early (With OS)

Application-specific “Application-specific”

GOAL

EXTENT

DEPLOYED

NATURE

INRIA - LaBRIPhoenix Group Oct-0595

Application vs Protocol stack optimization

Application Protocol Stack

Best left to the application developer

Exploit full b/w of hardware

Implemented in kernel,

intrusive

Early (With OS)

“Application-specific”

GOAL

EXTENT

DEPLOYED

NATURE

INRIA - LaBRIPhoenix Group Oct-0596

Application vs Protocol stack optimization

Application Protocol Stack

Best left to the application developer

Need a systematic process to

customize…

GOAL

EXTENT

DEPLOYED

NATURE

INRIA - LaBRIPhoenix Group Oct-0597

Program Specialization

Program Specializer(Tempo)

GENERIC CODE

CONFIGURATIONVALUES

SPECIALIZED CODE

INRIA - LaBRIPhoenix Group Oct-0598

Program Specialization

int tcp_mini_sendmsg (struct sock *sk, void *msg, int size) { int tocopy=0, copied=0; while (tocopy = (size < sk->tcp->mss) ? size : mss) { if (copied = (f ree_space (sk->write_queue.prev.space))) { if (copied > tocopy) copied = tocopy; add_data (sk->write_queue.prev, msg, copied); size = size - copied; msg = msg + copied; } else { struct skbuff *skb = alloc_new_skb(); add_data(skb, msg, tocopy); size = size - tocopy; msg = msg + tocopy; entail (sk->write_queue, skb); } } return size; }

INRIA - LaBRIPhoenix Group Oct-0599

Program Specialization

int tcp_mini_sendmsg (struct sock *sk, void *msg, int size) { int tocopy=0, copied=0; while (tocopy = (size < sk->tcp->mss) ? size : mss) { if (copied = (f ree_space (sk->write_queue.prev.space))) { if (copied > tocopy) copied = tocopy; add_data (sk->write_queue.prev, msg, copied); size = size - copied; msg = msg + copied; } else { struct skbuff *skb = alloc_new_skb(); add_data(skb, msg, tocopy); size = size - tocopy; msg = msg + tocopy; entail (sk->write_queue, skb); } } return size; }

INRIA - LaBRIPhoenix Group Oct-05100

Program Specialization

int tcp_mini_sendmsg (struct sock *sk, void *msg, int size) { int tocopy=0, copied=0; while (tocopy = (size < sk->tcp->mss) ? size : mss) { if (copied = (f ree_space (sk->write_queue.prev.space))) { if (copied > tocopy) copied = tocopy; add_data (sk->write_queue.prev, msg, copied); size = size - copied; msg = msg + copied; } else { struct skbuff *skb = alloc_new_skb(); add_data(skb, msg, tocopy); size = size - tocopy; msg = msg + tocopy; entail (sk->write_queue, skb); } } return size; }

INRIA - LaBRIPhoenix Group Oct-05101

Program Specialization

size=1400sk={…}

int tcp_mini_sendmsg (void *msg) { struct skbuff *skb = alloc_new_skb(); add_data(skb, msg, 1400); entail (sk->write_queue, skb);

return 0; }

Specialization context:

Specialized code:

INRIA - LaBRIPhoenix Group Oct-05102

How specialization is introduced

Application token = do_customize_send(…); f or (i=0;i<100000;i++) { customized_send (token, buff er); } OS

INRIA - LaBRIPhoenix Group Oct-05103

How specialization is introduced

Application token = do_customize_send(…); f or (i=0;i<100000;i++) { customized_send (token, buff er); } OS

INRIA - LaBRIPhoenix Group Oct-05104

How specialization is introduced

Application token = do_customize_send(…); f or (i=0;i<100000;i++) { customized_send (token, buff er); } OS

INRIA - LaBRIPhoenix Group Oct-05105

Spec. opportunities in protocol stacks: Eliminating Genericity

Eliminating lookups Eliminating option interpretation Eliminating routing decisions Optimizing buffer allocation Optimizing data fragmentation and coalescing

INRIA - LaBRIPhoenix Group Oct-05106

Eliminating lookups

ApplicationSocket

descriptors

INRIA - LaBRIPhoenix Group Oct-05107

Eliminating lookups

ApplicationSocket

descriptors

KernelProtocol layer

Socket structures

INRIA - LaBRIPhoenix Group Oct-05108

Eliminating lookups

ApplicationSocket

descriptors

KernelSockets interface

Protocol layerSocket

structures

Socket i-node

INRIA - LaBRIPhoenix Group Oct-05109

Eliminating lookups

Application

Kernel

INRIA - LaBRIPhoenix Group Oct-05110

Eliminating lookups

Application

Kernelsockfd_lookup(sd)

INRIA - LaBRIPhoenix Group Oct-05111

Eliminating socket options

Application

BLOCKING

UNICAST

NAGLE

INRIA - LaBRIPhoenix Group Oct-05112

Eliminating socket options

Application

Kernel

BLOCKING

UNICAST

NAGLE

BLOCKING?

UNICAST?

NAGLE?

INRIA - LaBRIPhoenix Group Oct-05113

Eliminating socket options

Application

Kernel

BLOCKING

UNICAST

NAGLE

BLOCKING?

UNICAST?

NAGLE?

sock_sendmsg(…, …, …, flags)

INRIA - LaBRIPhoenix Group Oct-05114

Eliminating routing decisions

KernelTCP/UDP layer

System call

NIC driver

ip_route_output

INRIA - LaBRIPhoenix Group Oct-05115

Eliminating routing decisions

KernelTCP/UDP layer

System call

NIC driver

ip_route_output

FIB

INRIA - LaBRIPhoenix Group Oct-05116

Eliminating routing decisions

KernelTCP/UDP layer

System call

NIC driver

ip_route_output

Route cache

INRIA - LaBRIPhoenix Group Oct-05117

Eliminating routing decisions

KernelTCP/UDP layer

System call

NIC driver

ip_route_output

connect (int sockfd, struct sockaddr *daddr, int addrlen)

INRIA - LaBRIPhoenix Group Oct-05118

Optimizing buffer allocation

Kernel

if (in interrupt() && (gfp mask & GFP WAIT)) {gfp mask &= ~GFP WAIT;npages = (dlen + (PAGE SIZE- 1))

>> PAGE SHIFT;skb->truesize += dlen;((struct skb sharedinfo *)

skb->end)->nr frags = npages;for (i = 0; i < npages; i++) {

... }

}

INRIA - LaBRIPhoenix Group Oct-05119

Optimizing buffer allocation

Kernel

if (in interrupt() && (gfp mask & GFP WAIT)) {gfp mask &= ~GFP WAIT;npages = (dlen + (PAGE SIZE- 1))

>> PAGE SHIFT;skb->truesize += dlen;((struct skb sharedinfo *)

skb->end)->nr frags = npages;for (i = 0; i < npages; i++) {

... }

}

setsockopt(…, …, …, SO_FIXADU, &size, …)

INRIA - LaBRIPhoenix Group Oct-05120

Optimizing coalescing and fragmentation

Kernel

while (seglen > 0){ copy = MSS_now – last_skb_len; if (copy > 0) { if (copy < seglen)

copy = seglen; push_into_previous(copy); } else { copy = min(seglen, MSS_now); push_into_current(copy); } seglen -= copy;}

setsockopt(…, …, …, SO_FIXADU, &size, …)

INRIA - LaBRIPhoenix Group Oct-05121

Specifying the opportunities to the code specializer

Original C code:struct sk_buff *sock_alloc_send_pskb( struct sock *sk,

unsigned long header len,unsigned long data len,int noblock,int *errcode) {...

}

Specialization declarations:Sock alloc_send_pskb:: intern sock_alloc_send_pskb(

Spec sock( struct sock) S(*) sk,S( unsigned long ) header len,S( unsigned long ) data len,S( int ) noblock,D( int * ) errcode) {...

};

INRIA - LaBRIPhoenix Group Oct-05122

Results: Improvements in Performance and Code Size

Execution time decreased by ~25% Code size decreased by a factor of >15 Throughput improvements:

– UDP - PIII: 13% 486: 27% iPAQ: 18%– TCP - PIII: 10% 486: 23% iPAQ: 13%

INRIA - LaBRIPhoenix Group Oct-05123

Specialization Overhead

Specialization– Run time for fast specialization– Compile time for fast specialized code

Bottleneck– Execution of specialization– Compiler

What about performing specialization remotely?

INRIA - LaBRIPhoenix Group Oct-05124

Conclusion

Problem: Need for fast protocol stacks in high speed networks

Solution: Program specialization at run time Assessment: Execution time up to 25%, code size

up to 15x, throughput up to 20%

INRIA - LaBRI Phoenix Group Oct-05125

Part III

Remote Customization Of Systems CodeFor Embedded Devices

INRIA - LaBRIPhoenix Group Oct-05126

Problem Statement

Embedded systems low on resources OSes generic

Overheads (space, time)

Need for customization of systems code

INRIA - LaBRIPhoenix Group Oct-05127

Outline

Introduction Code Customization Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP Performance Evaluation Conclusion

INRIA - LaBRIPhoenix Group Oct-05128

Introduction Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP Performance Evaluation Conclusion

Outline

INRIA - LaBRIPhoenix Group Oct-05129

Dedicated Vs. Generic OSes

Dedicated OSes Generic OSes

+ Deeply customized, compact, fast, well-suited- Lack of support for standards

+ Support for standards- Generic, coarse grained abstractions

INRIA - LaBRIPhoenix Group Oct-05130

Industry Trends

Embedded Systems Developers Survey, 2002Source: Evans Data Corp.

Percent

INRIA - LaBRIPhoenix Group Oct-05131

Generic abstractions

if (poll (listen_pfds, n, -1) > 1) { f oreach(pfd, listen_pfds) { if (hi_ r(pfd->revents)) queue( accept(f d, addr, addr_ len)); } }

Coarse grained building blocks

Concrete Operations

Performance: 3000+ conn/sec.

Performance: 7000+ conn/sec.

Overheads: memory transfers, context switches,sanity checks, data structures

f oreach(sock, my_sockets) { if (sock->sk->accept_queue) { sock->ops->accept(sock, new_sock, O_NONBLOCK); } }

INRIA - LaBRIPhoenix Group Oct-05132

Generic abstractions

if (poll (listen_pfds, n, -1) > 1) { f oreach(pfd, listen_pfds) { if (hi_ r(pfd->revents)) queue( accept(f d, addr, addr_ len)); } }

Coarse grained building blocks

Concrete Operations

Performance: 3000+ conn/sec.

Performance: 7000+ conn/sec.

Need for Program Specialization

f oreach(sock, my_sockets) { if (sock->sk->accept_queue) { sock->ops->accept(sock, new_sock, O_NONBLOCK); } }

INRIA - LaBRIPhoenix Group Oct-05133

Outline

Introduction Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP Performance Evaluation Conclusion

INRIA - LaBRIPhoenix Group Oct-05134

Remote customization server

Remote Customization

Network

Customized Code

INRIA - LaBRIPhoenix Group Oct-05135

Application

OS

for (i=0 ; i < 100000; i++) { send(…, buffer);}

Specialization Opportunity

INRIA - LaBRIPhoenix Group Oct-05136

Application

OS

token = do_customize_send(…);

for (i=0 ; i < 100000; i++) { customize_send(token, buffer);}

How it’s used

INRIA - LaBRIPhoenix Group Oct-05137

Application

OS

token = do_customize_send(…);

for (i=0 ; i < 100000; i++) { customize_send(token, buffer);}

How it’s used

INRIA - LaBRIPhoenix Group Oct-05138

Application

OS

token = do_customize_send(…);

for (i=0 ; i < 100000; i++) { customize_send(token, buffer);}

How it’s used

INRIA - LaBRIPhoenix Group Oct-05139

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

Architecture

INRIA - LaBRIPhoenix Group Oct-05140

Customizer Compiler

Runtime Layer

CodeManager

ContextManager

Kernel Space

ContextManager

CodeManager

User space

Application

Application issues customization request

CustomizationRequest

Customization Request

INRIA - LaBRIPhoenix Group Oct-05141

Customizer Compiler

Runtime Layer

CodeManager

ContextManager

Kernel Space

ContextManager

CodeManager

User space

Application

Context manager picks up customization context

syscall=sys_sendfd=4;daddr=1044321;flags=32;......

Context Manager

INRIA - LaBRIPhoenix Group Oct-05142

Customizer Compiler

Runtime Layer

CodeManager

ContextManager

Check if we have code for the current context

Kernel Space

ContextManager

CodeManager

User space

Application

Code Manager

INRIA - LaBRIPhoenix Group Oct-05143

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

Application issues customization request

fd=4;daddr=1044321;flags=32;addr_len=8;block_size=1483;(...)

Customization Request

INRIA - LaBRIPhoenix Group Oct-05144

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

Context manager invokes runtime layer

Runtime Layer

INRIA - LaBRIPhoenix Group Oct-05145

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

The program customizer is Tempo

Customizer

INRIA - LaBRIPhoenix Group Oct-05146

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

The customized code is compiled using a standard compiler

Compiler

INRIA - LaBRIPhoenix Group Oct-05147

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

Customized code is sent back

Code Manager

INRIA - LaBRIPhoenix Group Oct-05148

Customizer Compiler

Runtime Layer

ContextManager

CodeManager

Kernel Space

ContextManager

CodeManager

User space

Application

Customization Token

INRIA - LaBRIPhoenix Group Oct-05149

Customizer Compiler

Runtime Layer

CodeManager

ContextManager

Kernel Space

ContextManager

CodeManager

User space

Application

Application gets back a customization token

CustomizationToken(eg., 0 for thefirst customization)

Customization Token

INRIA - LaBRIPhoenix Group Oct-05150

Customizer Compiler

Runtime Layer

CodeManager

ContextManager

Kernel Space

ContextManager

CodeManager

User space

Application

Application uses customization token as an index

Per-process syscalltable

Customization Syscall

INRIA - LaBRIPhoenix Group Oct-05151

Outline

Introduction Code Customization Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP Performance Evaluation Conclusion

INRIA - LaBRIPhoenix Group Oct-05152

Access to client side memory

Customizer

Runtime Layer

0xc01f400: 1483 [tcp_mss]0xc01f363: 0xc01f355 [tp]

movl [socket_pointer],%eax

1. Run-time layer interceptsdereference, as CPU exception.2. Run-time layer interprets instruction with values incustomization context table.

INRIA - LaBRIPhoenix Group Oct-05153

Access to client side memory

Customizer

Runtime Layer

0xc01f400: 1483 [tcp_mss]0xc01f363: 0xc01f355 [tp]

movl [socket_pointer],%eax

1. Run-time layer interceptsdereference, as CPU exception.2. Run-time layer interprets instruction with values incustomization context table.3. Customization-time functionsExecuted on client

INRIA - LaBRIPhoenix Group Oct-05154

Outline

Introduction Code Customization Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP (already presented) Performance Evaluation Conclusion

INRIA - LaBRIPhoenix Group Oct-05155

Outline

Introduction Code Customization Remote Customization Infrastructure Virtualization of memory Case study: TCP/IP Performance Evaluation (already presented) Conclusion

INRIA - LaBRIPhoenix Group Oct-05156

Conclusion

Problem: – Services in generic OSes are slow and bloated

Solution: – Dynamic program specialization– Remote specialization limited resources

Assessment: Exec time… -25%, throughput… +20%, code size… -15x