code complexity

38

Upload: jorge-silva

Post on 18-Dec-2014

257 views

Category:

Technology


1 download

DESCRIPTION

Smalltalk about code complexity and general good practices checklist.

TRANSCRIPT

Page 1: Code complexity
Page 2: Code complexity

Topics

�  Define: complexity

�  Why tackle complexity

�  Big O notation

�  Code metrics �  McCabe’s Cyclomatic/Conditional �  Martin’s Software Package Metrics �  Other metrics

�  Semantic complexity / Clean Code

�  The myth of the genius programmer �  Code Reviews

Page 3: Code complexity

Define: complexity

the state or quality of being intricate or complicated -  Apple Dictionary

SYNONYMS complication, problem, difficulty, twist, turn, convolution, entanglement; intricacy, complicatedness, involvement, convolutedness.

ANTONYMS simplicity.

Page 4: Code complexity

Complexity

It is threefold:

1.  ”Essential" or unavoidable complexity that is at the essence of the problem.

2.  "Accidental" complexity which covers the work that does not really have much to do with the problem at hand but needs doing anyway

3.  “Unnecessary” complexity which is just noise.

Unnecessary

Page 5: Code complexity

Why tackle complexity Corollary 1

For every 10-percent increase in problem

complexity, there is a 100-percent increase in the

software solution’s complexity.

Page 6: Code complexity

Why tackle complexity Corollary 2

The most important factor in attacking

complexity is not the tools and techniques

that programmers use but rather the quality

of the programmers themselves.

Page 7: Code complexity

Why tackle complexity Corollary 3

Maintenance 60%

Others 40%

Software Cost

Enhancement

60%

Others 23%

Error correcti

on 17%

Maintenance Cost

Page 8: Code complexity

Why tackle complexity Corollary 4

…software’s "60/60" rule, that is that maintenance typically

consumes 40 to 80% (60% average) of software costs, and then

that enhancement is responsible for roughly 60% of software

maintenance costs, while error correction is about 17%...

Robert Glass, Frequently Forgotten Fundamental Facts about Software Engineering, 2001

Page 9: Code complexity

Why tackle complexity

Because the hard part building a software system does not lie in

the code itself.

The hard part is maintaining it.

Anyone can give orders to a computer. Not everyone is able to

do so while being explicit about his/hers intentions.

Page 10: Code complexity

Why tackle complexity

The competent programmer is fully aware of the strictly limited

size of his own skull; therefore he approaches the programming

task in full humility

Edsger Djikstra, 1972

Page 11: Code complexity

Code metrics McCabe’s Cyclomatic

�  Linear Independent Path count

�  It is a function of code branching complexity. Boils down to a number. The higher that number, the worst.

�  It’s not additive, i.e., inner function calls do not affect the outcome.

�  It has been correlated with low reliability and frequent errors

Page 12: Code complexity

Code metrics McCabe’s Cyclomatic

�  M = McCabe’s complexity

�  M = E – N + 2P E = Edges N = Nodes P = # Connected Components

�  M = 2 + π – s = 2 + #If + #Loop + #Case - #Return

Directed Graph Approach

Imperative code Approach

Page 13: Code complexity

Code metrics McCabe’s Cyclomatic

M = 2 + 1 + 1 – 1 = 3

�  void f1(int n) { for (int i=0; i<n; i+=1) { if (a) { f2(); } f3(); } }

Page 14: Code complexity

Code metrics McCabe’s Cyclomatic

�  void f1(int n) { for (int i=0; i<n; i+=1) { if (a) { f2(); } f3(); } }

M = 9 – 8 + 2*1 = 3

Page 15: Code complexity

Code metrics McCabe’s Cyclomatic

�  The higher the M the worst;

�  McCabe himself defined M <= 10 OK 10 < M <= 15 ~OK Please justify M > 15 NOK

�  Other standards say

�  Define your rule and follow it

1-10 OK

11-20 ~OK; Justify please

21-50 NOK

51+ NOK; Untestable

Page 16: Code complexity

Code metrics McCabe’s Cyclomatic

�  Awareness during development.

“programmers should keep track of the complexity of the

modules they are developing, and split them into smaller

modules whenever the cyclomatic complexity of the module

exceeded 10.”

Tom McCabe

Page 17: Code complexity

Code metrics McCabe’s Cyclomatic

�  Applications: �  Indicates the minimum # of white box tests that need

to be run in order to obtain sufficient coverage of the module. �  More tests may be necessary because of path coverage.

�  branch coverage <= cyclomatic complexity <= # paths

�  May help code conciseness by limiting the size of an imperative module.

Page 18: Code complexity

Code metrics McCabe’s Cyclomatic

�  Is it worth it? �  McCabe’s metric was used on one 77.000 line program

to identify problem areas. The program had a post-release defect rate of 0.31 defects per thousand lines of code. A 125.000 line program has had a 0.02 defects per thousand lines of code – William T. Ward, Hewlett Packard

�  Similar results were observed in countless other companies such as Steve McConnel’s Construx Software.

Page 19: Code complexity

Code metrics Software package metrics

�  Robert Martin’s software package metrics

�  Number of classes and interfaces �  Classes, Pure abstract classes, abstract classes

�  Afferent Couplings / Ca (Package responsibility) �  Number of packages that depend on classes within a package (inwards)

�  Efferent Couplings / Ce (Package independence) �  Number of packages the classes within a package depend upon (outwards)

�  Abstracteness �  Ratio of the packages abstracteness.

�  0 < A < 1. �  Instability (resiliency to change)

�  I = Ce / (Ce + Ca)

�  0 < I < 1.

Page 20: Code complexity

Code metrics Software package metrics

�  Robert Martin’s software package metrics �  Distance from the Main Sequence (balance between

abstractness and stability) �  D = A+I, 0 < D < 1

�  D=0 indicating a package that is coincident with the main sequence

�  D=1 indicating a package that is as far from the main sequence as possible

�  Ideal packages are either completely abstract and stable (x=0, y=1) or completely concrete and unstable (x=1, y=0)

1

1 0

Page 21: Code complexity

Code metrics Other metrics

�  Bugs per line of code;

�  Code coverage;

�  Cohesion �  degree to which the elements of a module belong together

�  Coupling �  Cohesion counterpart; High cohesion may mean low coupling and vice-versa.

�  Program execution time

�  Nesting levels in control constructs

�  Variable span (number of lines between successive references to variables)

�  Variable lifetime (number of lines a variable is in use)

�  Others (use tools such as sonar, fortify, etc…)

Page 22: Code complexity

Big O notation

�  Also known as asymptotic notation;

�  Describes the limiting behavior of a function when the

argument tends to a particular value or infinity;

�  Used to classify algorithms – time & space - by how they

respond to input size;

Page 23: Code complexity

Big O notation

�  Also known as asymptotic notation; Expressed as O(x);

�  Describes the limiting behavior of a function when the argument tends to a particular

value or infinity;

�  Used to classify algorithms – time & space - by how they respond to input size;

�  It’s about the worst-case scenario of an algorithm;

�  Ω(x) (Big Omega), Θ(x) (Big Theta) also exist and offer different measures.

Page 24: Code complexity

Big O notation O(1) Constant Complexity

�  Input size does not affect the complexity;

�  Example: Given a binary representation of a number say if it is even or odd

�  If rightmost bit is one then the number is odd; even otherwise

�  No matter how many bits the number representation has we only need to look at

the first bit (right to left).

0111101011010100100101 0111101011010100100100

Page 25: Code complexity

Big O notation O(n) Linear Complexity

�  Example: Add two numbers

�  Line the numbers up (to the right)

�  Add the digits in a column writing the last number of that

addition in the result;

�  The 'tens' part of that number is carried over to the next column.

�  If we add two 100 digit numbers together we have to do 100 additions. If we add

two 10,000 digit numbers we have to do 10,000 additions. See the pattern?

1234 + 5678

6912

Page 26: Code complexity

Big O notation O(log n) Logarithmic Complexity

�  Example: Phone Book

�  Given a person's name, find the phone number by picking a random point about halfway through the part of the

book you haven't searched yet

�  Check to see whether the person's name is at that point

�  Repeat the process about halfway through the part of the book where the person's name lies.

�  you can simply divide-and-conquer, and you only need to explore a tiny fraction of the entire space before you eventually find someone's phone number.

�  A bigger phone book will still take you a longer time, but it won't grow as quickly as the proportional increase in the additional size.

�  See the pattern? (also you could just the internet)

Page 27: Code complexity

Big O notation O(n2) Quadratic Complexity

�  Example: Multiply two numbers

�  Line the numbers up (to the right)

�  take the first digit in the bottom number and multiply it in turn

against each digit in the top number;

�  and so on through each digit

�  For 4 digit number we need to do 16 multiplications (and 7 adds). For 100 digit numbers

we need to do 10.000 multiplications and 200 adds. See the pattern?

�  Should it be O(n2 + 2n)?

1234 x 5678

9872 86380 740400

+ 6170000 7006652

Page 28: Code complexity

Big O notation O(n!) Factorial Complexity

�  Example: The travelling salesman

�  You have N towns

�  Each of those towns is linked to 1 or more other towns by a road of a certain

distance

�  find the shortest tour that visits every town

Page 29: Code complexity

Big O notation O(n!) Factorial Complexity

�  Imagine 3 towns A, B & C �  A → B → C

�  A → C → B

�  B → C → A

�  B → A → C

�  C → A → B

�  C → B → A

�  There are 3 equivalents: A-B-C:C-B-A, A-C-B:B-C-A, B-A-C:C-A-B; So there are 3 possibilities

�  Take this to 4 towns and you have 12 possibilities.

�  With 5 it's 60.

�  6 becomes 360. See the pattern?

Page 30: Code complexity

Types of complexity Semantic

Let us change our traditional attitude to the construction of

programs: Instead of imagining that our main task is to instruct

a computer what to do, let us concentrate rather on explaining

to human beings what we want a computer to do.

Donald Knuth – Literate Programming, Centre for the Study of Language & Information, 1992

Page 31: Code complexity

Types of complexity Semantic

�  Good naming �  Ahah moment: Steve Yegge’s Execution in the

Kingdom of Nouns

�  Names should fully and accurately describe the member they represent (see slide 13)

Page 32: Code complexity

Types of complexity Semantic

�  High quality routines & classes �  Is the reason for creating a new routine/class sufficient? �  Have all the parts of the routine/class that would benefit from

being put into routines/classes of their own been put into routines/classes of their own?

�  Does the name describe everything the routine/class does? �  Does the routine/class have strong, functional cohesion –

doing one and only one thing and doing it well? �  Do the routine/class have loose coupling – are the routine/

class connections to other routines/classes small, intimate, visible and flexible?

�  Is the length of the routine determined naturally by its function and logic rather than by an artificial coding standard?

Page 33: Code complexity

Types of complexity Semantic

�  High quality routines & classes �  Does the routine have 5 or fewer parameters?

�  Is each input parameter used?

�  Is each output parameter used?

�  Does the routine avoid using input parameters as working variables?

�  If the routine is a function does it return a valid value under all possible circumstances?

�  Does the routine parameter list, taken as a whole, present a consistent interface abstraction?

Page 34: Code complexity

Types of complexity Semantic

�  High quality routines & classes �  Does the routine protect itself from bad input data?

�  Have you used assertions to document assumptions, including pre-conditions and post-conditions? �  Check out Eiffel’s invariants; Java has support at some

extent also

�  Have assertions been used only to document conditions that should never occur?

Page 35: Code complexity

Types of complexity Semantic

�  High quality routines & classes �  Have debugging aids been installed in such a way that

they can be activated /deactivated without a great deal of fuss?

�  Is the amount of defensive programming code appropriate – neither to much nor too little?

�  Have you used offensive programming techniques to make errors difficult to overlook during development? �  Make sure asserts abort the program

�  Be sure the code in each case statement’s default or else clause fails hard or is otherwise impossible to overlook

Page 36: Code complexity

The myth of the genius programmer

�  Also known as “Don't be afraid to show your code ™”

�  You are not one in a million. �  And if you were there would be 10 like you in Portugal

alone. Think about this.

A pervasive elitism hovers in the background of collaborative software development: everyone secretly wants to be seen as a genius.

How to avoid this trap and gracefully exchange personal ego for personal growth and super-charged collaboration.

http://www.youtube.com/watch?v=0SARbwvhupQ

Page 37: Code complexity

The myth of the genius programmer Code reviews

�  Do them �  Knowledge sharing – functional and business.

�  Engage with your peers.

�  Learn from others – be open minded.

�  Criticize and be criticized �  But being polite and respectful doing both

�  Be like water my friend – Bruce Lee

Page 38: Code complexity

That’s it folks