greg wilson - we know (but ignore) more than we think

8
2013-03-22 1 What We Actually Know About Programming And What We Ought To Do Next Greg Wilson http://software-carpentry.org March 2013 Best Practices for Scientific Computing 2 You are free to: Copy, share, adapt, or re-mix; Photograph, film, or broadcast; Blog, live-blog, or post video of; This presentation. Provided that: You attribute the work to its author and respect the rights and licenses associated with its components. Best Practices for Scientific Computing 3 Arrr, Matey Seven YearsWar (actually 1754-63) Britain lost 1,512 sailors to enemy action... ...and almost 100,000 to scurvy Best Practices for Scientific Computing 4 James Lind (1716-94) 1747: (possibly) the first-ever controlled medical experiment × cider × sulfuric acid × vinegar × sea water oranges × barley water Of course, no-one paid attention until a proper Englishman repeated the experiment in 1794... Oh, the Irony

Upload: devto

Post on 01-Nov-2014

282 views

Category:

Technology


2 download

DESCRIPTION

Over the past thirty years, software engineering researchers have learned a lot about how software actually gets built, and about which tools and practices make developers more productive. However, most working programmers don’t know these studies exist, and don’t base their own work on the best available evidence. This talk will describe a more of the most interesting results, and explore why the gulf between research and practice persists.

TRANSCRIPT

Page 1: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

1

What We Actually Know About Programming

And What We Ought To Do Next

Greg Wilson http://software-carpentry.org March 2013

Best Practices for Scientific Computing 2

You are free to:

Copy, share, adapt, or re-mix;

Photograph, film, or broadcast;

Blog, live-blog, or post video of;

This presentation. Provided that:

You attribute the work to its author and respect the rights and licenses associated with its components.

Best Practices for Scientific Computing 3

Arrr, Matey

Seven Years’ War (actually 1754-63)

Britain lost 1,512 sailors to enemy action...

...and almost 100,000 to scurvy Best Practices for Scientific Computing 4

James Lind (1716-94)

1747: (possibly) the first-ever controlled medical experiment

×  cider ×  sulfuric acid ×  vinegar

×  sea water √  oranges ×  barley water

Of course, no-one paid attention until a proper Englishman repeated the experiment in 1794...

Oh, the Irony

Page 2: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

2

Best Practices for Scientific Computing 5

1950: Hill & Doll publish a case-control study comparing smokers with non-smokers

Now called the “British Doctors” study, it ran until 2001

It Took a While...

Best Practices for Scientific Computing 6

#1: Smoking causes lung cancer

#2: Most people would

rather fail than change “What happens ‘on average’ is of no help when one is faced with a specific patient…”

What They Found

The Cochrane Collaboration (http://www.cochrane.org/) now archives results from hundreds of medical studies

Best Practices for Scientific Computing 7

“[Using domain-specific languages] leads to two primary benefits. The first, and simplest, is improved programmer productivity... The second...is...communication with domain experts.”

– Martin Fowler, IEEE Software, July/August 2009

So Where Are We?

Best Practices for Scientific Computing 8

One of the smartest guys in the industry... ...made two substantive claims of fact…

Look Closer

...in a peer-reviewed journal... ...without a single citation… …because nobody expected one

Page 3: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

3

Best Practices for Scientific Computing 9

Growing emphasis on empirical studies in software engineering since the mid-1990s

Papers describing new tools or practices routinely include results from some kind of field study

Many are flawed or incomplete, but standards are constantly improving

A New Hope

Best Practices for Scientific Computing 10

Rigorous inspections can remove 60-90% of errors before the first test is run. (Fagan 1975)

A Classic Result

Best Practices for Scientific Computing 11

Rigorous inspections can remove 60-90% of errors before the first test is run. (Fagan 1975) The first review and hour matter most. (Cohen 2006)

A Classic Result Refined

Best Practices for Scientific Computing 12

Sackman, Erikson, and Grant (1968): “Exploratory experimental studies comparing online and offline programming performance.”

Or 10, or 40, or 100, or whatever other large number pops into the head of someone who can’t be bothered to look up the reference...

The best programmers are up to 28 times more productive than the worst.

Most Often Misquoted

Page 4: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

4

Best Practices for Scientific Computing 13

1.  Study was designed to compare batch vs. interactive, not measure productivity

2.  How was productivity measured, anyway? 3.  Best vs. worst exaggerates any effect 4.  Twelve programmers for an afternoon

Pick That Apart

Next major study was 54 programmers... ...for up to an hour

Best Practices for Scientific Computing 14

Boehm et al (1975): “Some Experience with Automated Aids to the Design of Large-Scale Reliable Software.”

...and many, many more since

1. Most errors are introduced during requirements analysis and design

2. The later they are removed, the most expensive it is to take them out

time

num

ber /

cos

t

Another Classic Result

Best Practices for Scientific Computing 15

Pessimists: “If we tackle the hump in the error injection curve, fewer bugs will get to the expensive part of the fixing curve.”

Optimists: “If we do lots of short iterations, the total cost of fixing bugs will go down.”

That Explains a Lot

Best Practices for Scientific Computing 16

Nagappan et al (2007) & Bird et al (2009):

Geography has little correlation with software quality

Isn’t That Interesting…

Page 5: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

5

Best Practices for Scientific Computing 17

Nagappan et al (2007) & Bird et al (2009):

Distance in the org chart is a much better predictor

Isn’t That Interesting…

Best Practices for Scientific Computing 18

Are any metrics better at predicting faults/effort than LOC? No.

A Few More Results

Best Practices for Scientific Computing 19

Do more frequent releases improve software quality? Yes, but it also changes the nature of the bugs.

A Few More Results

Best Practices for Scientific Computing 20

Are there better ways to teach programming? Yes: media-based instruction and peer instruction.

A Few More Results

Page 6: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

6

Best Practices for Scientific Computing 21

Sampling Bias I focus on quantitative

studies because they’re what I know best

A lot of the best work in this field is using qualitative

methods

Best Practices for Scientific Computing 22

All Together Now Andy Oram & Greg Wilson (ed): Making Software: What Really Works, and Why We Believe It. O'Reilly, 2010, 978-0596808327. http://neverworkintheory.org

Best Practices for Scientific Computing 23

Where to Start?

Best Practices for Scientific Computing 24

Where to Start?

Many practices can be monitored automatically

Page 7: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

7

Best Practices for Scientific Computing 25

Where to Start?

But top-down initiatives usually don’t work

Best Practices for Scientific Computing 26

1.  How do you identify people with good ideas?

Where to Start?

Best Practices for Scientific Computing 27

1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas?

Where to Start?

Best Practices for Scientific Computing 28

1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers?

Where to Start?

Page 8: Greg Wilson - We Know (but ignore) More Than We Think

2013-03-22

8

Best Practices for Scientific Computing 29

1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers? 4.  How do you tell if it actually worked?

Where to Start?

Best Practices for Scientific Computing 30

Where to Start?

Remember: some changes will be qualitative

1.  How do you identify people with good ideas? 2.  How do you reward people for good ideas? 3.  How do they share those ideas with peers? 4.  How do you tell if it actually worked?

Best Practices for Scientific Computing 31

Words to Live By

If you build a man a fire, you'll keep him warm for a night.

If you set a man on fire, you'll keep him warm for the rest of his life.

— Terry Pratchett

Best Practices for Scientific Computing 32

http://software-carpentry.org

Thank You