a word from our 2011 section chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf ·...

22
Volume 22, July 2011 A joint newsletter of the Statistical Computing & Statistical Graphics Sections of the American Statistical Association A Word from our 2011 Section Chairs RICHARD HEIBERGER COMPUTING JÜRGEN SYMANZIK GRAPHICS Statistical Computing, as we in this section all know, is the center of Statistics. The program for the upcoming JSM 2011 in Miami Beach indicates that centrality. The Section on Statistical Computing is the main sponsor of 27 technical sessions and activities, and cosponsor of 50 more for a total of 77. The session and paper titles cover the entire gamut of statistical activity and many scientific and social issues. The cosponsors include most of the sections of ASA and many of the asso- Continued on page 2 ... Is there a fifth force in particle physics, in addition to the previously known four interactive forces electromagnetism, strong nuclear force, weak nuclear force, and gravitation? A possible new force, called “technicolor” force, was recently described in an article by three physicists (http:// prl.aps.org/abstract/PRL/v106/i25/e251803), based on unexpected experimental observations (http://prl.aps.org/abstract/PRL/v106/i17/ e171801). The discovery of such a new force would Continued on page 2 ... Contents of this volume: A Word from our 2011 Section Chairs ..... 1 Statistical Graphics and InfoVis — Twins separated at Birth? .......... 4 Visualization: It’s More than Pictures! .... 5 Visualization, Graphics, and Statistics .... 9 Heuristic Methods in Finance ......... 13 Section Officers ................. 20

Upload: others

Post on 24-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Volume 22, July 2011

A joint newsletter of the StatisticalComputing & Statistical GraphicsSections of the American StatisticalAssociation

A Word from our 2011 Section ChairsRICHARD HEIBERGERCOMPUTING

JÜRGEN SYMANZIKGRAPHICS

Statistical Computing, as we in this section allknow, is the center of Statistics. The programfor the upcoming JSM 2011 in Miami Beachindicates that centrality. The Section on StatisticalComputing is the main sponsor of 27 technicalsessions and activities, and cosponsor of 50 morefor a total of 77. The session and paper titles coverthe entire gamut of statistical activity and manyscientific and social issues. The cosponsors includemost of the sections of ASA and many of the asso-

Continued on page 2 . . .

Is there a fifth force in particle physics, inaddition to the previously known four interactiveforces electromagnetism, strong nuclear force,weak nuclear force, and gravitation? A possiblenew force, called “technicolor” force, was recentlydescribed in an article by three physicists (http://prl.aps.org/abstract/PRL/v106/i25/e251803),based on unexpected experimental observations(http://prl.aps.org/abstract/PRL/v106/i17/e171801). The discovery of such a new force would

Continued on page 2 . . .

Contents of this volume:

A Word from our 2011 Section Chairs . . . . . 1Statistical Graphics and InfoVis —

Twins separated at Birth? . . . . . . . . . . 4

Visualization: It’s More than Pictures! . . . . 5

Visualization, Graphics, and Statistics . . . . 9

Heuristic Methods in Finance . . . . . . . . . 13

Section Officers . . . . . . . . . . . . . . . . . 20

Page 2: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Computing ChairContinued from page 1.

ciated societies. Topics include: careerdevelopment, climate, clinical trials, data mining,environment, financial analysis, functional dataanalysis, genetics, graphics, health policy,high-dimensional methods, mapping, marketing,Monte Carlo simulation, national security, physicaland engineering sciences, public affairs, qualityand productivity, risk, social media, teaching.There is something for everyone in the collection.

The Program Chair for the Section this yearis David J. Poole. With the help of all of youwho are presenting he has done a magnificentjob. Just the Invited Paper Session Topics arefascinating: Advancement in Hierarchical Models,The Future of Statistical Computing Environments,Computational Methods for Space-Time CorrelatedData, Statistics in Computational Advertising(what goes on behind the scenes when a searchengine decides which advertisements to displayto a specific search item), Statistical Analysisof Actigraphy Data (Actigraphy is a relativelynon-invasive method of monitoring humanrest/activity cycles.).

In addition to invited papers at the JSM, theSection cosponsors the Data Expo (this year withposters related to the 2010 Deepwater HorizonOil Spill), the Interface on Computer Scienceand Statistics, student awards, and ContinuingEducation courses at the JSM. This year we havetwo courses: “Data Stream Mining: Tools andApplications” by Simon Urbanek and “BootstrapMethods and Permutation Tests for Doing andTeaching Statistics” by Tim Hesterberg.

The centrality of Statistical Computing, andcomputing in general, in our world impliesresponsibility. Part of our mission is to geteveryone who does data analysis to do itwell. We therefore must work toward gettinghigh quality statistical computing and graphicssystems accessible to everyone on any computer(cell phones included) and at every level fromintroductory courses to highly complex large andvery large applications. Many of the sessions inthe upcoming JSM are reflective of the work thatour members have done in making high-qualitysoftware and thinking processes accessible. Wehave sessions on software design for teaching, and

statistical teaching using high-quality software.The main specific social event is the annual

“Section on Statistical Computing and Sectionon Statistical Graphics Joint Mixer” on Mondayevening. I hope to see many of you there, as wellas at the rest of the JSM.

Richard M. HeibergerTemple University

Graphics ChairContinued from page 1.

be considered a major revolution in the world ofparticle physics. But, why does an editorial writtenby the current chair of the Statistical Graphicssection start with discoveries from particle physics?The answer of course is simple: Because of thestatistical graphics that supported this discovery.

A less technical summary of these findingsand a freely accessible version of the mostrelevant figure can be found here and here atthe NewScientist. Of course, according to ourstandards, this is a relatively simple statisticalfigure. Nevertheless, this may be the mostimportant piece of a puzzle that transforms manyyears of scientific research into a new physicaltheory and model.

It appears as if statistical graphics have helpedto detect the unknown and unexpected — again!Most of us know the classical examples fromthe last 150 years where statistical graphics havehelped to discover the previously unknown.This includes John Snow’s discovery that the1854 cholera epidemic in London most likelywas caused by a single water pump on BroadStreet, a fact he observed after he had displayedthe deaths arising from cholera on a map ofLondon. A second, well–known example isFlorence Nightingale’s polar area charts from1857, the so–called Nightingale’s Rose (sometimesincorrectly called coxcombs), that demonstratedthat the number of deaths from preventablediseases by far exceeded the number of deathsfrom wounds during the Crimean War. Thesefigures convinced Queen Victoria to improvesanitary conditions in military hospitals. Manyadditional important scientific discoveries based onthe proper visualization of statistical data could bementioned, but the most important fact is: Newdiscoveries based on the visualization of data can

2

Page 3: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

happen here and now!This is a message we should carry to our

collaborators, students, supervisors, etc.: Statisticalgraphics (or visual data mining, visual analytics, orany other name you like) typically do not provide afinal answer. But, statistical graphics often help todetect the unexpected, formulate new hypotheses,or develop new models. Later on, additionalexperiments or ongoing data collection as well asmore formal methods (and p–values if you reallywant) may be used to verify some of the originalgraphical findings.

As can be seen in the examples above, statisticalgraphics are of universal use. Nevertheless,graphics may also be unique for some particularapplication areas. This naturally leads me to inviteyou to the Joint Statistical Meetings (JSM) wherenew graphical methods and their applications arepresented. I do not want to repeat the entireJSM graphics program here, but rather say thatWebster West, our 2011 Program Chair, has puttogether an excellent set of sessions for the 2011JSM held from July 30 to August 4, 2011, in MiamiBeach, Florida, at the Miami Beach ConventionCenter. So, let me just mention a few highlightshere: First, there is the 2011 Data Expo (Mon,8/1/2011, 2:00pm to 3:50pm) with posters relatedto the 2010 Deepwater Horizon Oil Spill. Next,there is the Section on Statistical Computing andSection on Statistical Graphics Joint Mixer (Mon,8/1/2011, 7:30pm to 10:00pm) that only is creditedto Statistical Computing in the online program andmay be missed at first glance. For everythingelse (from invited sessions over topic contributedsessions to contributed sessions and roundtables),start at http://www.amstat.org/meetings/jsm/

2011/onlineprogram/index.cfm and search fordetails. Enjoy the JSM and/or the beaches!

This may sound strange to many of you, butplans for the JSM 2012 to be held from July28 to August 2, 2012, in San Diego, California,at the San Diego Convention Center, have far

progressed. If you have any suggestions for aninvited session for 2012, please contact our 2011Program Chair–Elect (and 2012 Program Chair),Hadley Wickham ([email protected]), as soon aspossible.

Finally, if you always wanted to know whatyour section membership fees are used for, hereis a brief summary provided by John (“Jay”)Emerson, our section secretary/treasurer: (1) Wehave contributed $375 to the 2011 Cavell BrownieScholars mentoring program (http://www.amstat-online.org/2010mentoringprogram/

CavellBrownieScholarsProgram.php). (2) Wehave contributed $2,000 to support graduatestudents at the Interface 2011 conference. (3) Wecontributed $1,750 to support color printing ofthe special 2009 Data Expo issue of JCGS. Just areminder that the Statistical Graphics section offerscolor publishing grants that have only receivedvery few applications so far. For details on theapplication procedure, please refer to http://

stat-graphics.org/graphics/grants.html. (4)We supported Seth Roberts’ invited presentationat the JSM 2010 with $450. (5) Other recurringannual expenses are for food and drinks at themixer and the officers’ meeting at the JSM and forvarious awards, such as “The Statistical Computingand Graphics Award” and the “Student PaperCompetition”.

Did you stay with me until the very end? Ifso, let me reveal one more fact: Another group ofresearchers from particle physics that conductedthe same experiments did not detect any anomaliesin their data and figures (http://www.bbc.co.uk/news/science-environment-13722986 and http:

//www-d0.fnal.gov/Run2Physics/WWW/results/

final/HIGGS/H11B/). Currently, researchers fromboth groups are comparing their results to come toa final joint conclusion.

Jürgen SymanzikUtah State University

3

Page 4: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Statistical Graphics and InfoVis —Twins separated at Birth?Nicholas Lewin-Koh and Martin Theus, Eds.

This volume features two articles both looking atthe aspects of “graphical displays of quantitativedata”. In the first paper “Visualization: It’s Morethan Pictures!” by Robert Kosara, Robert shedsa light from the point of view of an InfoVis per-son, i.e. someone who primarily learned how todesign tools and techniques for data visualization.With the second article “Visualization, Graphics,and Statistics” by Andrew Gelman and Antony Un-

win, we get a similar view, but now from someonewhose primary training is in math and/or statistics.

Given this set-up, we might think that we havea good idea how both sides would argue, and whatwould be the assets the one and the other sidewould claim: computer scientists are good at de-signing tools for data visualization and statisticiansare good at doing the analysis; and consequently,they both don’t know much about the expertise ofthe other discipline.

Analysts(statisticians,

domain experts)

technology centric,

non-inter-active

problemoriented,interactive

? interaction ?

Tool builder(computer scientists)

Figure 1: Is there a “wall” between the two promotors of graphical displays?(Taken from http://www.theusrus.de/blog/the-wall-what-wall/.)

Reading the two papers you will find out that,while there is certainly some truth behind this sim-ple classification, the overlap and agreement islarger than one would probably think. The com-mon and most important understanding is thatthere is a story to be told with the data. Graph-ics are the most powerful tool to do this, no matterwhat your training and background is.

Nonetheless, there is still a lot to be learnedfrom each other and the one or the other differ-ence or misunderstanding might spur the discus-sion between the two sides. As a platform for thisdiscussion you can use the post at http://www.

theusrus.de/blog/InfoVis-and-StatGraphics/

— we are looking forward to a lively exchange,which might even end up in a collaboration!

4

Page 5: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Visualization: It’s More than Pictures!Robert Kosara

Introduction

Information visualization is a field that has hadtrouble defining its boundaries, and that conse-quently is often misunderstood. It doesn’t help thatInfoVis, as it is also known, produces pretty pic-tures that people like to look at and link to or sendaround. But InfoVis is more than pretty pictures,and it is more than statistical graphics.

The key to understanding InfoVis is to ignorethe images for a moment and focus on the part thatis often lost: interaction. When we use visualiza-tion tools, we don’t just create one image or onekind of visualization. In fact, most people wouldargue that there is not just one perfect visualiza-tion configuration that will answer a question [4].The process of examining data requires trying outdifferent visualization techniques, settings, filters,etc., and using interaction to probe the data: filter-ing, brushing, etc.

The term visual analytics captures this processquite well, and it also gives a better idea of whatmost visualization is used for: analysis. Analysis isnot a static thing, and can rarely be done by lookingat a static image. Visualization and visual analyticsuse images, but the images are only one part of vi-sualization.

Cheap Thrills

It is no wonder that many people think that vi-sualization is primarily about pretty and colorfulpictures, even smart people like Andrew Gelman.What readers see on popular websites like Flow-ingData [8] and infosthetics [3], and what makesthem so popular, are the pictures. In many cases,they provide only minimal context, and readers aremostly left to look at the images as images, ratherthan figure out what they are actually trying to tellthem.

Another issue is the blurred boundary betweenactual visualization and data art, which is often ig-nored on purpose to have more interesting imagesto choose from. The result is that the expectationmany people have of visualization images is sim-ilar to that of a piece of art: that you can look at

it and like or don’t like it, but don’t get any actualinformation out of it. In fact, I have argued thatwhat Gelman calls “that puzzling feeling” is actu-ally what sets pragmatic visualization apart fromdata art [2].

Data art clearly has its place, and the more prag-matic visualization community can learn from it.But when we’re talking about visualization in thecontext of statistics and the analysis of data, weneed to draw a clear distinction. Visualization isnot art any more than statistics is.

Goals

So what does visualization do, then? The main ideais to provide insight into data. This is how scien-tific visualization got started in the 1980s: the hugeamounts of data produced by the then-recent su-percomputers required new ways of analysis. Sci-entific visualization made it possible to see the ef-fects of design changes on the pressure distributionof an airplane wing, for example. The same thingcould be done with number crunching in theory,but it was a lot more immediate and obvious wherethings went wrong when the model was actuallyshown as an image.

Another, more recent, goal is making data acces-sible. A lot of data is already available in principle,but not in a form that normal people would wantto play with. There is still a difference betweendata being technically available and actually beingaccessible to a broad audience. Creating a visual-ization makes it possible for people to start pokingaround in the data and perhaps discover interest-ing facts that nobody has seen before.

Finally, to borrow Tableau’s tagline [6], the goalof visualization is to make analytics fast. Sure, a lotof questions can be asked of a data warehouse bywriting 150-line SQL queries, but changing param-eters or exploring variations is going to be diffi-cult this way. An interactive visualization systemmakes it possible to do that and ask many morequestions in much less time. This is not only aworthwhile goal in a business context, but also inthe sciences and many other fields: the easier andquicker it is to ask questions, the more questionscan be asked.

5

Page 6: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Figure 1: Spirals are useful for finding periodicity in data (from [1]). (a) The bar chart shows no obviousperiodic pattern; (b) the spiral set to 25 days hints at a periodic pattern, but this is clearly not the correcttime frame; (c) at 28 days, the pattern is very clearly visible.

Example: Perceive Patterns

A common question in time series data is whetherthe data is periodic, and if yes, what the period is.A common way of finding out is drawing the dataon a spiral [1]. By changing the number of datapoints that is shown per full round the spiral makes(that number is constant, of course), patterns be-come visible. Figure 1 shows an example of sickleave data that has an interesting periodic pattern:in 28 days, there are four periods, which means thatthere is a weekly pattern: more people call in sickon Mondays than later in the week.

The way this pattern was discovered is decep-tively simple. All it took was to play with a sliderthat allowed the user to change the number of dayson shown on the spiral. Slide it back and forth, andsoon you will see a pattern (if there is one). With abit of practice, you can even tell when you’re get-ting close, as there are telltale signs around the op-timal value.

The key here is not just the way of displayingthe data, but also the interaction. Without it, itwould take much longer to find the correct inter-val, or require some very educated guessing. Thepower of visualization is that it allows the user tofind things he or she may not have expected, andthus would not have been looking for.

Example: Filter the Flood

A beautiful example of the integration of analysisand visualization is a system for visualizing net-work traffic data [7]. To be able to deal with theenormous amount of data, the system includes adeclarative logic system that can apply rules to findcertain patterns in the data. The idea is to identifypatterns of known good data, and filter that dataout, so that what remains is the data that needs tobe examined more closely (Figure 2).

Instead of having to write the declarations byhand, however, the system allows the user to selectdata points and creates a rule from the selection.The user can then apply that rule to other traffic tosee if it matches the right data, and even examineand edit the actual definition directly. Creating andrefining definitions of different traffic patterns isrelatively straight-forward this way, especially fora network security expert.

One of the most clever design decisions in thissystem is to focus on the known good traffic, ratherthan trying to define what is suspicious. New typesof scans and attacks are developed all the time,so keeping up with them is practically impossible.Also, defining the bad traffic would defeat a big ad-vantage of the visual part of this system: being ableto see new patterns as they emerge.

By treating the known good traffic as irrelevant,it can be removed, and the user can focus on the

6

Page 7: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

(a)

(b) (c)

(d) (e)

Figure 2: Event diagrams showing flows during residual analysis (from [7]). (a) Original unidentified traf-fic (b) Flows with “mail” label (c) The residual after filtering out “mail” from Figure 6a. (d) Flows with the“scan” label (e) The residual after filtering out the “scan” label from Figure 6(c).

7

Page 8: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

parts that may be suspicious. Each part is done bythe component that is best suited for it. The ma-chine uses the rules to sift through and filter largeamounts of data, and the user tries to understandwhat remains and tweaks the rules (or finds a wayto fend off a break-in attempt).

Conclusions

Visualization cannot exist without visual represen-tations, and those representations need to be de-signed so that they can be effectively and efficientlyperceived. There is no question that more effectivevisual representations will result in better analysisand easier comprehension of data. But the imagesaren’t everything.

There is also a vast open field of research thatmakes good use of statistics to enhance visualiza-tion. A few attempts at this exist [5], but a lot morecan be done. Despite the relatively new field ofvisual analytics, visualization research is still verystrongly focused on visual representation, with toolittle attention being paid to interaction, analysis,and cognitive effects.

And yet, visualization is much, much more thanwhat it appears to be at first glance. The real powerof visualization goes beyond visual representationand basic perception. Real visualization means in-teraction, analysis, and a human in the loop whogains insight. Real visualization is a dynamic pro-cess, not a static image. Real visualization does notpuzzle, it informs.

Bibliography

[1] W. Aigner, S. Miksch, W. Müller, H. Schumann,and C. Tominski. Visual methods for analyzing

time-oriented data. Transactions on Visualizationand Computer Graphics, 14(1):47–60, 2008.

[2] R. Kosara. Visualization criticism — the miss-ing link between information visualization andart. In Proceedings of the 11th International Confer-ence on Information Visualisation (IV), pages 631–636. IEEE CS Press, 2007.

[3] A. V. Moere. information aesthetics. http://infosthetics.com/.

[4] B. Shneiderman. The eyes have it: A task bydata type taxonomy for information visualiza-tion. In Proceedings Visual Languages, pages 336–343. IEEE Computer Society Press, 1996.

[5] C. A. Steed, P. J. Fitzpatrick, T. Jankun-Kelly,and J. E. S. II. Practical north atlantic hurricanetrend analysis using parallel coordinates andstatistical techniques. In Proceedings of the 2008Workshop on Geospatial Visual Analytics, 2008.

[6] Tableau software. http://tableausoftware.

com/.

[7] L. Xiao, J. Gerth, and P. Hanrahan. Enhanc-ing visual analysis of network traffic using aknowledge representation. In Proceedings VisualAnalytics Science and Technology (VAST), pages107–114. IEEE CS Press, 2006.

[8] N. Yau. Flowingdata. http://flowingdata.

com/.

Robert KosaraUNC [email protected]

http: // eagereyes. org/

8

Page 9: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Visualization, Graphics, and StatisticsAndrew Gelman and Antony Unwin 1

Quantitative graphics, like statistics itself, is ayoung and immature field. Methods as funda-mental as histograms and scatterplots are commonnow, but that was not always the case. More re-cent developments like parallel coordinate plotsare still establishing themselves. Within academicstatistics (and statistically-inclined applied fieldssuch as economics, sociology, and epidemiology),graphical methods tend to be seen as diversionsfrom more “serious” analytical techniques. Statis-tics journals rarely cover graphical methods, andHoward Wainer has reported that, even in the Jour-nal of Computational and Graphical Statistics, 80% ofthe articles are about computation, only 20% aboutgraphics.

Outside of statistics, though, infographics anddata visualization are more important. Graphicsgive a sense of the size of big numbers, dramatizerelations between variables, and convey the com-plexity of data and functional relationships. Jour-nalists and graphic designers recognize the hugeimportance of data in our lives and are always look-ing out for new modes of display, sometimes tomore efficiently portray masses of information thattheir audiences want to see in detail (as with sportsscores, stock prices, and poll reports), sometimesto help tell a story (as with annotated maps), andsometimes just for fun: a good data graphic can beas interesting as a photograph or cartoon.

We and other graphically-minded statisticianshave been thinking a lot recently about the differentperspectives of statisticians and graphic designersin displaying data. But first we would like to em-phasize some key places in which we agree withthe infographics community, some reasons why weand they generally prefer numbers to be graphedrather than written.

• A well-designed graph can display more in-formation than a table of the same size, andmore information than numbers embeddedin text. Graphical displays allow and encour-age direct visual comparisons.

• It has been argued that tables are commonlyread as crude graphs: what you notice in a ta-

ble of numbers are (a) the minus signs, andthus which values are positive and which arenegative, and (b) the length of each number,that is, its order of magnitude. In a tableof statistical results you might also note theboldface type or stars that indicate statisticalsignificance. A table is a crude form of log-scale graph. If we really must display num-bers in tables with many significant figures, itwould probably generally be better to displaythem like this: 3.1416, so as not to distract thereaders with those later unimportant digits.

• A graph can tell a story so easily. A linegoing up tells one story, a line going downtells another, and a line that goes up andthen down is yet another possibility. It isthe same with scatterplots and more elabo-rate displays. Yes, a table of numbers cantell a story too—especially in an area such asbaseball where, as sabermetrician Bill Jameswrote, numbers such as .406 or 61 evoke im-ages and history—but in general the possibil-ities of storytelling are greater and more di-rect with a graph. Storytelling is importantin journalism and advertising (of course) butalso in science, where data can either moti-vate and illustrate a logical argument or re-fute it.

In short, graphs are a good way to convey rela-tionships and also reveal deviations from patterns,to display the expected and the unexpected.

Now we turn to differences between statisti-cal graphics and infovis. In statistical graphics weaim for transparency, to display the data points(or derived quantities such as parameter estimatesand standard errors) as directly as possible with-out decoration or embellishment. As indicated byour remarks above, we tend to think of a graph asan improved version of a table. The good thingabout this approach is it keeps us close to the data.The bad thing is that it limits our audience. Weas statisticians think we’re keeping it simple andclean when we display a grid of scatterplots, butthe general public—and even researchers in manyscientific fields—don’t have practice reading these

1We thank the Institute of Education Sciences for grants R305D090006-09A and ED-GRANTS-032309-005, and the National ScienceFoundation for grants SES-1023189 and SES-1023176

9

Page 10: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

graphs, and can often miss the point or simply tuneout.

In contrast, practitioners of information vi-sualization use data graphics more generally asa means of communication, in competition (andcollaboration with) photographs, cartoons, inter-views, and so forth. For example, a news ar-ticle about health care costs might include somereportage (perhaps with some numbers gleanedfrom government documents), quotes from ex-perts, an interview with a sick person who cannotget health insurance, a photograph of a high-techMRI machine, a how-much-do-you-know quiz onthe prices of medical procedures—and a data visu-alization showing medical costs and service use indifferent parts of the country. The visualization isgraded partly on how cool it looks: “cool” grabs

the reader’s attention and draws him or her intothe story.

We hope that, by recognizing our different goalsand perspectives, graphic designers and statisti-cians can work together. For example, a websitemight feature a dramatic visualization that, whenclicked on, reveals an informative static statisticalgraphic that, when clicked on, takes the interestedreader to an interactive graphic and a spreadsheetwith data summaries and raw numbers.

We illustrate some of our points with two ex-amples. The first is Florence Nightingale’s fa-mous visualization of deaths in the Crimean War.Here is Nightingale’s graph from 1958 (for moredetails, see Hugh Small’s presentation at http:

//www.florence-nightingale-avenging-angel.

co.uk/Coxcomb.htm):

Figure 1: Florence Nightingale’s famous visualization of deaths in the Crimean War is attractive and drawsthe viewer in closer so as to understand what is being conveyed.

And now our presentation of the same information using R:

10

Page 11: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

020

040

060

080

0

Mortality rates in the Crimean War from April 1854 to March 1856A

nnua

l mor

talit

y ra

tes

per

thou

sand

Zymotic diseasesWounds, injuriesAll other causes

1854 1855 1856

A M J J A S O N D J F M A M J J A S O N D J F M

Sanitary commission arrives

British Army Size in the Crimean War from April 1854 to March 1856

Ave

rage

Arm

y S

ize

020

000

4000

0

1854 1855 1856

A M J J A S O N D J F M A M J J A S O N D J F M

Figure 2: Our re-plotting of Nightingale’s data shows the data and their patterns much more clearly, but ina visually less striking way. As is often the case, two smaller plots can show data much more directly thanis possible from a single graph, no matter how clever.

Nightingale’s visualization and ours both havetheir strengths. When it comes to displaying thedata and their patterns, we much prefer the plainstatistical graphs. The most salient visual featureof Nightingale’s graph is that a year is divided intotwelve months, a fact that we already knew aheadof time. The trends and departures from trend aremuch clearer when plotted directly as time series.This is no criticism of Nightingale: the standard sta-tistical techniques of today were not so easily avail-able in the mid-1800s, and in any case her graphdid the job of attracting attention better than oursdo, in any era.

Nightingale’s graph is intriguing and visuallyappealing—much more so than our bland graph—and, as is characteristic of the best infographics,the appeal is centered on the data display itself. A

reader who sees this graph is invited to stare at it,puzzle it out, and understand what it is saying. Insome ways, the weaknesses of the graph from astatistical point of view—it is difficult to read, themain conclusions to be drawn from the data arenot clear, indeed it is a bit of a challenge to fig-ure out exactly what the graph is saying at all—are strengths from the infovis perspective. Giventhat the graph is attractive enough, and the subjectimportant enough, to motivate the reader to go indeeper, the challenges in reading the graph inducea larger intellectual investment in the viewer and amotivation to see the raw data.

And once policymakers were alerted byNightingale’s dramatic visualization, they wereable to scan the columns of numbers directly andunderstand what was going on: the patterns in

11

Page 12: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

these time series are clear enough that we imaginea careful study of a tabular display would suffice.The role of the graph was to dramatize the prob-lem and motivate people to go back and look at thenumbers.

In a modern computing environment, a displaysuch as Nightingale’s could link to a more directgraphical presentation such as ours, which in turncould link to a spreadsheet with the data. The sta-tistical graphic serves as an intermediate step, al-lowing readers to visualize the patterns in the data.

Our second example concerns the survival ratesof different groups who sailed on the Titanic’smaiden voyage. Here is a doubledecker plot show-ing the survival rates by sex (males on the left andfemales on the right) and within sex by class (first,second, third, crew). The widths of the bars are pro-portional to the numbers in each group, so that weget a rough idea of their relative sizes, though it isthe survival rates that are of most interest.

It is easy to see two expected conclusions, thatfemale survival rates were higher than males forall possible comparisons, and that female survivalrates went down with class. It is also obvious,though more surprising, that the lowest male sur-vival rate was in the second class. The fact that themale crew survival rate was higher than the malesurvival rates in the second and third classes mustat least partly be due to the lifeboats being manned

with crew members to accompany the passengers.All of these conclusions may be drawn directlyfrom the display, but no one would claim it is anattention-grabbing graphic! We looked on the in-ternet (a.k.a. googled) to see if these data had beenpresented in an infographic display and found sev-eral statistical displays, not all either clear cut oreasy to read, though no infographic ones. This is agood example where cooperation between statisti-cians and infographics experts could really pay off:we have an interesting dataset and several interest-ing conclusions to present and we would like to doit in an attractive and stimulating way without los-ing any statistical clarity. Just wanting to do that isnot enough, we need design expertise, and we lookforward to someone from the infographics side tak-ing up the challenge of helping us.

Andrew GelmanDep. of Statistics and Department of Political ScienceColumbia University, New [email protected]

http: // www. stat. columbia. edu/ ~gelman/

Antony UnwinDepartment of MathematicsUniversity of [email protected]

http: // www. rosuda. org

SexClass

Male1st 2nd 3rd Crew

Female1st 2nd 3rd Crew

Yes

No

Survived

Figure 3: A doubledecker plot showing the survival rates on the Titanic by sex and, within sex, by class.This graph shows several interesting comparisons but could benefit from improvement in graphic design.

12

Page 13: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Heuristic Methods in FinanceEnrico Schumann and David Ardia 1

Heuristic optimization methods and their applica-tion to finance are discussed. Two illustrations ofthese methods are presented: the selection of assetsin a portfolio and the estimation of a complicatedeconometric model.

Heuristic methods in Finance

Models in finance

Finance is, at its very heart, all about optimization.In financial theory, decision makers are optimiz-ers: households maximize their utility, firms max-imize profits or minimize costs. In more appliedwork, we may look for portfolios that best matchour preferences, or for trading rules that find theideal point in time for buying or selling an asset.And of course, the ubiquitous estimation or calibra-tion of model parameters is nothing but optimiza-tion.

In this note we will describe a type of numeri-cal techniques, so-called heuristics, that can be usedto solve optimization models. An optimizationmodel consists of an objective function and pos-sibly a number of constraints, i.e., the model is aprecise, mathematical description of a given prob-lem. But the process of financial modeling com-prises two stages. We start with an actual problem –such as “how to invest?” – and translate this prob-lem into a model; then we move from the model toits numerical solution. We will be concerned withthe second stage. Yet we cannot overemphasize theimportance of the first stage. It may be interestingto work on a challenging optimization model, but ifthe model is not useful than neither is its solution.

It turns out that many financial models are dif-ficult to solve. For combinatorial models it is thesize of the search space that causes trouble. Suchproblems typically have an exact solution method –write down all possible solutions and pick the bestone – but this approach is almost never feasiblefor realistic problem sizes. For continuous prob-lems, issues arise when the objective function is

not smooth (e.g., has discontinuities or is noisy), orthere are many local optima. Even in the contin-uous case we could attempt complete enumerationby discretizing the domain of the objective functionand running a grid search. But again this approachis not feasible in practice once the dimensionality ofthe model grows.

Researchers and operators in finance often goa long way to make models tractable, that is, toformulate them such that they can compute thequantities of interest either in closed form or withthe computational tools that are at hand. Whenit comes to optimization, models are often shapedsuch that they can be solved with “classical” opti-mization techniques like linear and quadratic pro-gramming. But this comes at a price: we have toconstruct the model in such a way that it fulfillsthe requirements of the particular method. For in-stance, we may need to choose a quadratic objectivefunction, or approximate integers with real num-bers. To paraphrase John Tukey, when we solvesuch a model we get a precise answer but it be-comes more difficult to say if we have asked theright question.

An alternative strategy is the use of heuris-tic optimization techniques, or heuristics for short.Heuristics aim at providing good and fast approx-imations to optimal solutions; to stay with Tukey’sfamous statement, heuristics may be described asseeking approximate answers to the right ques-tions. (In theory, the solution of a model is the op-timum; it is not necessary to speak of optimal solu-tions. But practically a solution is rather the resultthat we get from a piece of software, so it is mean-ingful to distinguish between good and bad solu-tions.) Optimization heuristics are often very sim-ple, easy to implement and to use; there are essen-tially no constraints on the model formulation; andany changes to the model are quickly implemented.But of course, there must be a downside: heuris-tics do not provide the exact solution of the modelbut only a stochastic approximation. Yet such anapproximation may still be better than a poor de-terministic solution or no solution at all. If a modelcan be solved with a classical method, it is no use totry a heuristic. The advantage comes when classical

1The views expressed in this paper are the sole responsibility of the authors and do not necessarily reflect those of VIP ValueInvestment Professionals AG, aeris CAPITAL AG or any of their affiliates. Any remaining errors or shortcomings are the authors’responsibility.

13

Page 14: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

methods cannot solve the given model, i.e., when amodel is difficult. Such models are far more com-mon in finance that is sometimes thought.

What is a heuristic?

The aim in optimization is to

minimizex

f (x, data)

with f a scalar-valued function, and x a vectorof decision variables. To maximize a function f ,we minimize − f . In most cases, this optimizationproblem will be constrained. Optimization heuris-tics are a class of numerical methods that can solvesuch problems. Well-known examples are Simu-lated Annealing, Genetic Algorithms (or, more gen-erally, Evolutionary Algorithms) or Tabu Search. Itis hard to give a general definition of what con-stitutes a heuristic. Typically, the term is charac-terized through several criteria such as the follow-ing (e.g., Zanakis and Evans [12], Barr et al. [2]):(i) the method should give a “good” stochastic ap-proximation of the true optimum, with “goodness”measured by computing time or solution quality,(ii) the method should be robust to changes inthe given problem, in particular the problem size,(iii) the technique should be easy to implement, and(iv) implementing and using the technique shouldnot require any subjective elements. Of course,such a definition is not unambiguous, and even inthe optimization literature the term is used withdifferent meanings.

How do heuristics work?

Very roughly, we can divide heuristics into iter-ative search methods and constructive methods.Constructive methods start with an empty solutionand then build a solution in a stepwise manner byadding components until a solution is completed.An example: in a Traveling Salesman Problem weare given a set of cities and the distances betweenthem. The aim is to find the route of minimal lengthsuch that each city is visited once. We could startwith one city and then add the remaining cities oneat a time (e.g., always choosing the nearest city) un-til a complete tour is created. The procedure termi-nates once we have found one complete solution.

For iterative search methods, we repeatedlychange an existing complete solution to obtain anew solution. Such methods are far more relevant

in finance, so we will concentrate on them. To de-scribe an iterative search method, we need to spec-ify (i) how we generate new solutions from existingsolutions, (ii) when to accept such a new solution,and (iii) when to stop the search. These three de-cisions define a particular method; in fact, they arethe building blocks of many optimization methods,not just of heuristics. As an example, think of asteepest descent method. Suppose we have a cur-rent (or initial) solution xc and want to find a newsolution xn. Then the rules could be as follows:

(i) We estimate the slope (the gradient) of f at xc

which gives us the search direction. The newsolution xn is then xc − step size · ∇ f (xc).

(ii) If f (xn) < f (xc) we accept xn, i.e., we re-place xc by xn.

(iii) We stop if no further improvements in f canbe found, or if we reach a maximum numberof function evaluations.

Problems will mostly occur with steps (i)and (ii). There are models in which the gradi-ent does not exist, or cannot be computed mean-ingfully (e.g., when the objective function is notsmooth). Hence we may need other approachesto compute a search direction. The acceptance-criterion for a steepest descent is strict: if there is noimprovement, a candidate solution is not accepted.But if the objective function has several minima,this means we will never be able to move awayfrom a local minimum, even if it is not the globallybest one.

Heuristics follow the same basic pattern (i)–(iii),but they have different rules that are better suitedfor problems with noisy objective functions or mul-tiple minima. In fact, almost all heuristics use oneor both of the following principles.

Trust your luck Classical methods are determinis-tic: given a starting value, they will alwayslead to the same solution. Heuristics makedeliberate use of randomness. New solutionsmay be created by randomly changing old so-lutions, or we may accept new solutions onlywith a given probability.

Don’t be greedy When we compute new candi-date solutions in the steepest descent method,we choose a (locally) optimal search direction(it is steepest descent after all). Many heuris-tics put up with “good” search directions, inmany cases even random directions. Also,

14

Page 15: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

heuristics generally do not enforce contin-uous improvements; inferior solutions maybe accepted. This is inefficient for a well-behaved problem with a single optimum, butit allows these methods to move away fromlocal minima.

As a concrete example, we look at ThresholdAccepting (a variant of Simulated Annealing).

(i) We randomly choose an xn close to xc. For in-stance, when we estimate the parameter val-ues of a statistical model, we could randomlypick one of the parameters and perturb it byadding a bit of noise.

(ii) If f (xn) < f (xc) we accept xn, as before. Butif f (xn) > f (xc), we also accept it as longas f (xn)− f (xc) is smaller than a fixed thresh-old (which explains the method’s name), i.e.,we accept a new solution that is worse thanits predecessor, as long as it is not too muchworse. Thus, we can think of Threshold Ac-cepting as a biased random walk. (SimulatedAnnealing works the same, but we would ac-cept an inferior solution with a certain proba-bility.)

(iii) We stop, say, after a fixed number iterations.

Stochastic solutions

Almost all heuristics are stochastic algorithms.Running the same technique twice, even with thesame starting values, will typically result in differ-ent solutions. Thus, we can treat the result (i.e.,the decision variables x and the associated objectivefunction value) of a optimization heuristic as a ran-dom variable with some distribution D. We do notknow what D looks like, but there is a simple wayto find out for a given problem: we run a reason-ably large number of restarts, for each restart westore the results, and finally we compute the em-pirical distribution function of these results as anestimate for D. For a given problem (often problemclass), the shape of D will depend on the chosenmethod. Some techniques will be more appropri-ate than others and give less variable and on aver-age better results. And D will often depend on thesettings of the method, most importantly the num-ber of iterations – the search time – that we allowfor.

Unlike classic optimization techniques, heuris-tics can escape from local minima. Intuitively then,if we let the algorithm search for longer, we canhope to find better solutions. Thus the shape of Dis strongly influenced by the amount of computa-tional resources spent (often measured by the num-ber of objective function evaluations). For min-imization problems, when we increase computa-tional resources, the mass of D will move to theleft, and the distribution will become less variable.Ideally, when we let the computing time grow everlonger, D should degenerate into a single point, theglobal minimum. Unfortunately, it’s never possibleto ensure this practically.

Illustrations

Asset selection with Local Search

We can make these ideas more concrete through anexample, taken from Gilli et al. [5]; sample code isgiven in the book. Suppose we have a universe of500 assets (for example, mutual funds), completelydescribed by a given variance–covariance matrix,and we are asked to find an equal-weight portfoliowith minimal variance under the constraints thatwe have only between Kinf and Ksup assets in theportfolio. This is a combinatorial problem, and hereare several strategies to obtain a solution.

(1) Write down all portfolios with feasible cardi-nality, compute the variance of each portfolio,and pick the one with the lowest variance.

(2) Choose k portfolios randomly and keep theone with the lowest variance.

(3) Sort the assets by their marginal variance.Then construct an equal-weight portfolio ofthe Kinf assets with the lowest variance, then aportfolio of the Kinf + 1 assets with the lowestvariance, and so on to a portfolio of the Ksupassets with the lowest variance. Of thoseKsup − Kinf + 1 portfolios, pick the one withthe lowest variance.

Approach (1) is infeasible. Suppose we were tocheck cardinalities between 100 and 150. For 100out of 500 alone we have 10107 possibilities, andthat leaves us 101 out of 500, 102 out of 500, andso on. Even if we could evaluate millions of portfo-lios in a second it would not help. Approach (2)

15

Page 16: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

has the advantage that it is simple, and we canscale computational resources (increase k). That is,we can use the trade-off between available comput-ing time and solution quality. Approach (2) can bethought of as a sample substitute for Approach (1).In Approach (3) we ignore the covariation of as-sets (i.e., we only look at the main diagonal of thevariance–covariance matrix), but we only have tocheck Ksup − Kinf + 1 portfolios. There may becases, however, in which we would wish to includecorrelation.

We set up an experiment. We create an artifi-cial data set of 500 assets, each with a randomlyassigned volatility (the square root of variance) ofbetween 20% and 40%. Each pairwise correlationis set to 0.6. We compute “best-of-k” portfolios (i.e.,Approach (2)): we sample 1000 portfolios, and onlykeep the best one; we also try “best-of-100 000”portfolios and, for intuition, “best-of-1” portfolios(i.e., purely random ones).

Figure 1, in its upper panel, shows the esti-mated cumulative distribution functions of port-folio volatilities; each curve is obtained from 500restarts. Such an empirical distribution functionis an estimate of D for the particular method. Wesee that completely random portfolios produce adistribution with a median of about 23.5%. (Whatwould happen if we drew more portfolios? Theshape of D would not change, since we are merelyincreasing our sample size. But our estimates ofthe tails would become more precise.) We also plotthe distribution of the “best-of-1000” and “best-of-100 000” portfolios. For this latter strategy, we geta median volatility below 21%. We also add the re-sults for Approach (3); there are no stochastics inthis strategy.

Now let us try a heuristic. We use a simple LocalSearch. We start with a random feasible portfolioand compute its volatility. This is our current solu-tion xc, the best solution we have so far. We now tryto improve it iteratively. In each iteration we com-pute a new portfolio xn as follows. We randomlypick one asset from our universe. If this asset is al-ready in the portfolio, we remove it; if it is not in theportfolio, we add it. Then we compute the volatil-ity of this new portfolio. If it is lower than the oldportfolio’s volatility, we keep the new portfolio, i.e.,xn replaces xc; if not, we stick with xc. We includeconstraints in the simplest way: if a new portfoliohas too many or too few assets, we always considerit worse than its predecessor and reject it. We run

this search with 100, 1000, and 10 000 iterations. Foreach setting, we conduct 500 restarts; each time weregister the final portfolio’s volatility. Results areshown in the lower panel of Figure 1.

0.16 0.18 0.20 0.22 0.24

0.0

0.2

0.4

0.6

0.8

1.0

Portfolio volatility

random

best−of−1000

best−of−100000

sort by marginal variance

0.16 0.18 0.20 0.22 0.24

0.0

0.2

0.4

0.6

0.8

1.0

Portfolio volatility

100 steps

1000 steps

10000 steps

Figure 1: Upper panel: random portfolios. For ex-ample: for one restart of the “best-of-100 000” strat-egy, we sample 100 000 portfolios, and keep thebest one. The distributions are estimated from 500restarts. The sort-by-marginal-variance approachis deterministic, so its result is a constant. Lowerpanel: Local Search. Each distribution is estimatedfrom 500 restarts.

Already with 1000 iterations we are clearly bet-ter than the “best-of-100 000” strategy (though wehave used only one-hundredth of the function eval-uations). With 10 000 iterations we seem to con-verge to a point at about 16%. A few remarks: first,we have no proof that we have found the globaloptimum. But we can have some confidence thatwe have found a good solution. Second, we canpractically make the variance of D as small as wewant. With more iterations (and possibly a fewother refinements), we could, for all practical pur-poses, have the distribution “converge”. But, third,in many cases we do not need to have D collapse;for financial problems a good solution is fine, giventhe quality of financial data [6, 7].

16

Page 17: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Econometric model fitting with Differen-tial Evolution

Our second illustration is taken from Mullen et al.[8], who consider the estimation of a Markov-switching GARCH (MSGARCH) model. MS-GARCH are econometric models used to forecastthe volatility of financial time series, which is ofprimary importance for financial risk management.The estimation of MSGARCH models is a non-linear constrained optimization problem and is adifficult task in practice. A robust optimizer is thusrequired. In that regard, the authors report the bestperformance of the Differential Evolution (DE) al-gorithm compared with traditional estimation tech-niques.

DE is a search heuristic introduced by Stornand Price [10] and belongs to the class of evolu-tionary algorithms. The algorithm uses biology-inspired operations of crossover, mutation, and se-lection on a population in order to minimize an ob-jective function over the course of successive gener-ations. Its remarkable performance as a global op-timization algorithm on continuous problems hasbeen extensively explored; see, e.g., Price et al. [9].

Let NP denote the number of parameter vec-tors (members) x ∈ Rd in the population, whered denotes the dimension. In order to create the ini-tial generation, NP guesses for the optimal valueof the parameter vector are made, either using ran-dom values between bounds or using values givenby the user. Each generation involves creation of anew population from the current population mem-bers {xi | i = 1, . . . , NP}, where i indexes the vec-tors that make up the population. This is accom-plished using differential mutation of the populationmembers. An initial mutant parameter vector vi iscreated by choosing three members of the popula-tion, xi1 , xi2 and xi3 , at random. Then vi is gen-erated as vi = xi1 + F · (xi2 − xi3), where F is apositive scale factor whose effective values are typ-ically less than one. After the first mutation opera-tion, mutation is continued until d mutations havebeen made, with a given crossover probability. Thecrossover probability controls the fraction of the pa-rameter values that are copied from the mutant.Mutation is applied in this way to each member ofthe population. The objective function values as-sociated with the children are then determined. Ifa trial vector has equal or lower objective functionvalue than the previous vector it replaces the previ-ous vector in the population; otherwise the previ-

ous vector remains. Note that DE uses both strate-gies described above to overcome local minima: itdoes not only keep the best solution but accepts in-ferior solutions, too; the method evolves a wholepopulation of solutions in which some solutions areworse than others. And DE has a chance ingredientas it randomly chooses solutions to be mixed andmutated. For more details, see Price et al. [9] andStorn and Price [10].

We report below some results of Mullen et al.[8], who fit their model to the Swiss Market In-dex. For the DE optimization, the authors rely onthe package DEoptim [1] which implements DE inthe R language [3]. For comparison, the model isalso estimated using standard unconstrained andconstrained optimization routines available in Ras well as more complex methods able to handlenon-linear equality and inequality constraints. Themodel estimation is run 50 times for all optimiza-tion routines, where random starting values in thefeasible parameter set are used when needed (us-ing the same random starting values for the variousmethods). Boxplots of the objective function (i.e.,the negative log-likelihood function, which mustbe minimized) at optimum for convergent estima-tions is displayed in Figure 2. We notice that stan-dard approaches (i.e., function optim with all meth-ods) perform poorly compared with the optimiz-ers that can handle more complicated constraints(i.e., functions constrOptim, constrOptim.nl andsolnp). DE compares favorably with the two bestcompetitors in terms of negative log-likelihood val-ues and is more stable over the runs.

Conclusion

In this note, we have briefly described optimizationheuristics, but of course we could only scratch thesurface of how these methods work and where theycan be applied. After all, for most people optimiza-tion is a tool, and what matters is how this tool isapplied. Heuristics offer much in this regard: theyallow us to solve optimization models essentiallywithout restrictions on the functional form of theobjective function or the constraints. Thus, whenit comes to evaluating, comparing, and selectingmodels, researchers and operators can focus moreon a model’s financial or empirical qualities insteadof having to worry about how to handle it numer-ically. We have argued initially that financial mod-eling comprises two stages: putting an actual prob-

17

Page 18: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

CG

BFGS

SANN

Nelder−Mead

L−BFGS−B

constrOptim.nl

constrOptim

solnp

DEoptim

3350 3400 3450 3500 3550 3600 3650 3700

CG

BFGS

SANN

Nelder−Mead

L−BFGS−B

constrOptim.nl

constrOptim

solnp

DEoptim

3350 3400 3450 3500 3550 3600 3650 3700

Figure 2: Boxplots of the 50 values of the objective function (i.e., the negative log-likelihood) at opti-mum obtained by the various optimizers available in R. Function optim with method "Nelder-Mead" (un-constrained), method "BFGS" (unconstrained), method "CG" (unconstrained), method "L-BFGS-B" (con-strained), method "SANN" (unconstrained), function constrOptim (constrained), function constrOptim.nl

of the package alabama [11], function solnp of the package Rsolnp [4], function DEoptim of the packageDEoptim [1]. More details can be found in Mullen et al. [8].

lem into model form, and then solving this model.With heuristics, we become much more powerful atthe second stage; it remains to use this power in thefirst stage.

Bibliography

[1] Ardia, D., Mullen, K. M., Peterson, B. G., Ul-rich, J., 2011. DEoptim: Differential EvolutionOptimization in R.URL http://CRAN.R-project.org/package=

DEoptim

[2] Barr, R. S., Golden, B. L., Kelly, J. P., Resende,M. G. C., Stewart, W. R., 1995. Designing andreporting on computational experiments with

heuristic methods. Journal of Heuristics 1 (1),9–32.

[3] R Development Core Team, 2009. R: A Lan-guage and Environment for Statistical Com-puting. R Foundation for Statistical Comput-ing, Vienna, Austria, ISBN 3-900051-07-0.URL http://www.R-project.org

[4] Ghalanos, A., Theussl, S., 2010. Rsolnp:General Non-linear Optimization Using Aug-mented Lagrange Multiplier Method.URL http://CRAN.R-project.org/package=

Rsolnp

[5] Gilli, M., Maringer, D., Schumann, E., 2011.Numerical Methods and Optimization in Fi-nance. Elsevier.

18

Page 19: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

[6] Gilli, M., Schumann, E., 2010. Optimalenough? Journal of Heuristics 17 (4), 373–387.

[7] Gilli, M., Schumann, E., 2010. Optimization infinancial engineering – an essay on ‘good’ so-lutions and misplaced exactitude. Journal ofFinancial Transformation 28, 117–122.

[8] Mullen, K. M., Ardia, D., Gil, D. L., Windover,D., Cline, J., 2011. DEoptim: An R package forglobal optimization by differential evolution.Journal of Statistical Software 40 (6), 1–26.

[9] Price, K. V., Storn, R. M., Lampinen, J. A.,2006. Differential Evolution: A Practical Ap-proach to Global Optimization. Springer-Verlag, Berlin, Germany.

[10] Storn, R., Price, K., 1997. Differential evolution– a simple and efficient heuristic for global op-

timization over continuous spaces. Journal ofGlobal Optimization 11 (4), 341–359.

[11] Varadhan, R., 2010. alabama: ConstrainedNonlinear Optimization.URL http://CRAN.R-project.org/package=

alabama

[12] Zanakis, S. H., Evans, J. R., 1981. Heuristic“optimization”: Why, when, and how to useit. Interfaces 11 (5), 84–91.

Enrico SchumannVIP Value Investment Professionals AG, [email protected]

David Ardiaaeris CAPITAL AG, [email protected]

19

Page 20: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

Section OfficersStatistical ComputingSection Officers 2011

Richard M. Heiberger, [email protected]

(215) 808-1808

Luke Tierney, Past [email protected]

(319) 335-3386

Kren Kafadar, [email protected]

(812) 855-7828

Usha S. Govindarajulu, Secretary/[email protected]

(617) 525-1237

Montserrat Fuentes, COMP/COS [email protected]

(919) 515-1921

David J. Poole, Program [email protected]

(973) 360-7337

Chris Volinsky, Program [email protected]

(973) 360-8644

Hadley A. Wickham, Publications [email protected]

(515) 450-8171

John Castelloe, ComputingSection [email protected]

(919) 531-5728

Rick G. Peterson (see right)

Nicholas Lewin-Koh, Newsletter [email protected]

Statistical GraphicsSection Officers 2011

Juergen Symanzik, [email protected]

(435) 797-0696

Simon Urbanek, [email protected]

(973) 360-7056

Heike Hofmann, [email protected]

(515) 294-8948

John W. Emerson, Secretary/ [email protected]

(203) 215-3540

Mark Greenwood, GRPH COS Rep [email protected]|(406) 994-1962

Michael Lawrence, GRPH COS Rep [email protected]

(515) 708-3239

Webster West, Program [email protected]

(803) 351-5087

Hadley A. Wickham, Program [email protected]

(515) 450-8171

Donna F. Stroup, Council of [email protected]

(404) 218-0841

Rick G. Peterson, ASA Staff [email protected]

(703) 684-1221

Martin Theus, Newsletter [email protected]

20

Page 21: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

21

Page 22: A Word from our 2011 Section Chairsstat-computing.org/newsletter/issues/scgn-22-1.pdf · 2013-06-10 · win, we get a similar view, but now from someone whose primary training is

The Statistical Computing & Statistical GraphicsNewsletter is a publication of the Statistical Comput-ing and Statistical Graphics Sections of the ASA. Allcommunications regarding the publication should beaddressed to:

Nicholas Lewin-KohEditor Statistical Computing [email protected]

Martin TheusEditor Statistical Graphics [email protected]

All communications regarding ASA membership andthe Statistical Computing and Statistical GraphicsSection, including change of address, should be sentto

American Statistical Association1429 Duke StreetAlexandria, VA 22314-3402 USATEL (703) 684-1221FAX (703) [email protected]

22