Baby Steps: Agile Transformation at BabyCenter.com
Post on 10-Mar-2017
Baby Steps: Agile Transformation at BabyCenter.com
I n 2003, BabyCenter.com looked a lot like other emerg-ing Internet software sites: a fragile production environ-ment, little institutional knowl-edge, and lots of tech debt. The site offered expert content for new and expecting parents, a store, and a community bulletin-board service supported by an engineering staff of four Java and three Web devel-opers. The number of projects exceeded the number of develop-ers, and the projects were large, with delivery dates set long before anyone had scoped the work. Net-work operations for a site that got 5 million daily page views included three people.
Of course, there were also ro-tating triple-A priorities, work assignments by email thread, and late delivery of waterfall require-ments and design specifications. By some accounts, 85 percent of engineering activities were spent fighting fires in production, fixing critical bugs, and maintaining the systemnot on the new feature development that management and users wanted.
But by 2008, BabyCenter had established a production environ-ment with 99.85 percent uptime, while tripling its engineering re-sources, rebuilding all of its core properties, moving into the social media arena, and delivering in-
ternational sites to more than 10 countries. The unyielding regu-larity of feature rollouts increased the trust between engineering and business managers who can now predict to within a day when a new feature will hit production. BabyCenter.com is the leading Internet destination in its niche worldwide, with 6 million visitors per month.
First Steps with ScrumSo what happened? While re-covering from back surgery, BabyCenters vice president of en-gineering read Agile Software De-velopment with Scrum, coauthored by Ken Schwaber, a leader in agile development and cocreator of the scrum process framework (see the sidebar, An Overview of Scrum). On returning to work, the VP gave copies of Schwabers book to the companys developers and other business stakeholders. One engineer and a product manager decided to run with the process in a small, noncritical pilot proj-ect. They used month-long scrum sprints and daily stand-up meet-ings and appreciated the results from both sides.
Three months later, the pilot project was deemed a success, and the engineer moved to a lon-ger-term project with other engi-neers. In addition, the company
sent 10 people to Schwabers cer-tified scrummaster class in San Jose. These people included the engineering VP, lead developers, all product managers, and the QA lead. Toward the end of this period, BabyCenter hired a chief architect with experience in agile development and concepts such as scrum.
Cultural ChangesWhile the first half-year focused on exploratory pilot projects and training, the company began mov-ing to build up the foundations of cultural change and best practices that would shape the marathon ahead.
During this transition phase, engineers and product managers cooperated to implement all the pieces of scrum, including the product owner (PO), team, and scrummaster process roles. They created product backlogs and held sprint reviews. They continued to grow their knowledge by inviting agile guru Mike Cohn to educate all engineers, product managers, and other interested parties in user story writing and estimation techniques. And finally, we cre-ated a company backloga pri-oritized list of the top projects and the engineers working on them.
Creating the company backlog was a wrenching cultural change
Keith Nottonson, Yahoo!Ken DeLong, BabyCenter.com
1520-9202/08/$25.00 2008 IEEE P u b l i s h e d b y t h e I E E E C o m p u t e r S o c i e t y computer.org/ITPro 59
60 ITPro September/October 2008
that deflated the all-too-common fantasy that engineering could squeeze in an unlimited number of projects. By requiring the busi-ness stakeholders to enumerate all the desired projects and then force-rank them clarified the real-ity of limited resources.
Many stakeholders were un-happy when a favorite project fell below others on the list. However, the prioritization put the onus on business stakeholders to supply compelling business reasons for their projects. Given the priori-tized company backlog, engineers
could remain focused on higher-value projects deemed to provide good ROI to the company.
No engineer left behind was a new policy implemented to en-sure that no engineer was strand-ed on a team alone. If theres only one engineer on a team, code re-views and pair programming are extremely difficult. Getting stuck or sick can halt development in-definitely. Learning new skills, such as test-driven development (TDD), and institutionalizing knowledge become practically impossible.
To implement the new policy, engineering reduced the number of agile teams, until each team in-cluded at least two engineers. At first, not everyone could see the need for this arrangement, but the benefits of policies that en-couraged and supported software engineering best practices soon showed up in consistently pre-dictable product releases.
The POs role in scrum was an-other tricky cultural issue. Previ-ously, product managers worked more on the creative side, not really wishing to get down into
An Overview of Scrum
S crum is a lightweight process framework, com-monly used with agile software development. It provides a simple set of rules that govern the development process.
scrum focuses on time-boxed iterations that in-crease feedback for continuous improvement to both the development product and process. scrum is based on the lean software development ideas that came out of Toyota for manufacturing cars. Hirotakaka Takeuchi and Ikujiro nonaka first de-scribed these ideas in 1986 in the Harvard Busi-ness Review.1 In 1995, Ken schwabera cocreator with Jeff suherland of the scrum processpre-sented a seminal paper describing it at an OOPsLA workshop.2
The scrum model describes three development process roles: product owner (PO), team, and scrum-
master. simply put, the PO defines and prioritizes a work product, the team does the work, and the scrummaster removes all obstacles.
The process is iterative and incremental (see Figure A). The team works for a period of time called a sprint, usually one to four weeks. A sprint begins with a sprint-planning meeting, where the PO presents a prioritized list of user requirements expressed as stories, which describe a minimal set of marketable product features. The prioritized list is called the product backlog. sprints include 15-minute daily stand-up meetings with all team members standing in a circle telling their teammates what they finished yesterday, what they plan to complete today, and what obstacles are holding them back. The scrum-master is responsible for handling the obstacles.
each sprint concludes with a sprint review, where the team shows the PO which stories it suc-cessfully implemented. The team also meets for a sprint retrospective, where members discuss what went well and what could go better in the next sprint.
Then it starts all over again.
References1. H. Takeuchi and I. nonaka, The new new Product development Game, Harvard Bus. Rev., 1 Jan. 1986.2. K. schwaber, scrum development Process,
Business Object Design and Implementation: OOPSLA95 Workshop Proc., J. sutherland et al., eds, springer, 1997.
Sprint Working incrementof the software
Figure A. Scrum iterative process. Products are developed in one- to four-week sprints that generate working software increments.
the software details. But the scrum process demands more PO involvement.
BabyCenter 2.0Growing up is hard to do. In 2007, technical reasons, such as an ag-ing infrastructure and high-pro-prietary software costs, prompted BabyCenter.com to rebuild its core infrastructure on open source Java platforms. At the same time, the company decided to move to a new data center. The project would consume the entire engi-neering teams time for the bet-ter part of a year. Consequently, it also offered the ideal opportunity to revamp not only the technical platform but also the whole prod-uct development culture.
The BabyCenter 2.0 project began by adopting two-week sprints, TDD, and the Framework for Tests (FIT) automated func-tional test framework. The pro-cess evolved quite a bit over the projects course. Sprint retrospec-tives were vital. Because Baby-Center was working on an entirely new technology stack, the team couldnt push to production af-ter each sprint. Furthermore, the early user stories were ill-defined (I want to port this feature). Finally, the team lacked the fi-nal user-interface requirements, which initially left many stories unfinished. These stories had to be split and dragged into new sprints or left in the purgatory of PMD (pretty much done). Many nights, the team went home with a deep understanding of why scrum is called controlling chaos.
But the results were impressive. The majority of functionality was rebuilt and some new features addedgoing from zero lines of code to a production launch in seven and a half months. From the first day, the site handled be-tween 3 and 6 million page views per day. The first three weeks af-
ter launch included not a single emergency code fix or on-call event. This caught the team at a loss because we had set aside two weeks for firefighting.
After the initial launch, the ship tightened up enormously. We en-forced the discipline of rolling code to production at the end of every sprint, and we stopped roll-ing code fixes in between unless they involved a priority 1 (P1) bug. Many people in both engineering and business operations thought it couldnt be done this way. Engi-neers wanted longer sprints; busi-ness managers wanted to be able to roll a file on a whim. However, the process supporters held firm.
One year after launching Baby-Center 2.0, roughly one-third of the release branches have zero check-ins after being cut, and most of the rest needed fewer than four small bug fixes before release. Rollouts typically pass unnoticed (from the firefighting point of view), and on-call weeks often pass without a single event. The teams pace has become pre-dictable. At one point, it was mov-ing faster than the business could supply requirements.
BabyCenter Adapts to ScrumFor all this to happen required two major cultural changes. The first involved BabyCenters adap-tations to the scrum process.
PO involvement. A new PO group came onboard near the time of BabyCenter 2.0s launch. Not only were they friendly to scrum and agile ideas, but they were also technical enough to understand what the engineering team was doing, even suggesting solutions from time to time.
The new PO group took seri-ously several practices that are crucial to success but were never truly embraced by the organiza-tion before. They attend daily
stand-ups as well as demos and planning meetings. They are fa-miliar with the actual project soft-ware, not just the requirements document wish list.
Stakeholder involvement. For many years, we had trouble get-ting stakeholders to actually look at the software. Now, they know that if they dont review a feature before the final pre-launch QA, it will go into production and, un-less its a P1 bug, wont change for two weeks.
This discipline has increased stakeholder involvement signifi-cantly and eliminated a lot of frus-tration and angry encounters. Also, because a feature can, in fact, be changed in two weeks, the pressure to get it right the first time is al-leviated. More of a test and learn culture is emerging.
User stories. The quality of re-quirements has improved. Rather than vague functional clumps (build a calendar tool that does cool stuff), we get real user sto-ries (I want to see my calendar entry for today based on my ba-bys due date). This helps keep features focused and small.
Minimum deployable features. As a corollary to user stories, fo-cusing on minimum deployable features lets us break a feature into small user stories that often capture basic functionality in one sprint. We can then refine and it-erate in later sprints on the basis of actual user data.
Systematic planning. Missing a prioritization meeting means missing your chance to get on the sprint. Gone are the days when a business manager or other stake-holder could run to an individual engineer at an arbitrary time and ask for some pet project to be im-plemented. Engineers know that
62 ITPro September/October 2008
they have management support when they say, Sorry, thats not on the sprint; go talk to my PO.
BabyCenter Adapts ScrumAt the same time BabyCenter has adapted to scrum, it has also adapted the process to some spe-cific company needs.
Two-tiered planning. The com-pany keeps a list of stories for the current sprint and the next. Beyond this one-month horizon, there are no stories, only themes.
Each month the POs revisit the theme plan, rejiggering ac-cording to business needs. Then they start breaking the upcom-ing months work into user sto-ries with more clearly defined requirements.
PO prioritization meetings. BabyCenter has a large number of stakeholders. To deal with this, the POs instituted a prioritization meeting one week before sprint planning. They present the story point budget for the upcoming sprint at this meeting, and the horse-trading begins. Stories that make the next sprint cut are rolled out to production in almost exact-ly four weeks.
Sprint preview. Two days prior to the sprint, the team attends a sprint preview. This gives team members a chance to look over the sprint and discuss implemen-tation optionstime to sleep on it, so to speak.
Team leads prep. The day before sprint planning, the team leads often go through the planning and tracking tool and assign tasks for the predictable and repetitive items. The actual sprint-planning meeting can then focus on the new, interesting, and risky stories. However, this was sometimes tak-en too far and left the team feel-ing as if it had lost ownership of the implementation. So the team is careful about how much it uses this practice now.
In-sprint visibility. The scrum-master sends an email to engi-neering and product management every day, detailing the deliver-ables the team is waiting for and any obstacles. Public visibility increases the likelihood of timely resolution enormously. Also dur-ing the sprint, the team holds biweekly meetings with business owners to review completed sto-ries. This allows time for feedback
within the sprint and serves as a kind of distributed demo.
B abyCenter.com has be-come a world-class mara-thon runner over the past four years. Starting with those fist baby steps, it has moved steadily, walking its way into agile think-ing, tools, and processes. The company now efficiently and consistently releases prioritized marketable features. Other teams can incorporate these lessons, re-gardless of their agi...