Architectures You Ve Always Wondered About EMag

Post on 10-Jan-2016




0 download


From InfoQ


  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 1

    eMag Issue 31 - August 2015


    Service Architectures at Scale


    Eric Evans on DDD at 10

    ARTICLE Microservice trade-offs


    Architectures you Always Wondered About

    Lessons learnt from adopting Microservices at eBay, Google, Gilt, Hailo and nearForm


    Martin Fowler on Microservice trade-offsMany development teams have found the microservices architectural style to be a superior approach to a monolithic architecture. But other teams have found them to be a productivity-sapping bur-den. Like any architectural style, micros-ervices bring costs and benefits. To make a sensible choice you have to understand these and apply them to your specific context.

    Eric Evans on the interplay of Domain-Driven Design, microservices, event-sourcing, and CQRSThe interview covers an introduction to DDD, how the communitys under-standing of DDD has changed in the last 10 years, strategic design, how to use DDD to design microservices, and the connection between microservices and the DDD bounded context.

    Lessons Learned Adopting Microservices at Gilt, Hailo and nearFormThis article contains an extensive interview on the mi-croservices adoption process, the technologies used, the benefits and difficulties of implementing microservices, with representatives from Gilt, Hailo and nearForm.

    Building a Modern Microservices Architecture at Gilt

    After living with microservices for three years, Gilt can see advan-tages in team ownership, boundaries defined by APIs, and complex problems broken down into small ones, Yoni Goldberg explained in a presentation at the QCon London 2015 conference. Challenges still exist in tooling, integration environments, and monitoring.

    Evolutionary ArchitectureRandy Shoup talks about designing and building micros-ervices based on his experience of working at large compa-nies, such as Google and eBay. Topics covered include the real impact of Conways law, how to decide when to move to a microservice-based architecture, organizing team structure around microservices, and where to focus on the standardization of technology and process.

    Service Architectures at Scale: Lessons from Google and eBayRandy Shoup discusses modern service architectures at scale, using specif-ic examples from both Google and eBay. He covers some interesting lessons learned in building and operating these sites. He concludes with a number of experience-based recommendations for other smaller organizations evolving to -- and sustaining -- an effective service ecosystem.







  • Architectures you Always Wondered About // eMag Issue 31 - Aug 20154


    This eMag has had an unusual history. When we started to plan it the intent had been to look at the different architectural styles of a number of the well known Silicon Valley firms. As we start-ed to work on it though it become apparent that nearly all of them had, at some level, converged towards the same architectural style - one based on microservices, with DevOps and some sort of agile (in the broadest sense) management ap-proach.

    According to ThoughtWorks Chief Scientist Martin Fowler the term microservice was dis-cussed at a workshop of software architects near Venice in May 2011, to describe what the partic-ipants saw as a common architectural style that many of them had begun exploring recently. In May 2012, the same group decided on microser-vices as the most appropriate name.

    When we first started talking about the mi-croservices architectural style at InfoQ in 2013, I think many of us assumed that its inherent oper-ational complexity would prevent the approach being widely adopted particularly quickly. Yet a mere three years on from the term being coined it has become one of the most commonly cited approaches for solving large-scale horizontal scaling problems, and most large web sites in-cluding Amazon and eBay have evolved from a

    monolithic architecture to a microservices one. Moreover the style has spread far beyond its Bay Area roots, seeing widespread adoption in many organisations.

    In this eMag we take a look at the state of the art in both theory and practice.

    Martin Fowler provides a clear and concise summary of the trade-offs involved when choos-ing to work with the style.

    Eric Evans talks about the interplay of Do-main-Driven Design, microservices, event-sourc-ing, and CQRS.

    Randy Shoup describes experiences of working with microservices from his time at eBay and Google. He focuses on the common evolu-tionary path from monoliths to microservices and paints a picture of a mature services envi-ronment at Google. In a follow-up interview he elaborates on some of the lessons from this ex-perience.

    Then Abel Avram speaks to three com-panies - Gilt, Hailo and nearForm - about their experiences covering both building a microser-vices platform from scratch and re-architecting a monolithic platform by gradually introducing mi-croservices. In the follow-up presentation sum-mary we take a more detailed look at Gilt.

    took over as head of the editorial team at in March 2014,

    guiding content creation including news, articles, books, video presentations, and interviews. Prior to taking on the full-time role at InfoQ, Charles led InfoQs Java coverage, and was CTO for PRPi Consulting, a remuneration research rm that was acquired by PwC in July 2012. For PRPi, he had overall responsibility for the development of all the custom software used within the company. He has worked in enterprise software for around 20 years as a developer, architect, and development manager.


  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 5

    Microservice trade-offs by Martin Fowler

    Read on

    Many development teams have found the microservices architectural style to be a superior approach to a monolithic architecture. But other teams have found them to be a productivity-sapping burden. Like any architectural style, microservices bring costs and benefits. To make a sensible choice you have to understand these and apply them to your specific context.

    Martin Fowler is an author, speaker, and general loud-mouth on software development. Hes long been puzzled by the problem of how to componentize software systems, having heard more vague claims than hes happy with. He hopes that microservices will live up to the early promise its advocates have found.

    Strong Module Boundaries: Microservices reinforce mod-ular structure, which is par-ticularly important for larger teams.

    Microservices provide benefits

    Independent Deployment: Simple services are easier to deploy, and since they are autonomous, are less like-ly to cause system failures when they go wrong.

    Technology Diversity: With micro-services you can mix multiple lan-guages, development frameworks and data-storage technologies.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 20156

    Distribution: Distributed sys-tems are harder to program, since remote calls are slow and are always at risk of fail-ure.

    but come with costs

    Eventual Consistency: Maintain-ing strong consistency is extremely difficult for a distributed system, which means everyone has to man-age eventual consistency.

    Operational Complexity: You need a mature operations team to manage lots of services, which are being redeployed regularly.

    Strong Module BoundariesThe first big benefit of microser-vices is strong module boundar-ies. This is an important benefit yet a strange one, because there is no reason, in theory, why a mi-croservices should have stron-ger module boundaries than a monolith.

    So what do I mean by a strong module boundary? I think most people would agree that its good to divide up software into modules: chunks of software that are decoupled from each other. You want your modules to work so that if I need to change part of a system, most of the time I only need to understand a small part of that system to make the change, and I can find that small part pretty easily. Good modular structure is useful in any pro-gram, but becomes exponential-ly more important as the soft-ware grows in size. Perhaps more importantly, it grows more in im-portance as the team developing it grows in size.

    Advocates of microservices are quick to introduce Conways Law, the notion that the struc-ture of a software system mirrors the communication structure of the organization that built it. With larger teams, particularly if these teams are based in differ-ent locations, its important to structure the software to recog-nize that inter-team communi-cations will be less frequent and more formal than those within a team. Microservices allow each team to look after relatively inde-

    pendent units with that kind of communication pattern.

    As I said earlier, theres no reason why a monolithic system shouldnt have a good modular structure. [1] But many people have observed that it seems rare, hence theBig Ball of Mudis most common architectural pattern. Indeed this frustration with the common fate of monoliths is whats driven several teams to microservices. The decoupling with modules works because the module boundaries are a barrier to references between modules. The trouble is that, with a mono-lithic system, its usually pretty easy to sneak around the barrier. Doing this can be a useful tacti-cal shortcut to getting features built quickly, but done widely they undermine the modular structure and trash the teams productivity. Putting the mod-ules into separate services makes the boundaries firmer, making it much harder to find these can-cerous workarounds.

    An important aspect of this coupling is persistent data. One of the key characteristics of mi-croservices isDecentralized Data Management, which says that each service manages its own database and any other service must go through the services API to get at it. This eliminatesIn-tegration Databases, which are a major source of nasty coupling in larger systems.

    Its important to stress that its perfectly possible to have firm module boundaries with a monolith, but it requires disci-

    pline. Similarly you can get a Big Ball of Microservice Mud, but it requires more effort to do the wrong thing. The way I look at, using microservices increases the probability that youll get better modularity. If youre confident in your teams discipline, then that probably eliminates that advan-tage, but as a team grows it gets progressively harder to keep dis-ciplined, just as it becomes more important to maintain module boundaries.

    This advantage becomes a handicap if you dont get your boundaries right. This is one of the two main reasons for aMono-lith First strategy, and why even those more inclined to run with microservices early stress that you can only do so with a well understood domain.

    But Im not done with ca-veats on this point yet. You can only really tell how well a system has maintained modularity after time has passed. So we can only really assess whether microser-vices lead to better modularity once we see microservice sys-tems that have been around for at least a few years. Furthermore, early adopters tend to be more talented, so theres a further delay before we can assess the modularity advantages of micro-service systems written by aver-age teams. Even then, we have to accept that average teams write average software, so rather than compare the results to top teams we have to compare the result-ing software to what it would have been under a monolithic

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 7

    architecture - which is a tricky counter-factual to assess.

    All I can go on for the mo-ment is the early evidence I have hear from people I know who have been using this style. Their judgement is that it is significant-ly easier to maintain their mod-ules.

    One case study was partic-ularly interesting. The team had made the wrong choice, using microservices on a system that wasnt complex enough to cov-er theMicroservice Premium. The project got in trouble and needed to be rescued, so lots more people were thrown onto the project. At this point the mi-croservice architecture became helpful, because the system was able to absorb the rapid influx of developers and the team was able to leverage the larger team numbers much more easily than is typical with a monolith. As a result the project accelerated to a productivity greater than would have been expected with a monolith, enabling the team to catch up. The result was still a net negative, in that the soft-ware cost more staff-hours than it would have done if they had gone with a monolith, but the microservices architecture did support ramp up.

    DistributionSo microservices use a distribut-ed system to improve modulari-ty. But distributed software has a major disadvantage, the fact that its distributed. As soon as you play the distribution card, you incur a whole host of complex-ities. I dont think the microser-vice community is as naive about these costs as the distributed objects movement was, but the complexities still remain.

    The first of these is perfor-mance. You have to be in a really unusual spot to see in-process function calls turn into a perfor-mance hot spot these days, but

    remote calls are slow. If your ser-vice calls half-a-dozen remote services, each which calls anoth-er half-a-dozen remote services, these response times add up to some horrible latency character-istics.

    Of course you can do a great deal to mitigate this prob-lem. Firstly you can increase the granularity of your calls, so you make fewer of them. This compli-cates your programming model, you now have to think of how to batch up your inter-service inter-actions. It will also only get you so far, as you are going to have to call each collaborating service at least once.

    The second mitigation is to use asynchrony. If make six asyn-chronous calls in parallel youre now only as slow as the slowest call instead of the sum of their la-tencies. This can be a big perfor-mance gain, but comes at anoth-er cognitive cost. Asynchronous programming is hard: hard to get right, and much harder to debug. But most microservice stories Ive heard need asynchrony in order to get acceptable performance.

    Right after speed is reliabil-ity. You expect in-process func-tion calls to work, but a remote call can fail at any time. With lots of microservices, theres even more potential failure points. Wise developers know this and design for failure. Happily the kinds of tactics you need for asynchronous collaboration also fit well with handling failure and the result can improve resiliency. Thats not much compensation however, you still have the extra complexity of figuring out the consequences of failure for every remote call.

    And thats just the top twoFallacies of Distributed Com-puting.

    There are some caveats to this problem. Firstly many of these issues crop up with a monolith as it grows. Few mono-

    liths are truly self-contained, usu-ally there are other systems, of-ten legacy systems, to work with. Interacting with them involves going over the network and run-ning into these same problems. This is why many people are in-clined to move more quickly to microservices to handle the in-teraction with remote systems. This issue is also one where expe-rience helps, a more skillful team will be better able to deal with the problems of distribution.

    But distribution is always a cost. Im always reluctant to play the distribution card, and think too many people go distributed too quickly because they under-estimate the problems.

    Eventual ConsistencyIm sure you know websites that need a little patience. You make an update to something, it re-freshes your screen and the up-date is missing. You wait a minute or two, hit refresh, and there it is.

    This is a very irritating us-ability problem, and is almost certainly due to the perils of eventual consistency. Your up-date was received by the pink node, but your get request was handled by the green node. Until the green node gets its update from pink, youre stuck in an in-consistency window. Eventual-ly it will be consistent, but until then youre wondering if some-thing has gone wrong.

    Inconsistencies like this are irritating enough, but they can be much more serious. Business logic can end up making deci-sions on inconsistent informa-tion, when this happens it can be extremely hard to diagnose what went wrong because any investigation will occur long af-ter the inconsistency window has closed.

    Microservices introduce eventual consistency issues be-cause of their laudable insistence

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 20158

    Microservices are the first post

    DevOps revolution architecture

    - Neal Ford

    on decentralized data manage-ment. With a monolith, you can update a bunch of things to-gether in a single transaction. Microservices require multiple resources to update, and distrib-uted transactions are frowned on (for good reason). So now, devel-opers need to be aware of consis-tency issues, and figure out how to detect when things are out of sync before doing anything the code will regret.

    The monolithic world isnt free from these problems. As sys-tems grow, theres more of a need to use caching to improve per-formance, and cache invalidation is the other Hard Problem. Most applications needoffline locksto avoid long-lived database trans-actions. External systems need updates that cannot be coordi-nated with a transaction manag-er. Business processes are often more tolerant of inconsistencies than you think, because busi-nesses often prize availability more (business processes have long had an instinctive under-standing of theCAP theorem).

    So like with other distribut-ed issues, monoliths dont entire-ly avoid inconsistency problems, but they do suffer from them much less, particularly when they are smaller.

    Independent DeploymentThe trade-offs between modular boundaries and the complexities of distributed systems have been around for my whole career in this business. But one thing thats changed noticeably, just in the last decade, is the role of releas-ing to production. In the twenti-eth century production releases were almost universally a painful and rare event, with day/night weekend shifts to get some awk-ward piece of software to where it could do something useful. But these days, skillful teams release frequently to production, many

    organizations practicing Con-tinuous Delivery, allowing them to do production releases many times a day.

    This shift has had a pro-found effect on the software industry, and it is deeply inter-twined with the microservice movement. Several microser-vice efforts were triggered by the difficulty of deploying large monoliths, where a small change in part of the monolith could cause the whole deployment to fail. A key principle of microser-vices is thatservices are compo-nentsand thus are independent-ly deployable. So now when you make a change, you only have to test and deploy a small service. If you mess it up, you wont bring down the entire system. After all, due the need to design for fail-ure, even a complete failure of your component shouldnt stop other parts of the system from working, albeit with some form of graceful degradation.

    This relationship is a two-way street. With many micros-ervices needing to deploy fre-quently, its essential you have your deployment act together. Thats why rapid application de-ployment and rapid provisioning of infrastructure areMicroservice Prerequisites. For anything be-yond the basics, you need to be doing continuous delivery.

    The great benefit of contin-uous delivery is the reduction in cycle-time between an idea and running software. Organizations that do this can respond quickly to market changes, and intro-duce new features faster than their competition.

    Although many people cite continuous delivery as a reason to use microservices, its essen-tial to mention that even large monoliths can be delivered con-tinuously too. Facebook and Etsy are the two best known cases. There are also plenty of cases where attempted microservices

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 9

    architectures fail at independent deployment, where multiple ser-vices need their releases to be carefully coordinated[2]. While I do hear plenty of people arguing that its much easier to do con-tinuous delivery with microser-vices, Im less convinced of this than their practical importance for modularity - although natu-rally modularity does correlate strongly with delivery speed.

    Operational ComplexityBeing able to swiftly deploy small independent units is a great boon for development, but it puts additional strain on oper-ations as half-a-dozen applica-tions now turn into hundreds of little microservices. Many organi-zations will find the difficulty of handling such a swarm of rapidly changing tools to be prohibitive.

    This reinforces the import-ant role of continuous delivery. While continuous delivery is a valuable skill for monoliths, one thats almost always worth the effort to get, it becomes essential for a serious microservices set-up. Theres just no way to handle dozens of services without the automation and collaboration that continuous delivery fosters. Operational complexity is also increased due to the increased demands on managing these services and monitoring. Again a level of maturity that is useful for monolithic applications be-comes necessary if microservices are in the mix.

    Microservice proponents like to point out that since each service is smaller its easier to understand. But the danger is that complexity isnt eliminat-ed, its merely shifted around to the interconnections between services. This can then surface as increased operational com-plexity, such as the difficulties in debugging behavior that spans services. Good choices of service

    boundaries will reduce this prob-lem, but boundaries in the wrong place makes it much worse.

    Handling this operation-al complexity requires a host of new skills and tools - with the greatest emphasis being on the skills. Tooling is still immature, but my instinct tells me that even with better tooling, the low bar for skill is higher in a microser-vice environment.

    Yet this need for better skills and tooling isnt the hardest part of handling these operational complexities. To do all this effec-tively you also need to introduce a devops culture: greater col-laboration between developers, operations, and everyone else involved in software delivery. Cultural change is difficult, es-pecially in larger and older orga-nizations. If you dont make this up-skilling and cultural change, your monolithic applications will be hampered, but your microser-vice applications will be trauma-tized.

    Technology DiversitySince each microservice is an independently deployable unit, you have considerable freedom in your technology choices with-in it. Microservices can be written in different languages, use differ-ent libraries, and use different data stores. This allows teams to choose an appropriate tool for the job, some languages and li-braries are better suited for cer-tain kinds of problems.

    Discussion of technical di-versity often centers on best tool for the job, but often the biggest benefit of microservices is the more prosaic issue of versioning. In a monolith you can only use a single version of a library, a situa-tion that often leads to problem-atic upgrades. One part of the system may require an upgrade to use its new features but can-not because the upgrade breaks

    another part of the system. Deal-ing with library versioning issues is one of those problems that gets exponentially harder as the code base gets bigger.

    There is a danger here that there is so much technology di-versity that the development organization can get over-whelmed. Most organizations I know do encourage a limited set of technologies. This encourage-ment is supported by supplying common tools for such things as monitoring that make it easier for services to stick to a small portfo-lio of common environments.

    Dont underestimate the value of supporting experimen-tation. With a monolithic system, early decisions on languages and frameworks are difficult to re-verse. After a decade or so such decisions can lock teams into awkward technologies. Microser-vices allow teams to experiment with new tools, and also to grad-ually migrate systems one ser-vice at a time should a superior technology become relevant.

    Secondary FactorsI see the items as above as the primary trade-offs to think about. Heres a couple more things that come up that I think are less im-portant.

    Microservice proponents often say that services are easier to scale, since if one service gets a lot of load you can scale just it, rather than the entire applica-tion. However Im struggling to recall a decent experience report that convinced me that it was actually more efficient to do this selective scaling compared to doing cookie-cutter scaling by copying the full application.

    Microservices allow you to separate sensitive data and add more careful security to that data. Furthermore by ensuring all traffic between microservices is secured, a microservices ap-proach could make it harder to

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201510

    exploit a break-in. As security issues grow in importance, this could migrate to becoming a ma-jor consideration for using micro-services. Even without that, its a not unusual for primarily mono-lithic systems to create separate services to handle sensitive data.

    Critics of microservices talk about the greater difficulty in testing a microservices applica-tion than a monolith. While this is a true difficulty - part of the greater complexity of a distrib-uted application - there aregood approaches to testing with mi-croservices. The most important thing here is to have the disci-pline to take testing seriously, compared to that the differences between testing monoliths and testing microservices are sec-ondary.

    Summing UpAny general post on any ar-chitectural style suffers from the Limitations Of General Ad-vice. So reading a post like this cant lay out the decision for you, but such articles can help ensure you consider the various factors that you should take into account. Each cost and benefit here will have a different weight

    for different systems, even swap-ping between cost and benefit (strong module boundaries are good in more complex systems, but a handicap to simple ones) Any decision you make depends on applying such criteria to your context, assessing which factors matter most for your system and how they impact your particular context. Furthermore, our expe-rience of microservice architec-tures is relatively limited. You can usually only judge architectural decisions after a system has ma-tured and youve learned what its like to work with years after development began. We dont have many anecdotes yet about long-lived microservice architec-tures.

    Monoliths and microser-vices are not a simple binary choice. Both are fuzzy defini-tions that mean many systems would lie in a blurred boundary area. Theres also other systems that dont fit into either catego-ry. Most people, including my-self, talk about microservices in contrast to monoliths because it makes sense to contrast them with the more common style, but we must remember that there are systems out there that dont

    It's important to stress that it's

    perfectly possible to have firm module

    boundaries with a monolith, but it

    requires discipline. Similarly you can

    get a Big Ball of Microservice Mud,

    but it requires more effort to do the

    wrong thing

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 11

    For more information about microservices, start with my Microservice Resource Guide, where Ive selected the best information on the what, when, how, and who of microservices.

    Sam Newmans book Is the key resource if you want to find out more about how to build a microservice sys-tem.

    fit comfortably into either cat-egory. I think of monoliths and microservcies as two regions in the space of architectures. They are worth naming because they have interesting characteristics that are useful to discuss, but no sensible architect treats them as a comprehensive partitioning of the architectural space.

    That said, one general sum-mary point that seems to be widely accepted is there is aMi-croservice Premium: microser-vices impose a cost on produc-tivity that can only be made up for in more complex systems. So if you can manage your systems complexity with a monolithic ar-chitecture then you shouldnt be using microservices.

    But the volume of the mi-croservices conversation should not let us forget the more im-portant issues that drive the success and failure of software projects. Soft factors such as the quality of people on the team, how well they collaborate with each other, and the degree of communication with domain ex-perts, will have a bigger impact than whether to use microser-vices or not. On a purely techni-cal level, its more important to focus on things like clean code, good testing, and attention to evolutionary architecture.

    Footnotes1:Some people consider mono-lith to be an insult, always im-plying poor modular structure. Most people in the microservices world dont do this, they define monolith purely as an applica-tion built as a single unit. Cer-tainly microservices-advocates believe that most monoliths end up being Big Balls of Mud, but I dont know any who would ar-gue that its impossible to built a well-structured monolith.

    2:The ability to deploy ser-vices independently ispart of the definition of microservices. So its

    reasonable to say that a suite of services that must have its de-ployments coordinated is not a microservice architecture. It is also reasonable to say that many teams that attempt a microser-vice architecture get into trouble because they end up having to coordinate service deployments.

    Further Reading

    Sam Newman gives his list of the benefits of microservices inChapter 1 of his book(the essential source for details of building a microservices sys-tem).

    Benjamin Woottons post, Microservices - Not A Free Lunch! on High Scalea-bility, was one of the earliest, and best, summaries of the downsides of using micros-ervices.

    AcknowledgementsBrian Mason, Chris Ford, Rebecca Parsons, Rob Miles, Scott Robin-son, Stefan Tilkov, Steven Lowe, and Unmesh Joshi discussed drafts of this article with me.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201512

    Eric Evans on Domain-Driven Design at 10 Years

    Listen on SE Radio

    The show will be about Do-main-Driven Design at 10; so its already 10 years that you came up with the idea. Some of the listeners might not be that familiar with domain-driven design, so Eric can you give us a short introduction about do-main-driven design, what it is and how it is special?

    In its essence, domain-driven de-sign is a way of using models for

    creating software, especially the part of the software that handles complex business requirements into such behavior.

    So the particular way in do-main-driven design, the thing that we focus on, is that we want a language where we can really crisply, concisely describe any situation in the domain and de-scribe how were going to solve it or what kind of calculations we need to do. That language would be shared between business peo-

    ple, specialists in that domain, as well as software people who will be writing the software, and that we call the ubiquitous language because it runs through that whole process.

    We dont do as most proj-ects do. We dont talk to the business people sort of on their terms and then go and have very technical conversations about how the software will work, sep-arately. We try to bring those conversations together to create

    Eric Evans is the author of Domain-Driven Design: Tackling Complexity in Software. Eric now leads Domain Language, a consulting group which coaches and trains teams applying domain-driven design, helping them to make their development work more productive and more valua-ble to their business.

    Eberhard Wolff works as a freelance consultant, architect and trainer in Germany. He is cur-rently interested in Continuous Delivery and technologies such as NoSQL and Java. He is the au-thor of several books and articles and regularly speaks at national and international confer-ences.



  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 13

    this conceptual model with very clear language. And thats a very difficult thing to do and it cant be done in a global sense. You cant come up with the model for your entire organization. At-tempts to do that are very coun-terproductive.

    So the other ingredient in that is what we call the bound-ed context. So I have a clear boundary, perhaps a particular application that were working on. Within this boundary, we say this is what my words mean, this is the relationship between the concepts. Everything is clear and we work hard to make it clear. But then outside that boundary, all sorts of different rules apply in some parts of the system, per-haps no rules really apply.

    Out of that comes the rest of domain-driven design, but thats really the essence of it. Its a particular way of dealing with these complex parts of our sys-tems.

    And I guess that is also why so many people are interested in that because thats really what a lot of software engineers do. Can you give an example of such a bounded context and how models might be different there? Because I think thats one of the very interesting parts of DDD; at least it was for me.

    I can give some examples. One common thing is that different parts of an organization might deal with the domain in a very different way. And there may even already be software, there probably is already software, that deals with those different parts.

    So take some company that does e-commerce. So theres a part of the software where were taking orders. So we are very fo-cused on what kind of items are

    in the inventory and how much they cost and how do we collect these items together into some kind of a shopping cart? And then eventually the order is cre-ated and then theres payment and all of those concerns.

    But then in shipping, per-haps theyre not really that in-terested in most of those issues. What they care about an item is what kind of box will it fit into, and how much does it weigh, and which kind of shipping did you pay for, and do we ship it all in one box, or this ones out of stock so were going to go ahead and ship the part weve got and then send the rest later, and how do we keep track of an order thats been partially shipped but part of its still waiting?

    Although in theory you could create a single model that would represent all these dif-ferent aspects, in practice thats not what people usually do and it works better to separate those two contexts and say, well we basically have a shipping system here and an order taking system and perhaps other things too - Im not saying it would just be those two. You could create concepts so general and versatile that you could handle all these cases. But were usually better off with more specialized models: a model that handles really well the idea of an order as a set of physical objects that fit into certain kinds of box-es and that you may or may not have available at this time; and another one that says, well, here are items that youve chosen and here are similar items; and just totally different issues.

    I think that thats a very good short introduction to DDD, and in particular the bounded contexts is I think really one of the interesting things here, as you said, where you would

    have totally different, lets say, perspectives on items wheth-er youre working on shipping or ordering things. So look-ing back on those ten years of DDD, what was the impact of DDD in your opinion?

    Well, its very hard for me to see that. I mean I do have a sense its had an impact and sometimes people tell me that it had a big influence on them. I think the things that made me feel best is occasionally someone says it sort of brought back the fun of soft-ware development for them or made them really enjoy software development again, and that particularly makes me feel good. Its so hard for me to judge what the overall impact is. I really dont know.

    Its always good if you can bring back the joy in work again. To me the impact of the book is quite huge. I would even call it a movement, sort of a DDD movement. Do you think thats true? Would you call it a move-ment or is it something differ-ent to you?

    I think that probably that was my intention, that I had the sense that I wanted to start a movement, maybe a very diffuse movement but it would be nice to think it had a little of that qual-ity to it.

    One of the things that I real-ly emphasize, and this part had a crusading quality to it, that when were working on software, we need to keep a really sharp focus on the underlying domain in the business. And we shouldnt look at our jobs as just technologists but really our job is -- and this is a difficult job -- as this person who can start to penetrate into the complexity and tangle of these domains and start to sift

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201514

    that apart in a way that allows nice software to be written. And sometimes when you really do it well, the problem becomes sig-nificantly easier on a technical level.

    Yeah. So I think its about, lets say, model mining or knowl-edge mining what a lot of our job is about.


    Was the impact of your book and the movement, was it dif-ferent from what you have ex-pected? Are there any surpris-es?

    Well, one pleasant surprise is that it hasnt just kind of faded. Most books go through a fairly short cycle, they become very popular for three or four years and then they kind of drop into the back-ground. And DDD has stayed a pretty relevant topic for a long time now; really like 11 years since its now 2015. So that gen-uinely surprises me actually, but obviously in a good way.

    And I couldnt agree more. To me its almost like a time-less thing that youve created there. What part of the DDD movement did you learn the most from? What gave you the biggest progress in your own knowledge and skillset?

    Oh, I think that -- and this I think is closely related to why peo-ple still pay attention to DDD -- that DDD has not been static so the core concepts of it have been very stable - lets say I can express them better now. But the people who are really doing

    it, there have been a couple of big shifts in how people tend to build these systems.

    So the first big one was when event sourcing and CQRS came on the scene and the way that we went about design-ing and building a DDD system changed a lot. When did that happen? That was maybe 2007. Anyway, after a few years, and it may be that just about the time that we would have been follow-ing that natural cycle of whats new and moving on to that thing, DDD kind of got a big ren-ovation. I learned a tremendous amount from those people, from Greg Young, Udi Dahan, and the others who went around and around about this thing in CQRS and just really shook things up.

    My way of thinking about things changed a lot and I think the way most people think about DDD now is significantly differ-ent because of that. There have been a few other things but that was certainly the first big one.

    Do you think there are any cir-cumstances where a DDD ap-proach would fail? And how would you deal with them or is it something that can be made to work in any project?

    So there are a few aspects to that. Thats an interesting question because certainly DDD projects fail all the time. Its not unusual. Of course, some of that is just anything difficult fails sometimes so we neednt worry about that. And I think DDD is hard. So what would make a DDD project more likely to fail than other times? I think that some of the most com-mon things are there is a ten-dency to slip into perfectionism: whenever people are serious about modeling and design, they start slipping toward perfection-ism. Other people start slipping

    toward very broad scope: we will model the whole thing - even if we have five different bounded contexts; but well model each one with great elegance and all at the same time.

    So some projects fail be-cause they get too ambitious. They dont walk before they run. Some of them, because they are in an area where the strategy of the organization isnt very clear. Let me put it this way: the great-est value of DDD comes when youre very close to the bone of whats strategically important within the company. Thats when it pays off the most. You need a certain degree of complexity, in-tricacy in the problem or at least some fuzziness or else theres no point to all that thinking. But also, you need to be in some strategically relevant thing. But along with that goes a certain amount of the rough and tumble of the politics of organizations. Some organizations change their strategies.

    Ive seen all these things happen. There are all sorts of other things. Of course, some-times the skills of the team are the problem. So you might have a situation where they get off to a good start and then their exe-cution isnt very good. Of course, bad programming will under-mine any software approach. Ultimately, the code has to be good. Well, the code has to be competent. That one hurts a lot of projects.

    Of course, since I men-tioned bounded context earli-er and I want to underline how fundamental that is, its not an easy discipline. If weve estab-lished that the shipping context and the order taking context are separate and so weve made a boundary between them, there is some kind of translation lay-er there. But some programmer in the shipping context needs a piece of data, and that data is

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 15

    over in the order taking context. Now, what does he do, does he just say, Well, I can just write a query on the order taking data-base. That will just take me 15 minutes. Or I could go and talk to people about how are we go-ing to enhance the interface and translator between those two contexts and then how would this piece of information be modeled within our context, and then Ill do my feature.

    So it takes a certain level of discipline to maintain that boundary. And when the bound-ary is gone, the two models start clashing and it all goes down the tubes pretty fast. Another vari-ation on this of course is that if you have a lot of legacy systems and youre trying to do some work. So ideally, you do that by isolating the new work within a bounded context and talking to the legacy through some kind of translator.

    Anyway, I could go on but I think its not too surprising that some of them fail.

    I agree. As you said, I guess DDD only makes sense if you have a certain complexity and with that comes risk but also by the potential value of the software I guess. What I found interesting about what you just said is that people get overambitious in some points and try to reach for a perfect state. To me that is natural for a lot of technical people. So I was wondering whether you have any secret sauce how to focus the ambition on those parts of the system where its really important, and to live with not so great quality in the other parts, and stop your am-bition there? Is there any tip that you have for that?

    I think that you summed it up very well then. Thats what you need to do and you need to have a boundary between the two. Theres quite a bit in DDD about part of the strategic de-sign part is how to decide which parts are which: like theres a general category of generic subdomains where we say, well, theres nothing all that special; were not creating something here that we want to innovate; this is something we want to do in a standard way. In fact, the ideal solution here would be lets not even write any software; lets see if we can get it off-the-shelf.

    Then theres lots of stuff that just keeps the lights on; that whether you have some brilliant insight or not isnt going to really change the outcome very much. And then there are those lever-age points. This is the general thing but its very hard, I admit, because first of all, its hard to know. Youll often get it wrong and youll often choose a topic which may turn out not to have been very core of the strategic value. But still I think theres a lot of value in trying.

    Another thing is the perfec-tionism, because even if you got zeroed in on a certain part that was strategically valuable, per-fectionism can still kill you. You have to deliver and you have to deliver fairly quickly. In fact, DDD depends on iteration. We as-sume that you dont understand the problem very well at the be-ginning, that youll get it wrong the first time. And so its essential to get the first one done quick and then get on to the second it-eration and get that done quick too, because youll probably get that wrong too. And then get on to the third one which is where youre probably going to have some clue by then and that third one might be fairly good. And if you can get it done fairly quick, then youll have time to do the

    fourth one which is going to be really elegant.

    And Im serious. I think that its a weird paradox but perfec-tionism prevents people from creating really elegant designs, because it slows them down in the early iterations so they dont get to do enough iterations. Multiple iterations, I mean itera-tion as in doing the same thing over, not iteration as when peo-ple really are talking about in-crements where Lets do a little iterative requirement at a time. I mean doing the same feature set then redoing the same fea-ture set again but with a nicer design, with a new insight into what it means. Thats the key: move quick.

    That sounds pretty interesting to focus on the number of re-iterations instead of reaching for a perfect solution at the very beginning. One thing Im really wondering is if you ever plan to update the book, is there anything you would like to change in the book?

    Im pretty sure Im not going to update the book. I might write something new at some point. But anyway, I think I probably wont change the book. But if I did or rather if I were going to try and explain DDD better, cer-tainly one thing that I have real-ized for a long time is that the concept of the bounded context is much too late in the book. All the strategic design stuff is way back at the back. I treat it as an advanced topic. Theres some logic to that but the trouble is that its so far back that most people never get to it really. So I would at least weave that in to the first couple of chapters. The ubiquitous language is in chapter 2 so thats all right. But

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201516

    I would have bounded context there in chapter 2 or three.

    Another thing I would do is try to change the presentation of the building blocks. The build-ing blocks are things like the en-tities and value objects and lay-er domain events and stuff like that. They are important things but there is a big chunk of that stuff right in the early middle part. Most people dont get past it and they come away thinking that thats really the core of DDD, whereas in fact its really not. Its an important part just because it helps people bridge from the conceptual modeling of DDD to the necessity of having a pro-gram, having code that really re-flects that model, and bridging that gap is difficult; thats why there was so much emphasis on that. But I really think that the way I arranged the book gives people the wrong emphasis. So thats the biggest part of what Id do is rearrange those things.

    It makes a lot of sense I guess. I agree that strategic design is really, really important and re-ally one of those things that a lot of people dont think about when they hear the term DDD. Recently, we have seen a trend towards microservices archi-tectures and weve already have had quite a few discus-sions about microservices on the show. So how does micros-ervices fit into the DDD world? Is there erudition?

    Im quite enthusiastic about mi-croservices. I think that it helps people who want to do DDD quite a bit. And I also think that certain aspects of DDD can help people do microservices better. So when I say it helps people do DDD, Ive already talked about bounded contexts and how im-portant that is. If you think about

    what the people do when they do microservices in a serious way, the interior implementa-tion of microservices is very iso-lated. Everything is supposed to go through that interface. Any kind of data that they contain is exclusively held by them. Theres really a tight boundary and that is what you need.

    The bounded context is a concept which in more tradition-al architectures, there werent very good ways to implement that concept; to really establish the boundary. So it seems to me that microservices has delivered us a practical and popular way of defining and sticking to those boundaries. And thats a big deal. And the emphasis on the micro, well, someone once asked me, Whats the difference be-tween microservices and the old SOA services? And I said, Well, I think part of it is the micro. These services are smaller. Thats just a convention, of course, but its an important convention. The idea that a very small piece of software would be very isolat-ed and doing their own thing. If you think about my example of the order taking versus shipping, of course those are too big to be a single microservice. It would probably be a little cluster of them each. But this notion that you would take them separate-ly would come very natural in a microservices architecture. So thats one way, the big way, in which I see that it helps DDD.

    So when you say that shipping an order would be a cluster of microservices, does that mean that you would think that the bounded context would be a cluster of microservices? Is that what you are saying?

    That is exactly what Im saying. And this, by the way, kind of

    points into where I think DDD can be helpful to microservices, because they have the same problem that SOA had in the sense that there is a vagueness about who can understand the interface of a service.

    So within a bounded con-text, lets say the interior of a mi-croservice, there are things that are crystal clear, or at least they could be. So lets say that weve declared one of these microser-vices to be a context and every concept in there is very consis-tent throughout. So we have said that an order line item means this, and it has certain properties and certain rules about how you combine it with other line items and whatever; all of these things are very clear in terms of their language and their rules.

    Now we go to the interface of the service. And so there we would have certain language -- unless you view a service thats just some strings and numbers going in and out. But thats not the way people view services and not the way they do well-de-signed services. Well-designed services have a kind of language about what they do. They have a contract.

    So if we say, all right, well, then when you send an order to this microservice, this is what it means. And I dont just mean the fields that are in it. I mean this is the concept of what it is. Now, if you zoom out a little bit, you see that typically what people do is that they have little clusters of services that essentially speak the same language.

    So if my team is working on an ordering system, we may have a model and we might -- lets say we have five microservices and they speak the same language and weve worked hard to make that language clear. And then over here were saying, well, we really are dealing with a different set of problems. These microser-

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 17

    vices speak a different language. If you send an order over here, it might mean a little bit different thing. And even if we try to dis-ambiguate with longer names, we wont always get that right.

    So its better to just say over here is a different language. And that means that if a message goes from a microservice within the cluster to another one, thats going to be really easy. But if a message goes from one of these clusters to another, we might need to put it through a compo-nent that can translate it. So this is one way in which I think that once we look at multiple micro-services, we need to think do clusters of them belong in differ-ent bounded contexts?

    Theres also the issue of the inside and the outside. The out-side of these -- micro services are what Im really talking about now -- the outside of a microservice might speak a different language than the inside. You know you might say pass an order to this microservice or pass this stream of orders to this microservice. Inside its going to crunch away with a model that views orders in a statistical fashion, lets say, and comes up with some recommen-dations or something.

    Well, the interior then is us-ing a quite different model. The interior of that microservice is a different bounded context. But as I said, that whole cluster is speaking the same language to each other. So we have this inter-change context where we define what messages mean as they move between contexts and then we have the interior context of each service.

    It makes a lot of sense. So what Im wondering is if a bound-ed context is usually a cluster of microservices, is there any way that you can think of to

    tell whether certain function-ality should be implemented in a microservice on its own or just be part of another mi-croservice? Because obvious-ly if there is a cluster that is a bounded context, its not one bounded context is one micro-service, its a set of microser-vices. So Im wondering wheth-er there is a rule that would give us an idea of whether we should break this functionality apart into an additional micro-service and a bounded context doesnt seem to cut it.

    So first of all, yeah, if a cluster of microservices is a context, or rather the exterior, the message passing between them would be a microservice, and then inside of each is a -- sorry is a bound-ed context -- and then the in-side of each is another bounded context. But now youre saying, well, suppose that we have a new functionality we need to put somewhere, should I put it as another microservice in this cluster?

    Well, I think that this is ba-sically though the same kind of question we always have to answer when were designing things. Does this piece of func-tionality fit within some existing structure? Or is it going to start distorting that structure out of a good shape? And if so, then where else should I put it? I think the factors that go into that, Im not being too original here, is how coupled is it to the other functions within the cluster? Like if theres a necessarily a chatty relationship with three different components of that cluster, then it seems very likely were going to want to keep it in that cluster.

    Another factor though would be the expressiveness of that cluster. The expressiveness of that particular bounded con-texts language - does it express

    this concept well? Can I extend that language to express this concept well? And if so, then it might be a good fit. If not, then how much price am I going to pay in proliferation of different bounded contexts? You know theres a tradeoff, of course.

    So theres no real answer there. Its like heres where we have to do good work as design-ers.

    And thats probably the hard part about it.

    A little trial and error helps too. Thats another reason to not be too perfectionist. You wont get it right anyway and save time for it-eration. Go back and do it again.

    Yeah, and do it better the next time. Okay. So you already mentioned the term CQRS. Can you explain what that is?

    I remember trying to understand that for a couple of years. So I will say that event sourcing and CQRS came along at almost the same time, and the community of people that were working on them was very interwoven and the two things were not so clear-ly delineated; certainly not at the time. But I do think there are two distinct architectural concepts there. They often work well to-gether but its useful to think of them separately.

    The one that immediate-ly made sense to me that just spoke to me instantly was event sourcing. And then CQRS was a little bit less obvious to me. But I do think its a good technique. So in essence, CQRS says that you break your system into compo-nents that either are read-only

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201518

    things or they are things that process updates.

    So lets take the order tak-ing; you know that ordering ex-ample. When a new order comes in, in CQRS wed put that in the form of a command. A command meaning the C in CQRS. And the command would be enter this order or something like that. Now, it goes to some microser-vice, lets imagine, whose job is to take this kind of command and see if it can be done; like it might say, Oh, Im not going to take this order because we no longer carry that item.

    So commands, as theyre defined in CQRS, you can say no. So that would be the response to that. Or lets say, okay, we do go ahead and we process the order and we reduce the inventory and we initiate a shipment and we send a message about that. Some events come out: some events that say things like the inventory has been reduced and another event that says theres a new or-der that has to be shipped. This is the responsibility of that com-mand processing part.

    Now the query, thats the Q, sometimes youd say, well, a user might want to look at the catalog, decide what he wants to order. The user might want to see the status of his order, Has my order been shipped yet? things like that. So this is the Q part. I want to see the status of my or-der. And the idea is that this part of the system would be kept very simple. There would be a part of the system where youd have to figure out how to ship an order. But once it had been shipped, youd update the status in a que-ry part that would say this order has been shipped.

    So queries that way can scale differently than the com-mand processing. And in a sys-tem where you have to do a lot -- if this were an e-commerce system where we were handling

    thousands of orders a minute, but maybe were handling even more queries, but we can scale them independently, we can recognize that queries take less processing power perhaps; that since theres no change hap-pening, we dont have to worry about consistency rules.

    So the query part is very simple and fast and scales that way. The command part is where we have to deal with all the issues of, well, what if a command came in to cancel the order and weve already shipped it, what are the rules around that? Does the com-mand still get processed? I mean it will get processed but does it still get cancelled? On and on. All that rule stuff goes into that, figuring out how to respond to a command.

    So we should probably explain that CQRS is Command Query Responsibility Segregation if I remember correctly?


    You already said that there is a relation to event sourcing. It seemed to me that the C part, the commands, are the events in event sourcing. Is that what the relationship is like?

    Well, I think you could have an event source system that was not CQRS. So for example, you could just have a module that responds to queries and also can process commands, and if you had that you wouldnt really be practic-ing CQRS because you wouldnt be separating them. But another thing is that in event sourcing, lets say that we have an order object. The old traditional OO way of doing this is that that or-der object might have a -- it says

    its been shipped. In event sourc-ing we say, well, we dont have a field or anything like that. What we have is a series of events and when it shipped there is created an event has shipped.

    So when we want to know the status of that order, we just go and find the events relevant to it and then roll them up. The classic example might be -- well, Ill use the example I first heard Greg Young use to explain the point.

    So lets say that you are do-ing some kind of stock trading application. Someone says sell 10,000 shares of IBM above a cer-tain price. So this order goes out. Its 10,000. And now the question comes, well, how many shares are still to be sold? So each time we execute this order, lets say we sell 1,000, and then in a separate transaction we sell 2,000 more. So here we have two events -- really three events. One was sell 10,000, and then there were two events that said: we sold 1,000, and then another event that said we sold 2,000. Now the question is how much remains to be sold? How much IBM should we sell at that price? At the time of the query, we can find what events are visible to us and we can cal-culate it.

    So in the old days, wed have had that object and it would have had 10,000 and then the first sale comes in and wed subtract 1,000. So now it would say sell 9,000 and then another 2,000 come in and wed say sell 7,000. And in event sourcing sys-tems dont even have that field. Or if you do, its an immutable field that expresses the original order to sell 10,000 and then youve got a completely separate object, an event object, that says we sold 1,000 and another one that says we sold 2,000. If you want to know, you can figure it out. You look at all those events

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 19

    and you say, 10,000 minus 1,000 minus 2,000.

    And that is the concept of event sourcing basically. So what Im wondering is what is the rela-tionship to DDD then of CQRS and event sourcing?

    Well, event sourcing I think its easier to illustrate because its a modeling technique. Its talking about how you would represent the domain. If you look at before that and the emphasis on enti-ties and values, this is placing the emphasis in a little bit different place. Its saying certain things happen in the domain; in the do-main we have got an order exe-cution and that should be explic-itly modeled.

    So its really making the model more explicit about the changes. If you take the old style OO system where things change and the objects represent the way things are in our most cur-rent view, and this also is the typ-ical relational database approach too; but they dont show you what happened. They just show you how things are now. Where-as event sourcing shifts and says lets model the state change rather than the state. And we can derive the state from the state change.

    So now we say we exe-cuted an order; thats the thing we store. We executed an order again; thats the thing we store. If you want to have that other view, its just a rollup of this. So its re-ally a modeling technique and it emerged from trying to apply DDD to certain kinds of problems that were very event centric and also where they had to get very high volume. With this, you can scale up the updates because if your updates are very frequent and your reads are less frequent, for example, you can be inserting

    events into the system without having to update an object in place every time.

    The objects all become im-mutable, which has certain tech-nical benefits, especially if youre trying to scale things; parallelize things. So I think it fit into DDD so naturally because -- its real-ly a revamping of the building blocks, is one way to look at it, but its a little more radical than that.

    One thing that Im really won-dering about is if I look at DDD and in particular, well, on the model part, it really seems to be an object-oriented ap-proach, I guess, because there are those value objects entities and all these kinds of things. Its rather easy to think about how that would be implement-ed using object-oriented tech-niques. In the last few years there has been a shift to func-tional programming. So do you think that DDD can be applied to functional programming too, even though it was origi-nally expressed in a rather ob-ject-oriented terms?

    Yes, that is one of the big things thats happened over these 11 years. The reason that every-things expressed in terms of objects is because objects were king in 2003, 2004; and what else would I have described it as? Peo-ple who wanted to address com-plex domains wanted to try to develop a model of that domain to help them deal with the com-plexity; they used objects. And the building blocks were an at-tempt to describe certain things that help those kind of models to actually succeed.

    Now, if you are going at it from a functional point of view, then your model is going to look quite different, or rather your

    implementation is going to look quite different. I think that the event sourcing actually points a good way, because you know I mentioned that if youve applied full-on event sourcing, the ob-jects are immutable, which is a start toward the functional per-spective; because instead of hav-ing objects we change in place, we have some kind of data struc-ture that we use a function to de-rive another data structure from.

    So if you imagine then an event source system where -- and let me just throw micros-ervices in. You have a microser-vice, pass some events to it and it computes the results and passes out another stream of the events that say well as a consequence of this, this is what happens. So I pass in we executed a trade for 2,000 and we executed another trade for 1,000 and it passes out an event that says the order has been reduced to 7,000, whatev-er,

    So its pretty easy to imag-ine implementing that as a func-tion actually, perhaps more nat-ural than OO in fact. Youve got a stream of events and you want to use it to compute another stream of events; that really cries out for a function to me.

    Yeah, absolutely. It sounds somewhat like an Actor model even.

    Yes. Well, some people in the DDD community have real-ly been into the Actor model. Vaughn Vernon, for example, talked a lot about using Actor model. Indeed, it does seem to be a good fit. It seems like that it corresponds closely to another one of the building blocks which we havent talked about yet. In the original book, it talked about a building book called aggre-

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201520

    gate, which was sort of trying to describe a set of objects which would have rules about their internal data consistency and somehow would be allowed to enforce those rules.

    So people have said, well, if you take that unit, that unit of whatever Im trying to make con-sistent, at any given change and you give that responsibility to a single actor, now you imagine an actor is receiving events, or commands, and it has to figure out whether it can maintain that state in a consistent -- so move from one state to another in a way that respects the invariance of that particular aggregate. And so thats an application of Actor model to pulling in a little bit of the old aggregates plus events and commands. A lot has been going on when we start talking about it. The way people really build these things is so different.

    We should probably say a few a words about Actor. Well, an Actor is something that gets events from the outside and executes them sequentially. It is a model for parallel compu-tation where you have multiple Actors exchanging events and each of them works sequential-ly. But the system as a whole is parallel because all these Ac-tors work parallel on their own event streams. Thats basically the idea and that seems to be a good fit to the aggregates as you just said; the DDD aggre-gates.

    Right. And your description is a very nice summary of the tech-nical properties of these things. And if I try to describe why this is so useful when we try to -- we have these conceptual models of the domain and were trying to make a software system that

    respects these concepts and ex-presses them.

    So theres a lot of different state within a big system and an aggregate says, well, one of the things that will keep you from going parallel, like you do in Ac-tor, is that you have no boundary where you can say that the result of this computation does not im-mediately affect anything else; that we can handle that asyn-chronously. And thats exactly what aggregates do. They define a subset of the state which has rules about how you can change that state. And you say any kind of resulting changes elsewhere will be handled asynchronously.

    Thats what an aggregate does. And its related to the do-main because you have to look at the business to know what really can be changed independently; where will there be consequenc-es to getting things out of sync?

    Yes. It seems like a good fit of a certain technology or technical approach to a certain domain approach.

    Yeah, because when we first were doing the aggregate thing, well before I wrote my book, back in the late 90s at least, it was difficult to implement the aggregates; there wasnt really a technical artefact to hang your hat on. So the nice thing about Actor is that it gives you some-thing to say we have decided that were going to make each aggregate the responsibility of one Actor. Now I can really tell another programmer, okay, this is my aggregate because I made an Actor for it.

    It really helps if you can have an explicit thing. This is why, by the way, I think objects are still a valuable concept. It says that heres a software artefact that makes explicit something

    in our conceptual model; that theres this thing, an important thing in the domain that we have defined. Theres a certain state around it. Theres some rules and behaviours around it.

    Personally, Ive taken a hol-iday from OO for a few years to freshen up my thinking, but we dont want to throw out the baby with the bathwater. Making things explicit is good.

    Another technology that has been on the rise in the last few years is NoSQL. Is there any relation between NoSQL and DDD too or are they not relat-ed?

    NoSQL, of course, is not -- unlike event sourcing and CQRS -- peo-ple who came up with those con-cepts really were DDDers who were trying to come up with a better way to do DDD. Thats not true at all with NoSQL; it came from a totally different world. They were very interested in technical properties of what they were doing and so on, and above all I think the motivator of a lot these things was speed. Howev-er, I actually think that its a great boon for DDD.

    But one of the biggest handicaps that weve had for a long time is this obsession with everything being stored in a re-lational database. Data in a rela-tional database has to be struc-tured a certain way. In the days when objects were dominant, the relational database was also still the dominant database. So we used OR mappers, object-re-lational mappers. Of course, people still use these; I say it as if its in the past. And then people would talk about the impedance mismatch. Well, whats the im-pedance mismatch? It just says that the fundamental conceptual structure of an object is different

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 21

    from the relational table or set of tables. The way they relate to each other is different.

    The trouble here -- I think it was Eric Meyer who I heard made this point. He said when we say NoSQL, we should make No be an acronym for not only. So we should say NotOnlySQL. And his point was that the problem isnt the relational database which is a brilliant thing; theyre a won-derful, powerful tool. But when you use them for everything, then you encounter data that doesnt fit into them, that isnt the right shape, and you have to twist them into that shape. But when youre dealing with a problem that fits them, theres just nothing else -- theyre a fan-tastic tool but weve used them for everything. Its hard to find a tool that works well for every-thing, certainly objects; I think thats another problem with ob-jects because they were used for everything, and of course theyre not good at everything.

    This relates to DDD because were trying to take all these different concepts from the do-main, and we are trying to create concepts that have a tangible form in the software. That shape -- sometimes theres a natural shape to it that more often is ob-ject-like than is relational. If its object-like, maybe you do want to represent it as objects, but then you have to cram it into a relational table with relations.

    So instead of that, maybe you use a key value store, which is a very natural fit to objects ac-tually. Object structures really are just references; to references of references. Its got that same kind of tree structure that -- graph structure anyway -- though good ones have more of a tree struc-ture. So its a better fit to some kinds of problems.

    And then the nice thing about NoSQL is that its a rela-tively diverse world. Theres the

    graph databases, since I did men-tion graphs, but there are things that are really nicely modeled as graphs. If you say, How am I go-ing to model this thing? some-times people think modeling means OO modeling. Oh, I have to draw a UML diagram of it and then implement it in C# or Java. Thats not what modeling means. Modeling means to create ab-stractions that represent import-ant aspects of your problem and then put those to work.

    So sometimes the natural abstraction is a graph. You want to say, well, how do these people relate to each other? You know, the graph databases, Neo4j and things like that, allow us to choose a tool that actually fits the kind of problem were trying to solve. I dont now have to twist it into objects and then figure out how to do graph logic over objects while, by the way, Im also stuffing the object data into a re-lational database. Instead, I use a graph database and ask graph questions using a graph query language. This is the world of NoSQL to me that we can choose a tool that fits well with the prob-lem were trying to solve.

    I think the point that youre making is quite important. Obviously, what youre talking about is how those NoSQL da-tabases gives you an advan-tage concerning modeling data, while a lot of people still think that NoSQL is all about scaling and big data issues. It is one of the benefits, but its probably not even the most important one. Its more about this flexibility, as you said, and the more natural modeling and different alternatives to relational databases. So I think thats a very good point.

    Yeah, and you know I agree with you, and I think the main reason that people think of it as primari-ly a scaling technique is because thats where it came from; that was the real driver behind it. It probably took the absolute ne-cessity of those big data people having to as -- you know the equi-librium where we were so deeply rooted to the relational database, it would take something like that to get us loose. But I do think that the great opportunities for NoSQL are in the world of com-plex problems where the things we want to do just dont really fit the relational model. Sometimes they dont fit the OO model. We can choose the thing that fits.

    Thats actually a very good way to sum it up. So thanks a lot for taking the time. Thanks a lot for all the interesting answers and the interesting insights. I enjoyed it a lot.

    Oh, thank you.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201522

    Service Architectures at Scale: Lessons from Google and eBay

    Watch on InfoQ

    Evolution of service architecturesService architectures of large-scale systems, over time, seem to evolve into systems with similar characteristics.

    In 1995, eBay was a mono-lithic Perl application. After five rewrites, it is a set of microser-vices written in a polyglot of pro-gramming languages. Twitter, on its third generation of archi-tecture, went from a monolithic Rails application to a set of poly-glot microservices. started out as a monolithic C++ application and moved to ser-vices written in Java and Scala. Today, it is a set of polyglot mi-croservices. In the case of Google and eBay, there are hundreds to

    thousands of independent ser-vices working together.

    Unlike the old way of build-ing services with strict tiers, these services exist in an ecosys-tem with layers of dependencies. These dependencies resemble a graph of relationships rather than a hierarchy. These relation-ships evolved without central-ized, top-down design evo-lution rather than intelligent design. Devel-opers create services or ex-tract them from other services

    or products. Sometimes they group these extracted services in a common service. Services that are no longer needed are depre-cated or removed. Services must justify their existence.

    The diagram below is an ex-ample of a set of services at Goo-

    Randy Shoup has experience with service architecture at scale at Google and eBay. In his talk, Service Architectures at Scale: Lessons from Google and eBay, he presents the major lessons learned from his experiences at those companies.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 23

    gle. These services evolved to a hierarchy of clean layers. The hi-erarchy was an emergent proper-ty; it was not designed that way.

    Cloud Datastore is a NoSQL service in Googles publicly avail-able App Engine. Megastore gives multiple-row transactions and synchronous replication among nearby data centers. Big-table is a data-center-level struc-tured storage of key-value pairs. Everything is built on Googles distributed file system Colossus. At the lowest level is Borg, the cluster-management infrastruc-ture responsible for assigning resources to processes and con-tainers that need it.

    Each layer adds something not in the layer below, but is gen-eral enough to be used in the layer above. Everything at Goo-gle runs on Colossus, and almost everything uses Bigtable. Me-gastore has many different use cases. It was originally written for Google apps such as Gmail, and the Cloud Datastore team end-ing up building on it.

    This was never a top-down design. It grew from the bot-tom up. Colossus was built first. Several years later, Bigtable was built. Several years later, Mega-store came into being. Several years after that, Cloud Datastore migrated to Megastore.

    Architecture without the architectNobody at Google has the title of architect. There is no central ap-proval for technology decisions. Mostly, individual teams make technology decisions for their own purposes.

    The early days of eBay, cen-tral approval from its Architectur-al Review Board was required for all large-scale projects. Despite the great number of talented people on that board, they usu-ally got involved when it was far too late to make changes. It end-ed up being a bottleneck. The

    boards only influence was the ability to say no at the last min-ute.

    It would have been much better to have these smart, ex-perienced people work on some-thing really usable by individual teams in a library, or tool, or service, or even a set of guide-lines that people could use on their own rather than having the teams learn at the last min-ute that a particular replication style (for example) was not going to work.

    Standardization without central controlStandardizing the communica-tion between IT services and the infrastructure components is very important.

    At Google, there is a pro-prietary network protocol called Stubby. Usually, eBay uses REST-ful HTTP-style data formats. For serialization formats, Google uses protocol buffers; eBay tends to use JSON. For a structured way of expressing the interface, Goo-gle uses protocol buffers, eBay usually uses a JSON schema.

    Standardization occurs nat-urally because it is painful for a particular service to support many different network proto-cols with many different formats.

    Common pieces of infra-structure are standardized with-out central control. Source-code control, configuration-manage-ment mechanisms, cluster man-agement, monitoring systems, alerting systems, diagnostic de-bugging tools all evolve out of conventions.

    Standards become stan-dards not by fiat, but by being better than the alternatives. Standards are encouraged rather than enforced by having teams provide a library that does, for example, the network protocol. Service dependencies on partic-ular protocols or formats also en-courage it.

    Code reviews also pro-vide a means for standardiza-tion. At Google, every piece of code checked into the common source-control system is re-viewed by at least one peer pro-grammer. Searching through the codebase also encourages standardization. You discover if somebody else has done what you need. It becomes easy to do the right thing and harder to do the wrong thing.

    Nonetheless, there is no standardization at Google around the internals of a service. There are conventions and com-mon libraries, but no standard-ization. The four commonly used programming languages are C++, Java, Python, and Go. There is no standardization around frameworks or persistence mech-anisms.

    Proven capabilities that are reusable are spun out as new services, with a new team. The Google File System was written to support search and as a dis-tributed, reliable file system, oth-ers used it. Bigtable was first used by search, then more broadly. Megastore was originally built for Google application storage. The Google App Engine came from a small group of engineers who saw the need to provide a mechanism for building new webpages. Gmail came out of an internal side project. App Engine and Gmail were later made avail-able for the public.

    When a service is no lon-ger used or is a failure, its team members are redeployed to oth-er teams, not fired. Google Wave was a failure, but the operational transformation technology that allowed real-time propagation of typing events across the net-work ended up in Google Apps. The idea of multiple people be-ing able to concurrently edit a document in Google Docs came straight out of Google Wave.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201524

    More common than a ser-vice being a failure is a new gen-eration, or version of a service that leads to deprecating the older versions.

    Building a service as a service ownerA well-performing service in a large-scale ecosystem has a single purpose, a simple and well-defined interface, and is very modular and independent. Nowadays, people call these mi-croservices. While the word is relatively new, the concept is rel-atively old. What has happened is that the industry has learned from its past mistakes.

    A service owner has a small team, typically three to five peo-ple. The teams goals are to pro-vide client functionality, quality software, stable performance, and reliability. Over time, these metrics should improve. Given a limited set of people and re-sources, it makes sense to use common, proven tools and infra-structure, to build on top of oth-er services, and to automate the building, deploying, operating, and monitoring of the service.

    Using the DevOps philos-ophy, the same team owns the service from creation to depreca-tion, from design to deployment to maintenance and operation. Teams have freedom to choose their technologies, methodolo-gies, and working environment. They also have accountability for the results.

    As a service owner, you are focused on your service, not the hundreds to thousands of ser-vices in the broader infrastruc-ture. You do not have to worry about the complete ecosystem. There is a bounded cognitive load. You only need, as they say at Amazon, a team large enough to be fed by two large pizzas. This both bounds the complexity and makes for high-bandwidth com-

    munication. Conways law plays out to your advantage.

    The relationship between service teams is very structured. Although everyone is working for the same company, you want to think about other teams as vendors or customers. You want to be cooperative but very clear about ownership and who is re-sponsible for what. Defining and maintaining a good interface is a large part of it. The other critical part is that the customer or client team can choose whether or not to use the service. No top-level directive exists, for example, to store data in Bigtable.

    Teams end up defining a service-level agreement that their clients can rely on to meet their own objectives. Otherwise, the client can build whatever functionality they need and that new functionality could become the next generation of the ser-vice.

    To make sure that costs are properly allocated, customer teams pay for the use of a service to meet the common economic incentives. Things given for free are not used optimally. In one case, a service using App Engine changed from 100% to 10% re-source consumption overnight when they had to pay for its use. Begging and pleading that client to reduce consumption did not work because that team had oth-er priorities. In the end, they got better response times with the reduced resource use.

    On the other hand, since the service team is charging for use, they are driven to keep service quality high by using practices such as agile develop-ment and test-driven develop-ment. Charging for use provides incentives for making small, easy-to-understand changes. All submitted code is peer reviewed. A thousand-line change is not ten times riskier than a hun-dred-line change but it is more

    like a hundred times riskier. Every submission to source-code con-trol causes the automated tests and acceptance tests to run on all the dependent code. In ag-gregate, Google ends up running millions of automated tests every day, all in parallel.

    Stability of the interface is important. The key mantra is never break your clients code, so you often have to keep multi-ple versions of the interface, and possibly multiple deployments. Fortunately, most changes do not affect the interface. On the other hand, you do want to have an explicit deprecation policy so that you can move your clients to the latest version and retire the older version.

    Predictable performance is important. Service clients want minimal performance variation. Suppose a service has a median latency of one millisecond, but at the 99.99th percentile the laten-cy is one second, so it is a thou-sand times slower about 0.1% of the time. If you use 5,000 ma-chines as a Google scale opera-tion might, you are going to wind up being 50% slow. Predictable performance is more important than average performance. Low latency with inconsistent per-formance is not low latency at all. It is also easier for clients to program against a service with consistent performance. The la-tency at the 99.99th percentile becomes much more import-ant as services use lots of other services and lots of different in-stances.

    Low-level details can be important even in large systems. The Google App Engine found that periodic reallocation of C++ STL containers resulted in latency spikes at a really low but periodic rate that was visible to clients.

    Large-scale systems are ex-posed to failures. While many fail-ures can happen in software and hardware, the interesting ones

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 25

    are the sharks and backhoes. Google has suffered network dis-ruptions because sharks appar-ently like the taste of trans-At-lantic cables and bite through them. Lots of fiber goes through lightly populated areas of the central United States. Once, a guy digging several meters deep to bury a horse with a backhoe cut through a line of Google fi-ber. You can never predict these occurrences but you need to be able to recover from them.

    Most disruptions occur by mistake. Just like security needs defense in depth, you need re-silience in depth. You need to be resilient to machine, cluster, and data-center failures. You need load balancing and flow control when invoking other services be-cause failures can happen on the network and off the network. You need to be able to rapidly roll back changes that have errors.

    You never deploy to hun-dreds, thousands, or tens of thousands of machines at once. You typically pick one system, or one instance of a new soft-ware version, as a test canary in a coal mine. If that looks good, you stage a rollout to 10% or 20%. Then you do a staged roll-out to the rest of the system. Sometimes you may have to roll back when you are at 50%. It is extremely important that you are able to rapidly roll back. Feature flags, which allow you through configuration to turn features on and off, are very useful. They allow you to decouple code de-ployment from feature deploy-ment. You might completely de-ploy with the feature turned off, and then turn on the features in a staged fashion. If there is a per-formance issue, business failure, or bug, you can turn it off faster than having to roll back code.

    Anti-patternsYou can never have too much monitoring. You can have too much alerting, so you want to avoid alert fatigue.

    Service-oriented architec-ture has gotten a bad name, not because the ideas were wrong but because of the mistakes that industry made along the way from lack of experience.

    One anti-pattern is a service that does too much., eBay, Twitter, and Google have ecosystems of tiny, clean services. A service that is too large or has too much responsi-bility ends up being a miniature monolith. It becomes difficult to understand and very scary to change. It ends up increasing or instilling lots more upstream and downstream dependencies than you would otherwise want.

    Shared persistence is an-other anti-pattern. If you share a persistence layer among services, you break encapsulation. People can accidently, or on purpose, do reads and writes into your ser-vice and disrupt it without going through the public interface. You end up unwittingly reintroduc-ing coupled services.

    The modern approach of microservices has small, isolated services independent of one an-other. The resulting ecosystems are healthy and growing.

    Nobody at Google has the title of architect. There is no central approval for technology decisions. Most technology decisions are made by individual teams for their own purposes.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201526

    Randy Shoup on Microservices, Conways Law, and Evolutionary Architecture

    Watch on InfoQ

    InfoQ: Could you brief us on what you have been talking about today?

    Sure. Most of the big web sites that we know about Ama-zon, eBay, Google, Twitter, etc. started out as monoliths and have ultimately all gone through convergent evolutions, not co-ordinated in any way, and have ended up at what we are starting to call polyglot microservices. I wanted to explore why that was the case.

    I talked a bit about a mono-lith and why you might want to have one. I mean, we say mono-lith and we mean it as a slur of-ten, but the reality is that for most systems, or certainly most stages

    of companies, that is a perfectly appropriate architectural ap-proach. You do not need to dis-tribute if you do not need to. I talked a little bit about the pros and cons of monoliths and then flipped over to what it looks like in microservices and again, pros and cons of this simple indi-vidual pieces but now you have to deal with coordination among them. I talked a little bit about what is it like to be an owner of a service in a large-scale micro-service ecosystem, like at Goo-gle or Netflix. Then I closed with some anti-patterns, which I have personally committed in every sense of the word committed, and I talked about why those are inappropriate and why you could do something better.

    InfoQ: You mentioned that a lot of your experience comes from large organizations such as Google and eBay. Do you think these lessons apply to smaller organizations and is there, per-haps, a cut-off with respect to the size of the organization?

    That is an excellent question. Yes, I think the lessons are applica-ble, but the better answer is that there is a spectrum, and when is it appropriate to apply one or the other? It was not something that I addressed in this talk spe-cifically, but on my SlideShare, I briefly mention phases of start-ups in particular and when it is appropriate to take one step or another step.

    In April 2015, Shoup delivered From the Monolith to Microservices: Lessons from Google and eBay at Craft Conference in Budapest. InfoQ spoke with him after his lecture.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 27

    In the early phase of a start-up, we do not even have a busi-ness model, we dont have prod-uct market fit, we do not have a product. So, it is inappropriate, I think, to think about any archi-tecture or even any technology. If a WordPress blog or buying ads on Google is the right way for you to test your hypothesis about how to move forward, you should totally do that and not build anything. Then there is a phase where we have a product market fit and we think people are willing to pay for it, and now we are trying to grow that busi-ness and typically that is slower than we would like to ramp up. Again, that is a situation where we started from minimal. It is not about the technology and it is certainly not about scaling that technology or the organization.

    We typically have a group of people that can fit around the conference table. This is not the point at which to split the archi-tecture up into small services, divide into small teams, etc. That comes later! Right now, we are one team and we are building one thing: the simplest thing that could possibly work. Then, one hopes that you will start to hit the limits of the monolithic

    organization and the monolith-ic architecture and that is what I call the scaling phase, where you hit the certain inflection point. The point seems to be in com-pany size and organization size, between that 20-25-person mark and the 100-person mark.

    I dont know why this is true, but I have observed that there are sort of stable points for organization size. Everyone fits around a conference table up to 20-25, and the next point seems to be around 100. It is in that transition point where you can make a single team work, even at 20-25. You are rickety, but you can still behave as a single team with fluid roles and so on at the 20-25 mark. But as soon as you are beyond that, and certainly if you scale up to 100, you need to flip the organization and the technology to subdivide into teams with well-defined respon-sibilities, and that is a good point at which to switch from a mono-lithic approach to what I would term microservices.

    InfoQ: If the organization has grown to the point at which they say, Now is the right time, what are the first steps either at a technical level or at an organizational level that it should take?

    I am glad you asked that with both technical and organization-al aspects in mind. Conways law teaches us that the organization-al structure is reflected in archi-tecture. So, it is maybe a bit coun-terintuitive, but when you are at the point where the monolith is slowing you down not earlier the first step you should make, or at least coextensive with deal-ing with the technology, is to change the organization to the structure you want to have. So, subdivide into three-to-five-per-son teams typically two-pizza in the metaphor and have them be responsible for individual parts.

    That naturally lends itself to the technology being split up. There are lots of monoliths that are very highly coupled, in fact most of them, and so it is not a trivial exercise to break them up. As a practical matter, here is what I recommend to people that I consult with now. First, we think it is a good idea to move to the new model and so first we have to agree to that.

    Step zero is to take a real customer problem, something with real customer benefit, may-be a reimplementation of some-thing you have or ideally some new piece of functionality that was hard to do in the monolith. Do that in the new way, first, and what you are trying to do there is to learn from mistakes you in-evitably will make going through that first transition, right? You do it in a relatively safe way, but at the same time you do with real benefit at the end. At the end, if nothing else, you have produced

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201528

    real tangible business value out of this work.

    That is step zero, and what we have done is that we have gotten comfortable with the idea that we can do it. It hopeful-ly achieves the goals of velocity and isolation and so on that we were expecting, and we have learned a lot. Now, we should go through the most important problems, with the highest ROI, vertical slice, with real customer benefit next, and then keep go-ing until you run out of patience. That is how eBay did it. When eBay went from version two, which was a monolithic C++ DLL, to a more partitioned, individual Java application. When it went through that transition, which overall took many years, it first did that step zero on more than a pilot, way more than a proto-type, but something that was tangibly highly valuable was pro-duced. Then eBay reverse-sorted the pages of the site by revenue and worked on the highest reve-nue ones first (and did a bunch of things in parallel), which seems a bit odd and a bit risky, except it has already de-risked it by doing that step zero.

    So now you are saying, I only have limited resources to apply against this migration path over time, and at some point I am going to run out of ROI, I am going to run out of organization-al resources that I am interested in investing in this. That is okay, because you have done the big-gest ones first. This certainly was true in 2010 or 2011 when I was last at eBay and it might still be true there were still pages on the site that were on version-two architecture, simply because they continue to work. They got 100,000 hits a day, no big deal, and they were neither painful enough to migrate nor had suffi-cient ROI to migrate. So, they just stayed and they happily stayed.

    InfoQ: In the talk you men-tioned that, with Google, the architecture was evolutionary and not necessarily by design., Google, and the like are known for having the brightest and the best hands. Do you think there are more guidelines required for small organizations?

    Well, it is always nice to have the best and the brightest, but I think there are lots of good and bright all around. There are many more smart people that do not work at Google and than those who do work at Google or So I dont wor-ry too much about that. But are there guidelines for smaller orga-nizations? Absolutely. And again, the meta-point with all these things is only solve problems that you actually have. I think it is great to talk about these things. Maybe people find some value in listening to me talk about them, but I am increasingly trying to be very clear, when I describe what works well for eBay or Google, to describe why that is true and that everything is a trade-off. Google and are intentionally trying to optimize for velocity of large-scale orga-nizations, which means lots of things moving in parallel with little coordination. They behave like a thousand tiny companies rather than one monster com-pany and that is why those com-panies move fast and other ones do not. But in the case of small-er organizations, if you are all one team, do not subdivide. You should continue to be one team. If you do subdivide into two or three teams, be pragmatic and subdivide the architecture step by step not into a thousand dif-ferent things, but in two things, three things, 10 things, and so on. I think it is important to know that you are going to evolve.

    Again, every successful compa-ny has evolved. Ill say it anoth-er way: no successful company that we have ever heard of has the same architecture today that it had when it started. Dont get too bitter and angry with your-self that the first thing you try is not the thing that lasts forever. In fact, if you had done the thing that was going to live for five or 10 years when you started out, we would have probably never heard of you because you would have spent all your time building for some far future that never came rather than building things that met near-term customer needs in near term.

    InfoQ: One thing I picked up from your talk was the need to standardize among micro-services the connectivity, if youd like. Do you have any guidelines for how to lead or manage those standardization efforts?

    Sure. I just want to repeat that part because so often large en-terprises, many of whom I have worked for, have this visceral idea that we should never duplicate effort and we should standardize on technologies and operating procedures and so on.

    One of the things that may be interesting to know about the Netflixes and the Amazon.coms and the Googles of the world is that they tend not to standardize on the internals of services. So, a service has a well-defined inter-face boundary that isolates and encapsulates a module within it, and as long as the implementa-tion respects the interface that they export and have agreed to, it really does not matter what is inside. Is it Haskell? Is it Ruby? Is it Basic? It actually should not mat-ter as long as it meets the outside needs, and that is what encapsu-

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 29

    lation and isolation actually mean. So, those big ecosystems do not standardize the internals of ser-vices, although there are common conventions and it is not like peo-ple are inventing new ways to do things all the time. But what you do need to standardize is the com-munication.

    It is a big network and you need to standardize in some sense the arcs in the network, not the in-ternals of the nodes. That should match our intuition about, for ex-ample, how economies work or how human interactions work. When we are having a conversa-tion, we have to agree on a com-mon language. If I speak English and you speak Hungarian back I do not speak Hungarian, unfortu-nately that would not work.

    It is the same with econom-ic interactions. If you are a shop owner and I want to buy some fruit from you, we have this agreement: I am going to pay you in some cur-rency and that has some meaning. But, by the same token, that does not mean that we have to have a global language or a global cur-rency because in reality we do not have either global language or global currency. We just need to agree on conventions for partic-ular interactions. So, how do you deal with that?

    Well, Ill describe what hap-pens at the Amazon.coms and the Googles. They often start with one thing, because they are small at the time, and there is that one standard that everybody commu-nicates with, and if that is perfect then it can always be that way.

    But over time they are going to learn that Oh, I can make this faster and add more flow control. There is a bunch of things that you can add to a network protocol that solve problems that you have at scale. What happens in reality is that there becomes version two of the protocol, version three of the protocol and so on, and over time those things get kind of ad-

    opted by more and more services, as those services need the capa-bilities of the new version or as the consumers of those services demand, in some sense, the capa-bilities that are in that protocol. So that is how it happens: evolution-arily more than by dictate.

    InfoQ: What do you think is com-ing after microservices?

    Maybe I am insufficiently imagina-tive, but microservices as a word is new, but the concept is old. It is SOA done properly. Are there any other ways of organizing software? Of course there are. But there is a reason why the Amazon.coms and the Googles and the Netflixes and the Gilts and the Yelps and every-body is ultimately rediscovering through convergent evolution this same general concept. So I think microservices is a real thing. May-be the word will die, but I think that if we have this conversation in three or four years that there will no longer be microservices in anybodys talk titles. We will not be talking about microservices, be-cause it is just going to be a thing that we do.

    The analogy that I think of here is NoSQL. If we were having this conversation three or four years ago, when the hot topic was not Docker and microservices be-cause neither of them existed, but NoSQL systems. Now, it is not that NoSQL systems have gone away, it is not that they are not important anymore, but the fact that Netflix uses Cassandra is not the subject of a talk is only a line item: Oh, we use Cassandra. And that is suffi-ciently descriptive of that thing that we do not say much more about it. Anyway, I think that the next thing about microservices is that we will stop talking about mi-croservices, but we will continue doing them.

    One of the things that is maybe

    interesting to know about the Netflixs and the Amazons

    and the Googles of the world is that they tend not to

    standardize on the internals of services.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201530

    Lessons Learned Adopting Microservices at Gilt, Hailo, and nearForm

    Read on InfoQ

    Richard Rodger is a technology entrepreneur who has been involved in the Irish Internet industry since its infancy. Richard founded the Internet startup in 2003. He subsequently joined the Telecommunication Software and Systems Group (TSSG) and became CTO of one of its successful spin-off companies, FeedHenry Ltd. More recently, he became CTO and founder of Richard holds degrees in computer science (WIT) and mathematics and philosophy (Trinity College, Dublin). Richard is a regular conference speaker and is a thought leader on system architectures using Node.js. Richard is the author of Mobile Application Development in the Cloud, published by Wiley. He tweets at @rjrodger and blogs here.

    Adrian Trenaman is VP of engineering at Gilt. Ade is an experienced, outspoken software engineer, communicator, and leader with more than 20 years of experience working with technology teams throughout Europe, the US, and Asia in industries as diverse as financial services, telecom, retail, and manufacturing. He specializes in high-performance middleware, messaging, and application development, and is pragmatic, hard-working, collaborative, and results-oriented. In the past, he has held the positions of CTO of Gilt Japan, tech lead at Gilt Groupe Ireland, distinguished consultant at FuseSource, Progress Software, and IONA Technologies, and lecturer at Maynooth University (formerly the National University of Ireland Maynooth). He became a committer for the Apache Software Foundation in 2010, has acted as an expert reviewer to the European Commission, and has spoken at numerous tech events. Ade holds a Ph.D. in computer science from the National University of Ireland Maynooth, a diploma in business development from the Irish Management Institute, and a B.A. (Mod.) in computer science from Trinity College, Dublin.

    Feidhlim ONeill has spent over 20 years working in a variety of tech companies in the UK and US, from startups to NASDAQ 100 companies. He spent 10 years at Yahoo in a variety of senior positions in service and infrastructure engineering. Feidhlim works at Hailo where he oversees their new Go-language microservices platform built on AWS.A


    T TH

    E IN




  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 31

    We interviewed representatives from three companies Gilt, Hailo, and nearForm who have agreed to share their experiences in either building a microservices platform from scratch or in re-ar-chitecting a monolithic platform by gradually introducing micro-services. The interviewees are: Adrian Trenaman, SVP of engi-neering at Gilt; Feidhlim ONeill, VP of platform and technical operations for Hailo; and Richard Rodger, CTO of nearForm.

    InfoQ: Please tell us about the microservices adoption process in your company. Why microservices? What technol-ogies have you used to imple-ment them? How long did it take?

    Adrian Trenaman: Adoption is high, now at approximately 300 services. Adoption was driven by an organization structure of autonomous, KPI-driven teams, supported by integrated tooling that makes it easy to create and deploy services. Adoption was also spurred by the adoption of Scala as a new way to write ser-vices.

    We had a number of large monolithic applications and ser-vices. It was getting increasingly harder to innovate fast as mul-tiple teams committed to the same codebase and competed for test and deployment win-dows. Adopting a microservices architecture offered smaller,

    easy-to-understand units of de-ployment that teams can deploy at will.

    We are using Scala, SBT, ZooKeeper, Zeus (Riverbed) Transaction Manager, Post-greSQL, RDS, Mongo, Java, Ruby, Backbone, Kafka, RabbitMQ, Ki-nesis, Akka, Actors, Gerrit, Open-Grok, Jenkins, REST, and apiDoc.

    The adoption process took a period of 1.5-2 years, and is on-going.

    Feidhlim ONeill: Hailo went through a re-platforming exercise and our new platform was built from the ground up us-ing microservices. Microservices was just evolving as a viable soft-ware architecture and we felt it supported how we wanted to work.

    We trialed a number of technologies and ultimately de-cided on a combination of what we knew (Cassandra, ZooKeep-er, etc.) and some new technol-ogies. Selecting Go as our pri-mary language was one of the riskiest choices but has paid off. From project kick off to the first live components was about six months. The full migration was around 12 months.

    Richard Rodger: We are an enterprise Node.js consultancy (one of the largest!), so we were naturally drawn towards the mi-croservice style, as it is a natural fit for the lightweight and net-work-friendly nature of Node.js. We began to adopt after inviting Fred George, one of the earliest advocates, to speak at one of our

    meetups. We found him to be in-spirational.

    As we began to adopt microservices, we tried out a number of approaches. In some sense, there is a tiering to the architecture, in that many adop-tees are simply spitting large web apps into lots of little web apps, whereas people like Fred are going fully asynchronous for each unit of business logic. We have run all these variants in production and what we have found is that this choice is not as important as it looks on the sur-face. More important is to pro-vide a message-transportation layer between services that ab-stracts this question away. Then you have the freedom to arrange communications between your services as appropriate, whilst ensuring that your developers do not have to worry about the transport layer, or the evils of ser-vice discovery.

    We use the microservice architecture for a very simple reason: we can build better sys-tems and delivery them more quickly. It is much easier to deal with changing requirements, be-fore and after go-live, because you change only small pieces at a time, rather than making high-risk full redeployments. Micro-services are easy to specify and test. If you think about, they are black boxes that react to certain messages, and possibly emit certain other messages. This is a very clean interface that you can define and test very clearly. Scaling is much easier. The whole

    If we were to consider Gartners Hype Cycle, microservices are perhaps about to hit the peak of inflated expectations. There is a good number of early adopters, and microservices are quite present in the specialized media, including here on InfoQ. Successful implementations like those at Amazon, Google, and Netflix, etc., have demonstrated that this technology is viable and worth considering.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201532

    application does not have to be performant, only the parts that are the bottlenecks. And you can scale them by adding more instances of a given service a service that, by definition, is stateless, and therefore easy to scale linearly. Finally, microser-vices make project management so much easier. Each microser-vice should take about a week to write. This gives you nice, easy blocks of effort to work with. If a developer makes a mess, you can, literally, genuinely, throw the code away and start again. Its only a week of work. This means that technical debt accu-mulates much more slowly. And that helps you move faster.

    We build using Node.js, and it really is perfect for micro-services. For communication be-tween services, we have an ab-straction layer ( that gives us the flexibility we need. As weve grown, weve found the need to build some services in other languages. This can happen for many reasons: performance, integration, or simply the availability of talent both internally and at our clients. Weve defined the abstraction layer as a simple protocol so that its easy to add new services in other languages. Calling it a pro-tocol is almost too much in fact its really just the exchange of JSON documents, embellished with some pattern matching. For message transport weve used everything from point-to-point HTTP and TCP to web sockets to messaging systems, and even Re-dis Pub/Sub.

    Our learning and adoption took about two years to fully develop a high-performing ap-proach. These days, theres so much more reference material, books, case studies, and con-ference talks that this time is much shorter. In fact, speaking of books, our team is writing at least two on the subject of mi-

    croservices, so look out for those later this year.

    InfoQ: When does it make sense to do microservices?

    AT: It makes sense: When you can isolate a piece

    of domain functionality that a single service can own.

    When the service can fully own read and write access to its own data store.

    When multiple teams are contributing to a monolithic system but keep on stepping on each others toes.

    When you want to imple-ment continuous deploy-ment.

    When you favor an emergent architecture rather than a top-down design.

    FO: We really wanted to have parallel mission product-de-velopment teams that were fully independent. By decomposing our business logic into hundreds of microservices, we are able to sustain parallel changes across multiple business lines.

    RR: It makes sense whenev-er you prefer to keep your busi-ness open, even if that means losing a percentage of revenue due to errors. Its funny how the world of enterprise software seems to glorify the notion of absolute correctness. Every da-tabase and application should only have ACID transactions. And then, when you ask the lead-ership of those organizations which they prefer, you find that keeping the shop doors open is much more important. For exam-ple, consumer barcodes are not always accurate the price at the till does not always match the price on the label. Supermarkets somehow seem to stay open.

    Microservices, as an archi-tecture, value availability over

    consistency. They keep your site, mobile app, or service up and running. There will be errors in some percentage of the data. You get to tune that percentage by increasing capacity, but you never get away from it complete-ly. If your business can tolerate errors, then microservices are for you.

    Obviously, there are sys-tems that need to be 100% accu-rate. And the best way to achieve this is with large-scale (and expensive) monoliths, both in terms of software and hardware. Financial, medical, and real-time systems are obvious examples. But there are large amounts of software that is pointlessly slow and expensive to build simply because we arent paying atten-tion to business realities.

    InfoQ: What are some of the difficulties in implementing microservices?

    AT: Its hard to replicate produc-

    tion into a staging environ-ment. You need to either test in production or invest in sandbox/stage automation.

    Ownership you end up with a lot of services. As teams change, services can become orphaned.

    Performance the call stack can become complex with cycles and redundant calls. You can solve this with lamb-da-architectural approaches.

    Deployment you need to have clear, consistent tech-nology on how to continu-ously deploy software.

    Client dependencies avoid writing service clients that pull in alarge num-bers of dependent libraries, which can lead to conflicts. Also, rolling out en-masse

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 33

    changes to those libraries is time consuming.

    Audit and alerting you need to move towards track-ing and auditing business metrics rather than just low-level performance met-rics.

    Reporting having decen-tralized your data, your data team will probably need you to send data to the data warehouse for analysis. You need to look at real-time data transports to get the data out of your service.

    Accidental complexity the complexity of the system moves out of the code and gets lost in the white space, the interconnectivity be-tween services.

    FO: [You need to resist] the temptation to go live before you have all the automation and tooling complete. For example, debugging 200 services with-out the right tools at the mac-ro (whats changed) and micro (trace) levels is neigh on impos-sible. You need to figure in au-tomation and tooling from day one, not as an afterthought.

    RR: Not abstracting your message transportation will end in tears. Typically, people start out by writing lots of small web-servers, a sort of mini-SOA with JSON, then run into problems with dependency management and the really nasty one, service discovery. If your service needs to know where any other ser-vices are on the network, youre heading for a world of pain. Why? Youve just replicated a monolith, but now your function calls are HTTP calls: not a huge win once things get big. Instead, think messages first. Your service sends messages out into the world, but does not know or care who will get them. Your service receives messages from the world, but does not know or care who sent them. Its up to you as architect

    to make this a reality, but its not that hard. Even if you are do-ing point-to-point behind the scenes for performance reasons, still make sure your service code does not know this, by writing a library to serve as an abstraction layer.

    InfoQ: What are the benefits of implementing microservices?

    AT: Faster time to market. Continuous deployment. Easy to understand compo-

    nents (notwithstanding that the complexity sometimes just moves elsewhere in the system).

    Easy to create, easy to tear down (although you need to have a clean shop mentali-ty).

    FO: Parallel continuous de-ployments and ease of refactor-ing services.

    RR: The big benefit is speed of execution. You and your team will deliver faster on an ongoing basis because you have reduced deployment risk (its easy to roll back, just stop the new service!) and removed the need for big refactoring its only a week of code, and youve removed hard-coded dependencies of lan-guage platforms or even things like databases.

    The other benefit is that you have less need of project-man-agement ceremony. Microser-vice systems suffer so much less from the standard pathologies of software development that strict development processes are not as effective in ensuring delivery. Its easy to see why high levels of unit-test coverage are a must for monoliths. Or that pair program-ming is going to help, or any of the agile techniques. The costs of technical debt in a monolith are

    so much more expensive, so it makes sense for the team to be micromanaged. In the microser-vice world, because the basic en-gineering approach is just much better suited to underspecified and rapidly changing require-ments, you have less need for control. Again, one week of bad code wont kill, and youll see it right away.

    InfoQ: How do microservices compare to a traditional SOA system?

    AT: For me, microservices is just taking SOA further, adapting the concept to avoid monolithic ser-vices/codebases and focusing on delivering continuous innova-tion to production across multi-ple teams.

    FO: Decomposing the busi-ness logic into independent ser-vices is probably the main take-away. Someone once described microservice architecture to me as a SOA design pattern and I guess that makes a lot of sense. There are lots of similarities monolith versus micro being the main difference.

    RR: Its a radically different approach. Theres no concept of strict schemas. Theres an insis-tence on small services. Theres recognition that the edges are smart and the network dumb so complexity does not build up in weird places. You dont have to deal with versioning issues. Why? You run new and old versions of a service together, at the same time, and gradually migrate over, all the while watching your per-formance and correctness mea-sures. The lack of strict schemas is exactly what makes this possi-ble.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 201534

    Building a Modern Microservices Architecture at Gilt: The Essentials

    Watch on InfoQ

    Goldberg, lead software engi-neer at Gilt, describes the com-pany as a flash-sales business. A typical sale offers a limited but discounted inventory starting at a specific time and running for a specific period, usually 36 hours. With tens of thousands of people coming to the website at once to buy items, Gilt experiences an extreme and short spike in traffic that generates about 80% of the revenue. Every decision that may affect website performance has to take into consideration this traffic spike of 50 to 100 times the regular traffic.

    As a traditional startup in 2007, Gilt used Ruby on Rails, PostgreSQL, and Memcached. Things went great but two years later they had a 200,000-line codebase and increasing traffic overloaded the thousands of re-quired Ruby processes running with a database. With everyone working on the same codebase, deployment could take up to two weeks due to all integration tests needed. The biggest hurdle was that if something went wrong, they had a really hard time find-ing the root cause.

    Macro/microservices era

    At this point, besides moving to the JVM, Gilt entered what Gold-berg calls a macro/microservices era. He distinguishes between a macroservice that handles a specific domain, e.g. sales or pay-ments, and a microservice that you get by breaking a macroser-vice down into smaller services. Gilt created 10 macroservices for the core business, services that are still in use. With all other ser-vices depending on these, the core services need to perform

    After living with microservices for three years, Gilt can see advantages in team ownership, boundaries defined by APIs, and complex problems broken down into small ones, Yoni Goldberg explained in a presentation at the QCon London 2015 conference. Challenges still exist in tooling, integration environments, and monitoring.

    Since 2010, Lead Software Engineer Yoni Goldberg has led the engineering behind several critical projects at Gilt--including personalization, the Gilt Insider loyalty program, SEO/optimization, and other customer-facing initiatives.

  • Architectures you Always Wondered About // eMag Issue 31 - Aug 2015 35

    with good SLAs to keep down-time to a minimum. Their check-out service is one example of a core: when that service is not responding, users cant place orders and the company then doesnt make any money. A set of less-critical supporting ser-vices, e.g. for user preferences, use the core services; these are good for the user experience but the business will still function if one goes down. On top of these supporting services, another set of services generates views for all users.

    While Gilt built services, it also introduced dedicated data stores for each service, providing the best database for each ones need. This new architecture solved 99% of their scaling prob-lems but left developers with some of the problems as the new services were semi-mono-lithic and lacked clear ownership of code. Problems with deploy-ments and long integration cy-cles remained. The main prob-lem, though, was that it wasnt fun to develop code.

    Moving to microservicesTo overcome the remaining problems, Gilt created a lot more microservices and empowered teams to take responsibility for not only developing a service but also for testing, deploy-ing, and monitoring it. This also clarified ownership; a team ba-sically became the owner of a service. According to Goldberg, the biggest benefit came from the smaller scope of a micros-ervice, which made it easier to grasp. Its easy to understand a service composed of just a few thousand lines, and to under-stand another teams micros-ervice when you move there to contribute as a developer. The architecture removed the huge pain point of deployment de-pendency among teams. Now, they could move to continuous

    deployment with each team de-ciding for itself when to deploy, even multiple times a day.

    During this move, Gilt started to work with what it calls LOSA, lots of small appli-cations, by breaking webpages into smaller pieces basical-ly microservices for web apps. This lets teams work more inde-pendently from other teams and Goldberg thinks it has created a lot of innovation and a focus on the user experience.

    Current challengesDespite Gilts successful move from Rails to a microservices architecture, Goldberg empha-sizes that the company still has some core challenges.

    DeploymentFrom the start, each team semi-manually deployed ser-vices with its own, different method. A lack of integration made it hard to execute tests to make sure a change didnt break something else. Gilt solved this by building a tool around sbt that helped teams to first de-ploy to an integration test envi-ronment and then to release to production. During the last year, the company has been working to bring operations to the teams, adopting Docker and moving to the cloud. One downside Gold-berg notes is that deployments now are slower, but he hopes that it will speed up in the com-ing years.

    APIsDuring the last year, Gilt has been moving away from a RPC style of communication, instead building REST APIs. The main ad-vantage Goldberg sees is that a well-defined API solves a couple of problems, most important-ly discoverability. Because all APIs are available in one place, finding what is available can be done with one search. The API

    will also provide documenta-tion; by looking at the models and the resources, its possible to understand whats available and how its meant to work. With the documentation generated from the code, it will always be correct and reflect any changes done to the exposed resources.

    DependenciesAll these microservices have a lot of dependencies among them. The biggest challenge Goldberg sees for developers is that for every change they have to make sure they dont break any other service. If they do a breaking change, they must do it in small steps and all clients must be moved to the new end-point before the old one can be deleted. Another problem they have experienced is that many of their small web applications repeat calls to one or several services to, for example, gen-erate a user profile. To limit all these redundant calls, Gilt has created what Goldberg calls a mid-tier microservice, a service that knows the calls needed to create the user profile and which the web applications can call in-stead. This mid-tier microservice knows how to optimize, maybe by caching, to reduce the num-ber of calls made.

    OwnershipAs in most organizations, staff turns over at Gilt. With all the microservices around, the com-pany must make sure that they have enough developers who understand the different code-bases and for Goldberg, the main solution is code reviews. When every commit needs to be reviewed by at least one other developer, it increases the pos-sibility that one more develop-er really understands the code. Goldberg also emphasizes that teams and not people owns services because even though

  • individuals may leave, teams usually stay longer. Teams also have the ability to transform a service between teams, which he really values.

    Another concept is data ownership. The move to one database per microser-vice has resulted in around a hundred re-lational databases and Gilt must manage the schema for each. Goldberg describes how Gilt completely separates the schema from the service code, which brings some subtle but important details: changes are required to be incremental and there are no rollbacks, so they have to be really con-scious about every change they make.

    MonitoringWith so many dependencies among ser-vices, Goldberg emphasizes that monitor-ing is important. Gilt uses several open-source tools to get the different matrices it especially cares about e.g. New Relic, Boundary, and Graphite but has also de-veloped its own monitoring system, CAVE, also open source. CAVEs basic function is to set up rules and to raise alerts for example, when the total order value of all US shipments during any five-minute win-dow drops below a threshold value or if the 99th-percentile response time exceeds a set value. This is a technique Goldberg finds better than metrics.

    TakeawayFor Goldberg, the biggest advantage Gilt has gained from microservices is owner-ship by team. He believes that when team members own a service, they tend to treat it like their baby. Another big promise of microservices he mentions is the breaking of complex problems into small ones that everyone can understand, one service at a time.

    Two challenges that Goldberg thinks still stand are monitoring, for lack of tool-ing, and the integration and developer en-vironment.

    Goldbergs final advice for starting with microservices is to begin with a fea-ture that does not yet exist. Build that as a microservice. He thinks that it is hard to get acceptance to break down something that already exists and works. Building something new will be much easier for people to accept.


    30 28Advanced DevOps Toolchain

    In this eMag we provide both implementation exam-ples and comparisons of different possible approach-es on a range of topics from immutable infrastructure to self-service ops platforms and service discovery. In addition, we talk about the Docker ecosystem and the different aspects to consider when moving to this in-creasingly popular system for shipping and running ap-plications.

    29QCon New York 2015 Report

    This eMag provides a round-up of what we and some of our attendees felt were the most important sessions from QCon New York 2015.

    27Technical Debt and Software Craftsmanship

    This e-mag brings together a number of authors who have addressed the topic on InfoQ and suggests ways to identify, monetize, and tackle technical debt in your products.

    This eMag focuses on three key areas of meta-language for Web APIs: API Description, API Discovery, and API Profiles. Youll see articles covering all three of these important trends as well as interviews with some of the key personalities in this fast-moving space.

    Description, Discovery, and Profiles - The Next Level in Web APIs