ben kehoe - serverless architecture for the internet of things

Serverless Architecture for the Internet of Things

Ben Kehoe @ben11kehoe Cloud Robotics Research Scientist

2016-05-26

Transition to the cloud: “Treat servers like cattle, not pets” (traces back to Bill Baker at Microsoft)

Transition to serverless: Treat servers like roaches

Contents

1.  iRobot + me 2.  iRobot’s needs 3. Cloud architecture 4. Serverless ops 5. Looking forward

iRobot + me

•  Consumer robotics –  Roomba (vacuuming)–  Braava (hard floor care)–  Other (Looj, Verra)–  Create

•  Global –  Over 100 countries–  Only 40% North America–  Antarctica?

•  High volume –  2.4 million robots last year

•  Me –  Cloud Robotics Research Scientist–  R&D, cloud architecture, IoT/smart home–  Background: UAVs, surgical robots, physics, theater, …

iRobot + me

iRobot’s needs

•  IoT/Smart Home –  From the consumer’s perspective, the

cloud is sometimes hidden in (consumer) IoT

•  This is not intuitive•  Smart Home is a better term

–  The consumer may never interact with an IoT device from outside their home

–  The cloud may enable functionality that the consumer only uses through direct physical interaction

–  This is especially true of robots•  Enterprise

–  Global–  Scalable–  Secure–  Auditable

iRobot’s needs

•  Cloud infrastructure for our customer-facing system is undifferentiated heavy lifting for us

–  On the other hand, big data may be a key part of our business, so cloud infrastructure in that area is more relevant for us

•  This is where serverless architecture comes in

–  Hurray!–  Development limited mostly to

business logic–  Accept inefficiencies in system

design due to available service functionality in exchange for vastly reduced ops complexity

iRobot’s needs

Cloud Architecture

•  Users, apps, robots –  Users to apps: one to many–  Users to robots: many to many–  “Accountless apps”

System architecture

•  Users, apps, robots –  Users to apps: one to many–  Users to robots: many to many–  “Accountless apps”

•  Local connections •  Triangle of trust •  Two entry points: AWS IoT and API

Gateway

System architecture

•  Robots –  No AWS credentials–  Certificates --> can only authenticate with

AWS IoT•  Not even API Gateway custom auth :-(•  BYO Cert (mfg-ing logistics)•  Use presigned URLs for e.g. S3 get/put

–  OTA firmware update•  Apps

–  Cognito identity --> AWS credentials–  “Accountless” functionality (UX driven)–  Uses the triangle of trust

•  Admin console –  ADFS sign in–  Served through separate API Gateway

•  Protip: single-page web app, files served thru API GW using S3 service proxying, API calls using relative paths--> client always in sync with backend

Cloud architecture for IoT

•  Computation: Lambda and IoT Rules •  Lots of SQS queues •  Storage: DynamoDB, IoT Shadows •  Security: Rube Goldberg WAF for API Gateway •  The *: Elasticsearch and RDS

Cloud architecture

•  Enterprise needs –  Scale

•  No problem!*•  Lambda limits are the most

worrying, CloudWatch Events limits are the most annoying

–  Mostly because of SQS–  Global

•  Actually the biggest downside to serverless

–  Regional availability–  Vendor lock-in

–  Security•  WAF•  CloudWatch•  3rd party tools

–  Auditability•  CloudTrail

Serverless ops

•  Serverless IT and Ops –  Infrastructure as code–  Build artifacts–  Inspectability–  Deploy from dev machine or test

server–  Deploy from working dir or git

commit–  Auditability

•  Security•  Billing

•  CloudFormation is great for deployment. Slower is ok for us.

–  Use CloudFormation custom resources to deploy and manage arbitrary resources

•  E.g., API Gateway + WAF–  Give CloudFormation some syntactical

sugar•  Still need to deploy and manage custom

resource Lambdas •  Still need to deploy artifacts into S3

–  Lambda source code–  CloudFormation templates–  Etc.

Serverless ops

•  Our deployment tool is named “cloudr” •  “clowder” is the collective noun for cats •  Builds Lambda source code •  Deploys artifacts to S3 •  Creates/compiles CloudFormation

templates, injects S3 locations from previous step

•  Deploys and manages custom resource Lambdas using hash of source as alias

–  Uses our cfnlambda library

cloudr

Source

•  Creates an application consisting of a set of microservices –  One stack per microservice

•  CloudFormation template defined by user, with syntactic sugar

•  DynamoDB table for service discovery added automagically, name injected into Lambda functions

•  Required and provided resources defined by the user

–  One stack for the application as a whole•  A custom resource for each microservice stack•  Cross-service policies created based on the

declared dependencies•  Service discovery tables populated from info

contained in this stack

cloudr

•  How do we actually roll out updates? •  This is the biggest area where

serverless offerings are lacking •  With IaaS and lower-level PaaS, you

get lots of control –  Canary deployments–  Roll out behind the load balancer,

or set up a new load balancer with a whole separate fleet

•  What can we do serverless on AWS today?

Serverless deployment

•  Rolling out a deployment “behind the load balancer” is impossible

•  Canary deployments are impossible if we update in place

•  So how do we host multiple versions simultaneously?

–  For API Gateway, multiple versions can coexist as separate stages or separate APIs

–  For IoT, no such luck•  One MQTT server per

account (in a region)•  Certificates can only exist in

one account (in a region)•  (╯°□°) ┻━┻


•  Solution for IoT: topic prefix •  All rules for an instance of an

application listen on prefixed topics •  What about /$aws/ topics?

–  Robot sets prefix in the shadow–  Rules on shadow switch on that

field


•  How do you switch clients over? •  A separate global API Gateway for

service discovery –  Well-known url using custom

domain names•  Client service discovery returns three

items: –  API Gateway base url–  MQTT host–  MQTT topic prefix


•  When an app wants to communicate with a robot, how do we make sure it talks to the same instance the robot is talking to?

–  Separate service discovery for robots and apps

–  Robot service discovery: where should I be?

•  Robot updates “where am I” in the cloud

–  App service discovery: where is this robot?

–  Quadrilateral of trust•  A third service discovery for app’s non-

robot-related calls


Conclusion

•  The awesome –  Zero unmanaged EC2 instances–  Zero Elastic Beanstalk

applications•  The good

–  Lambda service in isolation•  Scaling•  Development•  Testing

–  API Gateway features–  AWS IoT

•  BYO Certificates•  Rules Engine computation•  Pricing!

Conclusion

•  The bad –  Deployment gets complicated–  We could get a lot of mileage of

MQTT “retain”–  IoT fleet operations–  WAF for API Gateway–  VPC support

•  The ugly –  Lambda SQS integration–  IoT instances/certificate

limitations

Conclusion

•  IoT is complicated •  Serverless is the way

–  Development–  Deployment–  Operations

•  iRobot’s solution: cloudr –  (Hopefully) will be open source

Conclusion

•  iRobotCorporation on Github –  cfnlambda–  sqslambda*–  ADFS credentials refreshing*

•  We’re hiring

•  @ben11kehoe on Twitter

Conclusion