ops for noops - operational challenges for serverless apps

Post on 19-Jan-2017

1.022 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Ops for NoOpsOperational challenges for serverless apps

Eric Windisch CTO IOpipe, Inc.

ERIC WINDISCH

@ewindisch

Founder & CTO of IOpipe, Inc. www.iopipe.com

ex-Docker, ex-Cloudscaling.

Builder of clouds,destroyer of monoliths.

EVOLUTION CREATES CHALLENGES

➤ Fear, uncertainty, and doubt for new users:

➤ What problems will I run into with this new platform?

➤ What will I do when those problems happen?

➤ Will I know about those problems when they happen?

➤ Is it secure?

➤ What tools to use?

SERVERLESS DEVELOPER PROFILES

➤ Frameworks: SLS, Zappa, Apex, DIY, others.

➤ Event sources: API Gateway, SNS, S3, Kinesis, others. (Alexa and AWS IoT sources are relatively infrequent)

➤ Languages: Node, Python, Java, Go, C, Ruby.

➤ Regions: all the regions: us-east, us-west, etc. several moving to new international regions (Sydney, etc.)

➤ Events: 0-100m+ events per day

➤ Stage: dev/test through production

CLOUDWATCH➤ Basic “super-outside” metrics:

➤ Errors ➤ Logs ➤ Invocations/time ➤ Duration ➤ Memory

➤ This is what Datadog, Sumologic, etc. ingest.

HARD PROBLEMS➤ Cold-starts

➤ Especially painful for Java users. ➤ Relationship of metrics vs logs. ➤ Lack or difficulty of profiling &

tracing tools. When do GCs happen?

➤ Retries - why/when & in relation to event sources

➤ AWS account level limits (& when to bump them up)

➤ Difficulty of managing unsupported languages: C, C++, Go, Ruby, etc.

➤ Debugging of & visibility into distributed systems ➤ Are failures at event-source or

lambda function? ➤ Kinesis!!!

➤ Cross-invocation leaks ➤ Memory leaks ➤ File descriptor leaks ➤ Backend process visibility ➤ Thread/callback leaks. ➤ etc.

➤ We install into your process, around your functions.

➤ Import a library, use a decorator (or low-level reporting API)

➤ Gets info via NodeJS process var, Python sys, etc.

➤ Timing information for wrapped function(s).

➤ Stacktrace reporting.

➤ Extra logging / events pushed by developers.

➤ & looks outside…

INSIDE THE PROCESS

METRICS & ANALYTICS

INTO THE BLACK BOX

GITHUB.COM/IOPIPE/LAMBDA-SHELL

OUTSIDE THE FUNCTION - INSIDE THE BLACK BOX

➤ Reuse of containers and VMs

➤ Cold-starts by VM, container, and app process.

➤ Tenancy of VMs (how many containers)

➤ Host VM processes(!!) & processes in other containers(!!!)

➤ Limited & very likely to go away…probably per-tenent VMs anyway

➤ Spawned processes

SECURITY

➤ I founded the Docker Security Team…

➤ FYI - Lambda’s not Docker!

➤ Lambda’s not perfect! (Security never is!)

➤ Amazon did a good job.

➤ Re-inventing the wheel means repeating some mistakes solved elsewhere…

➤ Still… AWS did a pretty good job.

➤ Don’t worry about it.

➤ Some questions can only be answered by AWS or with more data! TBD!

APP MANAGEMENT

➤ Actionable metrics from inside & outside the function.

➤ Ingest CloudTrail for context-aware intelligence.

➤ Where events originate, retries, etc.

➤ Alarms -> Lambda invocation

➤ triggers AWS services, PagerDuty, IFTTT, Zapier, etc.

➤ Real-time visibility. Daily, Weekly, Monthly reporting.

GETTING HELP➤ Gitter…

➤ https://gitter.im/serverless/serverless

➤ Slack…

➤ https://serverless-forum.slack.com/signup

➤ IOpipe Slack (for registered users!)

➤ Forums…

➤ Amazon - https://forums.aws.amazon.com/index.jspa

Eric Windisch CTO IOpipe, Inc.

Register for FREE beta access:

www.iopipe.com

Q&A

top related