production debugging web applications

44
© Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com SELA DEVELOPER PRACTICE December 11-15, 2016 Ido Flatow Production Debugging Web Applications

Upload: ido-flatow

Post on 06-Jan-2017

62 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Production debugging web applications

© Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com

SELA DEVELOPER PRACTICEDecember 11-15, 2016

Ido Flatow

Production Debugging Web Applications

Page 2: Production debugging web applications

THE STORIES YOU ARE ABOUT TO HEAR ARE BASED ON ACTUAL CASES. LOCATIONS, TIMELINES, AND NAMES

HAVE BEEN CHANGED FOR DRAMATIC PURPOSES AND TO

PROTECT THOSE INDIVIDUALS WHO ARE STILL LIVING.

Page 3: Production debugging web applications

For the Next 60 Minutes…IntroductionService hangsUnexplained exceptionsHigh memory consumption

Page 4: Production debugging web applications

Why Are You Here?You are going to hear about

Bugs in web applicationsTips for better codingDebugging tools, and when to use them

You will not leave here as expert debuggers! SorryBut… You will leave with a good starting pointAnd probably anxious to check your code

Page 5: Production debugging web applications

How Are we Going to Do This?What did the client report?Which steps we used to troubleshoot the issue?What did we find?How did we fix it?What were those tools we used?

Page 6: Production debugging web applications

The Tired WCF ServiceClient

Local bankReported

WCF service works fine for few hours, then stops handling requestsClients call the service, wait, then time outServer CPU is high

WorkaroundRestart IIS Application pool

Page 7: Production debugging web applications

TroubleshootingConfigured WCF to output performance counters

Used Performance Monitor to watch WCF’s counters, specifically

Instances Percent Of Max Concurrent Calls

Page 8: Production debugging web applications

Troubleshooting - cntdWaited for the service to hangInspected counter values

Value was at 100% (101.563% to be exact)At this point, no clients were active!

Reminder - WCF throttles concurrent calls (16 x #Cores)

Page 9: Production debugging web applications

Troubleshooting - cntdWatched w3wp thread stacks with Process Explorer

Noticed many .NET threads in sleep loop

Issue found - Requests hanged in the service, causing it to throttle new requestsFixed code to stop endless loop – problem solved!

Page 10: Production debugging web applications

The Tools in UsePerformance Monitor (perfmon.exe)

View counters that show the state of various application aspectsMost people use it to check CPU, memory, disk, and network state.NET CLR has useful counters for memory, GC, JIT, locks, threads, exceptions, etc.Other useful counters: WCF, ASP.NET, IIS, and database providers

Sysinternals Process ExplorerAlternative to Task ManagerSelect a process and view its managed and native threads and stacksExamine each thread’s CPU utilizationView .NET CLR performance counters per processhttps://download.sysinternals.com/files/ProcessExplorer.zip

Page 11: Production debugging web applications

Why We Do Volume TestsClient

QA team. Government collaboration appReported

MVC web application works in regular day-to-day useApplication succeeded under load testsUnder volume tests, application throws unexplained errorsReturns HTTP 500, with no specific error messageApplication logs are not showing any relevant information

WorkaroundNone. Failed under volume tests

Page 12: Production debugging web applications

TroubleshootingChecked Event Viewer for errors, found nothingUsed Fiddler to view the HTTP 500 response

Error text was too general, not very useful

Page 13: Production debugging web applications

Troubleshooting - cntdDecided to use IIS Failed Request Tracing

Luckily, the MVC app had an exception filter that used tracingCreated a Failed Request Tracing rule for HTTP 500Added the System.Web.IisTraceListener to the web.config

Waited for the test to reach its breaking point…

Page 14: Production debugging web applications

Troubleshooting - cntdOpened the newly created trace file in IE

Found an error! Exception in JSON serialization - string too big

Stack overflow to the rescue…

Page 15: Production debugging web applications

Troubleshooting - cntdRan the test again – failed again!Checked the JavaScriptSerializer serialization code

Where is MaxJsonLength set?Inspected MVC’s JsonResult codeFound the code that configured the serializer

Page 16: Production debugging web applications

Troubleshooting – almost doneCode fix was quite easy

But how big was our JSON string? 5MB? 1GB? Time to grab a memory dump…

return Json(data); return new JsonResult { Data = data, MaxJsonLength = NEW_MAX_SIZE};

Page 17: Production debugging web applications

Troubleshooting – just one more thingQuickest way to dump on an exception - DebugDiag

Page 18: Production debugging web applications

Troubleshooting – final piece of the puzzle

Tricky part, using WinDbg to find the values

Page 19: Production debugging web applications

Troubleshooting – final piece of the puzzle

Which thread had the exception - !Threads

Page 20: Production debugging web applications

Troubleshooting – final piece of the puzzle

Get the thread’s call stack - !ClrStackJavaScriptSerializer.Serialize takes a StringBuilder …

Page 21: Production debugging web applications

Troubleshooting – final piece of the puzzle

List objects in the stack - !DumpStackObjects (!dso)

Page 22: Production debugging web applications

Troubleshooting – final piece of the puzzle

Get the object’s fields and values - !DumpObj (!do)

Page 23: Production debugging web applications

The Tools in UseFiddler

HTTP(S) proxy and web debuggerInspect, create, and manipulate HTTP(S) trafficView message content according to its type, such as image, XML/JSON, and JSRecord traffic, save for later inspection, or export as web testshttp://www.fiddlertool.com

IIS Failed Request TracingTroubleshoot request/response processing failuresCollects traces from IIS modules, ASP.NET pipeline, and your own trace messagesWrites each HTTP context’s trace messages to a separate fileCreate trace file on: status code, execution time, event severityhttp://www.iis.net/learn/troubleshoot/using-failed-request-tracing

Page 24: Production debugging web applications

The Tools in UseDecompilers

Browse content of .NET assemblies (.dll and .exe)Decompile IL to C# or VB Find usage of a field/method/propertySome tools support extensions and Visual Studio integration

http://ilspy.nethttps://www.jetbrains.com/decompilerhttp://www.telerik.com/products/decompiler.aspx

Page 25: Production debugging web applications

The Tools in UseDebugDiag

Memory dump collector and analyzerCan generate stack trees, mini dumps, and full dumpsAutomatic dump on crash, hanged requests, perf. counter triggers, etc.Contains an analysis tool that scans dump files for known issueshttps://www.microsoft.com/en-us/download/details.aspx?id=49924

WinDbgManaged and native debugger, for processes and memory dumpsShows lists of threads, stack trees, and stack memoryQuery the managed heap(s), object content, and GC rootsVarious extensions to view HTTP requests, detect dead-locks, etc.https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk

Page 26: Production debugging web applications

Leaking Memory In .NET – It Is Possible!Client

Local insurance companyReported

Worker process memory usage increase over timeNot sure if it’s a managed or a native issue

WorkaroundIncrease application pool recycle to twice a day

Page 27: Production debugging web applications

TroubleshootingFirst, need to know if the leak is native or managedChecked process memory with Sysinternals VMMap

Looking at multiple snapshots, seems to be managed (.NET) related

Page 28: Production debugging web applications

Troubleshooting - cntdTime to get some memory dumps

Need several dumps, so we can compare themVery simple to do, using Windows Task Manager

Next, open them and compare memory heaps

Page 29: Production debugging web applications

Troubleshooting - cntdCompared the dumps with Visual Studio 2015 (Requires the Enterprise edition)

Page 30: Production debugging web applications

Troubleshooting - cntdDidn’t take long to notice the culprit and reason

Hundreds of DimutFile objects, each containing large byte arrays

Page 31: Production debugging web applications

Troubleshooting - cntdThese objects were not “leaked”, they were cached!

Recommended fix includedDo not cache many large objectsCache using an expiration (sliding / fixed)

Page 32: Production debugging web applications

Troubleshooting – wait a second…The memory diff. had another suspicious leak

Why are we leaking the HomeController?

Page 33: Production debugging web applications

Troubleshooting - cntdChecked roots

Controller is also cached, why?Referenced by the CacheItemRemovedCallback event

Page 34: Production debugging web applications

Troubleshooting - cntdChecked the code for last time

CacheItemRemoved is registered to the event, but it is an instance methodNote - adding instance method to a global event may leak its containing object

The fix - change the callback method to static

Page 35: Production debugging web applications

The Tools in UseSysinternals VMMap

Helps in understanding and optimizing memory usageShows a breakdown of the process memory typesDisplays virtual and physical memoryCan show a detailed memory map of address spaces and usagehttps://technet.microsoft.com/en-us/sysinternals/vmmap.aspx

Visual Studio managed memory debug (Enterprise)Part of Visual Studio’s dump debuggerDisplays list of object types and their inclusive/exclusive sizesTracks each object’s root pathsCompare memory heaps between dump fileshttps://msdn.microsoft.com/en-us/library/dn342825.aspx

Page 36: Production debugging web applications

When SSL/TLS Fails…Client

Airport shuttle service siteReported

Application suddenly fails to communicate with external services over HTTPSError is “Could not establish trust relationship for the SSL/TLS secure channel”Cannot reproduce the error in dev/test

WorkaroundRestart IIS (iisreset.exe )

Page 37: Production debugging web applications

TroubleshootingChecked Event Viewer for any related error

Found the SSL/TLS error in the Application and System logs

According to MSDN documentation, error code 70 is protocol version support

Page 38: Production debugging web applications

Troubleshooting - cntdUsed Microsoft Message Analyzer (network sniffer) to watch the TLS handshake messages

Before issue starts – client asks for TLS 1.0, handshake completes

After issue starts – client asks for TLS 1.2, handshake stops

Page 39: Production debugging web applications

Troubleshooting - cntdChecked the Server Hello, it returned TLS 1.1, not 1.2

Switched to TCP view to verify client’s behaviorClient indeed sends a FIN, and server responds with an RST

Page 40: Production debugging web applications

Troubleshooting – moment of clarityDeveloper remembered adding code to support new Paypal standards of using only TLS 1.2

Code set to only use TLS 1.2, removing support for TLS 1.0 and 1.1

Suggested fixUse enum flags to support all TLS versions – Tls | Tls11 | Tls12This is the actual default for .NET 4.6 and onFor .NET 4.5.2 and below – default is Ssl3 | Tls

Page 41: Production debugging web applications

The Tools in UseMicrosoft Message Analyzer

Replaces Microsoft’s Network Monitor (NetMon)Captures, displays, and analyzes network trafficCan listen on local/remote NICs, loopback, Bluetooth, and USBSupports capturing HTTPS pre-encryption, using Fiddler proxy componenthttps://www.microsoft.com/en-us/download/details.aspx?id=44226

Event Viewer (eventvwr.exe)Discussed previously

Page 42: Production debugging web applications

Additional Tools (for next time…)Process monitoring

IIS Request Monitoring, Sysinternals Process MonitorTracing and logs

PerfView (CLR/ASP.NET/IIS ETW tracing), IIS/HTTP.sys logs, IIS Advanced Logging, Log Parser Studio

DumpsSysinternals ProcDump, DebugDiag Analysis

Network sniffersWireshark

Page 43: Production debugging web applications

How to Start?Understand what is happeningBe able to reproduce the problem ”on-demand”Choose the right tool for the taskWhen in doubt – get a memory dump!

Page 44: Production debugging web applications

ResourcesYou had them throughout the slides

My Info@IdoFlatow // [email protected] // http://www.idoflatow.net/downloads