Download - Application Monitoring
![Page 1: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/1.jpg)
The Northwestern Mutual Life Insurance Company – Milwaukee, WI
Application Monitoring
Jeremy Kalsow
![Page 2: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/2.jpg)
Why Application Monitoring
• Majority of all corporations
• Northwestern Mutual
• Total 1,000+ servers
• Team is 6 people
• Team uses 16 servers
• Average 50 applications per server
• Need a way to know status fast
![Page 3: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/3.jpg)
What is it?
• The ability to monitor performance and availability
• Gather metrics
• Show trends
• Pretty pictures for management
![Page 4: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/4.jpg)
Why?
• Trends predict future problems
• Solve application issues faster
• Uptime relates directly to profit for many companies
• View all applications, servers, databases and other items being monitored with a single dashboard.
![Page 5: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/5.jpg)
Types of Monitoring
• Fault
• Performance
• Configuration
• Security
• Accounting
![Page 6: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/6.jpg)
Fault
• Detects major errors
• Easy to implement
• Examples– Network loss– Database Connectivity
• Very Important
![Page 7: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/7.jpg)
Fault
Type of Monitoring
What to Monitor
When to monitor
Hardware CPU utilization CPU load Load > 99% for x minutes
Memory utilization Memory load Load > 99% for x minutes
Storage System Available space System out of Space
Applications Application available
Application working
Working or Error
Application Logs Error Log monitoring
If error occurred
Databases Database online Database is online Database is up/down
Network Latency Latency Latency > acceptable range
![Page 8: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/8.jpg)
Performance
• Slow Performance
• Service Level Agreements
• Metrics
• Old and New Metrics
• Visual Display
![Page 9: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/9.jpg)
Performance
http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html
![Page 10: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/10.jpg)
Configuration
• Configuration variables
• Connectivity
• Speed
• Performance
• Proactive
• Servers and Applications
![Page 11: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/11.jpg)
Configuration
• Why would the configuration change?
• Hardware
• Storage
• Service packs
• Hot fixes
• Windows Updates
![Page 12: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/12.jpg)
Security
• Attempts to access the system
• Open ports
• Inventories
• Firewall
• Packets
• System events
• Blocked Exploits
![Page 13: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/13.jpg)
Accounting
• Monitors Usage
• Generally used for fees
• Profit/Loss
• Example– Electric Company– Northwestern Mutual
![Page 14: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/14.jpg)
Types of Monitoring Recap
• Fault
• Performance
• Configuration
• Security
• Accounting
![Page 15: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/15.jpg)
Types of Monitoring Recap
• Historical data
• Baseline test
• Current test
• Performance disagreements
![Page 16: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/16.jpg)
Types of Monitoring Recap
• Allows for trends to be seen
• Modifications can be made
• Trends over multiple releases
![Page 17: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/17.jpg)
Types of Monitoring Recap
• Monitoring is important
• Not enough time is given
• Implemented After discovery of an issue
• Monitoring only in areas of known problems
• Adding monitoring requires time and money
![Page 18: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/18.jpg)
Challenges of application monitoring• Various types of systems
• Shared
• Clustered
• Virtualized
• Production logging
![Page 19: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/19.jpg)
Shared Systems
• 1 server / Multiple applications
• System resources are shared
• Tracking individual usage is difficult
• Many applications may be impacted
• Server without access (production)
![Page 20: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/20.jpg)
Clustered Systems
• Applications on more than one server
• Avoid single point of failure
• May be hard to target the issue
![Page 21: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/21.jpg)
Production Logging
• Generally Limited
• Most errors repeated in test
• Application downtime
• Use of company resources
![Page 22: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/22.jpg)
Implement Application Monitoring• Plan Early
• Monitor Proactively
• Create a Recovery Plan
• Create and use SLAs
![Page 23: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/23.jpg)
Plan Early
• Planning stage
• Add monitoring during development
• Late additions cover known issues
![Page 24: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/24.jpg)
Monitor Proactively
• Harder to implement
• Issues are dealt with before end user knows
![Page 25: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/25.jpg)
Monitor Proactively
• Tools based approach
• Easy and relatively fast setup
• No code
• Multiple applications
![Page 26: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/26.jpg)
Monitor Proactively
• Logging is directly in the code
• Less efficient
• More specific
• Developers have less time
![Page 27: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/27.jpg)
Create a Recovery Plan
• Fast resolution
• Knowledge management
![Page 28: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/28.jpg)
Recovery Plan Template
![Page 29: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/29.jpg)
Service Level Agreements
• What percentage of time that the services will be up (uptime)
• How many people can use the application at once without performance issues
• Performance metrics and benchmarks to be used with performance monitoring alerts
• The rules for notification announcements• What statistics will be monitored and
when and where they will be available• Acceptable response time
![Page 30: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/30.jpg)
Service Level Agreements
![Page 31: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/31.jpg)
Using the Statistics
• Visual display
• Alerts
• Tickets
![Page 32: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/32.jpg)
Visual (Dashboard)
• Easily view statistics
• Comparison results
• Trend comparison
• Cross Platform
• Auto-generated management reports
![Page 33: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/33.jpg)
Dashboard
![Page 34: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/34.jpg)
Alerts and Tickets
• Auto-generated alerts
• Tickets for queue system
• Vital information in each
![Page 35: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/35.jpg)
Alerts and Tickets
• Most common: Email
• Text, popup, printout, recording and more
• Tickets: auto-generated
• Knowledge databases
• Common fixes and resolutions
![Page 36: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/36.jpg)
Application Monitoring
• Maximize application uptime
• Higher end user satisfaction
• Higher Profit
![Page 37: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/37.jpg)
References
• Polozoff, A. (2003, April 9). Proactive Application Monitoring. IBM - United States. Retrieved October 20, 2011, from http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html
• Choice. (2009, December 20). Application Monitoring. Adminschoice - Unix Made Easy. Retrieved October 31, 2011, from http://adminschoice.com/application-monitoring
• Application Monitoring Software - uptime software. (n.d.). Server Monitoring Software - IT Systems Management, Capacity Planning, Application and Server Monitoring Tool by uptime software. Retrieved October 31, 2011, from http://www.uptimesoftware.com/application-monitoring.php
• Marko, K. (2005, December 30). Proactive Application Monitoring. Processor.com:
• Data Center IT Equipment at Processor, Routers, Storage, Rackmount Servers, Computer Room Cabling and Flooring. Retrieved October 29, 2011, from http://www.processor.com/editorial/article.asp?article=articles%2Fp2752%2F43p52%2F43p52.asp
• "IT Service Level Agreement Templates | ContinuityPlanTemplates." ContinuityPlanTemplates |Free Business Continuity Plan (BCP) Templates. ContinuityPlan Templates, n.d. Web.30 Oct. 2011. http://www.continuityplantemplates.com/it-service-level-agreement-templates
![Page 38: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/38.jpg)
XML
![Page 39: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/39.jpg)
![Page 40: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/40.jpg)
![Page 41: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/41.jpg)
![Page 42: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/42.jpg)
![Page 43: Application Monitoring](https://reader036.vdocuments.site/reader036/viewer/2022062408/568139b6550346895da151ea/html5/thumbnails/43.jpg)
Upcoming events with Dashboard•Ability to display visualized graphs and other pertinent information
•Ability to click a failed component and have the system auto generate a ticket
•Ability to Alert others of the issue found
•Performance monitoring as well as fault