using scom for vmware monitoring: experts’ tips

Download Using SCOM for VMware monitoring: Experts’ Tips

Post on 01-Jun-2015




1 download

Embed Size (px)


Monitoring with the virtualization stack in mind Data sources and solution architectures Actionable alerts vs. alert noise Creating a dynamic user experience Reporting for trending, capacity planning and forecasting Override behavior and 4 Tips for proper override tuning


  • 1. Using SCOM for VMware monitoring:Experts TipsCameron Fuller Alec KingPete ZergerMicrosoft System Product Management Microsoft SystemCenter MVP Director at Veeam Software Center MVP

2. Whats in? Monitoring with the virtualization stack in mind Data sources and solution architectures Actionable alerts vs. alert noise Creating a dynamic user experience Reporting for trending, capacity planning and forecasting Override behavior and 4 Tips for proper override tuning 3. Storage monitoring Storage monitoring solutions can be very esoteric and expensive. A virtualization monitoring solution that points SCOM and VMware administrators to problems at this layer is a welcome addition indeed. 4. When SCOM is blindThere are aspects of the virtual infrastructure that are notvisible to SCOM through its agents: vSpherespecific metrics for the VM such as balloonmemory and CPU wait time; Important components of vSphere including ESX(i)hosts, clusters and vCenter Server; Automation within vSphere directly affecting VMs,such as Distributed Resource Scheduler (DRS),vMotion and High Availability (HA) 5. Continuous VisibilityWithout visibility of vSphere as a whole, it is impossibleto monitor such critical aspects of vSphere as: ESX(i) host Physical hardwareVM status performance statusand location 6. Virtualization stackBy monitoring with the virtualization stack in mind,you can effectively combine the power of SCOMand vSphere monitoring to provide a more comprehensivemonitoring solution. 7. Data sources and solution architecturesYou can get information on vSphere performance fromdifferent data sources, including: SNMP; Syslog; VMcontrol-based APIs; Web Service SDK;It is important to understand which sources exposethe data needed to quickly identify and isolate commonissues and to understand the PROs & CONsof each approach. 8. SNMP & Syslog: PROs & CONs SNMP SyslogvSphere is SNMP-enabled, Syslog messages from vSphere areP and SNMP monitoring is a rich source of hardware monitoringconfigurable through the Systemdata, and rules that utilize SyslogR Center Operations Console UI can be configured using the UIO (user interface).wizards in the System Center Operations Console.Many VMware Infrastructure 3 Messages tend to be cryptic,(VI3), vSphere4 and vSphere 4.1and most are of little use to mostC events and performance metrics are operators without time-consumingO not exposed via SNMP.translation to a human interpretableVMware is no longer focusing onerror message.Nmanagement via SNMP. 9. vmcontrol-based APIs & Web Service SDKvmcontrol-based APIsWeb Service SDK (including vmPerl / vmCOM) (requires vCenter Server)COM was deprecated some time ago,This is the richest source of vSphereso you should avoid using this altogether. performance data with the lowest impactWhile vSphere does offer an SDK for Perl,on host and guest performance.Perl is not a Windows-friendly option, It provides the most complete picture ofand is probably the least documented infrastructure and application performancein terms of samples on the Internet. and health with the best scalability from a vSphere perspective. 10. Other methodsThere are also various methods available for monitoringservers in the virtual infrastructure.For more practical information on the other monitoringmethods and the PROs & CONs of each ofthem, please, refer to the full version of Veeams featuredWhite-PaperBest Practices for Monitoring VMware with SCOMavailable at 11. Catch the problem before it growsThe best thing to keep in mind for productive VMwaremonitoring is:catch the little problems beforethey become major issuesthat cause service interruptionsTo bring this principle into realityyou should set critical alertsappropriately. 12. One Mans Trash Is Another Mans Treasure The goal of SCOM is to notify administrators of issuesbefore they become a major problem. But with all the alerts SCOM is way too noisy(at least in some environments). Noise is generated when alerts (usually sentvia email) are created by SCOM for situations thatare not actionable, relevant or unique. 13. When SCOM is noisyWhen someone tells you SCOM is noisy you shouldremember, that1. you can generate alerts that are only displayed in the SCOM console and are not necessarily sent by emails.2. the SCOM management packs are designed to be used by organizations that range from relatively small to large-scale enterprise environments and thus should be configured accordingly.3. SCOM requires a proper tuning to match the requirements of an organization fully and its perfectly tunable (See slide 15 for a hint)! 14. Crucial Alerts vs. Alert Noise Tune up the SCOM alertsystem to provide onlyactionable alerting matchingthe needs of your business. That way you ensure thatthe notifications that are sentwill be acted upon and notregarded as alert noise. 15. Customize your User ExperienceThe SCOM console needs to betargeted to show only the informationthat is relevant to the specificuser, including the users serversand applications. 16. Dynamic User ExperienceUsing user groups is a great way to ensure that everyoperator is getting information relevant to hisherresponsibilities.There are 2 general categories of groups in SCOM: Static groups are created by adding specific entities to the groupas explicit members. Static groups are useful when there are nospecific criteria that define group membership. Dynamic groups are created by setting criteria that define whatentities are members of the group. Entities matching those criteriaare automatically added to the group. 17. Capacity Trends The SCOM Data Warehouse provides reports on the trendsover a significant period of time that is customizable to yourbusiness requirements. By integrating performance data into SCOM we can generatetrend reports on custom metrics (such as CPU Wait Time)to see if additional capacity is currently required. 18. Capacity Planning & ForecastingIn order to identify and resolve an issue before it affectsthe users, SCOM should be able to not only see trends onperformance metrics but also to forecast on metrics basedupon the history of the data.However, SCOM lacks an off-the-shelf report for forecastingand capacity planning.To make projections of future resource consumptionyou need to integrate an additional reporting software,such as, e.g., Veeam Reporter. 19. Veeam Reporting SolutionVeeam Reporter, now available as a part of Veeam ONEsolution, provides extensive capacity planning & changemanagement functionality. It discovers, documents and analyzes your entirevirtual infrastructure and maintains a complete historyof all objects, settings and changes. Armed with knowledge of your virtual environmentsconfiguration and its past and current utilization,it makes recommendations for resource allocationsand acquisitions. 20. OverridesWhen customizing SCOM for VMware monitoring dontforget about the overrides: Overrides are used to enable, disable or modifymonitoring elements in a management pack(rules, monitors, discoveries, and so on); Override precedence is a rule of the road regardingoverride behavior with multiple overrides present ona single rule, monitor or discovery. 21. Override precedence: rules of the roadTake these rules of the road into consideration to controloverride precedence: The most specific override takes precedence. An enforced override takes precedence. Overrides in unsealed management packs takeprecedence. Class overrides from contained or hosted instancestake precedence over class overrides of the instance. Instance overrides with the higher relative depth takeprecedence over those with a lower depth. 22. Taking full control over OverridesOverride precedence is just one thing.To take full control over overrides, keepthe following 4 tips in mind when tuningyour SCOM environment:1.Use classes where possible2 Target overrides to a group VS. an entity.3.Use dynamic groups when possible4.Create relevant groups for entities 23. Tip #1. Use classes where possibleIf a class already exists that meets your requirements,use the class instead of a custom group (unless you needto create the group for other reasons such as customizingthe user experience). 24. Tip #2. Target overrides to a group VS. an entityUsing groups allows these overrides to be portableso that they can be moved between SCOM environments.This approach also minimizes the number of overridesrequired by gathering similar systems and applying a singleoverride to the group. 25. Tip #3. Use dynamic groups when possibleStatic groups require maintenance to add and removemembers of the group as agents are deployed or removedfrom the environment.Dynamic groups do not require this added maintenance. 26. Tip #4. Create relevant groups for entitiesGroups can contain any type of entity, not just servers.SCOM automatically provides an alert when the defaultapplication pool or default website is not running.While removal of these websites and application pools arethe recommended approach, you can instead generatea group that contains the entities that match the nameand then disable the monitor for that group of entities. 27. Got more questions?For more detailsdownloadfree White PaperBest Practices forMonitoring VMwarewith SCOM 28. Get more free content(webinars, white papers, product demos):


View more >