ibm operations analytics for z systems v2.2 - client short pres
TRANSCRIPT
© 2015 IBM Corporation
z Systems Software
IBM Operations Analytics for z Systems
Transforming Data into InsightsThe Next Generation of IT Service
Management
© 2015 IBM Corporation
z Systems SoftwareAgenda
Why IT Analytics? Overview of IBM Operations Analytics for z Systems
– Functional capability– What’s New in 2015– Architecture– Out-of-the-box Value– Customize to meet your needs– Integration with Service Management tooling
Additional Detail – Bring Your own Data – Example using HMC log– IOAz V2.2 Details
– CICS insights– Network insights– Security insights– Log Forwarder improvements
2
© 2015 IBM Corporation
z Systems SoftwareSolution Branding
3
Note that the distributed version of the product is now named IBM Operations Analytics – Log Analysis
This solution was previously branded as IBM SmartCloud Analytics - Log Analysis.
The support to search and analyze z/OS logs was initially provided in March, 2014 under the following product names: IBM SmartCloud Analytics - Log Analysis z/OS - Insight Packs – SYSLOG V1.1IBM SmartCloud Analytics - Log Analysis z/OS - Insight Packs - IBM WebSphere® Application Server V1.1
Subsequent releases were named with the SmartCloud brand until April 2015, when Version 2 of the product was rebranded to
IBM Operations Analytics for z Systems
Initial release under the new name:IBM Operations Analytics for z Systems v2.1 (GA on April 24, 2015)
Current release:IBM Operations Analytics for z Systems v2.2 (GA on October 16, 2015)
© 2015 IBM Corporation
z Systems Software
Analytics for System z addresses rapid growth of data and next generation technology
4
Much greater amount of critical IT operational data (SMF, log, journal) than distributed-only environments.
– Focus on problem determination and time to resolution while placing premium on availability of services and applications.
– 100x to 1000x explosion in data flooding existing tools. – New runtimes, programming languages needing complex
instrumentation.
By 2016, 40% of Global 2000 enterprises will have IT operations analytics architecture in place, up from < 1% in 2014, looking to integrate across their enterprise to reduce outages (Gartner).
90% of the Fortune 1000 companies are running z and have ‘Systems of Record’ dependencies for transactional processing and data serving applications.
© 2015 IBM Corporation
z Systems Software
Too long to isolate, diagnose problems in applications and infrastructure.
– Complex application workloads span multiple platforms – Increasing amounts of IT data: Performance metrics,
events, infrastructure logs, application logs, configuration files, traces
Existing IT tools need additional data analysis capabilities to manage of Systems of Engagement
– 100x to 1000x explosion in data flooding existing tools. – New runtimes, programming languages needing
complex instrumentation.
Reactive analytics misses critical information leading to outages
– Need to move to a more proactive model– Analysing ALL information better for predicting
problems.
Is managing IT today like sipping from a fire hose?
New Technologies like cloud, mobile and big data already challenging current Enterprise tools
5
© 2015 IBM Corporation
z Systems Software
Predict: Pro-Active Outage Avoidance Predict problems before they occur Search & Analyze: Quickly search and analyze large volumes of data from a single search barPerform log and performance analysis while searchingCorrelate messages from multiple logs for end-to-end problem diagnosisOptimize: Improve performance across IT Infrastructure
IBM is focused on managing end-to-end analytics for improved performance and workload management
6
IBM Analytics solutions for z Systems
PredictIBM zAware
Proactive Outage Avoidance
Search & Analyze
IBM Operations Analytics for z Systems
Faster Problem Resolution
OptimizeIBM Capacity
Management Analytics (CMA)
Optimized Performance
© 2015 IBM Corporation
z Systems SoftwareIBM Operations Analytics for z Systems
7
Analyze various types of data (logs, metrics, events, trouble tickets) from multiple sources (mainframe and distributed)
Locate problems from system, configuration, software logs and performance metrics using rapid index search and pattern analysis
Isolate issues across various domains including OS, Middleware, applications, etc. Leverage Expert Advice via links to support documentation and operations notes
to resolve problems quickly Visualize search results with analytic tools to rapidly determine root cause Out-of-the-box analysis and insights for z/OS, WebSphere, DB2, CICS, IMS, MQ,
Network, Security as well as distributed systems Enable early error detection and broaden
scope of automation with event notifications Fully customizable to meet your needs
SEARCH
ANALYZE Launch to Support Doc
RESOLVE
INTEGRATE
Accelerate problem isolation and identification … Reduce mean time to repair
in 2015 • Network insights• Security insights• Event notification• Hadoop support• Analysis of performance metrics (SMF real time Data Provider)• Integration with existing Service Management tooling (Automation,
Monitoring, Event and Incident Management) • Role-based access control• Multi-time zone support
© 2015 IBM Corporation
z Systems Software
Applications Search
Mainframe z/OS
z/Linux
z/O
S L
ogFo
rwar
der
Log
File
Age
ntor
Logs
tash
SM
FR
eal-t
ime
Dat
a P
rovi
der
WAS SYSOUT
z/OS Syslog
CICS MSGUSR
Annotators
CICS EYULOG
…
Operations Analytics Server
WAS SYSPRINT
WAS SYSOUT
DB2
DB2 App
Syslog
Web Access Log
The IBM Operations Analytics server is installed on z System (or x System) running Linux (64 bit)
z/OS Insight Packs are installed on the IBM Operations Analytics server z/OS Log Forwarder / SMF Data Provider installed on each z/OS LPAR
where you want to provide Search and Analysis
WAS SYSPRINT
Indexers
Current/ Archive
Tier
HadoopTier
USS Log Files
Other Logs
8
Alert Actions
Index
Log
Script
SNMP
EIF
AlertsInsight Pack
(z/OS)Distributed
Systems Insight Packs
SMF Data
Gen
eric
Rec
eive
r
Other Logs
Net
Vie
w
Mes
sage
G
athe
rer
IBM Operations AnalyticsArchitecture and Flows
Joblogs
NetView Netlog
……
© 2015 IBM Corporation
z Systems SoftwareSimple search interface EASY to customize
9
Log data is analysed and insights are surfaced as you searchFind problems you didn’t know existed
Search specific logs or ALL logs
Timeframe
Enter search string
Save My Search
Quick Searches, Analysis,Annotations, Patterns, Expert Advice, Dashboards will populate the Navigation tree
© 2015 IBM Corporation
z Systems SoftwareEasy to use – Quick Search
Domain-specific ‘Quick Searches’ available out-of-the-box
or create and save your own Provided with every z/OS Insight Pack
Provided by subject matter experts, support teams and customers
Immediate value out of the box
Easy to modify or create and save your own
10
© 2015 IBM Corporation
z Systems Software
Dashboards, Information Links and Expert Advice
Visualize the data with Dashboards Quick links to additional information and
support documents. Provided with every Insight Pack Expert Advice to access white papers,
tech notes, APARs, etc. for faster problem resolution
Dashboard views created by subject matter experts, support teams and customers
Immediate value out of the box Easy to modify or create and save your
own
11
© 2015 IBM Corporation
z Systems Software
Quickly and easily access IBM Support Portal based Expert Advice from Log Analysis
Search for expert advice with the click of a button
12
Launch to Tech Note
All IBM support site documents that reference messages from search results
© 2015 IBM Corporation
z Systems SoftwareAnalyze logs as you Search
Insights are surfaced automatically as you search. Patterns are
surfaced based on the log type. Provided with every Insight Pack Logs are analysed automatically Log data is categorized by hostname,
data source, message type, message source, etc.
Patterns/Insights are surfaced to help you focus on the source of the problem.For example, log analysis automatically surfaces java exceptions in application logs.
Perform searches and analyse multiple logs, organized per the needs of your enterprise.
Create your own Insight Pack for any text logs with time stamps
13
© 2015 IBM Corporation
z Systems SoftwareSample dashboardView your log and metric data however you like
14
© 2015 IBM Corporation
z Systems Software
Out-of-the-box dashboards (Example – Display message counts and java exceptions)OR
Build Your Own Dashboard with the click of the mouse
Sample dashboardView your log and metric data however you like
© 2015 IBM Corporation
z Systems SoftwareVisualizing the DataSearch and Analyze SMF Data (New in 2015)
Analyze your SMF data AND your log data for a complete view of the enterprise.
16
CPU utilization, Working Set Size, Paging & IO Rates
© 2015 IBM Corporation
z Systems Software
Perform simple free-form searches using the standard set of search keywords and operators Build complex queries with range searches and DateMath functions To learn more, consult Online Help available from the Learn More → Search Bar → Search query
syntax menu:
BYOD – Bring your own Data – The z/OS Log Forwarder can be configured to forward your text logs to enable Search, Analysis, Dashboards and Expert advice.
BYOIP – Build your own Insight Pack BYOV – Build your own Views (Graphs, Charts and Dashboards)
Out-of-the-Box capabilities provide immediate value.Additionally, IOA can easily be tailored to your specific needs.
Create your own – Queries, Dashboards, Feeds
17
© 2015 IBM Corporation
z Systems Software
Customer Experiences
18
Large Insurance Company (29 hours down to 29 minutes)Experienced an application outage that resulted in the team working around the clock for 29 hours. Multiple customers and IBM support staff poured through logs and traces to determine the root cause of the issue. After the issue was resolved, the logs were captured and sent to IBM lab for analysis using IBM Operations Analytics for z Systems. Within minutes, the IBM team was able to focus in on the root cause of the problem and to find the relevant PTF to resolve the issue through the integrated expert advice.
State Agency (up and running in 2.5 hours)Were able to download, install, configure and use IBM Operations Analytics for z Systems to search their logs in 2.5 hours.
Numerous Customers (improve visibility and find problems you weren’t aware of)Errors lurking in logs that are never examined because they don’t necessarily cause SLA or performance problems. For example, IBM Operations Analytics for z Systems found Over 4,000 invalid login attempts in a three day period that had otherwise gone unnoticed. MQ channel errors causing MQ errors in logs from distributed systems – not being monitoredSQL errors in multiple logs
© 2015 IBM Corporation
z Systems Software
New capabilities in 4Q, 2015
General capabilities (delivered via IBM Operations Analytics – Log Analysis and included with IOAz) Additional real-time alerting actions: SNMP Traps, EIF EventsRole-based access controlSupport for multiple time zones and time intervalsService Desk Extension: Incident and service request analytics
z/OS capabilities (included in the z/OS Insight Pack)Additional CICS insights from SMF 110 and EYULOGAdditional network insights from NetView netlogSecurity insightsPattern-based configuration for z/OS Log Forwarder job log data gathererAdditional out-of-the-box searches for DB2 and MQTranslation of z/OS Insight Packs (English + 10 languages) and documentation
19
© 2015 IBM Corporation
z Systems Software
Alerting actions: SNMP Traps, EIF Events
20
Benefit:– Utilize your existing event management tooling to track, highlight, enrich,
correlate and act upon conditions that are identified in their operational data by IBM Operations Analytics for z Systems through the use of SNMP Traps, Informs or EIF events.
– Broaden your scope of automation. Use NetView or other automation tools to take automatic action on any messages or other operational data as long as that data is consumed by IBM Operations Analytics for z Systems. This expands your current automation capabilities to automate on ANY data source that is fed into IBM Operations Analytics.
IOA now enables you to generate SNMP Traps and EIF Events.This is in addition to existing notifications (text, email, etc.)
© 2015 IBM Corporation
z Systems Software
21
Role-based access control and audit
Benefit:– Role-based access control and auditing capabilities enable customers to
maintain compliance with their data segregation and access control requirements. It is of special interest for service provider environments in which segregation of data is of particular importance.
© 2015 IBM Corporation
z Systems Software
Support for multiple time zones and time intervals
22
Benefit:– All users connected to a single IOA Log Analytics server, regardless of their
location, are able to view search results and graphs in their local time zone or in a different time zone of their choice. This new capability is particularly helpful for teams that are distributed across multiple time zones.
– Applications can now specify more than a single occurrence of a relative time interval. Instead of specifying “Last Day”, applications can specify “Last 3 Days” for example.
© 2015 IBM Corporation
z Systems SoftwareIntegration with Service Management Solutions
23
IOAz integrates with Monitoring, Automation and Event ManagementAutomation
NetView / SA (or other Automation tooling) – Receive and enrich, action or forward Events from ANY log source (not just Syslog)
Event Management Netcool Operations Insights (NOI)
– Launch to IOAz to analyze logs and metrics (IOA is included with NOI)– Search and analyze Events– Receive, correlate, enrich and action Events from IOAz (NOI or other Event Management
System)
Incident Management IBM Service Desk (or other incident management / trouble ticketing solutions)
– Generate Events to create Trouble Tickets– Analyze Trouble Tickets
Monitoring OMEGAMON
– Launch in context to IBM Operations Analytics from OMEGAMON and ITM workspaces– OMEGAMON Insight Pack to analyze ITM logs (RKLVLOG)
Service Management Unite (included with Performance Management and Service Management Suites)– Launch in context to analyze logs and SMF data in context of performance problem diagnosis
© 2015 IBM Corporation
z Systems Software
Using IOAz to broaden the scope of
Event Management and Automation
24
© 2015 IBM Corporation
z Systems Software
Event processing Generate Events from ANY log message(s) or other data in IOA Notifications can be in the form of:
– Text message– Email– SNMP Trap – EIF Event
Enhance your Visibility & Automation Capabilities
25
Increase scope of log monitoring and automation
Improve event correlation
IOAz can generate notifications for messages from any log in your enterprise.
Be Proactive!
© 2015 IBM Corporation
z Systems Software
26
Getting the most out of IOA notification capabilities
IBM Operations Analytics provides the ability to generate events based on messages, combination of messages over time, number of occurrences, etc.
Notifications can be generated from any data source:– Messages from Mainframe and Distributed Logs– SMF data– Events– Other
Examples include:– Send an email or text message whenever a specific message(s) is written
to a log … For example, message IRRB069I (RACF is being shut down) – Generate a SNMP Trap or EIF event when there are more then 500 failed
logon attempts in a 30 minute period
© 2015 IBM Corporation
z Systems Software
27
Notifications
Index alert action (ie send events back into IOA so they can be searched) You can use the index alert action template to index any triggered alerts. Email / Text alert action You can use the email template to send an email when a condition is met. Emails can easily be sent as text messages by most carriers
EIF alert action You can use the EIF template to send an EIF formatted event when a condition is met.
SNMP Trap alert action You can use the SNMP Trap template to send an SNMP Trap when a condition is met.
Script alert action You can use the Script template to execute a custom script when a condition is met.
Write to Log alert action You can use the Write to Log template to write an entry to a log file of your choice when a condition is met.
Send notifications in many forms …
© 2015 IBM Corporation
z Systems Software
Sending Events to any Event Receiver
Data Source 2
Alert Actions
IOA Server
Ingestion Pipeline
Alert Runtime
Index
Log
Script
SNMP
EIF
Event Management
System
(NOI, OMNIbusor other Event
Management tool)
Event Receiver
Automation
(NetView/SA or other
automation tool)
Data Source N
Data Source 1
28
IOA can generate standard SNMP Traps and/or EIF Events that can be
received and processed by ANY Event Receiver.
Event Receiver
Any Event ProcessorEvent
Receiver
© 2015 IBM Corporation
z Systems Software
29
Event Configuration is Simple
From IOAzSpecify the message or messages to trigger the NotificationChoose the event criteria (message IDs, number of occurrences, time period, etc.)Specify address of Event Receiver hostname/port or email address
From your automation toolCreate an automation statement(s) to:
– Enrich the Event– Forward the Event– Automate to correct the problem– Other
From your Event Management tool Enrich the Event Correlate with other Events and Log Messages Automate to correct the problem Create Trouble Tickets Other
Increase the scope of automation to include ANY log message
Most z/OS automation tools are limited to z/OS Syslog and Console messages
Correlate z/OS Events with Events from distributed systems to resolve problems
end-to-end
© 2015 IBM Corporation
z Systems SoftwareEvent driven automation scenarios
Optimized Performance
30
There are many scenarios where events can drive automation. Prior to IOAz, these scenarios were limited to events being driven from Syslog, because most z/OS automation tools only monitor the z/OS Syslog.
Since IOAz has access to many more logs than Syslog, we now have the ability to drive automation from messages coming from other logs and even other platforms.
We have included just a few examples in the subsequent slides. The possibilities are endless.
Benefit:• The subject matter expert can now access messages from ANY log in the enterprise• Events coming from IOA can be consumed by ANY Event receiver to automate, enrich,
correlate or forward Events or generate trouble tickets• Events can be generated in SNMP or EIF format. As a result, the events can be consumed
by any Event Receiver (Event Management or Automation tool). • Since IBM Operation Analytics for z systems can generate events from ANY message it
consumes and NetView can act as an event receiver, NetView can now automate on ANY log message (not just messages from Syslog). This scenario will work with any automation tool that can drive automation from events.
• This feature will enable customers the ability to ‘TAKE ACTION’ on any messages being consumed by IBM Operation Analytics for z Systems.
© 2015 IBM Corporation
z Systems Software
Alerting actions: SNMP Traps, EIF Events
31
Sample scenario for MQWebSphere MQ channel stopped abnormally
– MQ server runs on Windows with a MQ channel defined to MQ running on z/OS.– MQ server detects that the MQ channel to z/OS is not active and writes error
messages to the Windows MQ AMQError log. Subsequent MQ communications fail.
Without IBM Operations Analytics for z Systems:The ‘Channel down’ message is never proactively observed and the support team(s) struggle for hours to debug the problem and finally re-initiate the Channel.
With IBM Operations Analytics for z Systems:IBM Operations Analytics for z Systems detects the problem through MQ error messages written to the Windows MQ AMQError log.IBM Operations Analytics for z Systems generates an SNMP Trap (or EIF event) and forwards it to NetView (or other automation solution).Automation is driven from this event and resolves the problem by issuing a command to restart the MQ channel.
Outage avoided with IOAz!
© 2015 IBM Corporation
z Systems Software
Alerting actions: SNMP Traps, EIF Events
32
Issue reported immediately with IOAz. Maintenance backed out. Problem avoided!
Sample scenario for DB2 DDFDB2 DDF applications timed out
– DB2 runs on z/OS; IBM Operations Analytics for z Systems collects DB2MSTR address space log.
– The customer applies bulk maintenance for z/OS and DB2 over the weekend. After application of maintenance, DB2 DDF applications experience time-outs.
Without IBM Operations Analytics for z Systems:Because maintenance occurs on a Saturday, operators do not catch the resulting problem until later. The DBA is notified on Saturday evening, a PMR is opened against IBM DB2, and diagnostics are started with the DB2 and TCP/IP L2 teams. By Monday morning, none of the agents can run transactions. DB2 and z/OS maintenance have to be backed out.
With IBM Operations Analytics for z Systems:IBM Operation Analytics is able to detect the time-out problem immediately after the maintenance is applied. Operators are notified immediately and are able to determine the root cause of the issue. End users do not experience downtime when they come into work on Monday morning.
© 2015 IBM Corporation
z Systems Software
Search and analyze events, logs and metrics using IOA and Netcool Operations Insight.
Easily identify ‘related’ Events that may be candidates for suppression Identify “difficult to spot” seasonal events that often result in regular periodic
problems Easily identify which events occur in clusters Leverage visualizations that help you quickly isolate more sever and significant
problems.
Also provides opportunities for event reduction thus improving operational efficiency.
Log Analysis and Event Management in Netcool Operations Insight(IOA is included in the box with NOI)
33
© 2015 IBM Corporation
z Systems SoftwareLog Analysis – Streamline Incident Management
Incident ManagementThe traditional incident management process usually begins with one or more trouble tickets being opened for an incident (for example, slow response time for a specific application).
The first step is to engage the application support team and associated Subject Matter Experts for each of the application components (WebSphere, CICS, DB2, etc.). Each SME examine data from their specific subsystem and we usually experience a phenomenon commonly referred to as ‘ticket hopping’. During the ticket hopping phase, the trouble ticket will be reassigned multiple times before it lands on the correct SME’s lap. Over the lifetime of the incident, there is very little collaboration with respect to data and there’s usually a fair amount of ‘finger pointing’.
In the post mortem session, we usually conclude that the ‘time to resolution’ is very high and so is the number of people involved in the process of diagnosing the problem.
With IBM Operations Analytics for z Systems:IBM Operation Analytics will provide a unified view of the data, enabling the application support team to quickly focus on the problem component. The ability to search and analyze the data helps to quickly identify the problem area and the expert advice feature assists in finding the solution or workaround.
If an SME is needed for a specific component, you can transfer the ticket to that SME with the data that was surfaced by IOAz. Post mortem reveals that time to resolution is significantly decreased by as much as 50% with less involvement by the SME community.To be more proactive and improve mean time to recovery even more, the team can incorporate the use of IOA notifications to immediately notify (Text, Email, SNMP Trap or EIF Event) that a problem is occurring. Early detection will significantly decrease time to resolution and automation can be triggered to resolve the issue before the problem affects the end user.
© 2015 IBM Corporation
z Systems Software
Integration with Performance Monitoring
Launch IOA from OMEGAMON performance monitoring
workspaces to search logs in context
OMEGAMON + IBM Operations Analytics – Launch in Context from TEP
35
The One Two – Punch: Combine two very powerful tools to ensure performance and high availability of your enterprise.
Perform log analysis in context of OMEGAMON workspaces – This approach enables OMEGAMON users to perform in-context log analysis while doing problem determination
– From your OMEGAMON workspace, use the IOA search bar to search logs (using LPAR or Sysplex as the default context)
– Easy to implement - Configure TEP to display the IOA search bar
© 2015 IBM Corporation
z Systems SoftwareSearch and Analyze Operational Data in Context
36
Select a row first. In this example, a row
specifies a Queue Manager.
Specify a search string and timeframe
to analyze operational data from the
appropriate system(s)
© 2015 IBM Corporation
z Systems SoftwareAnalysis of Operational Data
37
IBM Operations Analytics analyzes log, metric and event data and surfaces insights
Built on industry expertise Expert Advice for faster time to resolution Expand analysis to include additional
data sources (from mainframe and distributed systems)
Launch into IBM Operations Analytics to
analyze logs and other operational data
to gain additional perspective and insights
and help diagnose root cause.
© 2015 IBM Corporation
z Systems SoftwareIntegration with existing Service Management solutions
(in a nutshell)
38
• Search and analyze logs, metrics, events and incident reports
• Launch from ITM, OMEGAMON, Service Management Unite & NOI
Surface anomalies
POWerful tools integrate to ensure performance and high availability of your enterprise.IBM zAware
Service Management Unite
Automation & Problem
DeterminationNetView
Performance Monitoring
ITM/OMEGAMON
Event ManagementOMNIbus/NOI
Incident ManagementControl Desk
Alert, enrich,
correlate and
automate
IBM Operations Analytics
© 2015 IBM Corporation
z Systems SoftwareSend us your logs!
Request a product demo using logs from your own test, development or production environments– IBM will load your logs into an IBM Operations Analytics server, then demo
the results back to you• A secure, dedicated drop box will be assigned to you• You will be sent detail upload instructions via email• Any file uploaded will be automatically moved to a dedicated IBM
Operations Analytics environment within 24 hours• All log data will be purged from the IBM Operations Analytics
environment within 48 hours after the demo event
To request your hosted demo, visit:http://services-useast.skytap.com:18280/WebDemo/
39
A guided demo is provided online at:
http://ibm.biz/ioazlivedemo
Or Take IOAz for a Test Drive
© 2015 IBM Corporation
z Systems Software
Announcing the IBM Operations Analytics for z Systems Early Access and Beta Program!
In 2015, we built on the strong foundation established over recent months as we develop and implement our product roadmap.
We are looking for customers and business partners worldwide who would like to help influence our roadmap and test new capabilities. The program is open-ended; interested participants may join at any time and stay on as long as they wish. That said, it is our desire to establish a set of “customer sponsor” relationships that will become instrumental in shaping the future of our offering.
To see the full program announcement, and to learn how to sign up, please visit us in our developerWorks community at:
https://ibm.biz/BdEkZV
IOA for z Systems Early Access and Beta Program
40
© 2015 IBM Corporation
z Systems Software
Additional IBM Operations Analytics Reference Material
41
Analytics Overview Video– https://www.youtube.com/watch?v=OQJapWiQECs
IOA for z Systems videos: – Overview: https://goo.gl/XsmGtq– Domain Insights: https://goo.gl/ktXYpL– Installation and Configuration: https://goo.gl/3QLm2N
IOA for z Systems Documentation– Knowledge Center:
• http://www.ibm.com/support/knowledgecenter/SS55JD
IOA – Log Analysis (server) Documentation– Service Management Connect
• http://www.ibm.com/developerworks/servicemanagement/ioa/log/index.html– Knowledge Center
• http://www.ibm.com/support/knowledgecenter/SSPFMY
© 2015 IBM Corporation
z Systems Software
42