© 2012 ibm corporation tivoli workload automation: planner functionality and recovery actions ©...
TRANSCRIPT
© 2012 IBM Corporation
Tivoli Workload Automation:Planner functionality and recovery actions
© 2012 IBM Corporation
© 2012 IBM Corporation
Command Line 8.2 TWS Domain Managers
8.2 TWS Master
Job JobJob
Job
Job
8.2 TWS FTA
TWS Distributed – 8.2.x
Job Scheduling Console
•TMF
Dependency for
JSC connection
•No Open APIs
for external
applications
•no LDAP
support
•Scheduling DB
Schema cannot be
exported
and accessed to
external apps
• Difficult D
B recovery
procedure
© 2012 IBM Corporation
TWS Distributed – 8.6
8.6 FTA8.6 FTA 8.6 FTA8.6 FTA
CLI
8.6 TWS Domain Managers +Backup DM
8.6 TWS Master + Backup Master
WebUI server (TDWC 8.6)
HTTP Load Balancer
DB2 in HA
Job JobJob
Job
Job
© 2012 IBM Corporation
Production Plan Generation
A new script JnextPlan (that replaces the old Jnextday script) creates the production plan.
The following command produces a production plan that starts today at 00:00 (start of day) and ends at 23:59.
JnextPlan
The script allows the creation of a production plan covering multiple days or a few hours. Running the following command results in a production plan that starts today at 00:00 and finishes tomorrow at 23:59.
JnextPlan –for 4800
JnextPlan can be run with a zero minute extension, this updates static information like Workstations, Windows users, Calendars and removes completed Job Streams (Carry Forward must be set to ALL) without adding new job stream instances.
JnextPlan –for 0000
© 2012 IBM Corporation
Production Plan Generation
JnextPlan syntax:
JnextPlan [- from mm/dd/[yy]yy [hhmm [tz | timezone tzname]]] [–to mm/dd/[yy]yy [ hhmm [tz | timezone tzname]]] | [–for [h]hhmm] [–days n]
-from sets the start time of the new production plan. The format of the date is specified in the localopts file; hhmm identifies the hours and the minutes and tz is the time zone. The “start of day” option can be updated with a new command line called “optman” that replace the globalopts file.
-to is the new plan end time; the date format is the same of the from parameter; if it is not specified the default value for this parameter is the "the date and time specified in the -from field + 23 hours and 59 minutes".
-for is the plan extension in terms of time; the format is the following: hhhmm where hhh are the hours and mm are the minutes. If it is not specified the default value is 24 hours.
-days is the plan extension in terms of days.
Default values maintain the backward compatibility!!Use “optman ls” to show the default values (including the “start of day” time) stored in DB (in this release the globalopts file doesn’t exist any more!)
© 2012 IBM Corporation
Jnextday and JnextPlan
reptr –pre Symnewreptr –pre Symnew
schedulrschedulr
compilercompiler
TWS 8.2.xTWS 8.2.x
conman "continue & stop @!@;wait;noaskconman "continue & stop @!@;wait;noask"
wmaeutil.cmd ALL -stopwmaeutil.cmd ALL -stop
stageman
logman
TWS 8.6.0TWS 8.6.0
PlanmanPlanmanPreproduction plan generationPreproduction plan generationSymnew creationSymnew creation
reptr –pre Symnewreptr –pre Symnew
conman “continue & link @!@;noask”conman “continue & link @!@;noask” conman “continue & link @!@;noask”conman “continue & link @!@;noask”
Mak
ePla
n
conman "continue & stop @!@;wait;noaskconman "continue & stop @!@;wait;noask"
Sw
itchP
lan
stageman
planman confirmplanman confirm
conman "continue & startconman "continue & start" conman "continue & startconman "continue & start"
CreatePostReports
reptr -post …/schedlog/M$DATEreptr -post …/schedlog/M$DATErep8 -F …. -i …/schedlog/M$DATErep8 -F …. -i …/schedlog/M$DATE
reptr -post …/schedlog/M$DATEreptr -post …/schedlog/M$DATErep8 -F …. -i …/schedlog/M$DATErep8 -F …. -i …/schedlog/M$DATE
logman UpdateStats
StartAppServer
© 2012 IBM Corporation
Final Schedule & JnextPlan
StartAppServer Checks that WAS is running and starts it if not
MakePlanCreates Symnew and make pre-production report
SwitchPlanStops the TWS agentsRuns Stageman to merge old Symphony and Symnew Confirms the switch of the plan to the plannerStarts the Master and the Symphony distribution
CreatePostReportsCreates post production reports
UpdateStatsRuns logman to update Pre-Production plan and job history and
statisticsIn FINAL runs in parallel to CreatePostReports
MAKEPLAN
SWITCHPLAN
CREATEPOSTREPORTS UPDATESTATS
STARTAPPSERVER
The Final Job Stream is now made up of fivedifferent jobs:
© 2012 IBM Corporation
Production Plan status
Planman showinfo This command retrieves the information related to the production
plan status.
© 2012 IBM Corporation
Production Plan Extension
JnextPlan and Production Plan extension When the production plan already exists and a JnextPlan is run, the
production plan is extended (by default for 24 hours). After the extension, the new production plan contains the new instances
related to the extension period and all the job stream instances not yet completed, which are carried forward.
Note: in the TWS 8.6 the Encarryforward keyword is used to specify the “carry forward” property. This keyword is stored in the database and its default value is ALL. During migrating data from a previous version, the value is copied from the previous configuration.
Encarryforward keyword all: ignores if the Carry Forward key is enabled or not in the job streams
definitions, and carries forward all uncompleted job streams. yes: Carries forward only those uncompleted job streams that have the
Carry Forward key enabled .
© 2012 IBM Corporation
Planning considerations
JobStreams not copied into the Current Plan: If they don’t have a Run Cycle If Run Cycle doesn’t result in a ”run” day in the planning
period
Ad-Hoc submission is allowed: If the Job Stream is defined in the database If the Job Stream is not draft and is valid the ON request flag is no longer mandatory (it was already
ignore in previous releases)
© 2012 IBM Corporation
Planning considerations (continued)
DRAFT / ACTIVE definition DRAFT: defined in the database; not used for Production Plan ACTIVE: defined in the database and used for Production Plan Sample Usage scenario: JobStreamA must not run tomorrow Set JobStreamA to DRAFT Extend Current Plan JobStreamA is not included into the Production Plan
VALID FROM definition A JobStream can have multiple versions VALID FROM date specification can differentiate JobStream versions Sample usage scenario:
JobStreamA: JobA -> JobB -> JobC; JobD must be added to workflow.Needs to be ready to run in 2 days when new apps goes live production
Modify JobStreamA and insert new definition like JobA …JobC -> JobDand insert a the VALID FROM date specification
Extend the Plan JobStreamA will have 2 sets of versions in the plan, in accordance with the
dates
© 2012 IBM Corporation
Planning considerations (continued)
Production Plan By default starts at 00:00AM but can be modified By default covers 24 hours (normal workdays) A higher or lower period can be specified Can span few days (during weekends or holidays) or more Once created the first time it is always Extended Extension can varies from a 1 minute to days
Symphony Size Same size and structure of expanded symphony of 8.2.x Longer plans will produce larger Symphony files At lease 512 bytes (1 Symphony record) for each Job
Stream Instance and for each Job Instance
© 2012 IBM Corporation
Jobs
NT Users “Symphony”Job
Streams
Prompts
Workstations
ResourcesCalendars
PlanDatabase
The scheduled workload for one or more production days
The collection of all defined scheduling objects.
JnextPlan
Symphony with more than 24hrs
The Symphony file contains objects needed for production plan period: Workstations, Calendars, Job Streams, Jobs, Dependencies
The JnextPlan runs on Master Scheduler as part of the production plan: JnextPlan extends the Production Plan and create a new Symphony file
The Pre Production Plan contains job stream instances calculated in advance for several days and external dependencies resolved on those instances according to matching criteria
Pre Production(LTP)
Job Stream instances and external dependencies for several days
© 2012 IBM Corporation
Production Plan Extension
Old Symphony
Jo
bs
Ca
len
da
rs
Database
Remove completed job
streams
Add detail for next plan
period
New Symphony
CurrentPlan Extension
10 days
today tomorrow
Pre Production Plan
….J
ob
Stre
am
s
Re
so
urc
es
Wo
rks
tatio
ns
Symnew
© 2012 IBM Corporation
StartAppServer
Checks that WAS is running and starts it if not.
In case of failure:
Rerun the job
© 2012 IBM Corporation
MakePlan
Replans or Extends Pre-Production plan if needed.
Produces the Symnew file.
Generate Pre-Production reports in the joblog.
In case of failure:
■ Global lock may be left set, use planman unlock to reset it.
■ Rerun the job to recover– Pre-Production plan is automatically re-verified and updated.– Symnew is recreated.
© 2012 IBM Corporation
MakePlan
How to stop it:
■ Stopping the job may not stop the processing still running inside WAS or on DB.
■ Force the DB statement closure if a DB statement is running too long and cause Makeplan to abend.
■ Restart WAS is required if processing is still running in WAS and Makeplan does not terminate.
Best Practice:
Check if the database statistics is enabled. If not, it is strongly suggested to schedule the runstatistics script stored in the dbtools TWS directory.
© 2012 IBM Corporation
MakePlan – Error messages
If MakePlan stdlist shows the following messages:
AWSBEH023E Unable to establish communication with the server on host "127.0.0.1" using port "31116".
This error means that the application server (eWAS) is down and MakePlan is not able to continue. In this case, the suggestion is to start the eWAS and check the eWAS logs in order to identify the reason of the eWAS stop.
AWSBEH021E The user "twsuser" is not authorized to access the server on host "127.0.0.1" using port "31116".
This is an authorization error . The suggestion to address this error is to check the twsuser credentials in the useropts file.
AWSJPL018E The database is already locked.
This means that a previous operation of MakePlan is stopped and the global lock is not reset. To recover the situation runs “planman unlock”.
© 2012 IBM Corporation
MakePlan – Error messages
If MakePlan stdlist shows the following messages:
AWSJPL006E An internal error has occurred. A database object "xxxx” cannot be loaded from the database.
In general “xxxx” is an object like workstation, job, job streams. This error means that a connection with the database is broken. In this case check in the SystemOut.log and the ffdc directory the error because additional information related to the database issue is logged.
AWSJPL017E The production plan cannot be created because a previous action on the production plan did not complete successfully. See the message help for more details.
This error means that a previous operation on the preproduction plan is preformed but finished with an error. In general it is present when “ResetPlan -scratch” is performed but not successfully finished.
AWSJPL704E An internal error has occurred. The planner is unable to extend the preproduction plan
This error means that MakePlan is not able to extend the preproduction plan. Different root causes are associated at this issue, in general always related to the database, like no space for the tablespace , full transaction logs. The suggestion is to check more information in the SystemOut.log or in the ffdc directory.
© 2012 IBM Corporation
SwitchPlan
Stops all the CPUs Runs stageman
– To merge old Symphony file with SymNew
– To archive the old Symohony file in schedlog directory Runs planman confirm to update in DB plan status information (e.g. plan end date and
current run number) Restart the master to distribute the Symphony file and restart scheduling.
In case of failure:
1) Planman confirm has not been run yet (check logs and “planman showinfo”) Rerun SwitchPlan
2) Planman confirm has failed Manually run “planman confirm” and “conman start”
3) Planman confirm has been already run (e.g. plan end date has been updated) Run “conman start”
How to stop it:
If conman stop is hanging, just kill conman command. This may impact plan distribution that will need to stop the agents left running before distributing the new Symphony.
© 2012 IBM Corporation
SwitchPlan – Error messages
If SwitchPlan stdlist shows the following messages:
■ STAGEMAN:AWSBHV082E The previous Symphony file and the Symnew file have the same run number. They cannot be merged to form the new Symphony file."
There are several possible causes for the Symphony and Symnew run numbers to be the same:
1. MAKEPLAN did not extend the run number in the Symnew file.
2. SWITCHPLAN was executed before MAKEPLAN
3. The stageman process has been run twice on the same Symnew file without resetting the plan or deleting the Symphony file.
■ AWSJCL054E The command "CONFIRM" has failed.
■ AWSJPL016E An internal error has occurred. A global option "confirm run number" cannot be set
In general, these error messages are present when the last step of the SwitchPlan that is “planman confirm” fails. The suggestion is to analyze the SystemOut.log to check more information and to rerun “planman confirm”.
© 2012 IBM Corporation
UpdateStats
Runs logman to update job statistics and history
Extends the Pre-production plan if its length is shorter then minLen
In case of failure:
■ Rerun the job or manually run “logman <file>” on the latest schedlog file.
■ If not run, the statistics and history will be partial. Pre-Production plan is updated anyway at the beginning of Makeplan.
How to stop it:
■ Kill the job or logman process, the statistics and history will be partial until the job or logman is rerun.
© 2012 IBM Corporation
CreatePostReports
Generate Post-Production reports in the job output
In case of failure:
■ Rerun the job if reports are needed
© 2012 IBM Corporation
Recovery Plan Procedure Symphony Corruption
Follow these steps on the master domain manager:
Set the job limit to 0, using conman or the Tivoli Dynamic Workload Console. This prevents all jobs from starting.
logman –prod Updates the Pre-Production Plan.
planman showinfo Retrieves the start time of the first non-completed job stream instance and the end time of the production plan.
ResetPlan Archives the current Symphony file.
JnextPlan -from –to Creates a new Symphony file for the period in which there are still outstanding jobs. Only incomplete job stream instances are included in the new Symphony file.
Set the job limit to the previous value. The Symphony file is distributed and the production cycle starts again.