etl operations manual web viewload progress is monitored in datastage director. the status view...

24
<<Application Name>> Operations Manual <<Month Day, YYYY>>

Upload: vudang

Post on 30-Jan-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

<<Application Name>>Operations Manual

<<Month Day, YYYY>>

Page 2: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

Document Revision History

Date Version Author Level of Review Summary of Comments

Page 3: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

Contents

1 INTRODUCTION..........................................................................................................................................

1.1 Purpose........................................................................................................................................ 6

1.2 Scope............................................................................................................................................. 6

1.3 Roles and Responsibilities....................................................................................................6

1.4 Document Maintenance.........................................................................................................7

1.5 References.................................................................................................................................. 7

2 APPLICATION OVERVIEW......................................................................................................................

2.1 Key Features.............................................................................................................................. 9

2.2 Application Technologies......................................................................................................9

2.3 System Interfaces.................................................................................................................. 10

2.4 Application Server File Locations and Paths................................................................10

2.5 Application File Structure.................................................................................................. 10

3 PREREQUISITES........................................................................................................................................

3.1 Collecting Input Files............................................................................................................12

4 PROCESS EXECUTION............................................................................................................................

4.1 Scheduler.................................................................................................................................. 13

5 MODIFYING PARAMETER.....................................................................................................................

5.1.1 Datastage Administrator......................................................................................................................... 14

5.1.2 Datastage Sequences............................................................................................................................ 14

5.1.3 How to Modify Variables in Datastage Sequences.................................................................................15

5.1.4 Datastage Jobs...................................................................................................................................... 15

5.1.5 How to Modify Project Variables in Datastage Jobs...............................................................................15

6 MONITORING PROGRESS......................................................................................................................

7 POST-LOAD ACTIVITIES........................................................................................................................

Page 4: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

8 VERIFYING RESULTS..............................................................................................................................

8.1 Verifying Results in DataStage Director........................................................................19

8.2 Verifying Results in Application Audit Reports..........................................................19

8.3 Verifying Results in RDBMS...............................................................................................19

9 COMMON ISSUES......................................................................................................................................

9.1 Recovering from an Aborted Job......................................................................................20

10 APPENDIX A: SFTP SERVERS................................................................................................................

Page 5: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

Index of Tables Table 1: Roles and Responsibilities....................................................................................................7Table 2: References.................................................................................................................................... 8Table 3: Application Technologies......................................................................................................9Table 4: Application Interfaces..........................................................................................................10Table 5: Application File Structure...................................................................................................11Table 6: SFTP servers.............................................................................................................................21

Page 6: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

1 Introduction

1.1 PurposeThe purpose of this Operations document is to provide operations guidance for the <<System Name>> for post implementation sustainment of this system.This document is not intended to address design and configuration details provided in system Functional Design (FD) or Technical Design (TD) documents and/or system build/configuration documents, which are provided separately.

1.2 Scope

This document describes the <<system name>> load processes and provides essential information for the operation of the process by the operations team. This document is not intended to substitute for a Standard Operating Procedure (SOP) in its scope or level of detail.

1.3 Roles and Responsibilities

The roles and responsibilities will be divided into roles, which support the operations of the application and specific responsibilities which have been outlined below:

Roles Responsibilities

Business Owner As a main stakeholder for the application, the System Owner has ultimate responsibility to keep the application operational during normal working hours. The business owner will manage the ongoing Operations and sustainment of the application and will provide direction to the operations teams as specified by the roles contained in this table. The business Owner will also coordinate with sources system owners and business users to implement any necessary application and/or scheduling changes during sustainment operations.

Database Administrator (DBA)

The DBA will be responsible for providing sustainment support for the application database including data repositories and ETL Server Metadata repositories. The DBA will support the operations team in maintenance of the physical databases and will perform database modifications, storage, and backup and restore operations.

Operations Team Facilitate direct contacts between Source System Owners technical leads and the Operations Team to support the management and collection of data

Review data load statuses and jobs. Coordinate and schedule on-demand process to include Load manual data

files (if applicable) provided by business functional team.

Page 7: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

Monitor rejected records and errors received from source systems and to correct the data errors with business functional team

Source System Owners

Provide contacts and contact information containing name, phone number, and email address for persons responsible for data extraction and delivery and ongoing changes to those contacts

Automate data extraction and transport of data files to a designated secure server within the application environment.

Manage exceptions to the scheduled delivery of data to application Designate and provide contact information for technical leads for

provided notification of security incident(s) when detected, so immediate actions can be taken to determine the impact of any potential breach of data

Network/System Administrator (SA)

The System Administrator will support sustainment activities for the physical infrastructure and networking components of the application including physical servers, network security and connectivity within internal system components and with external systems, firewall setting, storage devices, etc.

Intranet/Internet Team

The Intranet Support Staff will be responsible for the intranet or web-based components of the application. The intranet/Internet team will be responsible for maintenance of the intranet/internet application server, web pages or portal, and user account security and access control of the intranet/web server.

Business Users

Business Users will consume the reports and information generated by the application. Business Users will provide feedback on the application report functionality and recommend changes and specify new system requirements as necessary

Table 1: Roles and Responsibilities

1.4 Document Maintenance This document is a living document, which may need to be updated throughout the application life cycle when changes occurred, implement, alter, and/or retire processes.

1.5 References This section provides all applicable application assets and other documents closely related to or referenced by the document, which includes the following:

Supporting Documentation

Repository/Path/URL

Description

Application Administrator’s Handbook

This handbook provides for the storage of important environment information; frequently ask

Page 8: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

questions (FAQ’s), standards, and other useful guidance for project teams and ETL Administrators.

Application Design Documents

Function Design (FD) and Technical Design (TD) and other documents detailing the design characteristic of the application.

Application Developer’s Handbook

This handbook provides for the storage of important environment information; Frequently Ask Questions (FAQ’s), Coding standards, and other useful guidance for project teams and ETL developers.

Application Security Matrix

This is a secured matrix of system identifiers (SystemID’s) and passwords for the application and associated interface partner systems

Interface Control Documents

Documentation defining the interface contract details, such as, schedules, procedures, formats, data volumes, data growth rates, etc.

Software Requirements Specification (SRS)

A software requirements specification (SRS) is a comprehensive description of the intended purpose and environment for the application. The SRS fully describes what the application will do and how it will be expected to perform.

Page 9: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

Table 2: References

Page 10: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

2 Application Overview

2.1 Key FeaturesThe application provides the following essential functionalities:

1. Data Integration and OrganizationThe application collects and stores data from source systems;Organizes and optimizes the data to meet business intelligence requirements, and enriches the data to provide actionable information and insight to support business operations.2. Business IntelligenceThe application provides business analytics and reporting capabilities to support daily business operation, which may include reporting and/or business analytics.

2.2 Application TechnologiesError: Reference source not found shown below provide an overview of the technologies of which the application is comprised.

Technology Type Technology Description

Hardware (server platform)

Platform (windows, Linux, etc.)

Scheduling Software

Relation Database Management System (RDBMS)

Integration Application

Business Intelligence/Reporting Application

Table 3: Application Technologies

Page 11: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

2.3 System InterfacesThe application interfaces are interaction, which across independent application boundaries to exchange information. These interfaces have been defined below:

Interface Direction Interface Type Description

<<Inbound or Outbound>>

<<SOA, Data Replication, ETL/ELT, FTP/SFTP, XML>>

<<Interface name, Description of information being exchanged>>

Table 4: Application Interfaces

2.4 Application Server File Locations and PathsThe major application file paths are outlined below:

Application File Location Name File Path

DataStage Engine Directory \IBM\InformationServer\Server\DSEngine

DataStage Project Folder \IBM\InformationServer\Server\Projects

Project Working Files (e.g. Surrogate keys) \IBM\InformationServer\Server\Projects<<ProjectName>>

2.5 Application File StructureThe application requires a file folder structure for storing and processing files from source systems, as well as for ETL process logs and errors, to store processed and failed files, and other temporary file storage needs. The application file structure below describes application working file structures.

Main Folder Sub-Folders\<<ApplicationFolderPath>> \JobControl

\Logs\Errors\Scripts

Page 12: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

\SourceFiles\SQLFiles\Work

Table 5: Application File Structure

Page 13: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

3 Prerequisites

3.1 Collecting Input Files

<Information about collecting/placing input files, whether automated or manual.>

Page 14: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

4 Process Execution

The process is

4.1 Scheduler

As enterprises expand the sharing of data across disparate platforms, reliable file transfer and batch process orchestration becomes indispensable. The Application uses are scheduler system to initiate the process, schedule jobs, and control the relationships and dependencies between jobs. Depending on the scheduler system chosen the dependencies rules may be defined in the execution strategy and are enforced at runtime by the scheduling system, or the Datastage control sequences, or both.

This application uses the <<Scheduler Tool Name>> process scheduler. To update the scheduler process:

<<Place instructions here >> <<Place instructions here >>

If a change the Datastage Control Sequences is required (other than parameter value updates), please see the applicable Technical Design (TD) document, the Developer’s Handbook, and follow the established Change Management (CM) process.

Page 15: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

5 Modifying Parameter

Parameter may need to be modified, occasionally, to support changes in the environment and/or security. The parameters which may need to be changed occur in the following general areas of the application:

5.1.1 Datastage Administrator

It is strongly recommend that you save a back copy of the DSParams and that you review the administrator’s Handbook, before attempting any changes in the Datastage Administrator.

5.1.1.1 How to Modify Project Variables in Datastage:1. Open Datastage Administrator2. Select the “Projects” Tab3. Select the <<ProjectName>> project

<<Add Screenshot Here>>

4. Select the “Properties” button5. Select the “Environment” button

<<Add Screenshot Here>>

6. In the “Categories” menu select the appropriate category containing the parameter to be modified

<<Add Screenshot Here>>

7. Select “Details” row by “Double Clicking” on the “Value” 8. Make the desired change and click away from the row.9. Click the ‘OK” button10. Click the ‘OK” button11. Click the ‘Close” button

5.1.2 Datastage Sequences

It is strongly recommend that you save an export back copy of the sequence(s) and that you review the Developer’s Handbook, before attempting any changes in the Datastage Design Client.

Page 16: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

5.1.3 How to Modify Variables in Datastage Sequences

To modify Datastage Sequence variables:

1. Open the Datastage Design Client2. Navigate to the to the sequence to be modified

<<Add Screenshot Here>>

3. Select the sequence to be change, Right Click, and select “EDIT”

<<Add Screenshot Here>>

4. Open the “Job Properties”

<<Add Screenshot Here>>5. Select the “Parameters” tab6. Select the “Default Values” column to the variable to be modified

<<Add Screenshot Here>>

<<Add Screenshot Here>>

7. Make the desired change and click “OK”

<<Add Screenshot Here>>8. Save the job and Compile.9. Close job

5.1.4 Datastage Jobs

It is strongly recommend that you save an export back copy of the job(s) and that you review the Developer’s Handbook, before attempting any changes in the Datastage Design Client.

5.1.5 How to Modify Project Variables in Datastage Jobs

To modify Datastage job variables:

1. Open the Datastage Design Client2. Navigate to the to the job to be modified

<<Add Screenshot Here>>

Page 17: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

3. Select the job to be change, Right Click, and select “EDIT”

<<Add Screenshot Here>>

4. Open the “Job Properties”

<<Add Screenshot Here>>5. Select the “Parameters” tab6. Select the “Default Values” column to the variable to be modified

<<Add Screenshot Here>>

7. Make the desired change and click “OK”

<<Add Screenshot Here>>8. Save the job and Compile.9. Close job

Page 18: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

6 Monitoring Progress

Load progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each job. The “Started” and “On date” columns reflect the time and date (respectively) that a job/sequence was last started.

<<Director Example Status Screenshot here>>Figure 1: Director, status view

To view the message log for a job or sequence, highlight the job or sequence and click the Log button (yellow notebook). Message severity is denoted in the Type column, as well as by the icon to the left of the message.

<<Director Example Log Screenshot here>>

Figure 2: Director, Log view

A successful load should produce no warnings or errors. To determine whether any data was rejected, go to section Error: Reference source not found for instructions on identifying and handling rejects.

Page 19: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

7 Post-Load Activities

<Description of post-load activities>

Page 20: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

8 Verifying Results

8.1 Verifying Results in DataStage Director

A successful load should end with all jobs in a “Finished” status, with no errors or warnings. Examine the logs for each job in Director to determine whether any rejects were produced (rejects will not be logged as errors or warnings).

8.2 Verifying Results in Application Audit Reports

A successful load should end with all jobs in a “Finished” status, with no errors or warnings. Examine the logs for each job in Director to determine whether any rejects were produced (rejects will not be logged as errors or warnings).

8.3 Verifying Results in RDBMS

<<Instructions on how to verify execution results using the database, If Applicable>>.

Page 21: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

9 Common Issues

9.1 Recovering from an Aborted Job

<Instructions on recovery from an aborted run.>

Page 22: ETL Operations Manual Web viewLoad progress is monitored in Datastage Director. The Status view lists all jobs in the selected repository folder, with summary information for each

10 Appendix A: SFTP Servers

To connect to the SFTP servers, use a file transfer program, such as Filezilla, that can connect to an SFTP (Secure FTP) server.

Server name/address Port Source systems

Table 6: SFTP servers