data links - ibm redbooks · data links: managing files using db2 december 2001 international...

378
ibm.com/redbooks Data Links Managing Files Using DB2 Rodolphe Michel Amit Arora Kevin Crooks Aman Lalla David Shields Understand the Data Links architecture, unleashed for the first time Explore planning, migration, the Reconcile utility, and recovery Learn about HSM and HACMP support

Upload: others

Post on 13-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

ibm.com/redbooks

Data LinksManaging Files Using DB2

Rodolphe MichelAmit Arora

Kevin CrooksAman Lalla

David Shields

Understand the Data Links architecture, unleashed for the first time

Explore planning, migration, the Reconcile utility, and recovery

Learn about HSM and HACMP support

Front cover

Page 2: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00
Page 3: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Data Links: Managing Files Using DB2

December 2001

International Technical Support Organization

SG24-6280-00

Page 4: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

© Copyright International Business Machines Corporation 2001. All rights reserved.Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject torestrictions set forth in GSA ADP Schedule Contract with IBM Corp.

First Edition (December 2001)

This edition applies to IBM DB2 Universal Database EE V7 and Data Links V7.

Comments may be addressed to:IBM Corporation, International Technical Support OrganizationDept. QXXE Building 80-E2650 Harry RoadSan Jose, California 95120-6099

When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you.

Take Note! Before using this information and the product it supports, be sure to read the general information in “Special notices” on page 343.

Page 5: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Contents

Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiThe team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiiSpecial notice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xixIBM trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx

Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Why Data Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Data Links overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Data Links File Manager (DLFM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.2 Data Links File System Filter (DLFF) . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.3 The DATALINK data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Applications that use Data Links. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.1 Link Integrity+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.2 VPM with DB2 Data Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Chapter 2. Technical architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1 Overview of the Data Links architecture . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.1 Data Links server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2 DB2 Universal Database server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1.3 DB2 client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 DATALINK data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2.1 Attributes of DATALINK type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.2.2 Scalar functions for DATALINK data type . . . . . . . . . . . . . . . . . . . . . 242.2.3 DATALINK options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Security/authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.3.1 Concept of tokenized file names . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.3.2 Database configuration parameters . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.3 How access tokens work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Data Links on UNIX and Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.4.1 Data Links File Manager (DLFM) . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.4.2 Data Links File System Filter (DLFF) . . . . . . . . . . . . . . . . . . . . . . . . 432.4.3 Linking and unlinking files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4.4 Transaction support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.5 Data Links on DCE-DFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

© Copyright IBM Corp. 2001 iii

Page 6: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.5.1 Data Links File Manager (DLFM) . . . . . . . . . . . . . . . . . . . . . . . . . . . 582.5.2 Data Manager Application (DMAPP). . . . . . . . . . . . . . . . . . . . . . . . . 592.5.3 Data Links File System Cache Manager (DLFS-CM) . . . . . . . . . . . . 62

Chapter 3. Application development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.1 Choosing suitable applications for using Data Links . . . . . . . . . . . . . . 663.2 Transactional semantics for files in the application . . . . . . . . . . . . . . . 663.3 Data Links versus LOBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.3.1 Using LOBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673.3.2 Using Data Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.4 Application development tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.4.1 Application deployment considerations. . . . . . . . . . . . . . . . . . . . . . . 693.4.2 Checking whether Data Links has been enabled . . . . . . . . . . . . . . . 703.4.3 Choosing DATALINK options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.4.4 Changing DATALINK options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.4.5 Querying DATALINK options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.5 Coding considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.5.1 Host variable declaration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.5.2 Creating and linking a new file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.5.3 Reading a linked file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.5.4 Updating a linked file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.5.5 Unlinking a file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793.5.6 Scalar functions used with the DATALINK data type . . . . . . . . . . . . 803.5.7 Error handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.6 Using multiple file servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823.6.1 Supporting multiple links to the same file . . . . . . . . . . . . . . . . . . . . . 83

3.7 Migrating existing applications to use Data Links . . . . . . . . . . . . . . . . 843.7.1 Migrating an application that uses files . . . . . . . . . . . . . . . . . . . . . . . 843.7.2 Migrating an application that uses LOBs. . . . . . . . . . . . . . . . . . . . . . 85

Chapter 4. Planning Data Links deployment . . . . . . . . . . . . . . . . . . . . . . . 914.1 Deployment options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.1.1 Single server implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924.1.2 Single Universal Database and multiple DLFMs . . . . . . . . . . . . . . . . 924.1.3 Multiple Universal Databases and single DLFM . . . . . . . . . . . . . . . . 934.1.4 Multiple DLFMs on a single host . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.1.5 Multiple DB2s and multiple DLFMs . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.2 File systems and sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.2.1 The DLFM backup (archive directory) . . . . . . . . . . . . . . . . . . . . . . . . 964.2.2 Data Links controlled file systems. . . . . . . . . . . . . . . . . . . . . . . . . . . 974.2.3 Using NFS and NIS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.3 Planning the backup of the DLFM_DB database . . . . . . . . . . . . . . . . 984.4 Performance tuning tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

iv Data Links: Managing Files Using DB2

Page 7: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4.4.1 Optimum logging levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984.4.2 Location of file servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 984.4.3 Number of files per directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994.4.4 Token algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 994.4.5 DLFM backup, home, and log directories . . . . . . . . . . . . . . . . . . . . . 99

Chapter 5. Data Links Manager administration . . . . . . . . . . . . . . . . . . . . 1015.1 Identifying the tables and servers in Data Links . . . . . . . . . . . . . . . . 1025.2 Checking for Data Links control over a file system . . . . . . . . . . . . . . 1035.3 Other useful DLFM commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Chapter 6. Using Tivoli Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . 1076.1 Introduction to Tivoli Storage Manager . . . . . . . . . . . . . . . . . . . . . . . 108

6.1.1 Storage device concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1106.1.2 Policy concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126.1.3 Security concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.1.4 Communication methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.2 Data Links with the Backup-Archive Client . . . . . . . . . . . . . . . . . . . . 1146.3 Data Links and Tivoli Space Manager . . . . . . . . . . . . . . . . . . . . . . . 116

6.3.1 Overview of Tivoli Space Manager . . . . . . . . . . . . . . . . . . . . . . . . . 1166.3.2 Tools, processes, and interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.3.3 Data Links support for HSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1256.3.4 Current restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Chapter 7. High Availability support on AIX . . . . . . . . . . . . . . . . . . . . . . . 1317.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1327.2 HACMP cluster configuration for hot standby . . . . . . . . . . . . . . . . . . 132

7.2.1 Hot standby setup for a host DB2 server . . . . . . . . . . . . . . . . . . . . 1347.2.2 Hot standby setup for a Data Links server . . . . . . . . . . . . . . . . . . . 135

7.3 HACMP cluster configuration for mutual takeover. . . . . . . . . . . . . . . 1367.3.1 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377.3.2 Sequence of events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

7.4 The scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1427.4.1 Additional considerations for DB2 Universal Database Version 6 . 1467.4.2 Final considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Chapter 8. Creating a new database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1498.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1508.2 Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1518.3 EXPORT (dlfm_export). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1528.4 The db2look command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1548.5 The restore command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1558.6 Copying the linked files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1568.7 DLFM commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

Contents v

Page 8: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

8.8 Running the Import utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1578.9 Running the Load utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Chapter 9. Data replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1619.1 Overview of DB2 replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1629.2 Why replicate linked files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1629.3 Supported platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1639.4 Replication components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

9.4.1 Change-capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1649.4.2 Apply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1669.4.3 Subscription sets and subscription set members . . . . . . . . . . . . . . 167

9.5 Data Links replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1689.5.1 Capturing DATALINK values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1699.5.2 How Apply handles DATALINK values . . . . . . . . . . . . . . . . . . . . . . 169

9.6 Implementing replication with Data Links . . . . . . . . . . . . . . . . . . . . . 1729.6.1 Before we begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1729.6.2 Defining the replication source . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1749.6.3 Defining the subscription set and subscription set member . . . . . . 1789.6.4 Configuring the source database . . . . . . . . . . . . . . . . . . . . . . . . . . 1829.6.5 Binding the Capture and Apply programs . . . . . . . . . . . . . . . . . . . . 1839.6.6 Creating the password file for the Apply program . . . . . . . . . . . . . . 1839.6.7 Configuration files used by ASNDLCOPY. . . . . . . . . . . . . . . . . . . . 1849.6.8 Configuration files used by ASNDLCOPYD . . . . . . . . . . . . . . . . . . 1879.6.9 Starting and stopping the Capture and Apply programs . . . . . . . . . 188

Chapter 10. The Reconcile utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19110.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19210.2 When to run the Reconcile utility . . . . . . . . . . . . . . . . . . . . . . . . . . 19410.3 Situations that require the Reconcile utility . . . . . . . . . . . . . . . . . . . 196

10.3.1 Reconcile algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

Chapter 11. Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20111.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

11.1.1 Crash recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20211.1.2 Version or full database recovery . . . . . . . . . . . . . . . . . . . . . . . . . 20511.1.3 Restore and rollforward recovery . . . . . . . . . . . . . . . . . . . . . . . . . 207

11.2 DLFM backup considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20811.2.1 Environment backup considerations . . . . . . . . . . . . . . . . . . . . . . . 210

11.3 DLFM restore considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21111.4 Recovery history file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

11.4.1 Events recorded in the history file . . . . . . . . . . . . . . . . . . . . . . . . . 21411.4.2 Data recorded in the history file . . . . . . . . . . . . . . . . . . . . . . . . . . 215

11.5 Restoring an offline backup without rollforward. . . . . . . . . . . . . . . . 21511.6 Restoring and rolling forward to a point in time . . . . . . . . . . . . . . . . 219

vi Data Links: Managing Files Using DB2

Page 9: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.7 Tablespace recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22411.8 Recovering the dlfm_db to a point in time . . . . . . . . . . . . . . . . . . . . 231

Chapter 12. Garbage collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23512.1 Garbage collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23612.2 Garbage collection scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Chapter 13. Migrating to DB2 UDB Version 7 . . . . . . . . . . . . . . . . . . . . . . 24313.1 Migration options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

13.1.1 DB2IMIGR and MIGRATE database commands . . . . . . . . . . . . . 24413.1.2 Migrating the DB2 UDB V6.x database server . . . . . . . . . . . . . . . 25013.1.3 Migrating databases using an offline backup . . . . . . . . . . . . . . . . 254

Chapter 14. Moving a Data Links file system to a new disk . . . . . . . . . . 25914.1 Migrating a DLFS-enabled file system (AIX) . . . . . . . . . . . . . . . . . . 26014.2 Migrating a DLFS-enabled file system (Solaris) . . . . . . . . . . . . . . . 262

Chapter 15. Replacing or upgrading a machine . . . . . . . . . . . . . . . . . . . . 26515.1 Replacing or upgrading a DB2 machine . . . . . . . . . . . . . . . . . . . . . 266

15.1.1 Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26615.1.2 Steps to perform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

15.2 Replacing or upgrading a DLFM machine . . . . . . . . . . . . . . . . . . . . 26715.2.1 Steps to perform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

Chapter 16. Problem determination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26916.1 Solving problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

16.1.1 Problem solving process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27016.1.2 Information needed to analyze a problem. . . . . . . . . . . . . . . . . . . 27116.1.3 DB2 Universal Database or DLFM ‘hang’ situations . . . . . . . . . . . 27316.1.4 DB2 Universal Database or DLFM crash . . . . . . . . . . . . . . . . . . . 27516.1.5 The DB2 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

16.2 Solutions to common problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . 28616.2.1 Available resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28716.2.2 DLFM server problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28716.2.3 DB2 server problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29016.2.4 File system problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29216.2.5 Frequently Asked Questions (FAQs) . . . . . . . . . . . . . . . . . . . . . . 294

Appendix A. BNF specifications for DATALINK . . . . . . . . . . . . . . . . . . . . 297

Appendix B. Overview of DCE-DFS on AIX. . . . . . . . . . . . . . . . . . . . . . . . 301Distributed Computing Environment (DCE) . . . . . . . . . . . . . . . . . . . . . . . 302Distributed File Service (DFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

Contents vii

Page 10: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Appendix C. VPM and Data Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307Installation overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308

Installing DB2 Data Links Manager 6.1 GA. . . . . . . . . . . . . . . . . . . . . . . . 309Preliminary installation steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310Data Links post-installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

Making Data Links work with VPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312VPM and Data Link tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314Adapting VPM to work with Data Links . . . . . . . . . . . . . . . . . . . . . . . . . . . 317Writing a model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

Additional information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Appendix D. Logging priorities for DLFF and DLFSCM . . . . . . . . . . . . . . 331Modifying the DLFF logging priorities on AIX. . . . . . . . . . . . . . . . . . . . . . 332Modifying the DLFSCM logging priorities in DCE-DFS (on AIX) . . . . . . . . . . 334Modifying the DLFF logging priorities on Solaris . . . . . . . . . . . . . . . . . . . . . . 336Modifying the DLFF logging level on Windows . . . . . . . . . . . . . . . . . . . . . . . 337

Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

Other resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339Referenced Web sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

IBM Redbooks collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

Special notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

viii Data Links: Managing Files Using DB2

Page 11: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figures

1-1 Architecture of the Data Links technology . . . . . . . . . . . . . . . . . . . . . . . . 42-1 Data Links overview in UNIX and Windows environments . . . . . . . . . . 162-2 Data Links overview in a DCE-DFS environment . . . . . . . . . . . . . . . . . 172-3 DATALINK data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182-4 Retrieving the Data Link value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192-5 Accessing Data Linked files through a browser . . . . . . . . . . . . . . . . . . . 202-6 DATALINK column definition syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . 292-7 Relationship between DB2 servers and Data Links servers . . . . . . . . . 342-8 DLFM process model: DB2 server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382-9 DLFM process model: Data Links Manager. . . . . . . . . . . . . . . . . . . . . . 392-10 DLFM process model: Complete picture . . . . . . . . . . . . . . . . . . . . . . . . 402-11 Attributes before the link operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422-12 Attributes after the link operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422-13 Overview of Data Links implementation. . . . . . . . . . . . . . . . . . . . . . . . . 462-14 Link-file operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492-15 Control flow of SQL insert statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 502-16 Unlink process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522-17 Commit processing transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562-18 DLFMs in a single DCE cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592-19 The DMAPP implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612-20 Data Links architecture on DCE-DFS . . . . . . . . . . . . . . . . . . . . . . . . . . 633-1 DATALINK access token . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733-2 DATALINK options stored in SYSCOLPROPERTIES table . . . . . . . . . 753-3 Using multiple DLFM file servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833-4 Externalizing LOB data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893-5 Moving LOB table data to DATALINK table . . . . . . . . . . . . . . . . . . . . . . 904-1 Single server implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924-2 Single UDB and one to many DLFMs . . . . . . . . . . . . . . . . . . . . . . . . . . 934-3 Multiple UDBs and a single DLFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944-4 Multiple DB2 and multiple DLFMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955-1 Select from sysibm.syscolproperties . . . . . . . . . . . . . . . . . . . . . . . . . . 1025-2 List databases and Data Links Managers . . . . . . . . . . . . . . . . . . . . . . 1025-3 The dlfs file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036-1 Storage management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1126-2 Policy concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1136-3 Tivoli Space Manager overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176-4 Data Links and Tivoli Space Manager . . . . . . . . . . . . . . . . . . . . . . . . . 1256-5 Selective Migration of READ PERMISSION DB file . . . . . . . . . . . . . . 127

© Copyright IBM Corp. 2001 ix

Page 12: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6-6 dostatfs.c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1286-7 VFS numbers of DLFS and FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1286-8 Result of dostatfs on /dlfsfsmtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1296-9 dsmls utility behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1297-1 Host DB2 (or) Data Links File Manager cluster . . . . . . . . . . . . . . . . . . 1337-2 Mutual takeover environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1377-3 The /var/db2 files show the global variables and instances. . . . . . . . . 1407-4 The dlfs_cfg file must exist on both servers . . . . . . . . . . . . . . . . . . . . . 1407-5 The contents of /etc/vfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1417-6 List of dlfm_ programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1478-1 The steps used to create the new database . . . . . . . . . . . . . . . . . . . . 1518-2 Backup database command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1528-3 Quiesce and export to the IXF file type . . . . . . . . . . . . . . . . . . . . . . . . 1528-4 Contents of the export control file . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1538-5 Sample dlfm_export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1538-6 Export using delimited output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1548-7 Delimited file before and after editing . . . . . . . . . . . . . . . . . . . . . . . . . 1548-8 The db2look command and the output it produced . . . . . . . . . . . . . . . 1558-9 Restore command, get dbm cfg, and list datalinks managers . . . . . . . 1568-10 Sample dlfm_import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1578-11 The dlfm add_db and dlfm add_prefix commands. . . . . . . . . . . . . . . . 1578-12 Import delimited file with DATALINK column type . . . . . . . . . . . . . . . . 1588-13 The Load utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1599-1 Change Capture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1659-2 Defining a replication source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1669-3 Apply program data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1679-4 Subscription set and subscription set members . . . . . . . . . . . . . . . . . 1689-5 DATALINK values before and after replication . . . . . . . . . . . . . . . . . . 1699-6 File reference mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1709-7 SOURCE.MANAGERS table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1729-8 SOURCE.MANAGERS table contents. . . . . . . . . . . . . . . . . . . . . . . . . 1739-9 Environment before replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1739-10 Defining a replication source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1749-11 Selecting columns to be replicated . . . . . . . . . . . . . . . . . . . . . . . . . . . 1759-12 Saving the replication source definition . . . . . . . . . . . . . . . . . . . . . . . . 1759-13 SQL to define the replication source . . . . . . . . . . . . . . . . . . . . . . . . . . 1769-14 Defining the replication source by running an SQL file . . . . . . . . . . . . 1769-15 Viewing the replication source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1779-16 Defining the replication subscription . . . . . . . . . . . . . . . . . . . . . . . . . . 1789-17 Define replication subscription dialog . . . . . . . . . . . . . . . . . . . . . . . . . 1789-18 Changing the target table name. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1799-19 Selecting the primary key for the target . . . . . . . . . . . . . . . . . . . . . . . . 1799-20 Restricting replicated rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

x Data Links: Managing Files Using DB2

Page 13: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9-21 Subscription timing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1819-22 Saving the replication subscription . . . . . . . . . . . . . . . . . . . . . . . . . . . 18110-1 Reconcile warning when DLFM server is not available . . . . . . . . . . . . 19210-2 Extract of a lock snapshot for a table being reconciled . . . . . . . . . . . . 19310-3 Output of the db2_recon_aid utility with the CHECK option . . . . . . . . 19310-4 Extract of db2diag.log showing a table in DRP state . . . . . . . . . . . . . . 19410-5 Extract of a DB2DART report showing a table in DRP state . . . . . . . . 19410-6 Determining when to run the Reconcile utility . . . . . . . . . . . . . . . . . . . 19611-1 Two-phase commit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20411-2 Version or full database recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20611-3 Rollforward recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20711-4 Asynchronous archive request. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20911-5 Processing that takes place during a backup . . . . . . . . . . . . . . . . . . . 21011-6 Environment backup considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 21111-7 Processing that takes place during a restore . . . . . . . . . . . . . . . . . . . . 21211-8 Restore with the WITHOUT DATALINK option . . . . . . . . . . . . . . . . . . 21211-9 Restore without specifying the WITHOUT DATALINK option . . . . . . . 21311-10 Selecting results prior to insert and restore . . . . . . . . . . . . . . . . . . . . . 21611-11 The ls results of the Data Link file system prior to insert . . . . . . . . . . . 21611-12 Inserting and selecting after a new link . . . . . . . . . . . . . . . . . . . . . . . . 21711-13 List files after the link operation has completed . . . . . . . . . . . . . . . . . . 21811-14 Restore command and files that were unlinked . . . . . . . . . . . . . . . . . . 21811-15 Restore of an offline backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21911-16 List history to find backup and point in time . . . . . . . . . . . . . . . . . . . . . 22011-17 Restore with rolling forward and rollforward pending status . . . . . . . . 22111-18 Rollforward to obtain minimum CUT time . . . . . . . . . . . . . . . . . . . . . . 22111-19 Rollforward and log messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22211-20 Select statement with warning message . . . . . . . . . . . . . . . . . . . . . . . 22211-21 Reconcile command and log messages . . . . . . . . . . . . . . . . . . . . . . . 22311-22 Restore and rollforward to a point-in-time . . . . . . . . . . . . . . . . . . . . . . 22411-23 Removing dlfm_backup files and removing a Data Linked file . . . . . . 22511-24 Tablespace restore and rollforward . . . . . . . . . . . . . . . . . . . . . . . . . . . 22511-25 Using db2dart to see the table status of DRP . . . . . . . . . . . . . . . . . . . 22611-26 Selecting the data before reconcile is run . . . . . . . . . . . . . . . . . . . . . . 22711-27 Reconcile and the exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22811-28 The ddl to create the exception table for reconcile . . . . . . . . . . . . . . . 22811-29 Information from the exception table for the reconcile . . . . . . . . . . . . . 22911-30 Selecting the data after reconcile has run . . . . . . . . . . . . . . . . . . . . . . 23011-31 Tablespace recovery scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23111-32 Restore command and dlfm stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23211-33 Rollforward and messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23211-34 The list registered databases output . . . . . . . . . . . . . . . . . . . . . . . . . . 23311-35 The db2_recon_aid utility and output . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Figures xi

Page 14: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11-36 DLFM_DB database point-in-time recovery . . . . . . . . . . . . . . . . . . . . . 23412-1 Expired database backups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23712-2 Four database backups are taken . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23812-3 Active database backup being restored. . . . . . . . . . . . . . . . . . . . . . . . 23912-4 Database backups taken with a new log sequence number . . . . . . . . 23912-5 Backup (BK1) is marked as expired. . . . . . . . . . . . . . . . . . . . . . . . . . . 24012-6 New log sequence created after restore of backup (BK6) . . . . . . . . . . 24012-7 Garbage collection marks backup BK2 as expired . . . . . . . . . . . . . . . 24112-8 All backups prior to and including BK5 are marked as expired . . . . . . 24112-9 Inactive databases may become active because they are retained . . 24213-1 DB2DART utility output reporting no errors . . . . . . . . . . . . . . . . . . . . . 24513-2 Verifying that the database can be migrated with the db2ckmig utility 24613-3 Instance migration using the db2imigr utility . . . . . . . . . . . . . . . . . . . . 24613-4 Connecting to a database that requires migration . . . . . . . . . . . . . . . . 24713-5 Successful migration of the database using the migrate command. . . 24713-6 Verifying that the database can be migrated with the db2ckmig utility 24813-7 Instance migration using the db2imigr utility . . . . . . . . . . . . . . . . . . . . 24813-8 Successful migration of the DLFM instance. . . . . . . . . . . . . . . . . . . . . 24913-9 Output of the db2set command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24913-10 DB2DART utility output reporting no errors . . . . . . . . . . . . . . . . . . . . . 25213-11 Stopping DB2 Services on Windows NT . . . . . . . . . . . . . . . . . . . . . . . 25213-12 Verifying that the database can be migrated with the db2ckmig utility 25313-13 Extract of a recovery history file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25613-14 Restoring into an existing database . . . . . . . . . . . . . . . . . . . . . . . . . . . 25613-15 Rollforward completing with a warning . . . . . . . . . . . . . . . . . . . . . . . . 25816-1 Extract of an entry written to the db2diag.log file . . . . . . . . . . . . . . . . . 27316-2 Information about each component in the db2diag.log file . . . . . . . . . 27316-3 Extract of a trap file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27416-4 Extract of a trace entry in the formatted trace file . . . . . . . . . . . . . . . . 27816-5 Information about each component in a formatted trace file . . . . . . . . 27916-6 An SQL1036 error message when connecting to the database . . . . . 28016-7 Extract of the DB2DIAG.LOG with the SQL1036 error message. . . . . 28116-8 Output of the DB2 Trace format command . . . . . . . . . . . . . . . . . . . . . 28216-9 Function flow structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28316-10 Extract of the trace flow file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28416-11 Extract of trace flow showing the SQL1036 error . . . . . . . . . . . . . . . . 28516-12 Trace format file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286B-1 DCE architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302B-2 CDS entry in DNS format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305C-1 DB2 V6 Fixpak 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310C-2 Interpreting DL_FEATURES values. . . . . . . . . . . . . . . . . . . . . . . . . . . 319C-3 Creating a model in VPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321C-4 Creating and saving a model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322

xii Data Links: Managing Files Using DB2

Page 15: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

C-5 Confirm Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322C-6 Saved model in VPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323C-7 Read-Only file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324C-8 Opening a model in CATIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325C-9 A model in CATIA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326C-10 File under Data Links control now . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327C-11 Backup directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328C-12 Files backed up under the Backup directory . . . . . . . . . . . . . . . . . . . . 328

Figures xiii

Page 16: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

xiv Data Links: Managing Files Using DB2

Page 17: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Tables

2-1 Arguments to the SQLBuildDataLink function . . . . . . . . . . . . . . . . . . . . 222-2 Possible combinations of DATALINK attributes. . . . . . . . . . . . . . . . . . . 292-3 DLFM results and corresponding actions by DLFF . . . . . . . . . . . . . . . . 453-1 DATALINK options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713-2 Host language variable declaration for DATALINKS data type . . . . . . . 764-1 Parameters that can affect the size of the archive directory . . . . . . . . . 96B-1 Some commonly used terms in DCE-DFS environment . . . . . . . . . . . 306C-1 Creating your file systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

© Copyright IBM Corp. 2001 xv

Page 18: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

xvi Data Links: Managing Files Using DB2

Page 19: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Preface

The amount of data that is stored digitally is growing rapidly because computer systems and storage systems have become very affordable. The file paradigm is very common for such data types as video, image, text, graphics, and engineering drawings because capture, edit, and delivery tools use the file paradigm for these data types. A large number of applications store, retrieve, and manipulate data in files. Many of these applications need search capabilities to find the data in the files. These search capabilities, however, do not require physically bringing the data into the database system, because their raw content is not needed for the query.

Typically, you would extract features of an image or a video and store them in the database for performing a search on the extracted features. The applications combine the search capabilities of SQL with the advantages of working directly with files to manipulate the raw data. In general, the approach involves the ability to store a reference to such files, along with parametric data that describes their contents.

Data Links is a new feature of DB2 Universal Database (UDB) that extends the management umbrella of the relational database management system (RDBMS), to data stored in external operating system files as if the data was stored directly in the database. Data Links provides several levels of control over external data such as referential integrity, access control, coordinated backup and recovery, and transaction consistency.

This IBM Redbook provides you with sufficient information to effectively deploy Data Links in a complex environment. First it describes the technical architecture of Data Links, developing applications in a Data Links environment, and planning a deployment of Data Links. Then, it covers administering a Data Links environment, setting up Tivoli Storage Manager as a backup server with Data Links, and implementing high-availability cluster multiprocessing (HACMP) with Data Links. It includes a full chapter on data replication and, in particular, the replication of Data Linked files. It then describes the Reconcile utility and how the DB2 backup and recovery mechanism supports Data Links. This redbook concludes by providing some hints and tips for problem determination in a Data Links environment.

© Copyright IBM Corp. 2001 xvii

Page 20: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

This IBM Redbook is intended to be read by anyone who requires both introductory and detailed information on Data Links. Prior to reading this redbook, you should have a good understanding of DB2 Universal Database, and in particular, be familiar with data replication, database backup, and recovery concepts.

The team that wrote this redbookThis redbook was produced by a team of specialists from around the world working at the International Technical Support Organization (ITSO), San Jose Center.

Rodolphe Michel is a Senior Data Management Specialist for DB2 UDB on UNIX and Windows NT at the ITSO, San Jose Center, where he conducts projects on all areas of DB2 UDB. He writes extensively and teaches IBM classes and workshops worldwide on all areas of DB2 Universal Database.

Amit Arora is a Sr. Software Engineer in IBM India Software Labs. He has two years of experience as a developer in the Data Links Project. He holds a Bachelor of Engineering (Honors) degree in Computer Science from REC Durgapur, India. His areas of expertise include UNIX internals and Data Links technology.

Kevin Crooks is a Database Administrator for the Boeing Company in Seattle, Washington (USA). He has 12 years of experience on DB2 for OS/390 and four years of expertise in the DB2 Universal Database field. He has worked at Boeing for 15 years. His areas of expertise include Data Links and DB2 UDB on AIX. He is also an IBM certified DB2 UDB database administrator (DBA).

Aman Lalla is a DB2 UDB Engine Support Specialist at the IBM Toronto Laboratory in Canada. He has five years of experience with DB2 on the UNIX and Intel platforms. His areas of expertise include database recovery and problem determination. He has two years Data Links experience. Prior to joining the IBM Toronto Lab, he was part of IBM Global Services South Africa providing on-site DB2 Common Server/UDB customer support.

David Shields is a DB2 Database Administrator for the Boeing Company in Seattle, Washington (USA). He has worked with DB2 for five years, including two years on OS/390 and three years on AIX. He provides database support to the Boeing engineering communities in Seattle and St. Louis, Missouri. He also worked as an IMS DBA for nine years prior to working with DB2.

xviii Data Links: Managing Files Using DB2

Page 21: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Thanks to the following people for their contributions to this project:

Nagraj AlurKaren BrannonVitthal GogateJoshua W HuiInderpal Narang (Inventor of the Data Links technology)Ajay SoodMahadevan SubramanianParag TijareIBM Almaden Research Center, San Jose, USA

Poorna AmbatiFrank ButtSteven Elliot (Manager of the DB2 UDB Data Links Development)Graziela KundeBomma ShashidharMohan V SingamshettyS R SreejithIBM Silicon Valley Lab, San Jose, USA

Suparna BhattacharyaAmit Das IBM Software Labs, Bangalore, India

Brian Baker and Amr Roushdi, of IBM Dassault Systèmes International Competency Center (IDSICC), Paris, France, who gave us permission to reproduce their report “Installing & Configuring VPM with DB2 Data Links” in Appendix C, “VPM and Data Links” on page 307.

Special noticeThis publication is intended to help database developers, database administrators, and system administrators to deploy a Data Links environment. The information in this publication is not intended as the specification of any programming interfaces that are provided by DB2 Universal Database or Data Links. See the PUBLICATIONS section of the IBM Programming Announcement for the above products for more information about what publications are considered to be product documentation.

Preface xix

Page 22: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

IBM trademarksThe following terms are trademarks of the International Business Machines Corporation in the United States and/or other countries:

Comments welcomeYour comments are important to us!

We want our IBM Redbooks to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways:

� Use the online Contact us review redbook form found at:

ibm.com/redbooks

� Send your comments in an Internet note to:

[email protected]

� Mail your comments to the address on page ii.

e (logo)® AFS®AIX®AS/400®DataPropagator™DB2®DB2 Universal Database™DFS™DPI®DRDA®IBM®IBM.COM™Informix™MVS™

Redbooks Logo OS/2®OS/390®Perform™Redbooks™RETAIN®S/390®SP™Tivoli®TME®Lotus®Lotus Notes®Notes®Domino™

xx Data Links: Managing Files Using DB2

Page 23: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 1. Introduction

DB2 is the IBM family of relational database management systems (RDBMS) products, with DB2 Universal Database (UDB) being the company's flagship for the implementation of object-relational extensibility. Data Links is a new feature of DB2 UDB, which extends the management umbrella of the RDBMS, to data stored in external operating system files as if the data was stored directly in the database. DB2 Data Links is available on the following environments:

� Journaled File System (JFS) on IBM AIX

� File System Migrator (FSM) on IBM AIX

� Distributed File Service (DFS) in Transarc’s Distributed Computing Environment (DCE) on IBM AIX

� UNIX File System (UFS) on SUN Solaris

� NTFS-formatted drive on Windows NT

� Integrated File System (IFS) on IBM ~ iSeries (AS/400)

Data Links provides several levels of control over external data such as referential integrity, access control, coordinated backup and recovery, and transaction consistency.

Referential integrity is supported with Data Links to ensure that users cannot delete or rename any external file as long as it is referenced in the database. Access control is enhanced with DB2’s permission used to grant or deny a user the ability to read a referenced external file, with read access control being

1

© Copyright IBM Corp. 2001 1

Page 24: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

optional. With coordinated backup and recovery, the DBMS is responsible for backup and recovery of external data in synchronization with the associated database; this type of control over external data is optional. Transaction consistency requires that changes that affect both the database and external file be executed within a transactional context to preserve the logical integrity and consistency of the data.

1.1 Why Data LinksThe amount of data stored digitally is growing rapidly because computer systems and storage systems have become very affordable. The file paradigm is very common for such data types as video, image, text, graphics, and engineering drawings because capture, edit, and delivery tools use the file paradigm for these data types. A large number of applications store, retrieve, and manipulate data in files.

These applications may use files to store their data for one or more of the following reasons:

� Cost

You should consider the expense required to rewrite applications that use standard file I/O semantics to use a database as a repository. Also, your applications may use existing tools that work with the file paradigm. Replacing these tools can be expensive.

� Performance

The store and forward model of data is unacceptable for performance reasons. For example, it may be unacceptable for the database manager to materialize a Binary Large Object (BLOB) into a file, and the converse, each time the data needs to be accessed as a file. Also, data is captured in high volumes, and you do not want to store it in the database.

� Network considerations

You want to access data directly from a file server that is physically close to a workstation. For example, the file server can be configured so that the network distance is much shorter to the user, compared to the database where all the BLOBs are stored. The number of bytes that flow for a large object are much larger than the number of bytes for an answer of an SQL query. Network distance between resources is, therefore, a significant consideration.

2 Data Links: Managing Files Using DB2

Page 25: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Isochronous delivery

The application uses a stream server because it has real time requirements for delivery and capture. The data is expected to be large, and you may require isochronous delivery. An example of isochronous delivery may be a video server that delivers high-quality (or “jitter-free”) video to a client workstation in real time. In these kinds of applications, it is likely that such data will not be moved into the database as a BLOB, but rather stay on the file server.

Many of these applications need search capabilities to find the data in the files. These search capabilities, however, do not require physically bringing the data into the database system, because their raw content is not needed for the query. Typically, you would extract features of an image or a video and store them in the database for performing a search on the extracted features. An example of the features that can be extracted from an image are color, shape, and texture. The IBM DB2 Universal Database Extender for Image product supports extraction and search functions on such features.

The ability to store a reference to such files, along with parametric data that describes their contents is, in general, the approach used by these applications to combine the search capabilities of SQL with the advantages of working directly with files to manipulate the raw data. The DB2 relational extenders for text, voice, image (and so on) provide this functionality. The extenders allow you to specify whether the object itself is to be maintained either inside or outside the database.

Currently, the DB2 relational extenders do not provide referential integrity between files on a server and their references in databases. Therefore, it is possible to independently delete either the reference or the file. Moreover, the extenders do not provide access control to the related files or coordinated backup and recovery schemes for a database and its associated files.

DB2 Data Links technology solves these problems and provides the functionality required by such applications. Future releases of the DB2 relational extenders will use Data Links technology.

1.2 Data Links overviewBy extending the reach of the RDBMS to operating system files, Data Links gives users flexibility to store data inside or outside the database as appropriate. To store and reference data outside of a DBMS, a database application developer declares a column of DATALINK data type when creating an SQL table. The value stored in the DATALINK column is then used to represent and reference data in an external file.

Chapter 1. Introduction 3

Page 26: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 1-1 illustrates the architecture of the Data Links technology. As shown in this figure, Data Links has two components:

� Data Links engine� Data Links Manager

Figure 1-1 Architecture of the Data Links technology

The Data Links engine resides on the host database server and is implemented as part of the database (DB2) engine code. It is responsible for processing SQL requests involving DATALINK columns such as table creation, and select, insert, delete, and update of records with a DATALINK column.

The Data Links Manager consists of two components:

� Data Links File Manager (DLFM)� Data Links File System Filter (DLFF)

At a high level, DLFM applies constraints on the files that are referenced by the host database, and DLFF enforces the constraints when file system commands or operations affect these files. For example, a file rename or delete would be rejected if that file was referenced by the database.

1.2.1 Data Links File Manager (DLFM)The Data Links File Manager resides with the file server, which can be local or remote to the host database server, and plays a key role in managing external files. It is responsible for executing the link/unlink operations with transactional semantics within the file system. To do this, DLFM maintains its own DB2 repository about files that are linked to (referenced in) the database. When a file

Archive ServerDB2 Application

Data Links Manageron File Server

DB2 Server withData Links Ext.

Standard FileAccess Protocol

Native File System : JFS , NTFS, UFS

(Solaris), DFS -DCE/AIX

Data Links F ile S ystem Filter

(DLFF)

Storage

DLFM_DB(meta datarepository)

Data Links File Manager (DLFM)

db2agents

ControlPath forData LinksIntegrity

SQL Access Path

4 Data Links: Managing Files Using DB2

Page 27: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

is initially linked to the database, the DLFM applies the constraints for referential integrity, access control, and backup and recovery as specified in the DATALINK column definition. If the DBMS controls read access, for example, the DLFM changes the owner of the file to the DBMS and marks the file “read only” as well.

All these changes to the DLFM repository and to the file system are applied as part of the same DBMS transaction as the initiating SQL statement. If the SQL statement is rolled back, the changes made by the DLFM on the file system side are undone as well.

The DLFM is also responsible for coordinating backup and recovery of external files with the database. When the DBMS transaction that includes a Link File operation commits and the DBMS is responsible for recovery of the file, the DLFM initiates a backup of the newly linked file. This file backup is done asynchronously and is not part of the database transaction for performance reasons.

In addition, note that by doing it this way, the database backup itself is not slowed down because the referenced file has already been backed up. This is particularly important in the case of very large files. Coordinated backup and recovery of external files with DB2 data can be done directly to disk or through an archive server supported by DB2 UDB, such as Tivoli Storage Manager.

1.2.2 Data Links File System Filter (DLFF)The Data Links File System Filter is a thin, database-control layer on the file system that intercepts certain file system calls (for example, file-open, file-rename, and file-delete) issued by the application. If the file is referenced in a database, the DLFF is responsible for enforcing referential integrity constraints and access-control requirements defined for the file. This ensures that any access request meets DBMS security and integrity requirements.

The DLFF will, for example, reject a user-level request to rename or delete a file referenced by the database. This avoids “dangling pointers” in which a file is referenced by the database, but the actual file does not exist. DLFF also validates any authorization token embedded in the file pathname for a file-open operation.

Data Links provides a new and innovative DBMS capability. By providing tight integration of file system data with the object-relational DBMS, Data Links allows DB2 UDB to guarantee the integrity of data whether it is stored inside or outside the database. Although companies in the CAD/CAM application marketplace

Chapter 1. Introduction 5

Page 28: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

were the early supporters of Data Links, Data Links applies to application problems in a wide variety of market segments, especially as it relates to content management. Web, Internet, and e-commerce applications are important examples of these new market segments.

1.2.3 The DATALINK data typeData Links technology includes the DATALINK data type that is implemented as an SQL data type in DB2 Universal Database, which references an object stored external to a database.

You use the DATALINK data type, just like any other SQL data type, to define columns in tables. In NT File System (NTFS) and JFS environments, the DATALINK values encode the name of a Data Links server containing the file and the filename in terms of a Uniform Resource Locator (URL). The DATALINK value is robust in terms of integrity, access control, and recovery. DB2 treats a DATALINK value as if the object were stored in the database. You register a set of known Data Links servers. The only Data Links server names that you can specify in a DATALINK value are those that have been registered to a DB2 database.

In Distributed Computing Environment-Distributed File Service (DCE-DFS) environments, the Data Links Manager is registered for the entire cell, and linked files are referred to in terms of a URL with a scheme – dfs and the DFS pathname of the file.

Even though the DATALINK value represents an object that is stored outside the database system, you can use SQL queries to search parametric data to obtain the file name that corresponds to the query result. You can create indexes on files containing video, images, text, or other media formats, and store those attributes in tables along with the DATALINK value. With a central repository of files on a file server and DATALINK data types in a database, you can obtain answers to questions like:

� What do I have? � Where can I find what I’m looking for?

These are examples of applications that can use the DATALINK data type:

� Medical applications, in which X-rays are stored on the file server and the attributes are stored in a database.

� Entertainment industry applications that perform asset management of video clips. The video clips are stored on a file server, but attributes about the clips are stored in a database. Access control is required for accessing the video clips based on database privileges of accessing the meta information.

6 Data Links: Managing Files Using DB2

Page 29: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� World Wide Web applications that manage millions of files and allow access control based on database privileges.

� Financial applications, which require distributed capture of check images and a central location for those images.

� CAD/CAM applications, where the engineering drawings are kept as files, and the attributes are stored in the database. Queries are run against the drawing attributes.

1.3 Applications that use Data LinksAmong the applications that use Data Links, there are two applications that illustrate the wide range of applications that can benefit greatly from Data Links:

� Link Integrity+� Dassault Systems’ VPM product

1.3.1 Link Integrity+Link Integrity+ is a Web asset integrity solution from the IBM Almaden Research Center in San Jose, California. It exploits IBM DB2 UDB’s unique Data Links technology to guarantee the referential integrity (RI) of an intranet’s Web objects such as Web pages, hyperlinks, images, server-side-programs, and templates.

While there are many products in the marketplace that report on broken links and missing images “after-the-fact”, Link Integrity+ proactively prevents the occurrence of broken links and the irksome “404 file not found” message. It does this by inhibiting any malicious or accidental changes to Web pages that could compromise the referential integrity of Web assets.

Link Integrity+'s architecture supports a two-phase approach to delivering Web content:

� Phase 1: Validates the referential integrity of hyperlinks, images, server-side programs and templates

� Phase 2: “Installs” the Web content on the Web site in atomic fashion, with minimal transient problems

Link Integrity+ also supports the enforcement of an organization's guidelines for Web content such as the inclusion of appropriate headers, footers, and disclaimers. A critical Link Integrity+ function is its support of multiple independent webmaster domains within a geographically distributed intranet of heterogeneous Web servers. It includes an e-mail and pager notification system that alerts webmasters to the impact on Web pages in their domain, of deletions, or updates of Web pages in another webmaster's domain.

Chapter 1. Introduction 7

Page 30: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Link Integrity+ exploits IBM DB2 UDB Data Links, Java JDBC, Java Mail, Java Beans Activation Framework, Java Native Interface (JNI), Structured Query Language (SQL), and Extensible Markup Language (XML) technologies in its implementation.

The Link Integrity+ architecture provides the ability to deliver a significantly higher level of Web asset integrity to an organization's intranet. It synergistically integrates the IBM DB2 UDB unique Data Links technology with innovative application design. Link Integrity+ is a prototype that demonstrates that its architecture is capable of supporting the “real world” environment of geographically distributed heterogeneous Web sites with multiple webmasters managing multiple domains or sub-domains. Its staging area approach enables the enforcement of referential integrity of Web assets and the enforcement of an organization's guidelines for Web content.

Because it is the main conduit for getting content on the Web, Link Integrity+ can be educated to become sensitive to information of interest to specific individuals. In other words, Link Integrity+ can be integrated with personalization and information delivery systems to notify and deliver in very timely fashion, information to individuals based on available user-profiles and subscription information. The Link Integrity+ trigger mechanism for use by content developers significantly enhances the productivity of webmasters by taking over routine and mundane activities, and only alerting them to get involved when problems are detected. By guaranteeing the integrity of an intranet's hyperlinks, the chances of an end user encountering the “404 file not found” message is minimized, which contributes to a positive experience for the user visiting the Web site. Note that an end user may still experience the “404 file not found” message due to caching of pages that may occur in the browser, Internet Service Provider (ISP), proxy, and other caches.

For more information, refer to “Link Integrity+: A Web Asset Integrity Solution”, Nagraj Alur, Ramani Ranjan Routray, IBM Almaden Research Center paper.

1.3.2 VPM with DB2 Data LinksThis demonstrates a methodology behind how IBM middleware (DB2 and Data Links) can provide solutions for data archive and restoration on a large enterprise basis. This applies specifically when working with IBM & Dassault Systemes CATIA and VPM.

Data Links technology has been supported in VPM since the general availability (GA) of VPM 1.2. This technology support provides four primary capabilities:

� Logical data consistency: For example, an engineer cannot delete or rename a file that is referenced by its corresponding part description in the database.

8 Data Links: Managing Files Using DB2

Page 31: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Transaction consistency: If a transaction is rolled back in the database, the link to the appropriate version of the file at this site is maintained.

� Security and access: Files controlled by Data Links can either be totally protected by the database, preventing unauthorized file system access, or opened, to allow file system access.

� Synchronized backup and recovery: Using DB2 with Data Links ensures consistent backup and recovery of ENOVIAVPM meta data and the associated CATIA models. This makes the overall process more automatic and less database administrator (DBA)-intensive. In the past, administrative tasks were performed outside of the CATIA environment. This required a separate backup strategy for external CATIA files, which introduced a large risk of inconsistencies between the database and related external files.

For additional information, refer to Appendix C, “VPM and Data Links” on page 307.

Chapter 1. Introduction 9

Page 32: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

10 Data Links: Managing Files Using DB2

Page 33: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 2. Technical architecture

This chapter provides a detailed description of the Data Links technical architecture. The following topics are discussed:

� Overview of the Data Links architecture� The SQL data type DATALINK� How Data Links maintains security� The different components of Data Links on AIX, Solaris, and Windows� The different components of Data Links on DCE-DFS

2

© Copyright IBM Corp. 2001 11

Page 34: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.1 Overview of the Data Links architectureDB2 Data Links can be installed on the following environments:

� Journaled File System (JFS) on IBM AIX

� File System Migrator (FSM) on IBM AIX

� Distributed File Service (DFS) in Transarc’s Distributed Computing environment (DCE) on IBM AIX

� UNIX File System (UFS) on SUN-Solaris

� NTFS-formatted drive on Windows NT

� Integrated File System (IFS) on IBM ~ iSeries (AS/400)

A typical environment using DB2 Data Links Manager (DLM) has the following components:

� Data Links server� DB2 Universal Database server� DB2 client

The following sections provide a brief overview of these components.

2.1.1 Data Links serverA Data Links server consists of the following components:

� Data Links File Manager (DLFM)� Either one of the following Data Links File System Filters (DLFF)

Note: FSM is the file system filter for Tivoli Space Manager client (also known as Hierarchical Storage Manager (HSM)), which provides the space management capabilities. Data Links support for Tivoli Space Manager is discussed in Chapter 6, “Using Tivoli Storage Manager” on page 107.

Note: Refer to Appendix B, “Overview of DCE-DFS on AIX” on page 301, for an overview of DCE-DFS.

Note: Data Links on iSeries and AS/400 is out of the scope of this book. Refer to the IBM Redbook DB2 UDB for AS/400 Object Relational Support, SG24-5409, for Data Links implementation on the iSeries.

12 Data Links: Managing Files Using DB2

Page 35: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

– In JFS, FSM, NTFS, and UFS environments– Data Manager Application (DMAPP) in DCE-DFS environments

� DB2 Logging Manager

Data Links File Manager (DLFM) DLFM is a set of user-level processes that keeps track of all the files on a particular Data Links server that are linked to a DB2 database. The DLFM receives and processes link-file and unlink-file messages that arise from SQL INSERT, UPDATE, and DELETE statements that reference a DATALINK column.

For each linked file, the DLFM tracks:

� The database instance� The fully qualified table name� The column name referred to in the SQL statement

Another vital role of the DLFM is to respond to all the queries sent by the DLFF (in AIX, Solaris, and Windows NT) or the DMAPP (in a DCE-DFS environment). These queries can be requests for a file or token information.

For referential integrity, DLFF (on UNIX and Windows NT) and DMAPP (on DCE-DFS) should be able to recognize all the files which are under the control of Data Links.

At the time of table creation, it is possible to specify some options for the DATALINK column. One of these options is RECOVERY=YES. This option allows DB2 to provide point-in-time, rollforward recovery for any file that has a reference in this DATALINK column. Therefore, if this option is specified at the time of creation of table, the DLFM not only keeps track of the currently linked files, but also tracks the previously linked files.

Definition: A token is a dynamically generated string used to provide users access to read a file under READ PERMISSION DB control. The DLFF rejects any operation that tries to access the READ PERMISSION DB file without a valid token, unless it has been originated by the super-user. Refer 2.3.3, “How access tokens work” on page 32, to understand better the concept of tokens.

Note: All the DATALINK options (for example, READ PERMISSION DB and RECOVERY=YES) are discussed in 2.2.3, “DATALINK options” on page 26.

Chapter 2. Technical architecture 13

Page 36: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Data Links File System Filter (DLFF) This is a filter file system layer that sits on top of the base file systems like JFS, FSM, UFS and NTFS. It is also known as Data Links File System (DLFS). DLFF maintains referential integrity by ensuring that linked files are not deleted or renamed, and that the file's attributes are not changed. In case of READ PERMISSION DB files, it also filters commands to ensure that proper access authority exists. AIX and Solaris file systems under the control of a DLFF can be NFS-exported and mounted on a DB2 client. Windows NT file systems under the DLFF control can be netshared.

Data Manager Application (DMAPP)DMAPP maintains referential integrity of Data Linked files in a DCE-DFS environment by filtering commands to ensure that all the files that are linked under the Data Links control are not deleted, renamed, and that their file’s attributes are not changed.

The DMAPP monitors file sets that reside in DMLFS aggregates that are Data Manager-enabled. Once an aggregate is Data Manager-enabled, the aggregate can contain file sets that may be brought under Data Links control. The DMAPP can then manage the data within these filesets after the aggregate is exported into the namespace. Making an Links File System aggregate Data Manager-enabled is part of the Storage Management Toolkit (SMT) provided by Transarc.

DB2 Logging ManagerThis is a component of DLFM that maintains the logging information in the DLFM_DB database. This DLFM_DB database contains registration information about the databases that can connect to a Data Links server. It also contains information about the mount points of the file systems on AIX or Solaris, or the sharename of the drives on Windows NT, that are managed by a DLFF. The DLFM_DB database also contains information about files that have been linked, unlinked, or backed up on a Data Links server or in a DCE cell. This database is created during the installation of DB2 Data Links Manager.

Note: Refer Appendix B, “Overview of DCE-DFS on AIX” on page 301, for an introduction to DCE-DFS concepts and the various terms used here.

Note: As you have seen, DLFF is the filter file system on the Data Links server on AIX, Solaris and Windows NT. In DCE-DFS environments, the file system operations at the file server are filtered by the DMAPP. Therefore, it is also known as DLFS-DMAPP.

14 Data Links: Managing Files Using DB2

Page 37: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.1.2 DB2 Universal Database serverThe DB2 Universal Database server is the location of the main database where the Data Links Manager is registered. In NTFS, JFS, FSM, and UFS environments, more than one Data Links Manager can be registered with a database.

The database may have tables with columns of the DATALINK data type. In DCE-DFS environments, the DB2 server can only register one DCE cell. Also, the DFS client must be installed on the DB2 server in order to allow access to configuration information that is stored in DFS.

The DB2 server and the Data Links server use a reserved port to communicate with each other. The default value of this port is “50100” at installation time and can be configured at installation time.

2.1.3 DB2 clientThe client connects to a remote DB2 server and accesses the tables with DATALINK columns. In UNIX environments, the remote client may directly access the Data Linked files by exporting the Data Links file system (file system under the Data Links control) from the Data Links server and mounting it through the Network File System (NFS) on the DB2 client. In Windows NT, the drive under Data Links control can be shared with the DB2 clients.

Figure 2-1 shows an overview of the interaction between a DB2 server, the DB2 Data Links Manager components, the backup media, and a remote client application in NTFS and JFS environments. The DB2 server has the user database (also known as the host database), which has the CELEBS table with a DATALINK column.

The client application performs the following actions to access a Data Linked file:

1. The client application issues a CONNECT statement to a database on a DB2 server.

2. The application then issues a SELECT statement on the table that contains a DATALINK column and receives the URL.

3. The application on the DB2 client uses a shared drive (on Windows NT) or an NFS-mount directory (on AIX or Solaris) to access the file from the file server (Data Links server).

Note: Data Links implementation for more than one DCE cell is currently not available, but IBM is considering it as a future enhancement.

Chapter 2. Technical architecture 15

Page 38: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

These steps are shown in Figure 2-1 (in UNIX and windows environments) and in Figure 2-2 (in DCE-DFS environments).

Figure 2-1 Data Links overview in UNIX and Windows environments

In DCE-DFS environments on AIX systems, the applications use the DFS client, which is also a DB2 client to connect to the database and access the files. A DB2 Data Links DFS Client Enabler, also known as the DLFS Cache Manager (DLFS-CM), is required to access the files referenced by the DATALINK columns created with READ PERMISSION DB specified.

Figure 2-2 gives an overview of Data Links implementation in the DCE-DFS environment. It shows how the steps followed by an application on the DB2 client (which is also a DFS client) access a Data Linked file. You can see that only one of the Data Links servers in a DCE cell can have the DB2 Logging Manager (DLFM_DB). The other Data Links servers on different nodes connect to this DLFM_DB as a DB2 client.

DLFM(Data Links

FileManager)

LoggingManager

DLFF(Data LinksFilesystem

Filter)

NativeFile

System

Data LinksServer

DB2 Client

(3)Shared Directory

or NFS Mount

DB2 Server

CELEBS

DATALINK

User Database

File

DLFM_DBDatabase

BackupMedia

(1) (2)

16 Data Links: Managing Files Using DB2

Page 39: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-2 Data Links overview in a DCE-DFS environment

DB2 Data Links Managerin One DCE-DFS Cell

DLFM

DB2Server

DMAPP

DMLFSFILE

CELEBS

DA

TAL

INK

User Database

DB2 Server

Data LinksServer

(DFS Server)

Data LinksServer

(DFS Server)DLFM_DBDatabase

DB2 ClientApplication

DB2 DFSClient Enabler

DFS Client

DB2/DFS Client

BackupMedia

BackupMedia

DLFM

DB2Client

toDLFM_DB

DMLFS

DMAPP

(1) (2)

(3)

(3)

FILE

Note: Do not confuse DLFF with DLFS-CM. As mentioned earlier, DLFF is the file system filter layer on the Data Links server on AIX, Solaris, and Windows NT. DLFS-CM is the filter layer on a DB2 client in a typical DCE-DFS environment, from where applications access Data Linked files.

Chapter 2. Technical architecture 17

Page 40: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.2 DATALINK data typeDB2 Data Links Manager introduces the base SQL data type DATALINK, which is now part of the ANSI, ISO, and ODBC standards. The DATALINK column allows you to store a reference to a file (in the form of a URL) that you want to put under the control of the Data Links File Manager (Figure 2-3). The files referred by the DATALINK values are treated by DB2 as if they were stored inside the database, and therefore, can fully benefit from RDBMS properties like referential integrity, access control, and recovery.

Figure 2-3 DATALINK data type

Even though the DATALINK value represents an object that is stored outside the database system, SQL queries can be used to search meta data (related information stored in other columns along with the DATALINK values) to obtain the file name that corresponds to the query result. Indexes on files containing video, image, text, or media formats can be created and stored as attributes in tables along with the DATALINK values.

Typically, an application programmer would insert rows in these tables with meta-data about the file and its file reference (DATALINK value). The referenced file is said to be “linked” under Data Links control. The applications can then do a search based on this meta-data information and locate the files of its interest. Next, the applications can access these files using native file system APIs (like fopen, fread etc.) or a Web browser.

FileServer

UNIX

Windows NT

FileServer

DB2 UDB DatabaseDB2 UDB Database

Table

DB2 UDB Database

DATALINK Value

DATALINK Value

File

File

18 Data Links: Managing Files Using DB2

Page 41: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

For example, the CELEBS table contains a DATALINK column that has URLs to pictures of various celebrities. This table also stores some meta-data information about each celebrity. Using this information, a search (using an SQL SELECT statement) can be done on the table for your favorite celebrities. Figure 2-4 shows how you can access the DATALINK value of the picture for celebrities from India whose pictures are in the .jpg format.

Figure 2-4 Retrieving the Data Link value

Now the picture can be accessed via a Web browser using the URL (or the DATALINK value), as shown in Figure 2-5.

Chapter 2. Technical architecture 19

Page 42: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-5 Accessing Data Linked files through a browser

For the applications to update or delete a file on the file system that is under Data Links control, they have to unlink the file by deleting the corresponding entry from the DB2 UDB table first. The reason is that Data Links enforces referential integrity on all the files under its control, and therefore, does not allow any delete or update operation on the file.

Note: HTTP protocol considers the semi-colon (;) to be a special character, and therefore, its equivalent escape sequence (“%3B”) is used instead. If an application does a select on the table with a DATALINK column, receives a tokenized file name (in case of READ PERMISSION DB), and wants to access that file through a Web browser, it needs to replace the “;” character with the “%3B” escape sequence itself.

Note: The Update-in-place feature (which allows you to update a file while it is under the Data Links control) will be available in DB2 Universal Database V8.x.

20 Data Links: Managing Files Using DB2

Page 43: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Data Links has been designed to support a distributed computing environment, with capabilities that include a DATALINK column in a DB2 UDB table that can reference multiple file systems spread over one or more file servers associated with different operating systems, such as UNIX and Windows.

A single Data Links Manager can be associated with DATALINK columns in one or more DB2 UDB databases. A DATALINK column can reference files residing in Transarc’s distributed file system DCE-DFS. Bi-directional coordinated replication of Data Linked files is supported in an atomic, automatic, and consistent way in conjuction with DB2 UDB’s database replication capabilities. Chapter 9, “Data replication” on page 161, discusses replication of DB2 UDB tables having DATALINK columns and the files referenced by them.

In JFS, UFS, and NTFS environments, the DATALINK values encode the name of a Data Links server that contains the file and the file name. The only Data Links server names that can be specified in the DATALINK value are those that have been registered to a DB2 database. In DCE-DFS environments, the Data Links Manager is registered for the entire cell.

A number of scalar functions (column functions) are provided with the DATALINK data type that allows access to individual parts of the URL, such as the server address part. This allows you, for example, to convert a VARCHAR (the normal way you would define a URL) into the DATALINK data type.

A DATALINK value can be assigned to (or inserted into) a column in any of the following ways:

� DLVALUE scalar function: This function can be used to create a new DATALINK value and assign it to a column. Unless the value contains only a comment or the URL is exactly the same, the assignment links the file.

� SQLBuildDataLink CLI function: A DATALINK value can be constructed as a CLI parameter of the CLI function SQLBuildDataLink. This value can then be assigned to a column. Unless the value contains only a comment or the URL is exactly the same, the assignment would link the file. The function interface is shown here:

SQLRETURN SQLBuildDataLink(SQLHSTMT StatementHandle,SQLCHAR FAR *LinkType, SQLINTEGER LinkTypeLength,SQLCHAR FAR *DataLocation, SQLINTEGER DataLocationLength,SQLCHAR FAR *Comment, SQLINTEGER CommentLength,SQLCHAR FAR *DataLinkValue, SQLINTEGER BufferLength,SQLINTEGER FAR *StringLengthPtr);

Chapter 2. Technical architecture 21

Page 44: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The function arguments of SQLBuildDataLink() are explained in Table 2-1.

Table 2-1 Arguments to the SQLBuildDataLink function

Data type Argument Use Description

SQLHSTMT Statement Handle input Used only for diagnostic reporting

SQLCHAR * LinkType input Always set to SQL_DATALINK_URL

SQLINTEGER LinkTypeLength input The length of the LinkType value

SQLCHAR * DataLocation input The complete URL value to be assigned

SQLINTEGER DataLocationLength input The length of the DataLocation value

SQLCHAR * Comment input The comment, if any, to be assigned

SQLINTEGER CommentLength input The length of the Comment value

SQLCHAR * DataLinkValue output The DATALINK value that is created by the function

SQLINTEGER BufferLength input Length of the DataLinkValue buffer

SQLINTEGER* StringLengthPtr output A pointer to a buffer in which the length of *DataLinkValue (excluding the null-termination character) is returned. If DataLinkValue is a null pointer, no length is returned. If the number of bytes available to return is greater than BufferLength minus the length of the null-termination character, then SQLSTATE 01004 is returned. In this case, subsequent use of the DATALINK value may fail.

22 Data Links: Managing Files Using DB2

Page 45: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The DATALINK value can be retrieved from the table by running a SELECT statement. Portions of a DATALINK value can be assigned to host variables by the use of scalar functions (such as DLLINKTYPE or DLURLPATH). These functions are discussed in 2.2.2, “Scalar functions for DATALINK data type” on page 24.

2.2.1 Attributes of DATALINK typeA DATALINK data type can have the following attributes:

� Link type: Currently the only supported type of link is the URL.

� Data location: The location of the file linked with a reference within DB2, in the form of a URL.

The allowed scheme names for this URL are:

– HTTP – FILE – UNC – DFS

The other parts of the URL are:

– The file server name for the HTTP, FILE, and UNC schemes– The cell name for the DFS scheme– The full file path name within the file server or cell

See Appendix A, “BNF specifications for DATALINK” on page 297, for more information on the full Backus Naur Form (BNF) specifications for DATALINKs.

� Comment: Descriptive information (254 bytes maximum) can be specified. This is intended for application-specific uses such as further or alternative identification of the location of the data.

Note: In a READ PERMISSION DB case, the URL value returned, as a result of an SQL SELECT query on the table with DATALINK column, has an access token attached with it. Therefore in CLI, embedded programming, and JDBC, use in assigning sufficient storage to the variables in which Data Links values have to be stored. This storage space should be sufficient to accommodate the URL value, and the access token embedded with it.

Chapter 2. Technical architecture 23

Page 46: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

A DATALINK value can possibly have only a comment attribute and an empty data location attribute. Such a value may even be stored in a column, but of course, no file will be linked to such a column. The total length of the comment and the data location attribute of a DATALINK value is currently limited to 200 bytes.

2.2.2 Scalar functions for DATALINK data typeBuilt-in scalar functions are provided for DATALINK data types. They are:

� Function to build a DATALINK value:

DLVALUE: The DLVALUE function returns a DATALINK value. When the function is on the right-hand side of a SET clause in an UPDATE statement or is in a VALUES clause in an INSERT statement, it usually also creates a link to a file. However, if only a comment is specified (in which case the data-location is a zero-length string), the DATALINK value is created with empty linkage attributes so there is no file link.

The following SQL statement can be used for inserting a row in the CELEBS table having two VARCHAR columns (country and pic_format) and one DATALINK column (picture):

EXEC SQL INSERT INTO CELEBSVALUES (‘India’,’jpg’, DLVALUE(‘http://sol-e/datalinks/celebs/images/salman.jpg’));

� Functions to extract the encapsulated values from a DATALINK value:

– DLCOMMENT: This function returns the comment value, if it exists, from a DATALINK value. The result of the function is VARCHAR(254).

Given a DATALINK value that was inserted into the picture column of a row in the CELEBS table using the scalar function, then DLCOMMENT(picture) will return the value 'comment':

Note: Leading and trailing blank characters are trimmed while parsing data location attributes as URLs. Also, the scheme names (http, file, unc, dfs) and host are case-insensitive and are always stored in the database in uppercase. When a DATALINK value is fetched from a database, an access token is embedded within the URL attribute when appropriate. It is generated dynamically and is not a permanent part of the DATALINK value stored in the database.

Note: Data Links cannot be exchanged with a Distributed Relational Database Architecture (DRDA) server.

24 Data Links: Managing Files Using DB2

Page 47: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DLVALUE('http://sol-e/datalinks/celebs/images/am.jpg,'URL','comment')

– DLLINKTYPE: This function returns the linktype value from a DATALINK value. The result of the function is VARCHAR(4).

Considering the DATALINK value as shown in Example 2-3, DLLINKTYPE(picture) will return the value 'URL'.

– DLURLCOMPLETE: The DLURLCOMPLETE function returns the data location attribute from a DATALINK value with a link type of URL. When appropriate, the value includes a file access token. If the DATALINK value only includes the comment, the result that is returned is a zero-length string. The result of the function is VARCHAR(254).

The DLURLCOMPLETE(picture) function on the CELEBS table would return the complete URL:

HTTP://SOL-E/datalinks/celebs/images/04E2_CluJ3k__x2rxA5IJl1Q;am.jpg

Here, 04E2_CluJ3k__x2rxA5IJl1Q; is the access token attached with the file name, since picture has the attribute of READ PERMISSION DB.

– DLURLPATH: The DLURLPATH function returns the path and file name necessary to access a file within a given server from a DATALINK value with a linktype of URL. When appropriate, the value includes a file access token. If the DATALINK value only includes the comment, the result returned is a zero length string. The result of the function is VARCHAR(254).

The DLURLPATH(picture) function on the CELEBS table would return:

/datalinks/celebs/images/04E2_Cln8Jk__xjh7A5Lkl0L;am.jpg

Here 04E2_Cln8Jk__xjh7A5Lkl0L; is the access token attached with the file name, since picture has the attribute of READ PERMISSION DB.

– DLURLPATHONLY: The DLURLPATHONLY function returns the path and file name necessary to access a file within a given server from a DATALINK value with a linktype of URL. The value returned never includes a file access token. Again, if the DATALINK value only includes the comment the result returned is a zero length string. The result of the function is VARCHAR(254).

The DLURLPATHONLY(picture) function on the CELEBS table would return:

“/datalinks/celebs/images/am.jpg”

Note that the file name returned in the path does not have an access token attached, although the picture column has the attribute of READ PERMISSION DB.

– DLURLSCHEME: The DLURLSCHEME function returns the scheme from a DATALINK value with a linktype of URL. The value is always in uppercase. If the DATALINK value only includes the comment the result

Chapter 2. Technical architecture 25

Page 48: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

returned is a zero length string. The result of the function is VARCHAR(20).

Here DLURLSCHEME(picture) would return the value 'HTTP'.

– DLURLSERVER: The DLURLSERVER function returns the file server from a DATALINK value with a linktype of URL. The value is always in uppercase. If the DATALINK value only includes the comment the result returned is a zero length string. The result of the function is VARCHAR(254).

DLURLSERVER(picture) would return the name of the server as 'SOL-E'.

It is important to distinguish between these DATALINK references to files and the LOB file reference variables. The similarity is that both contain a representation of a file. However, consider these points:

� DATALINKs are retained in the database, and both the links and the data in the linked files can be considered as a natural extension of data in the database.

� File reference variables exist temporarily on the client, and they may be considered as an alternative to a host program buffer.

2.2.3 DATALINK optionsYou can define what level of control you want for a linked file by specifying certain options when defining the DATALINK column using the CREATE TABLE or ALTER TABLE ADD COLUMN SQL statements. Options include file access permission and the level of recovery support you want. You can specify the following options at the time you create the tables with DATALINK columns:

� LINKTYPE URL

This defines the type of link as a URL.

� NO LINK CONTROL

Specifies that there will not be any check made to determine that the file exists. Only the syntax of the URL will be checked. There is no database manager control over the file.

Note: The argument to all the above functions must be an expression that results in a value of the DATALINK data type. If the argument is null, the result is the null value.

26 Data Links: Managing Files Using DB2

Page 49: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� FILE LINK CONTROL

Specifies that a check should be made for the existence of the file. Additional options may be used to give the database manager further control over the file.

There are additional options to define the level of database manager control of the file link, which include:

– INTEGRITY: Specifies the level of integrity of the link between a DATALINK value and the actual file.

ALL: Any file specified as a DATALINK value is under the control of the database manager and may not be deleted or renamed using standard file system programming interfaces.

– READ PERMISSION: Specifies how the permission to read the file, specified in a DATALINK value, is determined.

• FS: The read access permission is determined by the file system permissions. Such files can be accessed without retrieving the file name from the column.

• DB: The read access permission is determined by the database. Access to the file is only allowed by passing a valid file access token, returned on retrieval of the DATALINK value from the table, in the open operation.

– WRITE PERMISSION: Specifies how permission to write to the file specified in a DATALINK value is determined.

• FS: The write access permission is determined by the file system permissions. Such files can be accessed without retrieving the file name from the column.

• BLOCKED: Write access is blocked. The file cannot be directly updated. In order to update a file, it should be copied, the copy should then be updated, and finally the DATALINK value should be updated to point to the new copy of the file.

– RECOVERY: Specifies whether DB2 will support point in time recovery of files referenced by values in this column.

• YES: DB2 will support point in time recovery of files referenced by values in this column. This value can only be specified when INTEGRITY ALL and WRITE PERMISSION BLOCKED are also specified.

• NO: Specifies that point in time recovery will not be supported.

– ON UNLINK: Specifies the action taken on a file when a DATALINK value is changed or deleted (unlinked). Note that this is not applicable when READ PERMISSION FS is used.

Chapter 2. Technical architecture 27

Page 50: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

• RESTORE: Specifies that when a file is unlinked, the Data Links File Manager attempts to return the file to the owner with the permissions that existed at the time the file was linked. In the case where the user is no longer registered with the file server, the result is product-specific. This can only be specified when INTEGRITY ALL and WRITE PERMISSION BLOCKED are also specified.

• DELETE: Specifies that the file will be deleted when it is unlinked. This can only be specified when READ PERMISSION DB and WRITE PERMISSION BLOCKED are also specified.

� MODE DB2OPTIONS

This mode defines a set of default file link options. The options defined by DB2OPTIONS are:

– INTEGRITY ALL– READ PERMISSION FS– WRITE PERMISSION FS– RECOVERY NO

Therefore, the DATALINK column definition syntax can be represented as shown in Figure 2-6.

Note: In the DB2OPTIONS mode, since the write control is under file system control, ON UNLINK option is not applicable. This option is now valid only for files linked with DATALINK column(s) having READ PERMISSION DB and WRITE PERMNISSION BLOCKED attributes.

28 Data Links: Managing Files Using DB2

Page 51: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-6 DATALINK column definition syntax

Now using these options, a DATALINK column may have one of the sets of attributes shown in Table 2-2.

Table 2-2 Possible combinations of DATALINK attributes

The other combinations of the DATALINK attributes are not currently supported (in DB2 Universal Database V7.x). Some of them are for future enhancements (for example, WRITE PERMISSION DB, which would be available from DB2 V8.x), and others are not mutually compatible (for example, READ PERMISSION DB and WRITE PERMISSION FS together).

DATALINK

(integer)

datalink-options-clause

LINKTYPEURL

NO LINK CONTROL

FILE LINK CONTROL file-link-options-clause

MODE DB2OPTIONS

datalink-options-clause:

file-link-options-clause:

INTEGRITY ALL READ PERMISSION

FS

RECOVERY

YES

BLOCKED

DB NO

DELETE

ON UNLINKWRITE PERMISSION FS RESTORE

Optn.

#Read Write Recovery Unlink

Referential

IntegrityDB Access

1 FS FS No N/A

2 FS Blocked No N/A

3 FS Blocked Yes N/A

4 DB Blocked No Delete

5 DB Blocked Yes Delete

6 DB Blocked No Restore

7 DB Blocked Yes Restore

Chapter 2. Technical architecture 29

Page 52: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.3 Security/authenticationWe have seen that along with the file system benefits, Data Links also provides RDBMS benefits including referential integrity, coordinated backup and recovery, and access control. This section describes how Data Links provides access control.

2.3.1 Concept of tokenized file namesDB2 and the Data Links Manager together provide file access control. When a DATALINK column (with READ PERMISSION DB) is accessed (using a SELECT statement), DB2 generates an access token and embeds it in the pathname of the file.

There are several conditions for file access control to operate correctly:

� File access control is provided only if the READ PERMISSION DB option is specified on the DATALINK column when the table is created.

� To access a READ PERMISSION DB file, the user needs to have the SELECT privileges on the table (or the SQL view) having DATALINK column, under which the file’s reference exists.

� Any file system API or command can be used to read the file. As shown earlier, files can also be accessed by the Web browser.

� Generation of the access token is shared secretly between DB2 and the Data Links File Manager. The DLFF contacts DLFM to validate a token.

� To be valid, an access token must be generated and used within a specified time interval as defined in the Data Links Access Token Expiry Interval (dl_expint) database configuration parameter.

� For each (SELECT statement) access, a new token is generated and remains valid for the time specified by dl_expint.

� Web addresses with embedded access tokens must be used by the application to access the files. Any attempt to open, read, or otherwise manipulate a file using Web addresses with the access token results in an access violation.

Note: Obviously the “write” operation is more of a security threat than the “read” operation on any file. Therefore, having the “read” control under DB and the “write” control under FS for a file does not make sense and is not supported.

30 Data Links: Managing Files Using DB2

Page 53: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.3.2 Database configuration parametersThe following DB2 database configuration parameters pertain to Data Links:

� DATALINKS

This option is specified in the database manager configuration. Using this, you can enable or disable the support of Data Links. A value of YES specifies that Data Links support is enabled for Data Links Manager linking files stored in native file systems (for example, JFS on AIX). A value of NO specifies that Data Links support is not enabled. The default value is NO.

� DL_EXPINT

This is a database configuration parameter that specifies the interval of time (in seconds) during which the generated file access control token is valid. The number of seconds the token is valid begins from the time it is generated. The Data Links File Filter checks the validity of the token against this expiry time. This parameter can have values ranging from 1 to 31,536,000. The largest value corresponds to one year (365 days). The default value for this parameter is 60 seconds. This parameter applies to the DATALINK columns that specify READ PERMISSION DB.

� DL_NUM_COPIES

This database parameter specifies the number of additional copies of a file to be made in the archive server (such as a Tivoli Storage Manager server) when a file is linked to the database. Its value ranges from 0 to 15. The default value for this parameter is 0. This parameter applies to the DATALINK columns that specify RECOVERY=YES.

� DL_DROP_TIME

This parameter specifies the interval of time (in days) that files would be retained on an archive server (such as an ADSM/Tivoli Storage Manager server) after a DROP DATABASE is issued. The value of this parameter ranges from 0 to 365. The default value for this parameter is 1 day. A value of 0 means that the files are deleted immediately from the archive server when the DROP command or statement is issued. (The actual file is not deleted unless the ON UNLINK DELETE parameter was specified for the DATALINK column.) This parameter applies to the DATALINK columns that specify RECOVERY=YES.

Note: The products offered by Tivoli Storage Manager have now replaced the Adstar Distributed Storage Manager (ADSM) product set.

Chapter 2. Technical architecture 31

Page 54: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� DL_UPPER

The parameter indicates whether the file access control tokens use uppercase letters. A value of YES specifies that all letters in an access control token are upper case. A value of NO specifies that the token can contain both uppercase and lowercase letters. The default value is NO. This parameter applies to the DATALINK columns that specify READ PERMISSION DB.

� DL_TOKEN

This parameter specifies the algorithm used in the generation of DATALINK file access control tokens. It may have either of two values: MAC0 or MAC1 (message authentication code). The value of MAC1 generates a more secure message authentication code than MAC0, but also has more performance overhead. Again this parameter applies to the DATALINK columns that specify READ PERMISSION DB.

2.3.3 How access tokens workDepending on the configuration parameters discussed above, the DB2 engine creates the token for all the DATALINK values (with READ PERMISSION DB), which qualify the SELECT SQL statement. The token has embedded the time at which the token expires. This expiry time is calculated by adding the expiry interval (specified by the DL_EXPINT database configuration parameter) and the current time (the time at which the SQL SELECT statement is issued on the DATALINK column).

Applications and users use this tokenized file name to access the files in the file system. The Data Links File System Filter sitting on top of the native file systems checks for the token in the file name. If it has any token attached with the file name and the tokenized file name does not already exist (although less probable, but it is possible that the tokenized file name exists as a normal or linked file), DLFF contacts one of the DLFM daemons (Upcall daemon) and asks it to validate the token. DLFM then uses DB2 engine code to verify the token. According to the result of this verification, DLFM prepares and sends a response to DLFF.

This response can either be “allowed” or “not allowed”. Therefore, if DLFF receives a response of “allowed”, it calls the base file system operation and lets the operation complete. Otherwise, it returns an error to the application. DLFM returns “not allowed” in two cases:

Important: In spite of the token being valid, the application still may receive an error while trying to access the READ PERMISSION DB file on the Data Links server. This may happen if the system clocks of the DB2 server and the Data Links server are not synchronized with each other.

32 Data Links: Managing Files Using DB2

Page 55: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� The user does not have the privilege to do the action.� The request cannot be processed due to some error.

When a file is linked under Data Links control with the READ PERMISSION DB attribute, DLFM changes its owner to dlfm UID), which is unique to each Data Links server. DLFM also changes the permissions of this file to “read only” by the owner (that is, the dlfm UID). If the owner of the file has “execute permission” before linking, it is retained even after linking the file.

When a user tries to access a READ PERMISSION DB file without a token, DLFF returns a “permission denied” error message and does not allow the operation to go through. It recognizes these files by the owner (as described earlier, the owner of a READ PERMISSION FILE is dlfm UID).

2.4 Data Links on UNIX and WindowsIn 2.1, “Overview of the Data Links architecture” on page 12, you saw that the Data Links implementation on AIX, Solaris, and Windows NT has the following components:

� DB2 server� DB2 client� Data Links File Manager� Data Links File System Filter

Of these components, DLFM and DLFF are part of the Data Links Manager (DLM). This section discusses these components in detail. Then it describes what happens when a file is linked or unlinked by an application.

Figure 2-7 shows how one DB2 server can be registered with one or more Data Links servers and vice versa. There is a many-to-many (M:N) relationship between DB2 servers and Data Links servers.

Note: DB2 Logging Manager is a part of DLFM. Therefore, DLFM can be said to have two components:

� Logging Manager� Daemon processes

Chapter 2. Technical architecture 33

Page 56: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-7 Relationship between DB2 servers and Data Links servers

2.4.1 Data Links File Manager (DLFM)DLFM is a sophisticated SQL application with a set of daemon processes that reside at a file server node. These processes work cooperatively with the host database servers to manage external files. DLFM receives DBMS calls to perform the link/unlink operations using unit of work consistency and maintains a list of all files under its control.

When a file is linked under the Data Links control in a database, the DLFM applies the constraints for referential integrity, access control, and backup and recovery as specified in the DATALINK column. If the DATALINK column specifies READ PERMISSION DB access, for example, the DLFM changes the owner of the file to the DBMS (a special user ID registered with DLFF on each Data Links server, known as dlfm UID) and marks the file “read only”, whenever any file is

DB2 Server-1

DB2 Server-2

DB2 Server-M

Data LinksServer-1on AIX

Data LinksServer-2

on Solaris

Data LinksServer-N

on Windows NT

: :

Unit of work consistency: This concept means that the link/unlink operation will be within the same commit scope as the other SQL performed in the same unit of work.

34 Data Links: Managing Files Using DB2

Page 57: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

linked under it. All of these changes to the DLFM repository (the DLFM_DB database) and to the file system are applied as part of the same DBMS transaction as the initiating SQL statement. Therefore, if the transaction is rolled back, the changes made by the DLFM are undone as well.

To reduce the number of messages between the database server and the DLFM, the DLFM maintains a set of meta-data on the file systems and the files that are under database control. Other information is stored by the DB2 Logging Manager in the DLFM_DB database.

DLFM handles interactions with the database backup/restore process. You can use the option for the DATALINK column to indicate whether a linked file should participate in point-in-time recovery of the DB2 table. DLFM can interface with Tivoli Storage Manager to take copies of the external files when the database backup occurs.

The actual copying is asynchronous to the database transaction triggering the backup of the file (via SQL SELECT or UPDATE statement), since the external file size can be quite large (tens of megabytes). Once the backup of the file has completed, DLFM signals to the calling database that the backup can be marked as complete. The completion of the asynchronous copy operation is checked when the database server performs a backup of the SQL tables involving DATALINKS columns.

The DLFM tracks different versions of a referenced file and maintains the backup status of each in order to support point-in-time recovery. The DBMS also provides the DLFM with a recovery ID (RECOV_ID) for a file, whenever it is linked or unlinked, to help synchronize recovery of files with data. This is important because a file with the same name but different content may be linked and unlinked several times. Without a separate recovery ID (RECOV_ID) for each link operation, DLFM would not be able to restore the file to match the database state.

DB2 Logging ManagerThis component maintains a database of information (or meta data) related to all the linked files. The name of the database in which all this information is maintained is DLFM_DB. It contains the following tables:

Note: Chapter 6, “Using Tivoli Storage Manager” on page 107, discusses using Tivoli Storage Manager for external file backup.

Note: Chapter 11, “Recovery” on page 201, discusses backup and recovery in detail.

Chapter 2. Technical architecture 35

Page 58: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� DFM_DBID: This table stores the database registration information. This table has the following main columns:

– DBID: This field in this table represents the unique combination of the host database name, instance name, and host machine name. This is the primary key for this table.

– HOSTNAME: The hostname registered with the DLFM.

– DBINST: The DB2 instance on the HOSTNAME.

– DBNAME: The database name (containing tables with the DATALINK column) on HOSTNAME, with schema DBINST.

– XN_ID: Once the database DBNAME is dropped, the entry for that database is marked as deleted, and the transaction identifier of the transaction doing this is stored in XN_ID attribute.

� DFM_PRFX: This table stores the prefix registration information. It has the following main columns:

– PRFX_ID: A unique identifier that represents this prefix. This is the primary key of this table.

– PRFX_NAME: This attribute maintains the name of the prefix.

� DFM_ACCESS: This table is used for controlling user access to files. It also stores directory patterns in which a particular user can link/unlink files in a particular prefix (the mount point of the file system under Data Links control).

� DFM_RCFILE: This table maintains a list of files received from the host DB2 (database having DATALINK column) during the RECONCILE process. It contains the list of files that are in the DB2 host database table. All the reconcile instances (running on one or more DB2 database having tables with DATALINK columns) share this table. The main columns in this table are:

– PID: Process ID of the reconcile child on dlfm.

– DBID: The identifier for database where reconcile is going on.

– PRFXID: The identifier for the file system (or the prefix) where the files are stored.

– STEMANAME: The name of the files.

� DFM_BOOT: This table maintains the boot information needed during the DLFM startup.

� DFM_GRP: This table consists of file group entries. Each group entry corresponds to a DATALINK column in an SQL table on the host database.

Prefix: This is the mount point (or stub) where the DLFS is mounted.

36 Data Links: Managing Files Using DB2

Page 59: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

An entry in this table is put when the first file reference is inserted in the corresponding DATALINK column.

� DFM_FILE: This is the most accessed table that consists of the information of linked and unlinked files on the file server. A new entry is created in this table whenever a file is linked in the host database registered with this DLFM. This table retains the unlinked file entries if files need to be restored in the future via the host database restore utility (for example, when the DATALINK column under which they were linked has the ON UNLINK RESTORE option set). If instead ON UNLINK DELETE option was set, this table still retains the entry for the unlinked file until it is deleted by the Garbage Collector daemon. The columns of interest defined in this table are: DBID, FILENAME, TRANSACTION ID, RECOVERY ID, and FILE STATUS.

� DFM_XNSTATE: This table keeps track of all the active DLFM transactions. Transaction state is maintained for each transaction as long as it is active. The transaction state table is first kept in an in-memory table when the transaction starts. The entry is inserted into the SQL table when the transaction begins the first phase of the commit processing. Once the transaction is completed, its entry is removed from the table.

� DFM_ARCHIVE: This table contains the file and group entries that need to be archived to the archive server. When the Load utility is used to insert a large number of datalink files into a DATALINK column on the host database, instead of replicating each file entry in the archive table, only a group entry is inserted.

The entry from the archive table (DFM_ARCHIVE) is processed to make a copy of one file or a set of files. The corresponding entry is deleted from this table once the copy is over.

� DFM_BACKUP: This table stores information regarding backups. A new sequence number is assigned to each new backup that is taken. This table is mainly used by the Garbage Collector daemon.

� DFM_DIR: This table maintains the directory hierarchy for each prefix (or mount point). This is used for disk crash recovery support on UNIX.

� DFM_URL: This table is required for the extension of Link Integrity+ (refer to 1.3.1, “Link Integrity+” on page 7, for additional information). It contains

Note: The DB2 Load utility is capable of efficiently moving large quantities of data into newly created tables, or into tables that already contain data. The utility can handle all data types, including large objects (LOBs), user-defined types (UDTs), and DATALINKs. The Load utility is faster than the Import utility, because it writes formatted pages directly into the database, while the Import utility performs SQL INSERTs.

Chapter 2. Technical architecture 37

Page 60: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

mapping for Web-root directory support. This table is currently not being used.

DLFM process modelThe DLFM has a main daemon that spawns a child agent (or process) when a connect request from a DB2 agent is received. The child agent then establishes a connection with the requesting DB2 agent. This child agent will serve all subsequent requests from the same connection. DLFM’s main daemon waits for another connect request from the same or different host DB2.

Applications on the host DB2 side would establish separate connections with the DLFM. Therefore, they are served by separate child agents on the DLFM side. In addition to the child agent, DLFM provides several other services implemented as daemons and they are also spawned by the main DLFM daemon.

The DLFM process model can be described from two angles:

� How DLFM processes interact with the DB2 server

Figure 2-8 shows how DB agents on the DB2 server interact with DLFM daemons.

Figure 2-8 DLFM process model: DB2 server

TCP/IP Data Links Manager

DB

db2agentdb2agentdb2agent

DB2 Server

AsyncDaemon

INT DATALINK

1 http://sol-e/datalinks/images/abc.gif

... ...

SQL: SELECT, INSERT, DELETEUTILITIES: BACKUP/RESTORE, LOAD, IMPORT, EXPORT

DLFMD

DLFM_CHILD

DLFM

38 Data Links: Managing Files Using DB2

Page 61: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� How DLFM processes interact with rest of Data Links components and the archive server

Figure 2-9 shows how DLFM daemons interact with other components of Data Links, like DLFF and DB2 Logging Manager. It also shows, how these daemons interact with the archive server, Tivoli Storage Manager, for example.

Figure 2-9 DLFM process model: Data Links Manager

Figure 2-10 gives the complete picture of the DLFM process model.

IPCMetadata in DB2

tables

Garbage

Collection

Daemon Define-group Daemon

Delete-group Daemon

Native File System: JFS, Solaris, NTFS, DFS-DCE (AIX)

Upcall Daemon

Change-Own Daemon

DLFF

Object Access/IntegritySubsystem

Streams driver,File System Driver,DMAPP

Archive Subsystem

Copy Daemon

Retrieve Daemon

Archive

Server

LOCAL DISK/TSM (ADSM)/XBSA

DLFMD

DLFM_CHILDDLFM_CHILDDLFM_CHILDTCP/IP

db2agent

Chapter 2. Technical architecture 39

Page 62: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-10 DLFM process model: Complete picture

The following sections describe the functionality and service provided by each of these daemons.

Delete Group daemonWhenever a table is dropped on the host DB2 side, then the corresponding file groups on the DLFM server, if any, also need to be deleted. There can be lots of files referenced by the DATALINK columns in the dropped table and all those files need to be unlinked. During the forward progress of the transaction, the file groups are marked deleted by the current transaction in the GROUP table. During prepare processing, the child agent notes the number of groups deleted by this transaction and records it with the transaction entry in the transaction table. The commit processing checks if any groups are deleted (by checking the deleted group count in the transaction entry in the current transaction) and if so, it sends the transaction ID to the Delete Group daemon. Using the transaction ID the Delete Group daemon finds all the groups deleted in this transaction and then unlinks all the files in each group.

DLFMD

DelGrpdChild agent

Chownd Retrieved

Upcalld GCd Copyd

DLFF DB2 LoggingManager

ArchiveServer

DB agent SQL Appl.Host DBM

40 Data Links: Managing Files Using DB2

Page 63: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The unlinking of the files by this daemon is asynchronous and the commit processing for the dropped table does not wait for it to complete. Note that the group entry is not deleted until all the files in that group have been unlinked. And as long as this transaction does not commit, the same file name is not allowed to be relinked. Therefore, if the DLFM fails before the Delete Group daemon has completed unlinking all the files from the deleted groups, then, after the DLFM restarts, the Delete Group daemon can still pick up all the committed transaction tables and resume its work.

Garbage Collector daemonThe Garbage Collector daemon is another asynchronous process that cleans up of the DLFM meta data. There are two types of cleanups:

� Cleanup triggered by the database backup� Cleanup the deleted groups whose lifetime has expired

The cleanup triggered by the database backup consists of cleaning up old backup entries according to the policy of keeping last N backups.

The last N+1 onwards backup entries and the corresponding unlink file entries from the FILE table are removed by the Garbage Collector daemon. It also removes the copies of those files from the archive server.

The cleanup of the deleted groups is based on the expiry of the lifetime. Each deleted file group is assigned a life span. Once the lifetime expires, the Garbage Collector daemon removes those deleted file group entries as well as the associated unlink file entries from the DLFM meta data tables. If archive copies associated with the unlinked file entries exist, they are also deleted from the archive server. The Upcall daemon services requests from DLFF to determine if a file is in the linked state. If it is a user’s request to delete, rename, or move the file via file system APIs, it is rejected by the DLFF. Its main purpose is to enforce referential integrity for the linked files.

Chown daemonThe Chown daemon is a special process whose effective user ID is root. The Chown daemon needs superuser privilege since it manipulates attributes (such as ownership, permissions, etc.) of the files belonging to different users. A child agent communicates with the Chown daemon whenever it needs to access the file information, such as the file system ID, the inode, the last modification time,

Note: The value of N can be specified in database configuration, by setting the NUM_DB_BACKUPS variable.

Chapter 2. Technical architecture 41

Page 64: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

the owner, the group etc. During commit processing, the child agent sends a request with a file name to the Chown daemon to take over the file, for example change owner and access permissions (if required), or to release the file to the file system to restore the original owner and access permissions.

For example, suppose before linking a file under Data Links with the READ PERMISSION DB attribute, the attributes of the file look like the example in Figure 2-11.

Figure 2-11 Attributes before the link operation

After the link operation, the Chown daemon changes the owner. The permissions of the file and the attributes (seen as root) of the file would look like the example in Figure 2-12.

Figure 2-12 Attributes after the link operation

Now if ON UNLINK RESTORE is one of the attributes of the DATALINK column, under which the “demofile” file is linked, then after the unlink operation, the Chown daemon restores the original attributes of the file (Figure 2-11).

Since the Chown daemon runs as superuser, it is important to safeguard unauthorized requests. Therefore, the child agent communicates with the Chown daemon with proper authentication.

Note: For READ PERMISSION DB files, Data Links implementation demands that the owner of the file (while it is under database control) should be changed to the dlfm UID (configured at installation time).

42 Data Links: Managing Files Using DB2

Page 65: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Copy daemonThe Copy daemon is responsible for copying linked files from the file system to an archive server or to disk. When a file is linked, it is copied asynchronously by the Copy daemon if DLFM is responsible for restoring the file after a database restore.

Retrieve daemonThe Retrieve daemon is responsible for restoring files from the archive server or from disk. When the host database is restored to a point in the past, the file system state may be out of sync with the new database state. As part of re-synchronization, files are restored by the Retrieve daemon from the archive server, if necessary.

Upcall daemonThe purpose of the Upcall daemon is to interact with the DLFF. The DLFF requests some information from DLFM by making an upcall. The Upcall mechanism in UNIX uses a stream-based driver. In Windows NT, the Upcall mechanism uses asynchronous DeviceIoControl calls with buffered I/O.

2.4.2 Data Links File System Filter (DLFF)The DLFF module supports the file system functionality required by Data Links. On UNIX platforms, it is implemented as a Virtual File System, which layers just above the native UNIX file system (JFS on AIX, UFS on Solaris, etc.). The approach is quite similar on Windows NT, except that the implementation is in the form of a filter file system driver layered over the native file system (NTFS).

The Data Links File System Filter enforces data integrity by making sure that no file interaction is allowed that is incompatible with the information in the database. It is designed to interfere as little as possible with the actual application. The DLFF is responsible for the interception of certain file system calls, such as file open, rename, and delete. Other file system calls, such as read and write are simply passed on to the underlying native file system. When a DATALINK value is retrieved from the database (assuming that the read

Virtual File System (VFS): This is an abstraction of a physical file system implementation. It provides a consistent interface to multiple file systems, both local and remote. This consistent interface allows the user to view the directory tree on the running system as a single entity even when the tree is made up of a number of diverse file system types. Therefore, the VFS interface (also known as the VNODE interface) provides a bridge between the physical file system (which manages storage of data) and the logical file system (which provides support for the system call interface).

Chapter 2. Technical architecture 43

Page 66: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

permission is controlled by the database), an access token is automatically generated. When an application attempts to open the file, the DLFF intercepts the file open call and ensures that the access token is valid, which means that the application may indeed open the file.

If the token is invalid or has expired, DLFF returns an error to the application. Otherwise if the token is valid, it calls the base (native) file systems operations to complete the user’s request. At this stage, the DLFF no longer interferes with any of the read or write operations on the file, leaving performance virtually unaffected. Because the DLFF must integrate with the file system, it is different for each platform. However, the API is consistent across the platforms, which lets you mix and match database and file server implementations.

File system operations intercepted by DLFFThis section discusses how DLFF handles some of the important file system operations.

Open fileWhen an open request comes for a file (by running cat on UNIX, and type on Windows), DLFF looks for an embedded token in the file name. It does so by searching for the “;” character at a fixed location in the file name. Even if it gets “;” in the right position, it checks if any such file (file name with embedded token) exists or not. If any such file exists, DLFF considers the file name as without token. If not, then DLFF assumes it to be a token generated by the host database and strips it from the file name. The stripped file name is then checked to exist. If it does not exist, DLFF returns error (ENOENT on UNIX). And if the file exists, the token (along with the file name and some other information) is passed to the DLFM as a query, to check its validity.

DLFM then checks the validity of the token and returns the result to DLFF. If the result is an error, DLFF returns it to the application (or user) who had requested the “open” operation. If DLFM says that the operation is allowed, DLFF changes the effective UID of the process (trying to open the file) to the dlfm UID (owner of READ PERMISSION DB files), and then calls the base file system “open” operation to complete the user’s request.

If the file name did not have a token embedded, the open request is passed directly to the base file system.

Note: Even though dlfm user is the owner and has read permission on the READ PERMISSION DB file, it needs a valid token to access it. This feature avoids security threats coming from NFS clients having a UID equivalent to the dlfm user.

44 Data Links: Managing Files Using DB2

Page 67: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Delete fileIf DLFF has a “delete file” request, it returns an error if the file is owned by dlfm UID (for READ PERMISSION DB). If not, it contacts DLFM to check whether the corresponding file is linked under the Data Links control. Now DLFM returns either of the results listed in Table 2-3.

Table 2-3 DLFM results and corresponding actions by DLFF

Rename fileRename file processing is similar to delete processing, except for the fact that in rename, the destination file is also checked to be under the Data Links control.

For example, a user wants to rename a file file1 to file2. In this case, DLFF checks the access permissions for the file1 file and that the file file2 (for example, the destination file) is not under Data Links control (to maintain referential integrity).

Rename directoryRenaming the directory is not allowed in DLFF.

Set permissionsNo permissions can be set/modified on a file that is under WRITE PERMISSION BLOCKED control. If the owner of the file is dlfm UID, the DLFF rejects any set/modify permission operation. Otherwise, the DLFF contacts the DLFM to check if the file is Data Linked with the WRITE PERMISSION BLOCKED attribute. If so, it disallows any setting of the “w” bit on the file.

How DLFF interacts with DLFMDLFF can recognize files under READ PERMISSION DB control, since they are owned by a unique user identifier (dlfm UID). However, when a file is linked under some other Data Links control option, DLFF prevents the rename and delete operations on the file too. In addition, for the files linked under WRITE PERMISSION FS, DLFF does not allow anyone (other than the superuser) to

Result from DLFM Action by DLFF

File under Data Links control DLFF returns an error to signify unauthorized access

File not under Data Links control DLFF passes the request to the base file system

Note: Note that superuser (root on UNIX and Administrator on Windows) can perform any operation on any file. DLFM is not contacted for any of the requests originating from the superuser (except for the logging purpose).

Chapter 2. Technical architecture 45

Page 68: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

enable “w” access to the file. In order to do so, DLFS needs to find out whether a given file is linked under the Data Links control. Because this information is not directly available to the DLFS VFS, it is achieved by making an upcall from the kernel-mode VFS to the user-mode DLFM Upcall daemon.

Figure 2-13 shows how an application accesses files under Data Links control and how DLFF and DLFM interact with each other.

Figure 2-13 Overview of Data Links implementation

This section discusses the implementation of the upcall mechanism by which DLFF interacts with DLFM. This mechanism can be broken down into the following steps:

DLFMUpcall

Daemon

DB2 LoggingManager

Native VFS

Disk Device Driver

Disk

KernelLevel

UserLevel

System Call Handler

Application

File System Calls

Other KernelServices

Logical File System

Streams Driver DLFF VFS

DLFS Kernel Extension

VFS/Vnode Interface

Other VFSNFS VFS

Other Device Drivers

46 Data Links: Managing Files Using DB2

Page 69: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

1. The DLFM Upcall daemon sleeps waiting for an upcall from DLFF.

2. The vnode operation for which an upcall needs to be made is invoked in the context of the thread that attempts to perform a rename, delete, or enable write permissions. Let us call this thread the “operation thread”.

3. DLFF makes an upcall by filling in a query buffer and waking up the DLFM Upcall daemon, and puts the operation thread to sleep waiting for the results of the upcall.

4. This results in a context switch to the DLFM Upcall daemon, which reads the query details and then queries the DB2 Logging Manager to service the upcall.

5. Once the results are available, the DLFM Upcall daemon sends them to the DLFF driver via a reply buffer.

6. At this stage, there is a context switch from the DLFM Upcall daemon back to the operation thread. The DLFM Upcall daemon is put to sleep waiting for the next upcall, while the operation thread wakes up, reads the results of the upcall from the reply buffer, and decides whether to allow the operation to proceed.

7. If another upcall needs to be made while the DLFM Upcall daemon is processing an upcall (for example, if a second rename request comes in before the results of an upcall made for an earlier rename are available), then the second upcall has to be queued until the DLFM Upcall daemon is ready to service it.

8. When the results of an upcall are available, there may be multiple operation threads that are waiting for the results of their requests. It is ensured that the correct thread is woken up and is passed the results of the upcall that it had requested

2.4.3 Linking and unlinking filesLinking a file and unlinking it from the database control are the two most frequent operations that corresponds to SQL INSERT and DELETE respectively. Whenever an application inserts a file reference into a DATALINK column, the corresponding file on the server is linked by the DLFM. Linking involves applying certain constraints on the file so that subsequent rename and deletion of the referenced file, via normal file system APIs (or commands) are prevented to preserve referential integrity from the host database.

Furthermore, the access control mode of the DATALINK column determines the partial or full takeover of the file. In full access control (READ PERMISSION DB), the file ownership is changed to DB (to the dlfm user) and the file is marked read-only. Also an access token assigned by the host database is needed to access such a file.

Chapter 2. Technical architecture 47

Page 70: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

All the files linked to the host database are guarded against move, delete, and rename operations by the DLFM and the DLFF. When a file is linked, the DLFM puts a new entry in the FILE table (dfm_file). This entry consists of (among other information):

� Database ID (DBID)� Transaction ID (XN_ID) � File name (STEMNAME)� Recovery ID (RECOV_ID)

Recovery ID (RECOV_ID) generated at the host database consists of the database ID and a timestamp. It is guaranteed to be unique and monotonically increasing. For every link-file operation, the DLFM makes the following two checks:

1. If a link entry already exists for the same file in the DLFM meta-data table (dfm_file table), it rejects the link-file operation since the file is already in the linked state.

2. If an unlink entry exists for the same file in the DLFM table whose unlink transaction has not committed (for example, in-flight or in-doubt state), it rejects the link-file operation since the outcome of the unlink transaction is still unknown.

Figure 2-14 shows the steps followed by the DLFM when a link-file request is received.

Note: Superusers (root on UNIX and Administrator on Windows) can access files under full access control, without any token. They can perform any operation on these files.

48 Data Links: Managing Files Using DB2

Page 71: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-14 Link-file operation

Now let us look at the entire steps followed by the various components of DB2 Data Links, when a DB2 client issues an SQL INSERT for inserting a DATALINK value. Figure 2-15 explains the control flow for the entire process.

Link entry indfm_file ?

Unlink entrywith Txn. notcommited ?

Reject theLink

Operation

Link TypeWRITE

PERMISSIONBLOCKED?

Link Type READPERMISSION

DB and WRITEPERMISSIONBLOCKED ?

Change theowner of the fileto DLFM adminuid and make it

read-only

Remove the writepermissions from

the file

Put an entry of the file in the dfm_filetable and trigger the backup of the

file

DONE

Yes

Yes

Yes Yes

No

No

No

No

LINKRequest

Chapter 2. Technical architecture 49

Page 72: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-15 Control flow of SQL insert statement

In Figure 2-15, the following actions occur:

1. The application issues an SQL INSERT involving a DATALINK column.

2. In the connect phase, the following process begins:

a. Based on the server name in the URL, the DB2 agent on the DB2 server issues a connect request to the corresponding DLFM.

If the DB2 agent is already connected to this DLFM, then no connect request is issued.

The DLFM checks whether this DB2 server is allowed to connect, depending on the existence of a corresponding entry in the dfm_dbid table.

b. The prefix name and the stem name are extracted from the URL.

An invariant unique identifier, prefix-ID, is returned for the prefix name. This is to support variances of file system with respect to mount points.

c. A sub-transaction between DB2 and the DLFM is started.

The DB2 server sends the transaction ID to the DLFM.

d. At this stage, the DB2 server instructs the DLFM to link the file.

Note: An entry in the dfm_dbid table is created when the database is registered with the DLFM using the command:

dlfm add_db

DLFF (Data Link

Filesystem Filter)

DLFM_DB

DLFM Daemons

Data LinksManager

DB2 Client

(2) (a) Connect(b) Get Prefixid(c) Begin sub-transaction(d) Link file

(2) (e) Check file(f) Insert metadata

db2agents

(1) SQL Insert

DB2 Server withDataLink Extensions

(3) SQL Commit

(4) (a) Prepare(c) Commit

(4) (b) Harden metadata(d) Takeover file

50 Data Links: Managing Files Using DB2

Page 73: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

e. The DLFM checks that the file exists and is not a symbolic link. It also makes some other checks as shown in Figure 2-14.

f. The DLFM inserts certain file meta data into DLFM_DB.

3. The application on the DB2 client issues an SQL COMMIT statement.

4. The DB2 server performs a two-phase commit with the DLFMs involved in this transaction. It involves the following steps:

a. The DB2 server sends out a prepare-to-commit the sub-transaction (first phase of the two-phase commit).

b. The DLFM hardens the meta data in the DLFM_DB database.

c. If all the DLFMs involved in this operation respond YES to prepare-to-commit, the DB2 server sends out the actual commit order to all the DLFMs (second phase of the two-phase commit). Otherwise, the user transaction is aborted.

At this stage, the sub-transaction is finally committed.

d. Constraints are applied to the file to support referential integrity. This may involve changing the attributes and ownership of the file.

During an unlink-file operation, the table entry for the file is marked as unlinked. It also updates the unlinked transaction ID and the unlinked timestamp in the entry. At any given time, the DLFM FILE table (dfm_file) can have, at most, one linked entry for a given file. There can be multiple unlinked entries for a file because many successive link and unlink operations could have taken place for the same file.

The unlinked entry is used in the coordinated backup-and-restore operation to identify the correct version of the file from the archive server, if needed. In this case, the unlinked file entry is later removed by the Garbage Collector daemon (described in “DLFM process model” on page 38) when it is no longer needed. If file recovery is not needed, the unlinked entry is deleted in the second phase of the commit processing.

Note that the entry is not deleted sooner than in the second phase of commit since it would not be possible to undo the action, if the transaction’s outcome is aborted after the first phase.

During the link-file operation, file-entry checking and insertion must be an atomic operation. Otherwise, there is a small time window during which two DLFM agents could both check for, and not find, the linked entry for a file and then proceed to insert the two linked entries for the same file.

Chapter 2. Technical architecture 51

Page 74: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

To enforce the atomicity of the link operation, a unique index on the file name column and a new check-flag are defined. During link-file operation, the check-flag attribute is set to zero, and during the unlink-file operation the check-flag is set to the recovery ID provided by the host database. This unique index prevents two linked entries to be inserted but allows multiple unlinked entries for the same file.

Figure 2-16 shows the unlink process.

Figure 2-16 Unlink process

During the forward progress of a transaction, DLFM manipulates the entries in the FILE table as per link/unlink file operations. If the transaction needs to be rolled back, DLFM uses the recovery mechanism provided by the local database to undo the actions of the transaction. The file server, on the other hand, does

ON UNLINKRESTORE ?

DONE

Yes

No

UNLINKRequest

Mark the table entryas "unlinked",

update unlink Txn.id and time stamp

entry

Restore the originalpermissions of the file

Delete the file

52 Data Links: Managing Files Using DB2

Page 75: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

not support transactional semantics in general. Therefore, actual takeover or release of the file from the file system is done during the second phase of the two-phase commit process and is done by the Chown daemon (as described in “DLFM process model” on page 38).

DLFM also supports the unlinking of a file from one datalink column and the re-linking of the same file to another datalink column within the same transaction. This is an important customer requirement where current and old versions of the file are maintained in separate tables.

When an error occurs during regular link or unlink processing, DLFM reports the error status to the host database, which results in either statement-level (savepoint) or transaction-level rollback at the host database. If a link or unlink file request is initiated by a savepoint rollback at the host database, then any error reported by the DLFM local database results in rolling back the entire transaction at the host database. This is because DLFM treats the local database as a “black box” and it is not possible to rollback a rollback.

In addition, if a severe error, such as deadlock, occurs in the local database, the host database rolls back the full transaction. This is because the current transaction has already been rolled back in the local database. Also since DLFM does not write recovery log records for its own link and unlink file operations, it is not possible to do a database-style rollback. In the design, undoing a link (or unlink) file operation is done by sending the DLFM another link (or unlink) file request but with a special in_backout flag set to true. For a link file request with in_backout set, DLFM deletes the linked file entry that was inserted by the current transaction. For an unlink request with the flag set, the unlinked file entry is restored back to linked state.

2.4.4 Transaction supportWhen a new transaction is started by the application, the host database assigns a new transaction ID. In the case of an XA transaction, the host database also generates a local transaction ID that is different from the global XA transaction ID.

Chapter 2. Technical architecture 53

Page 76: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

A transaction ID is associated with a particular database so that there is no problem with transaction ID being the same between different databases. The transaction ID generated at a specific database is guaranteed to be monotonically increasing, which is absolutely essential. This ID is passed to the DLFM on each API invocation. The DLFM associates the transaction ID with each operation that changes the DLFM meta data and state. The reason is that the DLFM does not have logging services of its own, but uses a local database for persistence and logging (DB2 Logging Manager). By associating the transaction ID with the operation, and storing them in the database tables, it can relate the actions performed by a particular transaction. This is important because:

� The actions done by a DLFM for a particular sub-transaction may need to be undone if the DLFM records the transaction ID as persistent information along with other information in the FILE table. Entries associated with a transaction are identified by this ID during the commit processing if the host transaction aborts after the sub-transaction completes the prepare phase (for example, completed the first phase of the two-phase commit protocol) in the DLFM.

� Certain actions on the file system have to be performed during the second phase of the two-phase commit processing of the transaction.

DLFM uses the two-phase commit protocol to enforce the transactional semantics. Four APIs are provided by the DLFM for this purpose:

� Begin Transaction � Prepare� Commit� Abort

A sub-transaction starts when the host database makes a Begin Transaction API call to a DLFM.

Note: The X/Open Distributed Transaction Processing (DTP) standard is an open standard for OnLine Transaction Processing (OLTP). DTP can be described as middleware that allows an application (possibly transaction oriented) to be distributed across multiple machines in an heterogeneous environment. DTP comprises of Application, Resource Manager and Transaction Manager. XA is the Resource Manager to the Transaction Manager protocol defined by the DTP standard. For details on this, refer to Chapter 10 in the DB2 Administration Guide, which is available online at: http://www.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main

54 Data Links: Managing Files Using DB2

Page 77: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The transaction ID (XN_ID) generated at the host database is passed along with the Begin Transaction call. All subsequent API calls by the host database within the same transaction for linking and unlinking files are tagged with the same transaction ID and are processed within the same transaction context by the DLFM. Once all operations are done under the present transaction, as a part of the commit processing on the host database, the host database sends a Prepare request to the DLFM. Prepare request processing on the DLFM ensures that all the operations on the file server are made persistent by issuing an SQL commit to the local database. A separate transaction table is used for keeping the transaction ID, its state, and other related information. The transaction entry for the current transaction is not made into the transaction table until the Prepare request for the transaction has arrived. After the Prepare transaction request is done successfully on all DLFMs, the host database sends a Commit transaction request to the DLFMs. On the other hand, if the Prepare request fails, an Abort request is sent to the DLFMs.

Normally the Prepare, Commit, and Abort APIs are invoked by the host database as part of an application’s SQL COMMIT. If the transaction is a branch of a global (distributed) transaction, the Prepare request to the DLFM is invoked as part of the global prepare processing and the Commit/Abort request is invoked only when the outcome of the global transaction is known.

It is assumed that the commit transaction processing should not fail on the DLFM side if the Prepare transaction processing is successful. But that is not always true because there is a major difference between the database’s SQL commit processing and the DLFM’s commit processing (refer to Figure 2-17).

Note: It is possible that files may be linked or unlinked to multiple DLFMs in a given host database transaction. This implies that a host DB2 transaction may involve sub-transactions on multiple DLFMs. For sake of clarity, we restrict the discussion of the transaction management to only one DLFM.

Important: When multiple DLFMs are involved in a transaction, if one of the DLFMs fails to prepare the transaction, the host database sends an Abort request to all the remaining DLFMs, even though they may have prepared successfully.

Chapter 2. Technical architecture 55

Page 78: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-17 Commit processing transactions

The SQL COMMIT processing does not acquire any new locks. It, in fact, releases all the locks acquired by the present transaction. On the other hand, the DLFM uses the SQL interface to update the meta data and its state stored in its local database during commit processing. For a Commit request, for example, DLFM retrieves entries from the FILE table and deletes an entry from the TRANSACTION table. This, in turn, requires additional locks to be acquired by the DLFM. Since deadlocks are always possible when new locks are acquired, retry logic is included in the commit processing, and it keeps retrying until it succeeds. However, if a deadlock occurs among committing or aborting transactions, retry does not break the deadlock. In our case, deadlocks have been found to occur between a committing transaction and one of the DLFM daemons, but not between two or more committing or aborting transactions. This is because table entries inserted or updated by two concurrent transactions are always disjoint.

Note: This is enforced by the corresponding locking of the host database.

Update R1 Update R2 Prepare Txn. Commit/ Abort Txn.

Write R1 log Write R2 log Force logs Release locks

SQL Transaction ( Txn )

link file1 link file2 Prepare Txn. Commit/ Abort Txn.

DLFM: sql insert sql insert insert/commit del/upt/commit

DB: Write log Write log force logs log/rel. locks

DLFM Transaction

56 Data Links: Managing Files Using DB2

Page 79: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Therefore, this retry logic can resolve deadlocks formed in the DLFM commit/abort processing.

During a Prepare transaction processing, the DLFM inserts an entry into the TRANSACTION table and marks the transaction as prepared. If the DLFM fails after the transaction is prepared, that transaction remains in an in-doubt state. It is the host database’s responsibility to resolve the in-doubt transactions with the DLFM. Either the host database restart processing does it, or if the DLFM is unavailable at the restart, the host database spawns a daemon whose sole task is to poll the DLFM periodically and resolve the in-doubts when the DLFM is up. In-doubt transactions are resolved based on the outcome of the parent transactions in the host database.

2.5 Data Links on DCE-DFSSection 2.4, “Data Links on UNIX and Windows” on page 33, showed you how Data Links is implemented on UNIX and Windows NT. This section discusses Data Links implementation on DCE-DFS.

A DCE-DFS setup can involve a network of multiple nodes organized into separate cells or administrative domains, presenting a uniform location transparent file system namespace to any file system client within the environment. Data Links has been implemented for a single cell environment in DCE-DFS. A typical setup contains a few file server machines and several file system client machines.

Unlike the Data Links implementation on UNIX and Windows where Data Links is only made of a server, Data Links in a DCE-DFS environment is made both of a server and a client. Every file server (where Data Linked files are stored) has Data Links server on it, and all the DFS client nodes from where READ PERMISSION DB Data Linked files have to be accessed, have a Data Links client on them.

Data Links in the DCE-DFS environment has the following components:

� Data Links File Manager (DLFM)� Data Manager Application (DMAPP)� Data Links File System Cache Manager (DLFS-CM)

DLFM and DMAPP are the part of the Data Links server. DLFS-CM is the Data Links client, which is also known as the DFS Client Enabler for Data Links.

Let us now look at the role of each of these components in detail.

Chapter 2. Technical architecture 57

Page 80: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2.5.1 Data Links File Manager (DLFM)Like on UNIX and Windows, DLFM has two components:

� DLFM daemons

The process model of these daemons is similar to the one described in “DLFM process model” on page 38.

� DB2 Logging Manager

DB2 Logging Manager maintains all the information regarding Data Linked files in the tables of a database of its own (DLFM_DB). There may be more than one DLFM in a cell, but there is only one DB2 Logging Manager (for example, a single DLFM_DB database in the entire cell). Therefore, if any DLFM daemons have to access information from the Logging Manager, they do so by connecting to the node having DLFM_DB database (refer to Figure 2-18).

Note: Complete location transparency in DCE-DFS is implemented at the cell level, in the sense, that within a given cell, a portion of the namespace could be on a fileset that could reside on any fileserver within the cell without the user needing to be aware of it. In fact, the fileset could have read-only replicas on different fileservers, and a fileset could even be migrated from one fileserver to another transparently to a user even while it is actually in use by a file system client. For this reason, meta data (that is, information about linked files maintained by the Data Links Manager) is maintained in a per-cell common database, rather than separately for each file server.

58 Data Links: Managing Files Using DB2

Page 81: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-18 DLFMs in a single DCE cell

2.5.2 Data Manager Application (DMAPP)The core functionality of DLFS on the DFS File Server is achieved by DMAPP running in user mode on the file server to intercept events (data manager events) that are generated for various file system calls and provide the extra access control and referential integrity features for files linked under Data Links. The design of the DMAPP component is based on the DFS Storage Management Toolkit (SMT), an implementation of the Data Management Application Programming Interface (DMAPI) for DCE-DFS. DMAPI is a standard for a user level programming interface to implement logical extensions of the operating system for supporting data management applications, which need to intercept file system operations in a manner that is transparent to file system applications and users. DFS SMT also implements some extensions to DMAPI to support DFS-specific aspects including security aspects and file set-level operation notifications among others.

DLFMServer

NODE 1

DB2 LoggingManager

(DLFM_DB)

NODE n

DLFMServer

DB2Client

NODE 2

DLFMServer

DB2Client

Single DCECell

Chapter 2. Technical architecture 59

Page 82: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

With this approach, DMAPP is implemented as a user level Data Management Application (DMAPP) that receives and can respond to events corresponding to file system operations in a transparent manner, both for accesses via the file-exporter (through the DFS path) and via the local OS (through the local mount path).

This application can also invoke callbacks into the file system, through certain specific DM API calls, as required for performing Data Links specific processing. In addition, there are some DM API calls that are used for determining the security context of the caller (which is required for recognizing privileged users). Upcalls to DLFM are handled through a suitable user-level IPC mechanism with DLFM, since both applications are in user mode.

DMAPP process modelThe main application runs as a single thread, waiting for events. As and when events are received, the DM application spawns a thread for handling each of these events. At boot time, DMAPP is started and initialized for the aggregates on the file server host machine. Now, whenever a DM aggregate (an LFS aggregate on the server machine is converted to a DMLFS type by the dmaggr command, therefore, activating the DFS SMT on it) on the DFS file server is exported into the DCE Namespace, the mount event is generated. At this point, the events to be intercepted for the aggregate are set, so that all future operations on files under that portion of the fileserver namespace are redirected to the DMAPP.

The DMAPP performs some pre-processing if required (depending on the operation requested), and then passes it on to the underlying file system to do the rest of the job (by means of responding to the events appropriately). Then, if required, the DMAPP does some post-processing on the results of the operation and also does some logging, mainly for disk-crash utility purposes. In some cases, however, when the DMAPP needs to disallow the operation (to provide Data Links functionality), it may directly return an error status instead of passing it on to the next layer. Since the mount event is intercepted for any aggregate that is exported and from there onward, intercepting all operations into file sets in that aggregate, it is ensured that DB-linked files are not accessible unless DMAPP can intercept the request. The events for an aggregate are reset on receiving an unexport request. Figure 2-19 shows the DMAPP implementation and how it interacts with SMT and DLFM.

Note: File-exporter (also known as the DFS exporter) is the interface through which the applications on the DFS client accesses the files on the DFS server.

60 Data Links: Managing Files Using DB2

Page 83: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-19 The DMAPP implementation

In Figure 2-19, the implementation process flows like this:

1. The DB2 application connects to the DB2 server database and obtains the DATALINK value of the file.

2. The application then accesses the file either through the DFS mount path or through the local mount path.

NFS VFS CDFS VFS Other VFS LFS

Other Device Drivers Disk Device Driver

System CallHandler

Logical File System

KernelMode

DMaggr

DB2Server

DB2Application

UserMode

DFSClient

Local MountPath

DFS Exporter

DMLFSKernel DMLFS VFS

VFS+

DFSServer

DFS MountPath

VFS/Vnode Interface

DMAPPDLFM

DFS StorageManagementToolkit (SMT)

IPC

Data Links Components

DB2 Components

(1)

(2)

(2)

(3)

(4)(5)

Chapter 2. Technical architecture 61

Page 84: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3. The DMLFS aggregate generates events corresponding to the request from the application and passes them to DFS SMT.

4. DFS SMT passes on these events to the DMAPP.

5. The DMAPP interacts with DLFM (if required) and determines whether the request has to be allowed.

2.5.3 Data Links File System Cache Manager (DLFS-CM)To improve performance, DFS implements enhanced caching at the DFS client. It is essential that the users should not be able to access READ DB PERMISSION files without a valid token. Now if there is no component of Data Links on the DFS client that can intercept a user’s request, it would be possible for anybody to access READ PERMISSION DB files from the cache, without any token. DLFS-CM is also known as the DFS Client Enabler for Data Links.

The Data Links component that is required for this is known as the Data Links Cache Manager (DLFS-CM). DLFS-CM is a kernel-level file system filter (like DLFF in Data Links implementation on UNIX and Windows) that sits on top of the DFS Client Cache Manager and filters some operations for proper Data Links functionality.

Therefore, the main responsibility of DLFS-CM is to provide support for specialized database access control through encrypted tokens embedded in the file name, for READ PERMISSION DB files. The main functionality of DMAPP is to provide referential integrity.

Figure 2-20 explains the complete architecture of Data Links on DCE-DFS.

62 Data Links: Managing Files Using DB2

Page 85: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 2-20 Data Links architecture on DCE-DFS

In Figure 2-20, the architecture flows as explained here:

1. The application on DFS client accesses the file under the Data Links enabled file set.

2. The logical file system layer passes the request to the DLFS-CM VFS.

3. DLFS-CM further passes the request to the base (DFS client VFS).

4. DFS Client VFS interacts with the DFS file-exporter (the interface between the DFS client and DFS server) and sends a request to access the file. Finally DMAPP is informed about the request through events, which interact with the DLFM and authenticates the request (as discussed in Figure 2-19 on page 61).

Logical File System

JFS DLFSCM VFS

Disk Driver

DB2 ClientApplication

TO DB2

User

Kernel

DFS Client (DLFS enabled)

Data Links Components

DCE-DFS Components

(1)

(2)

(3)(4)DFS Client VFS

and VFS+

DFS File Server(DLFS-enabled filesets)

VFS+

LogicalFile System

APICalls

VFS

Data ManagerApplication(DMAPP)

DLFMUpcall

Daemon

EVENTS

IPC

DFS

File

Exporter

DMAPIKernel Ext. DM LFS

DFS-LFS

Chapter 2. Technical architecture 63

Page 86: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

64 Data Links: Managing Files Using DB2

Page 87: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 3. Application development

This chapter discusses how application programs interact with the Data Links File Manager, and how to plan and create those applications. It begins with a discussion of what types of applications might benefit from the use of Data Links. Next it discusses how Data Links gives you the ability to apply advanced database programming concepts to an environment based on files, while at the same time allowing the use of traditional file-based APIs. It also compares the pros and cons of using large objects (LOBs) with Data Linked files and then discusses the day-to-day tasks that an application developer must deal with when using files managed by Data Links.

3

© Copyright IBM Corp. 2001 65

Page 88: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.1 Choosing suitable applications for using Data LinksAny application that creates or retrieves large numbers of files is an ideal candidate to use Data Links. The content of the files can be anything: MP3 audio, video clips, images of any kind, engineering drawings, HTML files, word processing documents, etc. Data such as these are, in many cases, not stored in a database. Many existing applications store data in files for the performance benefits of direct file access without the overhead of a DBMS. Data Links provides the benefits of a database to externally stored files (security, integrity, transactional consistency, recoverability) without impacting the performance of applications that access those files.

Some examples of industries that use large numbers of files that could be managed with Data Links are:

� Aerospace and automotive engineering: Three-dimensional part geometry files

� Banking: Check images� E-commerce: HTML files and images of products� Bio-technology: Genetic sequences stored in files� Music industry: Downloadable MP3 and WAV files� Insurance: Insurance policies� Internet services: E-mail messages, HTML files, downloadable software� Medical industry: X-ray images

3.2 Transactional semantics for files in the applicationData Links applies many of the benefits of relational database technology to the management of external files. The concepts of access control, referential integrity, and backup and recovery can now be applied to external files as well as to data stored in a database. This helps to ensure that all of the data used by the enterprise is secure, up to date, and recoverable. Data Links provides these capabilities by storing information about the files being managed in its own DB2 database (DLFM_DB). The information kept about the files depends on the options chosen when defining the corresponding DATALINK columns.

When a file is linked, Data Links enforces referential constraints (if they have been defined) by validating that the file exists on the specified server and in the specified location. If the DATALINK column has also been defined as recoverable (RECOVERY=YES option), Data Links makes a backup copy of the file and record in the DLFM_DB database the attributes of the file such as when the file was linked, owner of the file, file access permissions, etc. This makes it possible

66 Data Links: Managing Files Using DB2

Page 89: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

to perform a coordinated point-in-time recovery of a DB2 database and the external files linked to the database. If any of the options to apply database access controls to the file have been selected, Data Links changes the owner of the file or the file access permissions.

All of the actions performed by Data Links, the making of the backup copy, the changing of owner and access permissions, the recording of file attributes in DLFM_DB, all occur within the scope of a database transaction. If an SQL statement that links or unlinks a file fails, or the transaction is rolled back for any other reason, the changes made by Data Links and the information recorded in the DLFM_DB database are also rolled back. This ensures that the state of a DB2 database and its linked files is always consistent.

3.3 Data Links versus LOBsThis section discusses the pros and cons of using large objects (LOBs) over Data Linked files.

3.3.1 Using LOBsDB2 provides the ability to store large character or binary strings in what is generically referred to as a large object (LOB). Character strings can be stored in a Character Large Object (CLOB), while graphic or binary strings can be stored in a Binary Large Object (BLOB). Data requiring double-byte character sets, such as documents written in Kanji, can be stored in Double-Byte Character Large Objects (DBCLOBs).

LOBs can be used to store the actual content of files inside of a DB2 database. The advantage of doing this is that the data stored in an LOB can be recoverable; it is backed up when the db2 backup database command is run. This can also be a negative, because every time the database backup is performed, all of the LOB data must be written to the backup. If the volume of data stored in LOB columns is large, the database backup can take a long time to complete, as well as require a large amount of storage media to hold the output of the backup.

To be recoverable, the LOB column must be defined as LOGGED. This means that when the LOB column is populated, modified, or deleted, its data is written to the DB2 log file. Because DB2 logging is a serialized process, meaning that only one process at a time can write data to the log file while others wait, performance can suffer, particularly when logging very large LOBs. The performance impact due to logging activity for LOB columns can be avoided by defining the LOB column as NOT LOGGED, but doing so makes the data stored in the LOB column non-recoverable.

Chapter 3. Application development 67

Page 90: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Another advantage of using the LOB data types is that access to the data can be controlled using table access privileges, for example, SQL GRANTs. Any user who has not been granted the SELECT privilege cannot read the data stored in an LOB column. Users must also be specifically given the privilege to issue the SQL INSERT, UPDATE, or DELETE statements.

Application programs use file reference variables to transfer data between an external file and an LOB. The file reference variable does not contain the data, but points to the data with a path name and file name. Data is inserted into an LOB column using the SQL INSERT statement with a file reference variable which points to an input file containing the data. Data is copied from an LOB column into an output file using the SQL SELECT statement with a file reference variable pointing to the output file. When using a file reference variable, the entire content of the LOB or the input file is copied. If the amount of data moving between the file and the LOB column is large, performance can suffer.

3.3.2 Using Data LinksData Links provides many of the benefits of LOBs without the drawbacks. Files that are linked to a database can be recoverable (when the DATALINK column is defined using the RECOVERY YES option). When a file is linked, the file data is not written to the DB2 log file, as with LOBs, but instead a backup copy of the file is made and stored in a separate file system or directory. The process of making the backup copy occurs asynchronously. The application program that linked the file does not have to wait for the backup copy to be created. In addition, the files are backed up only once, when they are linked. This means that the db2 backup database command does not repeatedly back up the same files over and over, so performance is improved, and the amount of media required to store the output of the backup is reduced.

Data Links also provides the flexibility to control access to the linked files through either table access privileges, for example, SQL GRANTs, or through file system access privileges. When a DATALINK column is defined using the READ PERMISSION DB option, access to the linked files is only granted to those users with the SQL SELECT privilege on the table containing the DATALINK column. Users who do not have this privilege cannot access the linked files. The advantage of this is increased security over the linked files. The disadvantage is that any legacy applications that use native file system APIs to read files may need to be changed to access a DB2 table before being able to access those files.

68 Data Links: Managing Files Using DB2

Page 91: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

If the DATALINK column was defined with the READ PERMISSION FS option, access to the linked files is not controlled by table access privileges, but is instead controlled by the access permissions of the linked files. This means that the linked files can be accessed without accessing the DB2 table containing the DATALINK column. The advantage of this is that legacy applications do not need to be changed to access linked files. The disadvantage is that the linked files may be somewhat less secure.

One other advantage of using Data Links instead of LOBs is that data does not have to be transferred between a file and the database. Therefore, performance is improved. Applications can access the data in the linked files without the overhead of database logging, or the overhead of data transfer between the client application and the server, and yet still have the benefits of recoverability and database security controls.

3.4 Application development tasksThis section discusses the day-to-day tasks that an application developer must deal with when using files managed by Data Links.

3.4.1 Application deployment considerationsApplications that use Data Links need to have access to both the DB2 database that stores the URLs pointing to the linked files, as well as the file system(s) where the files reside. Because the database server, the DLFM server and the file systems managed by it, and the application code can be hosted on different machines and different operating systems, some thought must be given to how each of these components interact. The database administrator who installs and configures Data Links on the DLFM server is responsible for ensuring that DLFM and any DB2 database that will use it can communicate (for details, see DB2 Data Links Manager Quick Beginnings, GC09-2966). The application programmer needs only to be concerned with how their program will access the DB2 database and the linked files.

Enabling access to a DB2 databaseAny program that accesses a DB2 database must do so through the DB2 Client Application Enabler (CAE) software. Whether your program runs on each end-user workstation, or on an application server, or on the DB2 database server, it always connects to the database through the DB2 CAE. There is a version of the CAE available for each of the supported platforms (including Windows, OS/2, and many of the UNIX platforms including AIX, Solaris, Linux, HP-UX, among others). Programmers need the Application Development CAE,

Chapter 3. Application development 69

Page 92: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

which provides tools for building applications which access DB2 databases. End users usually only need the Run-Time CAE. For a complete description of how to install and configure the DB2 client software, see DB2 Data Links Manager Quick Beginnings, GC09-2966, for your client platform.

Enabling access to linked filesAn application program must also be able to access the file system on which the linked files reside. If the program is running on a Windows client, and DLFM and the linked files are on a Windows server, access to the linked files can be accomplished by sharing the drive on the server and mapping it on the client. NFS can be used to access the files on UNIX fileserver (Data Links server). The FTP and HTTP protocols can be used, independent of the platforms, to access a Data Linked file.

3.4.2 Checking whether Data Links has been enabledBefore you can use the DATALINK data type in your applications, the DB2 database must be enabled for Data Links. Typically, the database administrator who installs and configures the DLFM server also registers the databases that will use that DLFM server, as well as register the DLFM server with those databases. This registration process is described in detail in DB2 Data Links Manager Quick Beginnings, GC09-2966.

You can obtain a list of all the Data Links File Managers that are registered with the database using the list datalinks managers command:

db2 list datalinks managers for <dbname>

Here, <dbname> is the name of the database.

The command should return output similar to this:

There are 1 DB2 Data Links Managers for database sampleType = Native Port = 50100 Name = UNUNBIUM.ALMADEN.IBM.COM

You can also check that the DB2 instance has been enabled for Data Links support by checking the database manager configuration parameter DATALINKS has been set to YES. On AIX, you could use this command:

db2 get database manager configuration | grep -i datalinks

The command should return output similar to this:

Data Links support (DATALINKS) = YES

If the DLFM server is registered with the database and the DATALINKS configuration parameter is set to YES, you are ready to begin using the DATALINK data type in your applications.

70 Data Links: Managing Files Using DB2

Page 93: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.4.3 Choosing DATALINK optionsBefore you create a table using the DATALINK data type, you must decide which options to use. Section 2.2.3, “DATALINK options” on page 26, describes in detail each of the valid options. This section discusses some considerations to make before you choose which DATALINK options to use. Table 3-1 shows the DATALINK options that are discussed with a brief description of the option.

Table 3-1 DATALINK options

How much control should the database have over the linked files?

The answer to this question alone may help you decide which DATALINKS options to use. At first glance, it may seem obvious that using Data Links to manage files stored outside of the database would imply a desire to have database controls over those files.

The question is to what degree do you want to impose those controls?

Data Links gives you the ability to apply the concept of referential integrity to the files being managed. Before a reference to a file can be put into the database, the file must exist. This helps you ensure that there are no invalid file references. But perhaps the application does not care if the file really exits. Data Links allows you to choose whether you want to use referential integrity for your files through the use of the LINK CONTROL option. If the NO LINK CONTROL option is used, none of the remaining options apply, and indeed, cannot be used.

Do you want to allow files that are referenced by the database to be deleted outside the scope of database control?

The DATALINK option, INTEGRITY ALL, gives you another way to enforce referential integrity by disallowing the deletion of a file that has been linked.

DATALINK option Description

LINK CONTROL Validates file references

INTEGRITY ALL Prevents deletion of link file

READ PERMISSION Allows/disallows file to be read outside of database control

WRITE PERMISSION

Allows/disallows file to be written to outside of database control

RECOVERY Provides ability to restore unlinked files

ON UNLINK Deletes or restores file when it is unlinked

Chapter 3. Application development 71

Page 94: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

How much control do you need over who reads your files?

Traditional file access controls can be used. Anyone who has permission to read the file does so by using standard file system APIs without being authorized by the database. This is particularly useful if you already have applications in place that access the files that will be managed with Data Links. By using READ PERMISSION FS these existing applications do not need to be modified to use SQL queries to retrieve a file reference before accessing a file. If, however, you want to use database access controls, you can use READ PERMISSION DB. This controls access to the linked files by requiring an access token to read the file. This access token is given to the user when the DATALINK value is read from the table. Any attempt to access a file without the access token is rejected.

Figure 3-1 shows the access token portion and the file name portion of the DATALINK value returned from a SELECT statement. Here is an example of trying to access the file without using the access token:

copy World_Domintation_Plan.doc /tmpcopy: junk3.txt: The file access permissions do not allow the specified action

The correct way to access the file is to use the access token with the file name. Note that because the access token is separated from the file name by a semicolon, the access token and file name may need to be enclosed in quotation marks to prevent the operating system from complaining. Here is an example of using the access token in a copy command:

copy ”042E2_Ckg9sE__A.hqFsTnJm_;World_Domintation_Plan.doc” /tmp

Access to the file is controlled by granting the SELECT privilege on the table with the DATALINK column to authorized users. Any user who does not have the SELECT privilege on the table does not have access to the linked files. The disadvantage of using READ PERMISSION DB is that existing applications that access files need to be rewritten to access the database.

72 Data Links: Managing Files Using DB2

Page 95: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 3-1 DATALINK access token

The WRITE PERMISSION option and the RECOVERY option must be considered together. If you want the ability to restore the database to a point-in-time and have the linked files automatically restored to the same point-in-time, you must use the RECOVERY YES option. When using RECOVERY YES, a copy is made of each file that is linked, and characteristics of the file such as location, file size, creator, etc. are recorded in the DLFM_DB database by the DLFM.

When a point-in-time recovery is performed on the DB2 database, DB2 requests DLFM to restore the files to the state they were in at that point-in-time. To provide this service, DLFM needs to have a backup copy of each different version of the file. If a file is changed outside the scope of database control, DLFM has no way of knowing what changes have been made to a file, and therefore, cannot support point-in-time recovery. The WRITE PERMISSION BLOCKED option forces the user to make changes to a copy of the original linked file, and then update the DATALINK value to point to the copy. This causes the original file to become unlinked, and the new file to become linked, and therefore, backed up by DLFM. With this method DLFM knows about each different version of the file and can participate in point-in-time recovery. This is why the WRITE PERMISSION BLOCKED option is required if you use the RECOVERY YES option.

If you are not interested in coordinated point-in-time recovery of the database and linked files, you can always use WRITE PERMISSION FS. This gives you the freedom to modify linked files without accessing the database and without updating the DATALINK values that point to the files.

SELECT dlurlpathonly(datalink_column) INTO :V1FROM My_Projects

My_Projects table

/projects/World_Domination_Plan.doc

V1 = "/projects/04E2_Ckg9sE__A.hqFsTnJm_;World_Domination_Plan.doc"

access token file namesemicolon

Chapter 3. Application development 73

Page 96: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

You also must decide what to do with files that become unlinked. When you delete a row containing a DATALINK value that points to a linked file (for example, the DATALINK value is not NULL and not zero-length) or you set the DATALINK value to NULL or zero length, the linked file becomes unlinked. You can choose to have the unlinked file immediately deleted from the file system or directory by using the ON UNLINK DELETE option (note that if you are using the RECOVERY YES option, the backup copy of the file is not deleted). This can be a useful cleanup mechanism.

Unneeded files do not waste any disk space, and you do not have to implement a separate cleanup process. However, you may decide that it is better to keep the file. If you are using READ PERMISSION DB, the owner of the file is changed to dlfm and its access permissions are set to read-only when the file is linked. The ON UNLINK RESTORE option resets the file owner to the original owner, as well as resetting the access permissions to their original state.

3.4.4 Changing DATALINK optionsOnce a table is created with a DATALINK column and the DATALINK options are chosen, the only way to modify those options is to export the data from the table, drop the table, recreate the table with the new DATALINK options, and load or import the data into the new table. There is no command that changes the DATALINKS options, so choose these options with care.

3.4.5 Querying DATALINK optionsWhen a table is created with a DATALINK column, the DB2 system catalog records the DATALINK options that were used. These options are recorded in the SYSIBM.SYSCOLPROPERTIES table in the DL_FEATURES column. These options can be viewed using the DB2 Command Line Processor with the following query:

db2 select colname, tabschema, tabname, dl_featuresfrom sysibm.syscolproperties

COLNAME TABSCHEMA TABNAME DL_FEATURES------------------ ------------------ ------------------ -----------PICTURE TARGET MGR_COPY UFAFBYR

Figure 3-2 provides the meaning of each of the values stored in the DL_FEATURES column.

74 Data Links: Managing Files Using DB2

Page 97: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 3-2 DATALINK options stored in SYSCOLPROPERTIES table

3.5 Coding considerationsOne of the advantages of using Data Links to manage files is that applications built to access those files can use traditional file system APIs to read and write the files. A programmer who is familiar with building applications that access DB2 databases only needs to learn a few new functions that deal with the DATALINK data type. For a complete discussion of how to build applications using DB2, see DB2 Application Development Guide, SC09-2949, and DB2 Application Building Guide, SC09-2948.

This section discusses how to declare a host variable for the DATALINK data type, how to link a file, read a linked file, update a linked file, and unlink a file. It also discusses the scalar functions used with the DATALINK data type, and it reviews some of the common programming errors encountered when using Data Links.

3.5.1 Host variable declarationThe DB2 host languages provide no host variable support for the DATALINK data type. This means that whenever dealing with a DATALINK column, programs must treat it as if it were a VARCHAR data type. Table 3-2 shows how to define a variable to hold a DATALINK value in each of the supported DB2 host languages.

UFAFBYR

Linktype U=URL

Link Control F=FILE, N=No

Integrity A=ALL, N=None

Read Permission F=FS, D=DB

Write Permission F=FS, B=Blocked

Recovery Y=Yes, N=No

On Unlink R=Restore, D=Delete, N=Not applicable

DL_FEATURES columnfrom the SYSIBM.SYSCOLPROPERTIES table

Chapter 3. Application development 75

Page 98: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Table 3-2 Host language variable declaration for DATALINKS data type

Note that although the length of the host variable used to hold a DATALINK value is a maximum of 254 bytes, the maximum length of the data location portion plus the comment portion is 200 bytes. The additional 54 bytes is reserved for holding the access token that is generated by DLFM when retrieving a DATALINK value from a table defined using the READ PERMISSION DB option.

3.5.2 Creating and linking a new fileApplication programs using Data Links create files in the same way that they always have, by using the standard file APIs provided by their programming language of choice. When using the C language, the programmer might use the fopen function to create a file. For DLFM to link the file, the file must reside in a file system or directory that is managed by DLFM.

After you create the file, you build the DATALINK value in the form of a URL that points to the file. The URL consists of four parts:

� The URL scheme � The hostname on which the file resides � The path name of the file� The file name itself

You need to store these values in a string variable. Here is a sample C code fragment to do this:

Host language DATALINK variable declaration

C struct tag {short int, char[254]} DATALINK_VAR_NAMEorchar DATALINK_VAR_NAME[255]

JAVA String DATALINK_VAR_NAME

PERL Host variables are not used. DATALINK values are placed into an array column. See Chapter 22, “Programming in Perl”, in DB2 Application Development Guide, SC09-2949.

COBOL 77 DATALINK_VAR_NAME PIC X(254)

FORTRAN SQL TYPE IS VARCHAR(254) DATALINK_VAR_NAME

REXX No host variable declaration is necessary. Host variable data type and size are determined at run time

76 Data Links: Managing Files Using DB2

Page 99: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

char URL[255];strcpy (URL,”HTTP://myhost.com/projects/World_Domination_Plan.doc”);

The last step to link the file is to create a row in a table with a DATALINK column using the SQL INSERT statement. The string variable containing the URL must be cast into the DATALINK data type by using the DLVALUE built-in scalar function. Here is what an SQL INSERT statement might look like:

INSERT INTO MY.DOCS VALUES (:doc_id, DLVALUE(:URL))

If the INSERT is successful, the file is now linked. Depending on the options used to define the DATALINK column, the file might be backed up by DLFM (when using RECOVERY YES), and its access permissions and ownership may be changed. Although an application program can always depend on an sqlcode of “0” as an indication that the SQL INSERT statement was successful and that the file was linked, you can also retrieve the DATALINK value from the table with an SQL SELECT statement. If the DATALINK value is returned, DLFM has validated that the URL does indeed point to a valid, linked file (assuming that the DATALINK column was defined using the FILE LINK CONTROL option).

3.5.3 Reading a linked fileWhen you retrieve a DATALINK value from a table, the entire URL is returned, along with some leading characters, which indicate that what follows is a URL:

URL HTTP://myhost.com/projects/World_Domination_Plan.doc

To retrieve the URL without the leading characters, you can use the DLURLCOMPLETE scalar function. Here is a sample C code fragment to do this:

char[255] my_URL;

SQL EXEC SELECT DLURLCOMPLETE(DATALINK_COLUMN_NAME) INTO :my_URL;

The SELECT would return:

HTTP://myhost.com/projects/World_Domination_Plan.doc

Web-enabled applications may be able to use this URL “as is”, directing a browser to open the file using HTTP protocol. Many times, however, the application program needs to use standard file system APIs to open and read the file. In this case, you typically need only the path name and file name to access the file. You can use another scalar function, DLURLPATH, to retrieve the path name and file name. Here is a C code example of retrieving the path name and file name and opening the file for read access:

char[255] thefile;

SQL EXEC SELECT DLURLPATH(DATALINK_COLUMN_NAME) INTO :thefile

Chapter 3. Application development 77

Page 100: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

FROM MY.TABLE WHERE FILE_ID=’123;

fopen (thefile, ‘r’);

Reading files linked with READ PERMISSION DBWhen you retrieve the URL or path name to a file that was linked using READ PERMISSION DB, an access token is returned along with the URL or path name. This access token must be used as part of the file name when reading the file, for example:

char[255] my_URL;

SQL EXEC SELECT DLURLCOMPLETE(DATALINK_COLUMN_NAME) INTO :my_URL;

The select would return:

HTTP://myhost.com/projects/04E2_CkKGxE;World_Domination_Plan.doc

The access token “04E2_CkKGxE” is the string of characters immediately preceding the file name and is delimited from the file name by a semicolon. When reading the file, the access token, semicolon, and the file name must all be used. Some operating systems have problems with the semicolon embedded as part of the file name, unless the entire file name is quoted:

ls -al “04E2_CkKGxE;World_Domination_Plan.doc”

When an application program retrieves a DATALINK value containing an access token, the access token is valid for a limited amount of time. The database configuration parameter DL_EXPINT determines the length of time for which the access token is valid. Any attempt to read a file with an expired access token is rejected, and it is necessary to access the table again to obtain a valid access token.

Key points to remember:

� The SQL SELECT statement is used to retrieve the complete URL or path name to the file.

� Once the application knows the location of the file, it can read it like any other file.

� Files linked with READ PERMISSION DB require a valid access token to read them.

78 Data Links: Managing Files Using DB2

Page 101: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.5.4 Updating a linked fileIf you are using the DATALINK option WRITE PERMISSION FS, you can update a linked file without having to do anything special with the DATALINK value. Because DLFM does not support recovery of linked files when using WRITE PERMISSION FS (see 3.4.3, “Choosing DATALINK options” on page 71, for an explanation why), you are free to change the file as you like. Anyone who has write permission on the file can rename, delete, or change the file outside the scope of database control.

To update a linked file when using WRITE PERMISSION BLOCKED, you must first unlink the file by setting the DATALINK value to NULL (if the column is nullable) or a zero-length URL, or by deleting the row that points to the file. When the file is unlinked, DLFM no longer controls access to the file, and you are free to modify it. After you change the file, you can relink it by updating the NULL DATALINK value or inserting a row with a DATALINK value pointing to the file, and it will once again be controlled by DLFM. One advantage to using this method is that the name of the file being linked does not need to change.

An alternative to unlinking and relinking the file is to make a copy of the linked file, modify the copy, and update the DATALINK value to point to the new file. This method has the advantage of using fewer SQL statements to do the job, but the disadvantage is that you require more disk space, because, at least temporarily, you have both the original file and a copy taking up space on the disk. Another disadvantage of this method is that the name of the linked file may need to change, and the application program would need to contain logic to generate a new file name.

3.5.5 Unlinking a fileA file can be unlinked by either deleting the row whose DATALINK column points to the file, or by updating the row and setting the DATALINK value to NULL or to point to another file.

When the file is unlinked, the action taken depends on the ON UNLINK option used to define the DATALINK column. The ON UNLINK DELETE option causes the file to be physically removed from the file system. This option must be used with care. Make sure you really want the file to be deleted. The ON UNLINK RESTORE option causes the file to be restored to the original ownership and access permissions that were in place when the file was linked. Remember that when using READ PERMISSION DB, DLFM changes the owner of the file to dlfm and the access permissions to read-only.

Chapter 3. Application development 79

Page 102: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.5.6 Scalar functions used with the DATALINK data typeBecause there is currently no host language support for the DATALINK data type, application programs must use the built-in scalar functions to build a DATALINK value and to extract the various components of a DATALINK value. These functions are described in 2.2.2, “Scalar functions for DATALINK data type” on page 24.

Scalar functions with DB2 Call Level InterfaceDB2 Call Level Interface (CLI) provides two scalar functions for the creation and retrieval of DATALINK values:

� SQLBuildDataLink: This function is used to create a DATALINK value and is the CLI equivalent of the DLVALUE function described in 2.2.2, “Scalar functions for DATALINK data type” on page 24.

� SQLGetDataLinkAttr: This function is used to extract the various components which make up a DATALINK value, and is the functional equivalent of the remaining DATALINK scalar functions described in Chapter 2, “Technical architecture” on page 11.

For a complete description of the syntax, function arguments and usage of these functions, refer to DB2 UDB Call Level Interface Guide and Reference, SC09-2950. For a complete sample program using CLI that connects to a database, creates a table with a DATALINK column, inserts a row into the table, and then fetches the row, see Appendix B, “CLI Example,” in DB2 Data Links Manager Quick Beginnings, GC09-2966.

Scalar functions with JDBCThe Sun JDBC 3.0 specification provides a DATALINK type code, which is defined in the java.sql.Types class, as well as methods for storing and retrieving references to externally stored data (for example, URLs pointing to linked files). At the time this redbook was written, the implementation of these methods (setURL and getURL) were not yet finalized. Application programs using JDBC should treat a DATALINK value as a String and use the getString and setString methods. Note that java applications can still use embedded SQL, which uses the scalar functions described in 2.2.2, “Scalar functions for DATALINK data type” on page 24.

80 Data Links: Managing Files Using DB2

Page 103: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.5.7 Error handlingWhen application programmers start to use a new data type, they usually find a whole new set of error conditions to deal with. The DATALINKS data type is no exception to this rule. It is important to understand what can go wrong so that your application program can handle the condition gracefully. This section briefly discusses the two most common errors that programmers are likely to encounter and suggests ways of dealing with the errors.

SQL0358N - Unable to access file...This is probably the most common error that you will encounter as you are developing a program to use Data Links. This error message is accompanied by a reason code (use the db2 ? SQL0358N command from the DB2 command line processor for a complete list of reason codes). The most common causes of this error are:

� Attempt to link a file that does not exist� Format of the URL is bad (for example, does not begin with HTTP, etc.)� File is already linked� A linked file has been deleted

The cause of this error can usually be determined by looking closely at the DATALINK value and verifying that all of its components are valid:

� Is the protocol specified correct (HTTP:// or UNC:\)?

� Is the server name specified a valid Data Links server that is registered with the database?

� Is the path name to the file correct?

� Is the file name correct?

Tip: DB2 returns error codes to applications through the sqlcode portion of the SQL communications area (SQLCA). The sqlcode is of the form -nnnn or +nnnn, where nnnn is a 4-digit number. A text description of the error can be obtained by using the DB2 command line processor “?” command. When using the “?” command, the four-digit error number should be prefixed with the characters SQL and followed by the character N (for error messages; the sqlcode is negative), W (for warnings; sqlcode is positive), or I (for informational messages, sqlcode is positive). For example, if the application program receives an error code of -0358, enter:

db2 ? SQL0358N

For an error code of +0100, enter:

db2 ? SQL0100W

Chapter 3. Application development 81

Page 104: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

SQL0357N - Data Links Manager Not AvailableThis error occurs when attempting to access a a DATALINK value in a table that has been defined with the INTEGRITY ALL option, and the Data Links server is down. When using INTEGRITY ALL, DB2 contacts DLFM to check the validity of the URL before returning the URL to the application program. If DB2 is unable to contact the File Manager for any reason, an sqlcode of -357 is returned.

The obvious way to correct this error is to start the File Manger. If this is the first time that a Data Link server is being used, check that it is correctly registered with the database using the command:

db2 list datalinks managers for database

3.6 Using multiple file serversA single DB2 database can work with multiple DLFM file servers. Perhaps you want the database to control files stored on Windows NT and on AIX (see Figure 3-3). You can accomplish this by registering both file servers with the DB2 database, and by registering the DB2 database with each of the file servers. Note that the Data Links file servers can reside on any combination of the supported platforms. There are, however, a few restrictions.

First, if you register a Data Links file server that resides in a DFS cell, it can be the only file server registered with the database.

Next, the maximum number of file servers you can register with a database is 16.

Each of the DLFMs registered with a database must be on the same version and release level as the database. That means that when you apply fixpaks to the database or upgrade to a new version of DB2, you must also update all of the DLFM servers.

82 Data Links: Managing Files Using DB2

Page 105: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 3-3 Using multiple DLFM file servers

3.6.1 Supporting multiple links to the same fileIf you attempt to insert a row with a DATALINK value pointing to a file that is already linked, you received an error (SQL0358N - reason code 25). DLFM allows you to link a file once. All subsequent attempts to link the file will fail, unless the file is first unlinked.

What if the DB2 database has two DLFMs registered? Couldn’t you link the file to both of them?

The answer is no, you cannot. The reason for this is that any given file system can only be managed by one DLFM. The file system must reside on the same server on which DLFM is running, and there can be only one instance of DLFM running on any given server. Because of this, you would be unable to register a file system with more than one DLFM, and therefore unable to link a file to more than one file manager.

DB2 UDB DatabaseDB2 UDB Database

Table

Windows NT

FileServer

Table

DB2 UDB DatabaseDB2 UDB Database

DATALINK Value

File

FileServer

File

AIX

DATALINK Value

Chapter 3. Application development 83

Page 106: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3.7 Migrating existing applications to use Data LinksExisting application programs that access files or LOBs can easily be migrated to use files that are managed by Data Links. Perhaps the biggest hurdle will be for programmers working with files who have never worked with DB2 or another relational database management system. Programmers unfamiliar with SQL should begin by reading DB UDB SQL Getting Started, SC09-2973. Those who are familiar with SQL but have never built applications using DB2 should read DB2 Application Development Guide, SC09-2949, and DB2 Application Building Guide, SC09-2948. These and other DB2 related publications are available online in HTML or PDF format, free of charge, at: http://www.ibm.com/software/data/db2/library

3.7.1 Migrating an application that uses filesAn application program that uses files usually has code that deals with the creation, modification, and deletion of those files. The program might prompt the user for information about which file name to open or create, or it might store or retrieve information about the file name and its attributes in a database. Using Data Links to manage files requires some initial preparation of the file server, perhaps some software installation and configuration on the client workstations, as well as changes in the application programs that access the files.

Changes on the file serverThe first thing that must be done to implement Data Links control of the files is to install the Data Links software on the file server. DB2 Data Links Manager Quick Beginnings, GC09-2966, describes how to install and configure Data Links.

Next you must prepare the file system or drive where your files reside to be managed by Data Links. If the files are on a UNIX platform, you must convert the file system to a DLFS file system. If the files are on a Windows NT platform, they must be put on to an NTFS formatted drive that was specified as a Data Links managed drive during installation of Data Links on the Windows NT server. DB2 Data Links Manager Quick Beginnings, GC09-2966, provides instructions for doing this.

Changes on the client workstationAny client workstation running applications that access DB2 databases must have the DB2 Client Application Enabler (CAE) installed. The CAE resolves the location and communication protocol(s) of a DB2 database when an application program issues a CONNECT command to connect to a database. Databases must be “cataloged” on the client to be accessible to the application. The Quick Beginnings manuals for UNIX (GC09-2970), Windows (GC09-2971), OS/2

84 Data Links: Managing Files Using DB2

Page 107: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

(GC09-2968), and Linux (GC09-2972) each contain a section describing how to install and configure the DB2 CAE software. The CAE software for all supported clients is free and can be downloaded from: http://www.software.ibm.com/data/db2/udb

Changes to the application programThe biggest change to any application program migrating from file access to using Data Links is that the program must interact with DB2. This means connecting to a DB2 database and using SQL to store and retrieve data from tables in the database. Any program that accesses files managed by Data Links must store and retrieve file references (URLs) in a DB2 table with a DATALINK column.

For a file to be linked (managed by Data Links), the program must perform an SQL INSERT operation with a DATALINK value that points to the file. When an application program wants to open a linked file, it does so by performing an SQL SELECT statement on a table with a DATALINK column and retrieving the path name and file name. Once the path name and file name are known, the program can use the standard file system APIs to read it. Depending on the DATALINK options chosen, the file may need to be unlinked before it can be changed. Once a file is unlinked, the standard file system APIs can be used to modify the file.

Section 3.5, “Coding considerations” on page 75, describes how files are linked and unlinked, how to read linked files, how to modify linked files, and how to use the DB2 built-in scalar functions to create and manipulate DATALINK values.

3.7.2 Migrating an application that uses LOBsApplications that use LOBs have many of the same considerations when migrating to use Data Links as do applications that use files. The Data Links software needs to be installed, and a file system or shared drive must be configured to hold the files. The DB2 database containing the LOB data must be registered with DLFM, and the Data Links File Server that will manage the files must be registered with the DB2 database. If an application using LOBs is running on client workstations, the DB2 Client Application Enabler must already be installed and configured on those workstations.

Three additional things must occur for a program to migrate from using LOBs to use files managed by Data Links. First, you must externalize the LOB data, that is, put the data which is stored in an LOB column into files. Second, you must create a new table with a DATALINK column instead of an LOB column. You can then populate the new table with data from the existing table and establish links to the files. Third, the programs must be changed to use the new DATALINK column.

Chapter 3. Application development 85

Page 108: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Externalizing LOB data using the Export utilityThere are a number of ways to put data contained in an LOB column into files. Perhaps the easiest way is to use the DB2 Export utility. The Export utility writes LOB data to a user-defined path, and names the files with a user-defined base file name followed by a sequence number. Here is an example of using the Export utility to put LOB column data into files:

db2 export to my_table_data.del of del lobpath /datalinks/photos lobfile mylob1,mylob2 modified by lobsinfile

select * from my.photo_table

In this example, all of the non-LOB data from the table my.photo_table is written to the delimited ASCII file named my_table_data.del, while the data from the LOB column is written to files named mylob1.001, mylob1.002, etc. up to mylob1.999. If there are more than 999 LOBS, the Export utility uses the next base file name specified, mylob2, and generates file names of mylob2.001, mylob2.002, etc. The files containing the externalized LOB data are written to the directory /datalinks/photos. You must be sure to specify enough base file names in the EXPORT command to handle the number of non-NULL LOB values.

Linking the filesAfter exporting the data from a table with an LOB column and placing the LOB data in files, you are ready to build a new table with a DATALINK column in place of the LOB column. The easiest way to build the new table is to use the db2look utility to extract the table definition from the DB2 catalog, save it in a file, and modify the file by replacing the LOB column definition with a DATALINK column definition. Here is an example of using the db2look utility:

db2look -d SAMPLE -u MY -t PHOTO_TABLE -e -o table.def

This example connects to the SAMPLE database and extracts the DDL for the table MY.PHOTO_TABLE and places it in a file named table.def (see DB2 UDB Command Reference, SC09-2951, for a complete description of the db2look utility). The table.def file may appear as shown in Example 3-1.

Example 3-1 The table.def file

CREATE TABLE "MY “."PHOTO_TABLE" ("LOBNO" CHAR(6) NOT NULL ,"PHOTO_FORMAT" VARCHAR(10) NOT NULL ,"PICTURE" BLOB(102400) LOGGED NOT COMPACT) IN "USERSPACE1";

You need to change the definition of the column named PICTURE from BLOB to DATALINK. The modified table.def file may appear as shown in Example 3-2.

86 Data Links: Managing Files Using DB2

Page 109: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Example 3-2 Modified table.def file

CREATE TABLE "MY "."PHOTO_TABLE" ("LOBNO" CHAR(6) NOT NULL ,"PHOTO_FORMAT" VARCHAR(10) NOT NULL ,"PICTURE" DATALINK LINKTYPE URL FILE LINK CONTROL INTEGRITY ALL READ PERMISSION FS WRITE PERMISSION BLOCKED RECOVERY YES ON UNLINK RESTORE) IN "USERSPACE1" ;

Next you drop the old MY.PHOTO_TABLE and create a new table with the modified DDL. You are almost ready to populate the new table. But first you need to change the delimited ASCII file created by the Export utility. You need to have a full URL pointing to your files rather than just the file name. You need to change the file from this:

"000130","bitmap","mylob.001""000130","gif","mylob.002""000130","xwd","mylob.003""000140","bitmap","mylob.004"

to this:

"000130","bitmap","HTTP://MY.DLFM.SERVER.COM/datalinks/photos/mylob.001""000130","gif","HTTP://MY.DLFM.SERVER.COM/datalinks/photos/mylob.002""000130","xwd","HTTP://MY.DLFM.SERVER.COM/datalinks/photos/mylob.003""000140","bitmap","HTTP://MY.DLFM.SERVER.COM/datalinks/photos/mylob.004"

Remember that the value you supply for insertion into the DATALINK column must be in the form of a URL that points to the file.

You are now ready to populate the new, improved table with the data you exported from the original table by using the either the DB2 Import utility or the Load utility:

db2 import from my_table_data.del of delinsert into my.photo_table

or

db2 load from my_table_data.del of delinsert into my.photo_table

For a complete discussion of using the import and load utilities, see 8.8, “Running the Import utility” on page 157, and 8.9, “Running the Load utility” on page 158.

Chapter 3. Application development 87

Page 110: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

If the volume of data in the LOB table is too large, you may need to run the Export utility multiple times, each time exporting a subset of the table. For example, if the table contains a million LOBs, Export would require more than 1000 base file names. An EXPORT command with 1000 base file names may be too large for DB2 to process. Instead, you may split the job into multiple EXPORT commands and supply a WHERE clause on the SELECT statements to limit the number of files created:

db2 export to my_table_data1.del of del lobpath /datalinks/photos lobfile mylob1,mylob2 modified by lobsinfile

select * from my.photo_tablewhere LOBNO < ‘500000’

and

db2 export to my_table_data2.del of del lobpath /datalinks/photos lobfile mylob3,mylob4 modified by lobsinfile

select * from my.photo_tablewhere LOBNO >= ‘500000’

Alternative to using EXPORTIf you want more control over the file names that are created than the Export utility gives you, you need to write code to extract the data stored in an LOB column. You can do this by using a file reference variable. The file reference variable represents a file, but does not contain the file data. A file reference variable can be used in an SQL SELECT statement to read data from an LOB column and place it in a file. Before invoking the SELECT statement, you need to set the attributes of the file reference variable. These include:

� The file name� The file options (read, create, overwrite, etc.)

You create a file by placing a file name into the file reference variable, setting the file options to SQL_CREATE_FILE, and then executing a SELECT statement with the file reference variable. You need to do this for each row with a non-NULL LOB column. You can find examples of how to use file reference variables in DB2 UDB SQL Reference, SC09-2974.

There are two problems you must address. The first is how to name the files you are creating. The second problem is how to associate a row in the table with the LOB column to the external file.

Figure 3-4 shows an example of LOB data from the MANAGERS table being written to external files. In this example, the file names contain the key values from the MANAGERS table, ID. If you name your files with data from the primary key of the original table, you have a way to associate a row in the table to the external file. Why this is important will soon become apparent.

88 Data Links: Managing Files Using DB2

Page 111: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 3-4 Externalizing LOB data

Creating a table with DATALINK columnNext, you create a table with similar attributes as your table containing LOBs, except you will use a DATALINK column instead of an LOB column. For example, you might create a table named MANAGER_PHOTOS with the ID FNAME, LNAME columns, and a DATALINK column named DL_PHOTO.

The last step is to populate the new table by reading rows from the LOB table and inserting the data into the new table and supplying a DATALINK value that points to the appropriate file. This is illustrated in Figure 3-5.

B01

C72

RT3

5G4

MANAGERS Table

Bob

Annie

Jim

Sue

Smith

Aston

James

Sims

......

......

......

......

ID FNAME LNAME LOB_PHOTO

file.B01

file.5G4

file.RT3

file.C72

Chapter 3. Application development 89

Page 112: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 3-5 Moving LOB table data to DATALINK table

After you migrate your LOB data to files, create a new table structure using a DATALINK column, and link the files, you need to change all of the programs that accessed the LOB columns to now use the external files. Note that if the application programs run on client workstations, the file system or directory containing the linked files must be accessible to the client. See 3.4.1, “Application deployment considerations” on page 69, for a discussion of how to make the linked files accessible to clients.

B01

C72

RT3

5G4

MANAGERS Table

Bob

Annie

Jim

Sue

Smith

Aston

James

Sims

......

......

......

......

ID FNAME LNAME LOB_PHOTO

B01

C72

RT3

5G4

MANAGER_PHOTOS Table

Bob

Annie

Jim

Sue

Smith

Aston

James

Sims

......

......

......

......

ID FNAME LNAME DL_PHOTO

file.B01

file.5G4

file.RT3

file.C72

1. SELECT

2. INSERT

90 Data Links: Managing Files Using DB2

Page 113: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 4. Planning Data Links deployment

This chapter discusses a number of different deployment options and the reasons why you may use them. It describes the most important file systems for the Data Links setup and the issues to consider when locating and sizing them.

Before using DB2 Data Links Manager, you need to consider these items:

� Data Links Manager can be installed on systems with:

– DB2 UDB EE (AIX)– DB2 PE, WE, EE (Windows NT)– DB2 EE V7 (Solaris)

� Data Links Manager cannot be used with DB2 Enterprise Extended Edition.

� DATALINK columns cannot be part of a unique index, primary key, or foreign key.

� A DB2 UDB server with a table containing the DATALINK data type can connect to a DB2 Data Links Manager on Windows NT, AIX, or Solaris.

4

Important: It is very important that the host chosen to be the Data Links File Manager has a hostname that will not be changed in the future. The hostname is stored throughout the DLFM system. The hostname must not contain the underscore character “_” this causes problems for DLFM.

© Copyright IBM Corp. 2001 91

Page 114: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4.1 Deployment optionsThere are a number of different ways that the Data Link File Manager can be deployed. The single server implementation is the easiest to start with. The most complex is multiple DB2 servers connecting to multiple Data Link File Managers. There are many other ways to deploy and use Data Links, some of the most common are discussed in the following sections.

4.1.1 Single server implementationThe single server implementation is the easiest to install and maintain (Figure 4-1). It consists of DB2 Universal Database and Data Links Manager installed on a single host machine. The single server implementation is commonly used on test and development systems and when only a single host machine is available.

Figure 4-1 Single server implementation

4.1.2 Single Universal Database and multiple DLFMsThe single Universal Database and one to many DLFMs on separate hosts seem to be the most common implementation for production systems (Figure 4-2). This implementation option can provide better performance by locating the Data Linked files geographically closer to the client. The Data Linked files are usually larger in size and therefore benefit from the closer network location and moving data over a local area network versus a wide area network. The UDB database can be on Windows NT, Solaris, or AIX and the DLFM can also be on Windows NT, Solaris, or AIX. This implementation also can provide increased availability.

Data LinksFile Manager

(DLFM)

DB2 Server

DB2 Client

92 Data Links: Managing Files Using DB2

Page 115: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 4-2 Single UDB and one to many DLFMs

4.1.3 Multiple Universal Databases and single DLFMThe multiple Universal Databases and single DLFM option may be desirable when a development and a test database are on the same host and there is no other hardware available to install another DLFM. This implementation is not recommended mainly because it greatly complicates recovery. Figure 4-3 shows the layout.

Shared directoryor NFS mount

DB2 UDB Serverin Seattle

Shared directoryor NFS mount

Client application inSan Jose

Client applicationin Seattle

Table withDATALINK column

Data Links FileManager (DLFM)

File datain San Jose Data Links File

Manager (DLFM)

File datain Seattle

Chapter 4. Planning Data Links deployment 93

Page 116: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 4-3 Multiple UDBs and a single DLFM

4.1.4 Multiple DLFMs on a single hostThe Data Links File System Filter (DLFF) has been designed so that only one Data Link File Manager is allowed per host. Multiple DLFMs on a single host is not supported today. For most applications, this should not be a problem because there are so many other deployment solutions.

4.1.5 Multiple DB2s and multiple DLFMsThe most complex way to implement Data Links is by creating an environment where multiple DB2 databases connect to multiple DLFM hosts. Figure 4-4 illustrates this concept.

Shared directory or NFS mount

DB2 Client

DB2 UDBhost-2

Table withDATALINK column

DB2 UDBhost-3Data Links File

Manager host-1

File Data

Table withDATALINK column

94 Data Links: Managing Files Using DB2

Page 117: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 4-4 Multiple DB2 and multiple DLFMs

4.2 File systems and sizingThere are at least two very important planning items discussed in this section that pertain to file systems and sizing. The items deal with the DLFM backup directory and planning your file systems and directories where the Data Linked files will reside.

DB2 UDBhost3

Data LinksFile Manager

host-2

Table withDATALINK

column

Data LinksFile Manager

host-1

DB2 UDBhost-4

DB2 UDBhost-5

File data

File data

Table withDATALINK

column

Table withDATALINK

column

Chapter 4. Planning Data Links deployment 95

Page 118: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4.2.1 The DLFM backup (archive directory)For a description of the DLFM_BACKUP_DIR_NAME, refer to Chapter 4 “Choosing a backup method” in DB2 Data Links Manager Quick Beginnings, GC09-2966.

Table 4-1 Parameters that can affect the size of the archive directory

The DB2 database configuration parameter DL_NUM_COPIES can also affect the size of the backup file system. If the default of “0” is chosen, it has no bearing, but if a number from 1 to 15 is chosen the size required for the DLFM backup directory will increase proportionately.

Another database configuration parameter that can have an impact on the size of the DLFM backup directory is NUM_DB_BACKUPS. This parameter specifies the number of database backups to retain for a database. For more information, refer to Chapter 11, “Recovery” on page 201. When the specified number is reached, any corresponding file backups linked through a DB2 Data Links Manager are removed from the archive server or backup directory.

Note: When the default of disk copy is selected, the sizing of this directory can be important. If an initial load or import of data is being done on a table with a DATALINK column that is defined with RECOVERY YES, the backup file system or directory must have at least the same amount of space as all of the files to be linked. The space is required because a copy of each file that is inserted is placed in the DLFM_BACKUP_DIR_NAME directory.

DB CFG parameter Description

RECOVERY YES When specified for a DATALINK column type, allows DB2 to support point in time recovery of Data Linked files. A copy of the file is placed on the archive server for recovery.

DL_NUM_COPIES The number of additional copies to be made on the archive server when a file is linked (0 to 15).

NUM_DB_BACKUPS Number of most recent database backups to retain. This triggers garbage collection, which can delete old files from the archive server.

REC_HIS_RETENTN Number of days historical information on backups is retained. Can also influence when garbage collection is triggered.

96 Data Links: Managing Files Using DB2

Page 119: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The database DLFM_DB, Data Links file system files, DLFM backup directory and the dlfm home directory should be placed on different file systems that do not share disks. The backup directory can also be the Tivoli Storage Manager archive server or XBSA (Net.Backup). For additional details on how to configure the archive server on NT, Solaris, and AIX, refer to Release Notes Version 7.2/Version 7.1 FixPack 3.

The DLFM backup directory can contain:

� Images of the DLFM_DB database

� Copies of linked files

� All updates when RECOVERY YES is specified for the DATALINK column and disk copy is the backup method

4.2.2 Data Links controlled file systemsYou must carefully consider the number of files that will be placed in each directory.

The application developers need to take this into consideration when designing the data manipulation processes for the Data Linked files. This can also be a consideration when initially populating Data Linked columns.

4.2.3 Using NFS and NISThe Network File System (NFS) is commonly used for sharing files between hosts. The file system that contains the linked files is usually exported from the DLFM server and mounted by the clients using NFS. We recommend that all of the other file systems used for DLFM be local to the host on which the DLFM is installed. They include:

� The dlfm home directory� The database DLFM_DB log files� The tablespace containers for the DLFM_DB database� The archive file directory for the DLFM backups

The Network Information Service (NIS) is used as a single point of control for UNIX user IDs, groups, and a number of other files in the /etc directory. Do not have the dlfm user ID or UNIX group under control of NIS or any of the /etc files. These are best maintained as local files.

Tip: We recommend that, on AIX, no more than 2000 to 3000 files be placed in a single directory. This helps the performance of the file system especially when inserting files.

Chapter 4. Planning Data Links deployment 97

Page 120: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4.3 Planning the backup of the DLFM_DB databaseConsider these points when planning the backup strategy for the database (DLFM_DB) that contains the meta data for all Data Linked files:

� Back up the DLFM_DB at the same time the DB2 Universal Database database is backed up.

� Make sure the user exit for archive logging is used for the DLFM_DB.

� Back up the file systems controlled by the Data Links Manager. They need to be unmounted, backed up (via the operating system), and then mounted again. See the DB2 Release Notes Version 7.2/Version 7.1 FixPack 3 “Backing up a Journalized File System on AIX”.

4.4 Performance tuning tipsThere are a few tips to keep in mind that can result in better performance of Data Links.

4.4.1 Optimum logging levelsThe lower the logging level is, the better the performance is. We recommend you keep a minimum value of the logging levels, unless you want to debug something. In a Data Links scenario, two types of logging are involved.

Logging by DB2The recommended value of logging level of DB2 is 3 (LOG_ERROR). For debugging purposes, it can be changed to 4 (LOG_DEBUG). It can be updated by issuing the following DB2 command:

db2 update dbm cfg using loglevel <number>

Logging by DLFF (DLFSCM on DCE-DFS)The logging by DLFF is tunable. You can even turn off the logging done by DLFF or DLFSCM. Refer to Appendix D, “Logging priorities for DLFF and DLFSCM” on page 331, to learn how to modify these logging levels.

4.4.2 Location of file serversFor high performance, it is better to have the file servers located near to the applications. It avoids network traffic making Data Links perform even better.

98 Data Links: Managing Files Using DB2

Page 121: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4.4.3 Number of files per directoryOn AIX and Solaris, the suggested number of files per directory is 3000 or less.

4.4.4 Token algorithmsThere are two token algorithms supported by Data Links. They are MAC0 and MAC1. MAC1 is more complex and secured, but results in a performance overhead. Therefore, we recommend you use MAC0, unless security is a major concern.

4.4.5 DLFM backup, home, and log directoriesAll the directories of DLFM should be preferably local to the Data Links server and not remote (NFS mounted on UNIX or shared drive on Windows). If these directories are remote, it may impact performance drastically.

Note: This does not mean that MAC0 is not safe. It’s just that MAC1 is safer!

Chapter 4. Planning Data Links deployment 99

Page 122: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

100 Data Links: Managing Files Using DB2

Page 123: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 5. Data Links Manager administration

This chapter discusses a number of general administration commands for working with the Data Links Manager.

5

© Copyright IBM Corp. 2001 101

Page 124: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

5.1 Identifying the tables and servers in Data LinksTo identify the tables that contain columns with the DATALINK data type, you can issue a SELECT statement as shown in the Figure 5-1.

Figure 5-1 Select from sysibm.syscolproperties

The Data Links File Managers that are defined to a UDB database can be found by using the command shown in Figure 5-2. In this example, you first issue the list db directory command to retrieve the names of all of the databases. You need the database names for the list datalinks managers command.

Figure 5-2 List databases and Data Links Managers

102 Data Links: Managing Files Using DB2

Page 125: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

5.2 Checking for Data Links control over a file systemTo find out which file systems are controlled by Data Links, you can use the commands shown in Figure 5-3. The mount command displays information for the currently mounted file systems. We are searching for the Virtual file system (VFS) type of dlfs. If the mount command does not result in any output, then the Data Link files are not available. The following UNIX command is used to see if there are any dlfs file systems defined to the system in the /etc/filesystems file:

lsfs -v dlfs or lsfs|grep dlfs

The Data Link files must be defined and mounted to be available. The following command must also be successful for Data Links to be completely set up:

dlfm list registered prefixes

Before Data Links has control of a file system, the following actions must occur:

1. The utility /usr/lpp/db2_0n_0n/instance/dlfmfsmd creates a dlfs file system. The dlfmfsmd utility updates the /etc/filesystems and /etc/rc.dlfs files. Refer to DB2 Data Links Manager Quick Beginnings, GC09-2966, for the syntax. Verify the file system is of type dlfs by running:

lsfs -v dlfs

2. The file systems are mounted by running the command:

mount -v dlfs <filesystem name>

Use the mount command to verify.

3. The file system is defined to DLFM by:

dlfm add_prefix

Verify it with the command:

dlfm list registered prefixes

Figure 5-3 The dlfs file systems

Chapter 5. Data Links Manager administration 103

Page 126: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

5.3 Other useful DLFM commandsMost of the commands used to administer a DB2 File Manager are quite simple to use. The question is often which command to use. Here is a list of the most frequently used commands with an explanation of their syntax and what they do. Note that, unless otherwise stated, all of these commands are run on the DLFM server using the DLFM administrator user ID.

� dlfm: Lists all of the available DLFM commands with a brief explanation of what they do. Alternative forms are dlfm help or dlfm ?

� dlfm add_db: Registers a database with DLFM. Three input parameters are required: database name, instance name, and nodename. Here’s an example:

dlfm add_db sample db2inst1 myhost.com

This command populates the DLFM.DFM_DBID table in the DLFM_DB database. All parameters are converted to uppercase before they are stored in the DFM_DBID table.

� dlfm add_prefix: Registers a dlfs file system with the DLFM, for example:

dlfm add_prefix <myfilesystem>

This command populates the DLFM.DFM_PRFX table in the DLFM_DB database. Note that the file system name is case sensitive.

� dlfm bind: Binds executables used by DLFM to the DLFM_DB database. This command also updates DB2 statistics for the DLFM_DB database.

� dlfm drop_dlm: Unregisters a DB2 database with DLFM. It requires the three input parameters database name, instance name, and nodename. Here’s an example:

dlfm drop_dlm SAMPLE db2inst1 myhost.com

� dlfm create: Creates all of the tables in the DLFM_DB database that are used by the File Manager. After the tables are created, the dlfm bind command is invoked to update DB2 statistics for the tables.

� dlfm create_db: Creates and configures the DLFM_DB database. Archive logging is turned on, and an offline backup of the database is performed.

� dlfm drop_db: Drops the DLFM_DB database.

� dlfm help: Lists all of the available DLFM commands with a brief explanation of what they do. An alternative form is dlfm ?

Important: The DB2 RUNSTATS utility should never be used to update statistics for DLFM_DB. Always use the dlfm bind command.

104 Data Links: Managing Files Using DB2

Page 127: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� dlfm list registered databases: Lists all databases registered with DLFM. It selects data from the DFM_DBID table in the DLFM_DB database. Also lists the instance name and hostname of each database.

� dlfm list registered prefixes: Lists all dlfs file systems that are registered with DLFM. Selects data from the DLFM.DFM_PRFX table.

� dlfm refresh key: Changes the key used for generating access control tokens for DATALINK files linked with READ PERMISSION DB. Outstanding tokens generated with old key are invalidated. The Data Links Manager must be restarted and all connections to registered DB2 databases terminated for this to take effect. Can be used as a security mechanism to prevent hacking of access tokens.

� dlfm restart: Stops and starts the File Manager. Some commands, such as dlfm refresh key, take effect only after File Manager has been restarted. If a change to DLFM is made and it does not appear to have taken effect, try the dlfm restart command.

� dlfm retrieve: Displays the status of all files managed by DLFM. This command presents an interactive dialog that prompts for hostname, database and instance name, and file system name. It also lists the status of all linked and unlinked files being tracked by DLFM that match the selection criteria.

� dlfm see: Shows the DLFM processes running on the system. See Chapter 2, “Technical architecture” on page 11, for a description of each DLFM process.

� dlfm setup: Starts the database manager, creates the DLFM_DB database and the tables used by the File Manager, and stops the database manager. A file containing configuration options can be used as input.

� dlfm shutdown: Stops the File Manager and removes all Inter Process Communications (IPCs). This command tries to shut down DLFM cleanly, but if unable to do so, it kills the DLFM processes. This command can be useful

Tip: The non-interactive retrieve_query command can be used instead of dlfm retrieve. This can be useful to capture the output of the command to a file. Consider this example:

retrieve_query -o <outfile> -h <hostname> -d <dbname> -i <instname> -p <prefix>

Here <outfile> is the name of the output file, <hostname> is the name of the host on which the DB2 database resides, <dbname> is the name of the DB2 database, <instname> is the name of the instance, and <prefix> is the name of the dlfs file system. Note that the file system name supplied must exactly match the output of the dlfm list registered prefix command, that is, the file system name must end with a forward slash (/).

Chapter 5. Data Links Manager administration 105

Page 128: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

for fixpak installations or version upgrades because it assures that all processes and IPCs have been terminated. Note that the dlfm stop command does not necessarily remove all IPCs.

� dlfm start: Starts DLFM and issues a message to check the db2diag.log file for an indication of success. The dlfm see command can also be used to verify that the DLFM processes are running.

� dlfm startdbm: Starts the database manager for instance dlfm. It’s the same as db2start.

� dlfm stop: Stops the File Manager. This ends all of the processes that are connected to the database DLFM_DB. The dlfm see command shows the processes that are stopped by dlfm stop.

� dlfm stopdbm: Stops the database manager for instance dlfm. It’s the same as db2stop.

Important: Note that the dlfm stop command does not necessarily cleanup IPCs used by DLFM. If performing a version upgrade or fixpak installation, it is important to make sure that all IPCs have been removed. Use the dlfm shutdown command followed by ipcs | grep dlfm to verify that all IPCs have been removed.

106 Data Links: Managing Files Using DB2

Page 129: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 6. Using Tivoli Storage Manager

This chapter discusses compatibility of DB2 Data Links with Tivoli Space Manager and the Backup-Archive Client. It is intended for people who are:

� Using Data Links and want to exploit the features of Tivoli Storage Manager� Using Tivoli Storage Manager and plan to use Data Links� New to both Tivoli Storage Manager and Data Links, and want to explore the

benefits of having both, side by side

This chapter offers:

� An introduction to Tivoli Storage Manager (the entire product set)� Data Links with the Backup-Archive Client� Data Links with Tivoli Space Manager

6

© Copyright IBM Corp. 2001 107

Page 130: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6.1 Introduction to Tivoli Storage ManagerTivoli Storage Manager provides a set of products for distributed data and storage management in an enterprise network environment. These products are highly business centric, application aware, and are considered among the most scalable, interoperable and robust products in the industry. Tivoli Storage Manager supports a wide variety of platforms for mobile, small and large systems, and delivers many data management functions. Tivoli Storage Manager V4.1 supports six server platforms: Windows NT, Windows 2000, IBM-AIX, HP-UNIX, SUN-Solaris, and IBM MVS OS/390 server series. It has the following main products:

� Backup-Archive Client� Tivoli Space Manager� Tivoli Data Protection (TDP) for applications� Tivoli Disaster Recovery Manager (DRM)

Backup-Archive ClientThis client helps in maintaining copies of files that may be required in the future for recovery purposes. The Tivoli Storage Manager (TSM) server maintains a separate repository that keeps track of different versions, the timestamp (for point-in-time recovery), and the location of the backup image. The number of backup versions is controlled by server definitions. DB2 provides the user with an option to asynchronously copy Data Linked files to disk (using the dlfm_copyd daemon) or to use the Backup-Archive Client of Tivoli Storage Manager to back up files to any secondary storage (may be the same disk).

Tivoli Space ManagerThis client transparently moves less-frequently accessed data to lower cost storage media, presenting, to the user, the impression that the data is still on disk. Now Data Links (DB2 V7.2 onwards) can be used with Tivoli Space Manager (or HSM) to provide users with the ability to manage and store Data Linked files in secondary storage.

Tivoli Data Protection (TDP) for applicationsTivoli Data Protection for Applications is a group of solutions integrated to Tivoli Storage Manager, which protects data used by business applications. These are interface programs between a storage management API provided by the vendor application, and the Tivoli Storage Manager data management API. Tivoli Data Protection is available for Lotus Notes, Lotus Domino, Lotus Domino for iSeries, MS Exchange, MS SQL Server, Informix, and Oracle.

108 Data Links: Managing Files Using DB2

Page 131: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Tivoli Disaster Recovery Manager (DRM)Tivoli Disaster Recovery Manager assists with the technical steps that help in making data available to users after a widespread failure. It offers various options to configure, control, and automatically generate a disaster recovery plan containing the information, scripts, and procedures needed to automate restoration and to help ensure quick recovery of data after a disaster.

A typical configuration involving Backup-Archive and Tivoli Space Manager clients has the following components:

� Server

– Provides storage management for client nodes– Maintains a database of information– Can be used in a network to allow you to manage them centrally and to

balance storage resources

� Server Storage

– Contains files that are backed up, archived, and migrated from client workstations

– Consists of pools of random and sequential access media

� Administrative Client

Provides a command line and Java-based administration interface to the server

� BA Client

The Backup-Archive Client

� HSM Client

Hierarchical Storage Manager: The client for the Tivoli Space Management product

The following sections describe some of the base concepts of Tivoli Storage Manager and then discuss how Data Links works with BA and HSM clients.

Note: Refer to Tivoli Storage Manager Version 3.7.3 and 4.1: Technical Guide, SG24-6110, to learn more about Tivoli Storage Manager.

Chapter 6. Using Tivoli Storage Manager 109

Page 132: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6.1.1 Storage device conceptsThe Tivoli Storage Manager-managed client’s data are stored in the Tivoli Storage Manager storage repository, which can consist of different storage devices, such as disk, tape, or optical devices. Tivoli Storage Manager controls this repository. To do this, Tivoli Storage Manager uses its own model of storage to view, classify, and control these storage devices, and to implement its storage management functionality.

The main difference between the storage management approach of Tivoli Storage Manager and other commonly used systems is that Tivoli Storage Manager storage management concentrates on managing data objects instead of managing and controlling backup tapes. Data objects can be files, directories, or raw logical volumes that are backed up from the client systems. They can be objects like tables or records from database applications or simply a block of data that a client system wants to store on the server storage.

To store these data objects on storage devices and to implement storage management functions, Tivoli Storage Manager has defined some logical entities to classify the available storage resources. The most important one is the storage pool logical entity.

Storage poolA storage pool describes a storage resource for one single type of media, such as, a disk partition or a set of tape cartridges. Storage pools are the place where data objects are stored. A storage pool is built up from one or more storage pool volumes. For example, in the case of a tape storage pool, this would be a single physical tape cartridge. To describe how Tivoli Storage Manager can access those physical volumes to place the data objects on them, Tivoli Storage Manager has another logical entity called a device class. A device class is connected to a storage pool and specifies how volumes of this storage pool can be accessed.

Storage hierarchyTivoli Storage Manager organizes storage pools in one or more hierarchical structures. This storage hierarchy can span over multiple server instances and is used to implement management functions to migrate data objects automatically – completely transparent to the client – from one storage hierarchy level to another or in other words, from one storage device to another. This function may be used, for example, to cache backup data (for performance reasons) onto a Tivoli Storage Manager server disk space before moving the data to tape cartridges. The actual location of all data objects is automatically tracked within the server database.

110 Data Links: Managing Files Using DB2

Page 133: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Tivoli Storage Manager has implemented additional storage management functions for moving data objects from one storage volume to another. Tivoli Storage Manager uses the progressive backup methodology to backup files to the Tivoli Storage Manager storage wide area network (WAN), local area network (LAN), storage area network (SAN) Client System Server System Storage Pool.

The reorganization of the data and storage media for fast recovery happens completely within the server. For this purpose, Tivoli Storage Manager has implemented functions to relocate data objects from one volume to another and to collocate data objects that belong together, either at the client-system level or at the data-group level.

Another important storage management function implemented within the Tivoli Storage Manager server is the ability to copy data objects asynchronously and to store them in different storage pools or on different storage devices, either locally at the same server system or remotely on another server system. It is especially important for disaster recovery reasons to have – in the event of losing any storage media or the whole storage repository – a second copy of data available somewhere in a secure place. This function is fully transparent to the client, and can be performed automatically within the Tivoli Storage Manager server.

Figure 6-1 gives an overview of the TSM storage management. It shows how a data object on a TSM client can be migrated or recalled, or backed-up or recovered, from a TSM server storage repository. Two device classes are shown that have one or more storage pools. And each storage pool has one or more storage pool volumes. It shows how a data object is moved in the storage hierarchy.

Chapter 6. Using Tivoli Storage Manager 111

Page 134: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 6-1 Storage management

6.1.2 Policy conceptsA data storage management environment consists of three basic types of resources: client systems, rules, and data. The client systems contain the data to be managed, and the rules specify how the management must occur. For example, in the case of backup, how many versions should be kept, where should they be stored, and so on.

Tivoli Storage Manager policies define the relationships between these three resources. Figure 6-2 illustrates this policy relationship. Depending on your actual needs for managing your enterprise data, these policies can be very simple or very complex.

Data Object

WAN, LAN, SAN

TSM Server

TSM Client

Storage Pool Volume

Storage Hierarchy

Storage Repository

Storage Pool

Device Class

Storage Pool

Device Class

Storage Pool

Copy

Relocate

Migrate &Colocate

112 Data Links: Managing Files Using DB2

Page 135: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 6-2 Policy concepts

Tivoli Storage Manager has certain logical entities that group and organize the storage resources and define relationships between them. Client systems, or nodes in Tivoli Storage Manager terminology, are grouped together with other nodes with common storage management requirements, into a policy domain.

The policy domain links the nodes to a policy set, a collection of storage management rules for different storage management activities. A policy set consists of one or more management classes. A management class contains the rule descriptions called copy groups, and links these to the data objects to be managed. A copy group is the place where all the storage management parameters, such as number of stored copies, retention period, storage media, and so on, are defined. When the data is linked to particular rules, it is said to be bound to the management class that contains those rules.

Another way to look at the components that make up a policy is to consider them in the hierarchical fashion in which they are defined. That is to say, consider the policy domain containing the policy set, the policy set containing the management classes, and the management classes containing the copy groups and the storage management parameters.

Rules

Rules

Rules

Copy Group

Copy Group

Copy Group

Policy Set

Data

Data

Data

ManagementClass

ManagementClass

ManagementClass

Machines

NodesPolicyDomain

Chapter 6. Using Tivoli Storage Manager 113

Page 136: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6.1.3 Security conceptsSecurity is a vital aspect for Tivoli Storage Manager because all the data of an enterprise are stored and managed by the storage repository of Tivoli Storage Manager. To ensure that data can only be accessed from the owning client or an authorized party, Tivoli Storage Manager implements, for authentication purposes, a mutual suspicion algorithm, which is similar to the methods used by Kerberos authentication.

Whenever a client wants to communicate with the server, an authentication has to take place. This authentication contains both-sides verification, which means that the client has to authenticate itself to the server, and the server has to authenticate itself to the client.

To do this, all clients have a password, which is stored at the server side as well as at the client side. In the authentication dialog, these passwords are used to encrypt the communication. The passwords are not sent over the network, to prevent hackers from intercepting them. A communication session is established only if both sides are able to decrypt the dialog. If the communication has ended, or if a time-out period without activity is passed, the session is automatically terminated, and a new authentication will be necessary.

6.1.4 Communication methodsTivoli Storage Manager server supports following methods for communication with clients:

� Shared Memory (TCP/IP pre-requisite)� TCP/IP� HTTP (for a Web interface)� SNMP DPI� None (this option is selected to disallow any client from connecting to the

server)

6.2 Data Links with the Backup-Archive ClientAs an alternative to the disk backup, Tivoli Storage Manager can also be used to back up files that reside on a Data Links server.

Note: Refer to Tivoli Storage Management Concepts, SG24-4877, for more information.

Note: Disk copy is the default backup mechanism for backing up Data Linked files.

114 Data Links: Managing Files Using DB2

Page 137: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

To use Tivoli Storage Manager as an archive server, you would use the following steps:

1. Install Tivoli Storage Manager on the Data Links server. For more information, refer to the Tivoli Storage Manager product documentation.

2. Register the Data Links server client application with the Tivoli Storage Manager server. For more information, refer to the Tivoli Storage Manager product documentation.

3. Add the following environment variables to the Data Links Manager administrator's db2profile or db2cshrc script files:

For Bash, Bourne, or Korn shell:

export DSMI_DIR=/usr/tivoli/tsm/client/ba/bin (On AIX)export DSMI_DIR=/opt/tivoli/tsm/client/ba/bin (On Solaris)export DSMI_CONFIG=$HOME/tsm/dsm.optexport DSMI_LOG=$HOME/dldumpexport PATH=$PATH:/usr/tivoli/tsm/client/ba/bin (On AIX)export PATH=$PATH:/opt/tivoli/tsm/client/ba/bin (On Solaris)

For C shell:

setenv DSMI_DIR /usr/lpp/tsm/binsetenv DSMI_CONFIG ${HOME}/tsm/dsm.optsetenv DSMI_LOG ${HOME}/dldumpsetenv PATH=${PATH}:/usr/tivoli/tsm/client/ba/bin (On AIX)setenv PATH=${PATH}:/opt/tivoli/tsm/client/ba/bin (On Solaris)

4. Ensure that the dsm.sys TSM system options file is located in the /<base>/tivoli/tsm/client/ba/bin directory, where <base> is usr in AIX and opt in Solaris.

5. Ensure that the dsm.opt TSM user options file is located in the <INSTHOME>/tsm directory, where <INSTHOME> is the home directory of the Data Links Manager administrator.

6. Set the PASSWORDACCESS option to Generate in the /usr/lpp/tsm/bin/dsm.sys Tivoli Storage Manager system options file.

7. Register the TSM password with the Generate option before you start the Data Links File Manager for the first time. This way a password will not be needed when the Data Links File Manager initiates a connection to the TSM server. For more information, refer to Tivoli Storage Manager for AIX Administrator’s Guide, GC35-0403, at: http://www.tivoli.com/support/public/Prodman/public_manuals/storage_mgr/v4pubs/v1_pdf/aix/guide/anragd40.pdf

8. Set the DLFM_BACKUP_TARGET registry variable to TSM by issuing the command:

db2set -g DLFM_BACKUP_TARGET=TSM

Chapter 6. Using Tivoli Storage Manager 115

Page 138: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

This activates the Tivoli Storage Manager backup option. The value of the DLFM_BACKUP_DIR_NAME registry variable is ignored in this case.

9. Stop the Data Links File Manager by entering the command:

dlfm stop

10.Start the Data Links File Manager by entering the command:

dlfm start

6.3 Data Links and Tivoli Space ManagerDB2 V7.2 Data Links allows users to keep their Data Linked files in the file systems being managed by the Hierarchical Storage Manager (HSM) client of the Tivoli Space Manager. This section discusses the following topics:

� Overview of Tivoli Space Manager

� Various tools, processes and interfaces available with the TSM server and HSM client

� Data Links support for HSM: Overview and benefits

� Known restrictions of using Data Links with Tivoli Space Manager

6.3.1 Overview of Tivoli Space ManagerTivoli Space Manager maximizes the usage of existing storage resources by transparently migrating data off workstation and file server hard drives based on size and age criteria, leaving only a stub file. If and when the migrated data is accessed, Tivoli Space Manager transparently migrates the data back onto the local disk. In doing so, Tivoli Space Manager relieves the user from the task of manual deleting and archiving of data on their workstation. Tivoli Space Manager

Notes:

� If the setting of the DLFM_BACKUP_TARGET registry variable is changed from TSM to LOCAL at run time, the archived files are not moved to the newly specified archive location. All newly-archived files are stored in the new location on the disk. The files that were previously archived to TSM are not be moved to the new disk location.

� To override the default TSM management class, there is a new registry variable called DLFM_TSM_MGMTCLASS. If this registry variable is left unset, then the default TSM management class is used.

116 Data Links: Managing Files Using DB2

Page 139: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

is a complementary product that is available on all Tivoli Storage Manager servers. Therefore, the supported HSM clients can send data to any Tivoli Storage Manager server that also has installed the Tivoli Space Manager. Tivoli Storage Manager is a pre-requisite to implementing Tivoli Space Manager.

Tivoli Space Manager provides the basic functionality of space management by automatically migrating data based on file size, number of days since it was last accessed, or a combination of both. Once migrated, Tivoli Space Manager automatically recalls a file if it is accessed and restores it to its original location in the file system.

Figure 6-3 gives an overview of the functionality of Tivoli Space Manager.

Figure 6-3 Tivoli Space Manager overview

UserApplication

dsmmonitorddaemon

(or dsmmigratetool)

dsmrecallddaemon

(or dsmrecalltool)

Storage Pool

(1)

(2)

(3)

(4)

PhysicalFile-System

File SystemMigrator (FSM)

(4)

Chapter 6. Using Tivoli Storage Manager 117

Page 140: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The flow in Figure 6-3 is explained here:

1. The users see more file system space than what is actually in the file system. This is due to the transparent migration and recall of the files. File System Migrator (FSM) VFS provides this illusion to the user.

2. The dsmmonitord daemon keeps track of the threshold values specified in /etc/adsm/SpaceMan/config/dsmmigfstab and the file systems registered with it. If any of the file system attributes (space usage, for example) crosses the threshold value, it starts migrating the files that are eligible for migration. Whether a file is eligible for migration depends on many factors, which include:

– The file size (should be greater than the stub file size)– The Include-Exclude list (specified in the dsm.sys option file)– The timestamp of the file

The file can also be migrated explicitly (selective migration) by using the dsmmigrate utility and passing the file name as an argument to it. This utility also requires that the file size be greater than the stub file size.

3. The migrated files are sent to the TSM server, which stores them in the storage pools depending on the policy sets.

4. When an application or user accesses a migrated file, FSM VFS checks whether there is any need to recall the file from the server. If it finds that the file has to be recalled, it triggers the recall operation. The file is finally recalled by the Recall daemon (dsmrecalld). This is known as the transparent recall. Files can also be recalled explicitly (known as selective recall) by using the dsmrecall tool.

Tivoli Space Manager maintains data integrity and security of data by working closely with the operating system. It provides a graphical user interface and commands that can be used to display information about files, including whether they have been migrated.

MigrationFiles are migrated by Tivoli Space Manager from the original file system to storage devices connected to a Tivoli Storage Manager server. Each file is copied to the server, and a stub file is placed in the original file’s location. Using the facilities of storage management on the server, the file is placed on various storage devices such as disk and tape.

Tivoli Space Manager migrates only regular files on locally mounted file systems. It does not migrate character special files, block special files, First in/first out (FIFO) special files (named pipe files), or directories.

118 Data Links: Managing Files Using DB2

Page 141: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

There are two types of HSM migration:

� Automatic

With automatic migration, Tivoli Space Manager monitors the amount of free space on your file systems. When it notices free space shortage, it migrates files off the local file system to the Tivoli Storage Manager server storage based on the space management options.

Tivoli Space Manager monitors free space in two ways:

– Threshold: Threshold migration maintains your local file systems at a set level of free space. At an interval specified in the options file, Tivoli Space Manager checks the file system space usage. If the space usage exceeds the high threshold, files are migrated to the server by moving the least-recently used (LRU) files first. When the file system space usage reaches the set low threshold, migration stops. Threshold migration can also be started manually.

– Demand: Tivoli Space Manager checks for an out-of-space condition on a file system every two seconds. If this condition is encountered, Tivoli Space Manager automatically starts migrating files until the low threshold is reached. As space is freed up, the process causing the out-of-space condition continues to run. You do not receive out-of-space error messages while this is happening.

� Selective

You can tell Tivoli Space Manager to selectively migrate a file immediately to the server’s storage. As long as the file meets the space management options, it is migrated. The file does not need to meet age criteria, nor does the file system need to meet space threshold criteria.

Note: Do not confuse HSM migration with storage pool migration. Storage pool migration is the process where client data (which could be backup, archive, or HSM data) moves through the Tivoli Storage Manager storage hierarchy, typically from disk to tape or optical. Storage pool migration happens entirely within the Tivoli Storage Manager server.

HSM migration is the process of moving data from an HSM client to the Tivoli Storage Manager where it will be stored in a storage pool. Once a file has been migrated (HSM) from a client to the server, it could subsequently be migrated (server) to another server storage pool.

Chapter 6. Using Tivoli Storage Manager 119

Page 142: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Pre-migrationMigration can take a long time to free up significant amounts of space on the local file system. Files need to be selected and copied to the Tivoli Storage Manager server, which may involve tape mount, and a stub file must be created in place of the original file. To speed up the migration process, Tivoli Space Manager can be told to implement a pre-migration policy.

After threshold or demand migration completes, Tivoli Space Manager continues to copy files from the local file system until the pre-migration percentage is reached. These copied files are not replaced with the stub file, but they are marked as pre-migrated.

The next time migration starts, the pre-migrated files are chosen as the first candidates to migrate. If the file has not changed since it was copied, the file is marked as migrated and the stub file is created in its place in the original file system. No copying of the file needs to happen, because the server already has a copy. In this manner, migration can free up space very quickly.

RecallRecall is the process of bringing back a migrated file from Tivoli Storage Manager to its original place on the local file system. A recall can be either transparent or selective:

� Transparent

From a user or running process perspective, all the files in the local file system are actually available. Directory listings and other commands that do not require access to the entire file appear exactly as they would if the HSM client was not installed. When a migrated file is needed by an application or command, the operating system initiates a transparent recall for the file to the Tivoli Storage Manager server. The process is temporarily halted while the file is automatically copied from the server’s storage to the original file system location. Once the recall is complete, the halted process continues without requiring any user intervention. In fact, depending on how long it takes to recall the file, the user may not even be aware that HSM is used.

After a recall, the file contents are on both the original file system and on the server storage. This allows Tivoli space Manager to mark the file as pre-migrated and eligible for migration unless the file is changed.

120 Data Links: Managing Files Using DB2

Page 143: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Selective

Transparent recall only recalls files automatically as they are accessed. If you or a process need to access a number of files, it may be more efficient to manually recall them prior to actually using them. This is done using selective recall. Tivoli Space Manager batches the recalled file list based on where the files are stored. It recalls the files stored on disk first and then recalls the files stored on sequential storage devices such as tape.

� Advanced transparent recall

Advanced transparent recall is available only on AIX platforms. There are three recall modes: normal (which recalls a migrated file to its original file system), migrate-on-close, and read-without-recall:

– Migrate-on-close: When Tivoli Space Manager uses the migrate-on-close mode for recall, it copies the migrated file to the original file system, where it remains until the file is closed. When the file is closed and if it has not been modified, Tivoli Space Manager replaces the file with a stub and marks the file as migrated (since a copy of the file already exists on the server storage).

– Read-without-recall: When Tivoli Space Manager uses the read-without-recall mode, it does not copy the file back to the originating file system, but passes the data directly to the requesting process from the recall. This can only happen when the processes that access the file do not modify the file, or if the file is executable, the process does not execute the file. The file does not use any space on the original file system and remains migrated (unless the file is changed; then Tivoli Space Manager performs a normal recall).

ReconciliationTivoli Space Manager uses reconciliation to maintain synchronization of the local file system and Tivoli Storage Manager. Reconciliation builds a migration candidates list.

Reconciliation can be started manually or allowed to happen automatically at intervals set in the options file and prior to threshold migration if the migration candidate list is empty.

Note: Do not confuse the Tivoli reconciliation with the Data Links Reconcile utility.

Chapter 6. Using Tivoli Storage Manager 121

Page 144: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Synchronization

Synchronization involves maintaining the Tivoli Space Manager database in sync with the actual files on the original file system. It ensures that, for every stub file, there is a valid file copy kept. For every original file on the original file system, there are no database entries. For pre-migrated files, there is an entry in the Tivoli Space Manager database, and it updates status fields in the database. For example, if you recall a file, change it, and immediately migrate it, Tivoli Space Manager has two copies of the file in its storage: the most recent one is valid, and there is an obsolete one. Reconciliation removes this obsolete file after its expiration interval has passed.

� Building a new migration candidates list

Tivoli Space Manager uses the reconciliation process to build a prioritized list of files on the original file system that are eligible for automatic migration. The list is created based on the management class criteria and minimum file size. It is ordered according to the number of days since the file was last used, the file size, and the migration factors set in the options file. During threshold and demand migration, the list is used to select files to migrate in prioritized order. As the file is selected, it is checked again to ensure that it still meets the migration criteria. A new migration candidate list is created each time reconciliation runs. The list can also be created at any time.

OptionsOptions to control Tivoli Space Manager are set in the client options file (dsm.sys). These options set items such as which Tivoli Storage Manager server to use for the Tivoli Space Manager functions, space management options, migration options, excluded file lists, and assigning management classes to files.

Backup/restore and archive/retrieveTivoli Space Manager should not be considered as a replacement for backup. It should be viewed as a form of space extension of local disk storage. When a file is migrated to the HSM server, there is still only one copy of the file available, since the original is deleted on the client and replaced by the stub.

Also, Tivoli Space Manager maintains only the last copy of the file, giving no opportunity to store multiple versions. Therefore, the Tivoli Storage Manager backup-archive client must be used for files backup or archive before or after the file is migrated by Tivoli Space Manager. You can specify that a file is not eligible for HSM migration unless a backup has been made first with the backup-archive client. If the file is migrated and the save Tivoli Storage Manager server destination is used for both backup and HSM, the server can copy the file from the migration storage pool to the backup destination without recalling the file.

122 Data Links: Managing Files Using DB2

Page 145: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Both files and stub files can be restored from a Tivoli Storage Manager backup. If you restore the entire file, it will become a normal resident file on the client, and the migrated copy will be deleted from the HSM pool during the next reconciliation. If you do not want to restore the actual file data, you can use options on the HSM client to restore only the stub file without re-creating the file contents. In this case, the file will still remain in its migrated state.

The Tivoli Storage Manager backup-archive client allows you to archive and retrieve copies of migrated files without performing a recall for the file first, providing the save Tivoli Storage Manager server is used for both HSM and backup-archive. The file is simply copied from the HSM storage pool to the archive destination pool.

6.3.2 Tools, processes, and interfacesThe various Tivoli Storage Manager processes, utilities for the TSM server, and backup and archive and HSM clients are discussed here:

� At the Tivoli Storage Manager server side:

– dsmserv: This is the main server process that be can started either in the background or in the foreground. The different modes in which dsmserv can be started are:

• CLI Mode: When started in this mode, dsmserv opens a command line interface. This mode is default when dsmserv is started from a shell.

• Quiet Mode: This is the default mode when dsmserv is started at boot time. From a UNIX shell, dsmserv can be started in this mode by issuing the dsmserv quiet command.

– dsmadmc: The administrative command-line client is a program that runs on a file server, workstation, or mainframe that allows administrators to control and monitor the server through administrative commands. It can be started in any one of the following modes:

• Console mode: This mode is used to monitor TSM activities as they occur or to capture processing messages to an output file. For example, it is possible to monitor migration processes and clients logging on to TSM. No administrative command can be entered in this mode. To start the server in this mode, enter the following command at the shell prompt:

dsmadmc -consolemode

• Mount mode: This mode is used to monitor removable media mount activities. No administrative commands can be entered in this mode. To start the dsmadmc client in this mode, enter the following command:

dsmadmc -mountmode

Chapter 6. Using Tivoli Storage Manager 123

Page 146: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

• Batch mode: This mode is used to enter a single administrative command. The administrative client session automatically ends when the command has been processed. To start an administrative client session in batch mode, enter the following command:

dsmadmc -id=<User Id> -password=<Passwd> <command>

• Interactive mode: This mode is used to enter a series of administrative commands. To start an administrative client session in interactive mode, a server session must be available. To start the administrative client in this mode, simply enter dsmadmc at the shell prompt without any parameters.

– Web interface: It is possible to do all the administration through a user friendly Web interface.

� At the Backup-Archive Client of TSM:

– dsmc: This is a command line interface used for backup/restore and archive/retrieve operations. Users can do query on any files to see the status of backup or archive.

– dsm: This is a GUI that provides all the basic functionalities provided by the dsmc client, as well as some extra functionalities like setting include-exclude options (discussed later).

� At Tivoli Space Manager (or Hierarchical Storage Manager):

– dsmmonitord daemon: This daemon monitors file systems registered with HSM at a regular frequency.

– dsmrecalld daemon: This daemon takes care of both transparent and selective recalls. It spawns a child for every active recall operation.

– dsmmigfs: This is a tool used for registering the file systems with HSM. It puts an entry in the /etc/adsm/SpaceMan/config/dsmmigfstab file, which maintains the information for each file system registered with HSM.

– dsmmigrate: This utility is used to selectively migrate files to the storage pool guided by the TSM server policy to which this client is registered.

– dsmrecall: This tool is used for selective recall of migrated files.

– dsmls: This is a tool that displays information about the files in an HSM managed file system. It is similar to the ls UNIX command and returns the following information:

• Virtual size of the file (as it appears to the user)• Actual size of the file (in fact size of the stub, if the file is migrated)• State of the file (migrated, pre-migrated or resident)

– dsmdu: This utility tells the virtual space usage of the file system objects (files, directories and subdirectories), for example, takes the virtual size of the migrated files into account and not the actual stub file size. The UNIX

124 Data Links: Managing Files Using DB2

Page 147: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

du utility, on the other hand, gives the space usage of the file system objects based on the actual size.

6.3.3 Data Links support for HSMData Links now supports HSM on AIX. In a typical scenario, DLFF sits over FSM (the VFS/Vnode layer of HSM), which in turn, layers over the native file system JFS. Any file system request coming to this file system is first trapped by DLFF and after DLFF is done with its preprocessing, it calls the corresponding base (in this case FSM) file system’s operation. FSM also does its own preprocessing required for space management functionality and calls the equivalent JSF file system operation. Figure 6-4 provides the entire picture of how and where different components of Data Links and Tivoli Space Manager fit.

Figure 6-4 Data Links and Tivoli Space Manager

To use the functionality of both Data Links and Tivoli Space Manager, the file system should be first registered with HSM with help from the following command:

dsmmigfs add <File System Name>

DB2 Server

db2agents

Data Links Manageron File server

Native File SystemAIX - JFS

DLFF (Data LinkFilesystem Filter)

Storage

FSM (HSM's VFS)

Data Links and Tivoli Space ManagerTSM Server

ArchiveServer

SQL Access Path

DB2 Application TSM Processeson File Server

StandardFile AccessProtocol

DLFM_DB(metadatarepository)

Data Links File Manager (DLFM)

ControlPath forDataLinksIntegrity

Chapter 6. Using Tivoli Storage Manager 125

Page 148: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Then it should be DLFF-enabled by running dlfsfsmd install script (under the /usr/lpp/db2_07_01/instance directory), and finally registered with DLFM by running the following command:

dlfm add_prefix <File System Name>

This happens after mounting the file system as DLFS.

The /etc/filesystems file is modified twice in this process. Before registering the file system with either DLFF or HSM, the entry in /etc/filesystems for any JFS file system would typically look like the following example:

Now after registering the file system with HSM (by using dsmmigfs add /myfilesystem), the /etc/filesystem entry would be modified to:

After running the dlfmfsmd script, this entry would look like the following example:

/myfilesystem:

dev = /dev/lv02

vfs = jfs

log = /dev/hd8

mount = true

options = rw

account = false

/myfilesystem:

dev = /dev/lv02

vfs = jfs

log = /dev/hd8

mount = false

options = rw

account = false

adsmfsm = true

Note: Observe that the mount option is modified to false and an extra option (“adsmfsm = true”) is added.

Note: On AIX, the dlfmfsmd script modifies the /etc/filesystems file to add or modify DLFS related options, corresponding to the file system name passed as an argument.

126 Data Links: Managing Files Using DB2

Page 149: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6.3.4 Current restrictionsThere are some restrictions when using Data Links with Tivoli Space Manager. These restrictions are:

� Selective migration (dsmmigrate) of a file Data Linked under READ PERMISSION DB control, should be done by the super-user (root) only.

An ordinary user amita (a non-root user) tries to migrate a READ PERMISSION DB file (fcfile) with or without a valid token. This would result in an error because the dsmmigrate utility expects the user to be the owner of the file. And since the owner of the READ PERMISSION DB file is DLFM admin user and not amita, the error message shown in Figure 6-5 is returned.

Figure 6-5 Selective Migration of READ PERMISSION DB file

/myfilesystem:

dev = /dev/lv02

vfs = dlfs

log = /dev/hd8

mount = false

options = rw,Basefs=fsm

account = false

adsmfsm = true

nodename = -

Note: Observe that the options option is modified to “rw,Basefs=fsm” and an extra option (if not already there) is added (“nodename = -”).

Chapter 6. Using Tivoli Storage Manager 127

Page 150: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� The statfs or the stat call on a file system having FSM, as well as DLFS (DLFF) mounted on it, would show the file system type to be of FSM and not of DLFS, although DLFS VFS layers above FSM VFS. The reason why this is done is because the HSM Recall daemon (dsmrecalld) expects the file system type to be of FSM, and it fails on finding some other file system type (DLFS in this case).

Both DLFS and FSM are mounted (with DLFS on top) on the /dlfsfsmtest file system. The following “C” code (Figure 6-6) does a statfs on the file system name (passed as an argument) and tells its VFS type from the FSID field of the statfs structure.

Figure 6-6 dostatfs.c

Figure 6-7 shows the VFS number of DLFS and FSM.

Figure 6-7 VFS numbers of DLFS and FSM

The second entry in /etc/vfs corresponding to a VFS name is its VFS number (or type). Therefore, you see that DLFS has the VFS number of 7 and FSM has the VFS number of 15.

128 Data Links: Managing Files Using DB2

Page 151: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

/dlfsfsmtest has both DLFS and FSM mounted over it. Now, when dostatfs.c is compiled and run (with /dlfsfsmtest as an argument), it gives the output shown in Figure 6-8.

Figure 6-8 Result of dostatfs on /dlfsfsmtest

Note that the VFS number shown by dostatfs is of FSM and not of DLFS, although DLFS is mounted on top of FSM.

� The dsmls utility does not show any output of whether the file having the minimum inode number (in that particular directory) is Data Linked under READ PERMISSION DB control.

Figure 6-9 shows that the dsmls utility does not give any output, although there are three files (file, fcfile and normalfile) in the directory. Reason is that the file with the minimum inode number (file with inode number=6145) in the directory is a READ PERMISSION DB Data Linked file.

Figure 6-9 dsmls utility behavior

Chapter 6. Using Tivoli Storage Manager 129

Page 152: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

130 Data Links: Managing Files Using DB2

Page 153: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 7. High Availability support on AIX

This chapter describes how Data Links can be supported under a High Availability Cluster Multiprocessor (HACMP) environment on AIX. It discusses some common cluster configurations under Data Links environment and specifies some of the key points required for Data Links to work under an high availability environment. Prior to reading this chapter, you should have some familiarity with the HACMP for AIX product.

In the context of HACMP, Data Links is an application that needs to be configured for high availability. In fact, Data Links consists of two sub-applications that need to be configured for high availability:

� Host DB2 Server: This is a piece of software that is essentially a DB2 instance to which database client applications can connect.

� Data Links Server: This is a piece of software that resides on the file server node where the host DB2 server communicates for linking/unlinking the files. The Data Links File Manager (DLFM) and the Data Links File System Filter (DLFS) are two different pieces of software under the Data Links server. But, as far as HACMP is concerned, they can be treated as one integrated application since both these pieces need to run together on the same node.

To learn more on HACMP, refer to HACMP/ES Customization Examples, SG24-4498.

7

© Copyright IBM Corp. 2001 131

Page 154: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

7.1 IntroductionThe High Availability Cluster Multiprocessor environment is built on the concept of clustering. In a cluster, multiple server processors cooperate to provide a set of services or resources to other entities. HACMP defines relationships among cooperating processors where peer cluster nodes provide the services offered by a cluster node that becomes disabled.

The HACMP Cluster Manager runs on each cluster node, monitoring local hardware and software subsystems, tracking the state of the cluster peers, and triggering cluster events when the cluster status changes. A cluster event represents a change in a cluster's operational state that the HACMP Cluster Manager recognizes and to which it can respond.

Cluster nodes exchange “keep-alive” messages with peer nodes so that the Cluster Manager can track the availability of the nodes in the cluster. If a node stops sending keep-alive messages, the peer nodes drive the recovery process. The peer nodes take the necessary actions to start critical applications up and running and to ensure that data is not corrupted.

This relationship between nodes is the basis for a failover of services. A failover of services occurs when an HACMP cluster environment experiences a change that requires stopping services on one node and resuming those services on the standby or peer node.

The above mentioned Data Links sub-applications (DLFM and DLFS) can be configured in the two basic HACMP cluster configurations:

� Hot Standby: In this configuration, the host DB2 server and the Data Links server belong to two different HACMP clusters. Each cluster has one active node that runs the host DB2 server or Data Links server in normal mode and one Standby node that takes over the functionality of the active node in case of failure. The Standby node in each cluster is mostly dedicated to failover operation of active node and does not run any other major applications.

� Mutual Takeover: In this configuration, the host DB2 server node and the Data Links File Manager node back up each other and take over each other's functionality during failover. They both belong to the same HACMP cluster. This configuration is common to both the host DB2 and the Data Links File Manager applications.

7.2 HACMP cluster configuration for hot standbyFigure 7-1 shows the configuration for the Host DB2 cluster or the Data Links File Manager cluster. The cluster contains two nodes:

132 Data Links: Managing Files Using DB2

Page 155: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Active Node-A� Standby Node-B

Each node has its own local disk on SCSI-0 adapter. The Volume Group 1 (VG-1) is the shared resource group of disks/file systems. The disks in the VG-1 are connected on the separate SCSI adapters (for example SCSI-1).

Figure 7-1 Host DB2 (or) Data Links File Manager cluster

The Active Node-A has priority 1, and the standby Node-B has priority 2 (less than the active node) for the shared resource group. In a cascading policy, active Node-A, whenever it is present in the cluster or rejoins the cluster, controls or takes over the shared resource Volume Group (VG-1) and acts as the host DB2 server. In case of a failure of active Node-A or its scheduled outage from the cluster, the shared resource Volume Group (VG-1) fails over to standby Node-B. The HACMP Cluster Manager detects the failure and then starts the applications (Host DB2 Server or Data Links server) on the standby Node-B. The sample scripts for starting and stopping the Host DB2 Server or Data Links server are given in 7.4, “The scripts” on page 142.

Shared Disks

ResourceVolume Group 1

(VG - 1)

SCSI - 1SCSI - 1

Network AdapterActive

Active Node APriority 1

Cascading

Local Disk

SCSI - 0

Network AdapterStandby

Local Disk

SCSI - 0

Standby Node BPriority 2

Cascading

Hot Standby ConfigurationHost DB2 (or) Data Links File Manager Cluster

Network

Chapter 7. High Availability support on AIX 133

Page 156: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Let us assume that the hostname of the active node is Node-A and its service IP address is IP-A. The association between Node-A and IP-A will be registered in the DNS or in the /etc/hosts files on the client nodes. The standby node with hostname Node-B will have IP-B as the IP address. HACMP configuration should enable both IP address and hardware address takeover for network adapters on Node-A and Node-B. When Node-A fails, the HACMP will make Node-B’s network adapter to release the IP address IP-B and take the new IP address IP-A. Also the network adapter on Node-B will assume the hardware address (for example, Ethernet address) of the network adapter of Node-A so that the client application nodes do not need to flush their ARP cache.

7.2.1 Hot standby setup for a host DB2 serverThe host DB2 server must have the following file systems under the shared resource Volume Group (VG-1). The file systems should be accessible under the same absolute path name on both the nodes:

� File system containing the DB2 instance home directory� File system containing the host database directory

You must install the host DB2 server software on both the nodes with the same install options and parameters. Let us assume /home and /dbfs are the file systems under VG-1. /home/db2 and /dbfs/db2 are the instance home and database directories respectively. While installing software on Node-A, let us assume that VG-1 is attached to Node-A, so that the installation will create the instance home and database directories on /home and /dbfs respectively.

While installing the software on the standby Node-B, you should create two temporary file systems on the local disk of Node-B and mount them as /home and /dbfs. After installation on Node-B, unmount the /home and /dbfs and then delete these temporary file systems from the local disk. After failover, the standby Node-B will use the /home and /dbfs file systems from VG-1 and will find the instance home and database directories created during installation on Node-A.

The user can have multiple instances and database directories on an active Node-A, but all of them need to be setup as mentioned above. That is, they need to be shared by standby Node-B in case of a failover.

The sample start/stop script provided in sqllib/samples/hacmp/rc.db2server.dls will be run by the Cluster Manager to start/stop the host DB2 instance. The script is also documented in 7.4, “The scripts” on page 142. The start script runs on Node-B in case of a failover and should set the hostname of Node-B to Node-A (this hostname setting is required for the Data Links application to work).

134 Data Links: Managing Files Using DB2

Page 157: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

7.2.2 Hot standby setup for a Data Links serverThe setup for a Data Links server in a hot standby mode is much the same as the one for a host DB2 server described in 7.2.1, “Hot standby setup for a host DB2 server” on page 134. Some of the differences are mentioned in 7.4, “The scripts” on page 142.

The following file systems should be part of the shared resource Volume Group VG-1:

� The file system that contains the home directory of the DLFM’s local DB2 instance user (which by default is dlfm)

� The file system that contains the DLFMs meta data database (which by default is DLFM_DB), if the DLFM_DB database is created in a different file system from the default one (home directory of the dlfm instance user)

� The file system that contains the dlfm_backup directory if the local disk backup option is used

� All the dlfs file systems

You must install the Data Links File Manager and Data Links File System software on both nodes with the same options and parameters as mentioned above for the host DB2 server:

1. Install the software on the active node with VG-1 attached to this node. Once the installation is over on the active node, complete the DLFM administration work of registering the prefixes and host databases. This makes the active node ready for service.

2. Later install the software on the standby node by creating the temporary local file systems with the same path names as those in VG-1. This is just for the installation to succeed and for doing the other required setup on the standby node. After the installation is over, shut down the DLFS kernel extensions and the Data Links File Manager. Unmount and delete the temporary file systems. In case of failover, the HACMP Cluster Manager will start the Data Links File Manager, load the DLFS kernel extensions, and mount the dlfs file systems from the shared VG-1 on the standby node.

3. A sample script is provided in sqllib/samples/hacmp/rc.db2dls, which the Cluster Manager will run to start/stop the DLFM/DLFS. The script is also documented 7.4, “The scripts” on page 142.

Attention: The sample scripts provided are not meant for direct use but need to be customized for the local environment (for example, nodenames, etc.).

Chapter 7. High Availability support on AIX 135

Page 158: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4. The standby Node-B gets the IP address or hardware address of the active Node-A’s network adapter during failover.

7.3 HACMP cluster configuration for mutual takeoverThe mutual takeover configuration for Data Links is where the host DB2 server and the Data Links server back up each other to provide high availability. The mutual takeover environment is shown in Figure 7-2.

Attention: The sample scripts provided are not meant for direct use, but need to be customized for the local environment (for example, nodenames, etc.).

Important: In case of a Data Links server node failover, the hostname of standby Node-B does not need to be set to the hostname of Node-A. This is because, in case of failover, the DNS or clients’ local /etc/hosts files still have the name Node-A associated to IP-A. Since IP-A is now taken over by the network adapter of Node-B, all the network requests for Node-A automatically go to Node-B.

136 Data Links: Managing Files Using DB2

Page 159: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 7-2 Mutual takeover environment

7.3.1 ConfigurationThe configuration consists of:

� In this configuration, Node-A is an active node and Node-B is a standby node for the host DB2 server, and it is the reverse for the Data Links server.

� Resource Volume Group - 1 (VG-1): This is the host DB2 server shared resource group. It contains all the file systems containing the home directories of the DB2 instances and the database directories that are required to fail over to standby Node-B when the host DB2 active Node-A goes out of the HACMP cluster.

Priority 2SCSI - 1

SCSI - 2

Shared Disks

Resource VolumeGroup 2 (VG - 2)

/home/dlfm/dlff/files1/dlff/files2

Mutual Takeover Configuration betweenHost DB2 Server & Data Links File Manager

Resource VolumeGroup 1 (VG - 1)

/home/db2inst1/db2/tabledata

SCSI - 1

SCSI - 2

Local Disk

/usr/lpp/db2_07_01/var/db2/v71

/etc/vfs/etc/rc.dlfs

SCSI - 0

Network Adapters

Active Standby

Local Disk

/usr/lpp/db2_07_01/var/db2/v71

/etc/vfs/etc/rc.dlfs

SCSI - 0

Network Adapters

Active Standby

Data LinksFile Manager

Node B

Priority 1

Priority 2 Priority 1

Host DB2Node A

Network

Chapter 7. High Availability support on AIX 137

Page 160: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The <instance_owner_home>/sqllib directory must exist on a shared disk and must have the same path on both Node-A and Node-B. The database and database logs must exist on a shared disk and have the same path on Node-A and Node-B. Each instance must have a unique path for both the database and the logs.

Node-A has priority 1, and Node-B has priority 2 for this shared resource group. Thus, in a cascading policy whenever Node-A rejoins the cluster, it gets the VG-1 back and takes over the functionality of the host DB2 server.

� Resource Volume Group - 2 (VG-2): This is the Data Links server resource group. It contains all the following file systems that are required to fail over to standby Node-A when the Data Links server active Node-B goes out of the HACMP cluster. Node-B has priority 1, and Node-A has priority 2 for this resource group. Therefore, in a cascading policy whenever Node-B rejoins the cluster, it gets the VG-2 back and takes over the functionality of the Data Links server.

– The file system containing the home directory of the DLFM’s local DB2 instance user (which by default is dlfm)

– The file system containing DLFM’s meta data database (which, by default, is DLFM_DB), if the DLFM_DB database is created in a different file system from the default one (under the home directory of the dlfm instance user)

– The file system containing the dlfm_backup directory if the local disk backup option is used

– All the dlfs file systems

� Each node should have two network adapters: One is active and the other is standby.

– In normal operation mode, the active adapter on Node-A is configured with the host DB2 service IP address (to which the DB2 client applications connect) and the standby adapter carries the boot IP address (IP-B1). In case of a Node-B failure, the Cluster Manager fails over the IP address and hardware address of the active adapter of Node-B to the standby adapter of Node-A.

– In normal operation mode, the active adapter on Node-B is configured with the Data Links File Manager service IP address (to which the host DB2 server connects) and the standby adapter carries the boot IP address (IP-B2). In case of a failure of Node-A, the Cluster Manager fails over the IP and hardware addresses of the active adapter of Node-A to the standby adapter of Node-B.

– The active adapters on both nodes should be configured with two network addresses: one with a boot address and the other with a service address. The boot address is used when a failed node reboots. Then, the Cluster

138 Data Links: Managing Files Using DB2

Page 161: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Manager revokes the service address from the standby node and assigns it to the boot address. Therefore, the boot address is needed to avoid the network address conflict during the startup of a failed node.

� Both the host DB2 server software and the Data Links server software needs to be installed and setup on both nodes (node-A and node-B), the same way as mentioned in 7.2.2, “Hot standby setup for a Data Links server” on page 135. Also the sample scripts can be used for HACMP Cluster Manager to start and stop the host DB2 server and Data Links server on the nodes during a failover.

� A certain number of files must be the same on both nodes (Node-A and Node-B):

– /var/db2/v71/default.env– /var/db2/v71/default.profiles.reg– /usr/lpp/db2_07_01/cfg/dlfs_cfg– /usr/lpp/db2_07_01– /etc/vfs– /etc/rc.dlfs

Figure 7-3 shows the default.env file and the profiles.reg file. The default.env file contains the DB2 global variables for the node. The profiles.reg file contains all of the registered instances on the node. They must reflect the information for both nodes.

Important: When the host DB2 server Node-A goes out of cluster, the cluster manager starting the service on Node-B should set the hostname of Node-B as Node-A. This is a requirement for Data Links. Also since DNS or the /etc/hosts do not change the association between Node-B and its Service IP address, all the network requests to Node-B (despite its hostname change to Node-A) go to Node-B. Therefore, Data Links File Manager Service on Node-B is not affected by this hostname change.

Chapter 7. High Availability support on AIX 139

Page 162: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 7-3 The /var/db2 files show the global variables and instances

The dlfs_cfg file must reside on both nodes. The parameters of this file are explained in Release Notes for version 7.2/version 7.1 Fixpack 3 in the section “Minimize Logging for Data Links File System Filter (DLFF)”. This file must be the same on both nodes and does not failover. Figure 7-4 shows the contents of the file.

Figure 7-4 The dlfs_cfg file must exist on both servers

Figure 7-5 shows the /etc/vfs file. It is required on both nodes so that the DLFF can be loaded by the strload command.

140 Data Links: Managing Files Using DB2

Page 163: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 7-5 The contents of /etc/vfs

7.3.2 Sequence of eventsBoth instances, db2inst1 and dlfm must be able to run simultaneously on either node. Resource Volume Group 1 contains all of the file systems for db2inst1, the instance home directories, and database directories. Resource Volume Group 2 contains the dlfm instance home directory, the DLFM_DB database, the dlfm_backup directory and all of the dlfs file systems. The Resource Groups are the resources that will fail over. For more information on Resource Groups and HA, refer to HACMP Installation Guide, SC23-4278. Figure 7-2 on page 137 shows the Resource Groups.

When the failover is from Node-A (DB2 UDB) to Node-B (DLFM), you must perform the following steps:

1. On Node-A, run rc.db2server.dls with the stop option. This shuts down the DB2 instances on Node-A, using db2stop force, db2_kill, and killall.

2. Shut down the DB2 admin server if it exists.

3. On Node-B, mount the db2inst1 home directory, logs, and table data file systems (VG-1). This is part of the HA configuration.

4. On Node-B, run rc.db2server.dls with the start option. This sets uname and hostname to the Node-A hostname and starts the DB2 instance on Node-B.

Chapter 7. High Availability support on AIX 141

Page 164: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

When the failover is from Node-B (DLFM) to Node-A (DB2 UDB), you must perform the following process:

1. On Node-B run rc.db2dls with the stop option. This shuts down the DLFM on Node-B using dlfm stop, dlfm shutdown.

2. Unmount the dlfs file systems.

3. On Node-A, mount the DLFM home directory, logs, table data, and archive file systems (VG-2). This is part of the HA configuration.

4. On Node-A, run rc.db2dls with the start option. It calls /etc/rc.dlfs, which loads the DLFF and mounts the Data Links files, runs dlfm shutdown, and runs dlfm start.

5. Force applications off on DB2 instance (if this is not done, you may see an SQL0357 error message with return codes: rc=5 or rc=1).

7.4 The scriptsThe sample scripts are located in the /instance home/sqllib/samples/hacmp directory. The sample scripts can be modified to make them more robust in terms of checking the required environment present on the node before starting the services on the node.

These configuration scripts should ensure that the following conditions are met whether in normal or failover state:

� Data Links file systems are mounted properly with the correct options/characteristics.

� DLFM processes are running.

� DB2 processes are running for all the applicable DB2 instances.

� The DB2 server's hostname is established as the hostname of whichever node is currently running the DB2 processes.

The sample script rc.db2dls (Example 7-1) either stops or starts the Data Links File Manager. When it stops DLFM, it also unmounts the Data Links file systems. When it starts DLFM, it runs /etc/rc.dlfs, which loads the Data Links File System Filter and mounts the Data Links file systems. It is called by the HACMP Cluster Manager.

Disclaimer: Sample scripts provided with the Data Links product for HA support should not be used as-is, but need to be modified to reflect the appropriate customer environment (user names, node names, etc.).

142 Data Links: Managing Files Using DB2

Page 165: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Example 7-1 rc.db2dls sample script

#!/bin/ksh## Licensed Materials - Property of IBM## (C) COPYRIGHT International Business Machines Corp. 1990,1997# All Rights Reserved## US Government Users Restricted Rights - Use, duplication or# disclosure restricted by GSA ADP Schedule Contract with IBM Corp.##################################################################################### # Name: rc.db2dls # # Description: Sample script to Start/Stop the Data Links File Manager Server.## Arguments: $1 - instance: dlfm instance user (default “dlfm”) # $2 - status: Either start or stop # # Returns: 0 success # ###################################################################################

## Initialization of variables etc.#DB2user=$1parm2=$2typeset -u parm2HOST=`/bin/hostname -s `PROGID=`echo $0 | sed 's%/usr/bin/%%g'`lnndir=`lsuser -c -a home $DB2user | awk -F":" '!/#/ { print $2}'`echo "\n`date`"

## STOP the Datalinks Manager and Unload the DLFSif [[ "$parm2" = "STOP" ]] then echo "$PROGID - $HOST: Going to stop db2 " Date su - $DB2user -c dlfm stop su - $DB2user -c dlfm shutdown sleep 60## Unmount your datalinks file systems and Unload the DLFS kernel extension# umount /dlfsmountpoint(s)

Chapter 7. High Availability support on AIX 143

Page 166: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

# Exit exit 0fi

## START the Datalinks Manager and Load the DLFS #if [[ "$parm2" = "START" ]] then echo "$PROGID - $HOST: Starting db2 " # # Load the DLFS kernel extension. Unmount and Mount all the dlfs file systems. # Execute dlfmfsmd for each dlfs mount point. It will create/update /etc/rc.dlfs file. # /dlfm-home/sqllib/int/instance/dlfmfsmd /dlfsmountpoint(s) # # Execute the rc.dlfs file created by /dlfm-home/sqllib/int/instance/dlfmfsmd # /etc/rc.dlfs

# # Shutdown and Restart the DLFM server. # su - $DB2user -c dlfm shutdown su - $DB2user -c dlfm start

exit 0else echo "$PROGID ERROR:: rc.db2dls $*" echo "$PROGID SYNTAX:: rc.db2dls [DB2_USER] [ start | stop ]" exit 1fi

-----------------------------------------------------------------------------------------------

The sample script rc.db2server.dls (Example 7-2) stops or starts DB2 and sets the uname and hostname. The uname and hostname must be set to the hostname that is registered to DLFM with the dlfm add_db command. It is called by the HACMP Cluster Manager.

144 Data Links: Managing Files Using DB2

Page 167: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Example 7-2 rc.db2server.dls sample script

#!/bin/ksh## Licensed Materials - Property of IBM## (C) COPYRIGHT International Business Machines Corp. 1990,1997# All Rights Reserved## US Government Users Restricted Rights - Use, duplication or# disclosure restricted by GSA ADP Schedule Contract with IBM Corp.########################################################################### # Name: rc.db2server.dls # # Description: Script to Start/Stop the Host DB2 Server HACMP Cluster manager. ## # Arguments: $1 - db2user: is the user of the db2 instance # $2 - parm2: [start | stop] : Start or Stop option.# $3 - param3 [standby|active] : This is to indicate the node on which# the script is running is active or standby node for DB2 Server # # Returns: 0 success # ########################################################################### Initialization of variables etc.# Change the Service_Host and Standby_Host with actual names,Service_Host=Node-A # Active Node for DB2 ServerStandby_Host=Node-B # Standby node for DB2 Server

DB2user=$1Parm2=$2Param3=$3

typeset -u parm2HOST=`/bin/hostname -s `PROGID=`echo $0 | sed 's%/usr/bin/%%g'`lnndir=`lsuser -c -a home $DB2user | awk -F":" '!/#/ { print $2}'`echo "\n`date`"## Stop the DB2 instance.#if [[ "$parm2" = "STOP" ]] then echo "$PROGID - $HOST: Going to stop db2 " date su - $DB2user -c $lnndir/sqllib/adm/db2stop force

Chapter 7. High Availability support on AIX 145

Page 168: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

date su - $DB2user -c $lnndir/sqllib/bin/db2_kill sleep 15 su - $DB2user -c killall## Set the uname and hostname back to the Standby_Host. Actually this must be done only# when script is run on Standby node to stop the DB2 server on Standby node. #if [[ "$parm3" = "standby" ]] then uname -S $Standby_Host hostname $Standby_HostFi

# Exit exit 0Else## Start the DB2 Instance.# if [[ "$parm2" = "START" ]] then

## Set the uname and hostname as DB2 Server's active node. Actually this setting of# hostname needs to be done only when script is run on Standby node during failover. # uname -S $Service_Host hostname $Service_Host date echo "$PROGID - $HOST: Starting db2 " su - $DB2user -c $lnndir/sqllib/adm/db2start exit 0 else echo "$PROGID ERROR:: rc.db2server.dls $*" echo "$PROGID SYNTAX:: rc.db2server.dls [DB2_USER] [ start | stop ]" exit 0 fifi

7.4.1 Additional considerations for DB2 Universal Database Version 6If failover is being configured using DB2 Universal Database Version 6, there is an extra task that you need to perform. The files listed from the following command on the DLFM server must be copied onto the DB2 server:

ls -l /usr/lpp/db2_06_01/bin/dlfm_*

146 Data Links: Managing Files Using DB2

Page 169: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The permissions and links must be identical. This is not a problem on DB2 Universal Database Version 5 or Version 7 because the files are in the /dlfm instance/sqllib/adm directory and this fails over. Figure 7-6 shows the output of the ls command.

Figure 7-6 List of dlfm_ programs

7.4.2 Final considerationsWe chose to remove /etc/rc.db2 from the /etc/inittab file because the cluster controls the startup of the system. We also found that we had to modify the dlfs file systems by using the /usr/lpp/db2_0n/0n/instance/dlfmfsmd script after the failover from the DLFM node to the DB2 node. The cluster would return these file systems to the jfs file system type.

To initially set up the DLFM environment for failover on Node-A, you can:

� Create a temporary file system for the dlfm instance /home/dlfm.

� Run the db2setup program to create the dlfm instance. This modifies /usr/lpp/db2_07_01/, /var/db2, and /etc/vfs.

� Unload the DLFF driver (DLFSDRV):

strload -uf /usr/lpp/db2_07_01/cfg/dlfs_cfg

Note: If a fixpack is installed for DB2 Universal Database Version 6, the fixpack installation process updates the /usr/lpp/db2_06_01/bin/dlfm_* files on the DLFM node, but not on the DB2 node. The files listed in Figure 7-6 need to be manually updated for failover to be successful after a fixpack upgrade.

Chapter 7. High Availability support on AIX 147

Page 170: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Unmount and delete the temporary file system. This removes the /home/dlfm/sqllib, /home/dlfm/dlfm_backup, and /home/dlfm directories.

Attention: We recommend that a skilled System Administrator be highly involved when setting up the failover environment. Extensive testing must be conducted to ensure all of the failover scenarios function correctly.

148 Data Links: Managing Files Using DB2

Page 171: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 8. Creating a new database

This chapter describes a method to create a new DB2 database from an existing DB2 database that uses Data Links. The export, import, load, dlfm_export, and dlfm_import utilities are discussed, as well as other administration commands for Data Links. The examples are from an AIX environment. The process we follow is similar for Windows NT and Sun Solaris with some minor differences. The dlfm_export and dlfm_import differences are documented in Chapter 5 “Moving Data Links Manager Data” in Data Movement Utilities Guide and Reference, SC09-2955.

8

© Copyright IBM Corp. 2001 149

Page 172: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

8.1 OverviewCreating a new database from an existing database is a task that Database Administrators (DBA) are continually asked to do. The request to create this new database can arise for a number of reasons. A test database is required, and it needs to look just like production, or a customer requires a restore of yesterdays backup to a new database so the data can be reviewed. This task is relatively simple when Data Links is not involved. When Data Links is involved, the DLFM_DB database, which contains the Data Links database information, must be rebuilt. We show how to create the new test system using our existing host machine. We create a new instance and restore a backup to the new database. We use the existing Data Links File Manager and add a new file system to store the linked values. We use the following steps to create our new database with Data Links:

1. Run the Backup utility on the database.

2. Run the Export utility on the Data Links table data.

3. Capture the ddl table using db2look.

4. Run Restore on the database with the database manager configuration parameter DATALINKS NO.

5. Drop and recreate the table.

6. Copy files to be linked to a new directory on the DLFM server.

7. On the DLFM server, run:

Dlfm add_db, dlfm add_prefix

8. Run Import or load to move the data into the table.

Figure 8-1 illustrates this process.

150 Data Links: Managing Files Using DB2

Page 173: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-1 The steps used to create the new database

8.2 BackupAn offline backup is taken from a database that contains a table with the DATALINK data type. If an online backup is used, a copy of the logs must be available for the rollforward in the new database. The backup is used to restore the database and all of its contents to a new database. Figure 8-2 shows the backup command. This method is useful when you want to migrate the environment onto new hardware. For example, you are currently using DB2 Universal Database Version 5.2 and you want to migrate this environment to a Version 7.x environment on a new file system or machine that does not have DB2 Universal Database or DLFM installed.

http://fileserv1/images/p10.bmp

nr imagedesc

DB2 Databasedlrestor

Export fileexport_resident.del

Backup file

Table ddlresident.ddl

DB2 Databasedltest

db2 backup ....1

http://fileserv1/images/p10.bmp

nr imagedesc

DLFM Server

/dldata/dldata2

db2look ....3

db2 restore ....4

db2 import ....8db2 export ....2

db2 drop tablecreate table ....

5

dlfm add_db dlfmadd_prefix ....

7

copy datalinkfiles ....

6

Chapter 8. Creating a new database 151

Page 174: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-2 Backup database command

8.3 EXPORT (dlfm_export)We illustrate two types of exports. An export to an integrated exchange format (IXF) file and an export to a delimited ASCII format (DEL) file. We use the IXF file type if the Data Link files are on a new host machine, and we do not need to change the path of the Data Link files. The steps for exporting are:

1. Run the quiesce command to ensure that you have a consistent copy of the table and the corresponding files when you run the export command.

2. Run the export command to creating an IXF file and a control file.

3. Copy the control file to the Data Links server.

4. Run dlfm_export on the Data Links server. It creates a tar file that contains the Data Linked files that will be needed for the import.

Figure 8-3 illustrates the quiesce command and an export command to an IXF file type.

Figure 8-3 Quiesce and export to the IXF file type

The export to an IXF file creates a control file called ununbium.almaden.ibm.com. The control file is used as input for the dlfm_export command. The control file is placed in /tmp/dlfm, and the contents are shown in Figure 8-4.

152 Data Links: Managing Files Using DB2

Page 175: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-4 Contents of the export control file

The dlfm_export command is run using the control file created by export as its input. The command creates a tar file that contains all of the files that are listed in the control file. The command and its output are shown in Figure 8-5.

Figure 8-5 Sample dlfm_export

For our example in Figure 8-1, we use an existing Data Links File Manager, so we have to change the path of the Data Linked files to avoid duplicates. To change the path names, we must edit the export file therefore we use a DEL file type. The DEL file type is easier to edit than an IXF file. The following process is used:

1. Run export on the Data Link data to a file type of type DEL.

2. Edit the file changing the file system name from /dldata to /dldata2.

Figure 8-6 displays the export command using a DEL file.

Chapter 8. Creating a new database 153

Page 176: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-6 Export using delimited output

Figure 8-7 displays the delimited export file before and after it was edited. The file system was changed from /dldata to /dldata2 because the files in /dldata already exist.

Figure 8-7 Delimited file before and after editing

8.4 The db2look commandThe ddl for the table with DATALINK columns must be captured so that it can be run on the new database. We use db2look to capture this ddl. You can find Help for db2look by simply typing db2look -h. Figure 8-8 shows db2look and the ddl it created.

154 Data Links: Managing Files Using DB2

Page 177: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-8 The db2look command and the output it produced

8.5 The restore commandWhen we restore the database to the new instance, we set the database manager configuration parameter DATALINKS=NO. The reason for this is so that we can easily drop and recreate the table with the DATALINK column. If we leave the DATALINKS parameter set to YES, the table is put in the Datalinks_Reconcile_Not_Possible (DRNP) state, which is difficult to clear up. The following steps pertain to the restore command:

1. Use the restore command to create the new database.

2. Make sure the database manager configuration parameter is set to DATALINKS=NO.

3. Drop and recreate the tables containing the DATALINK data type using the output from db2look in Figure 8-8.

4. Update the database manager configuration using DATALINKS=YES.

5. Stop and start DB2.

6. Use the list datalinks managers command to see if the Datalink Manager name and port number are correct. If the name or port number is incorrect, use drop datalinks manager, and add datalinks manager to correct the configuration.

Chapter 8. Creating a new database 155

Page 178: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-9 shows the restore, get dbm cfg and list datalinks managers commands.

Figure 8-9 Restore command, get dbm cfg, and list datalinks managers

8.6 Copying the linked filesThe dlfm_import utility can be used to extract the files from the archive created by the dlfm_export utility. The name of the archive is dlfm_files.tar. When dlfm_import is run, it should be run as root. The dlfm_import command extracts the files into the same directory name from which they were copied.

The dlfm_import command extracts the files into the same directory name from which they were copied.

A sample dlfm_import is shown in Figure 8-10.

Note: Use caution with dlfm_import. If it is run on the same host as dlfm_export, the existing Data Link files will be overwritten.

Note: If dlfm_import is run on a different host, there is a way to make dlfm_import extract the files from the archive into a different path name. Symbolic links can be set up to do this. For our example, we could have used:

ln -s /dldata2 /dldata

The files would then be extracted into /dldata2 using the link /dldata.

156 Data Links: Managing Files Using DB2

Page 179: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 8-10 Sample dlfm_import

For the example in Figure 8-1, we chose not to use dlfm_import. It would not allow us to change the path name for the Data Linked files because they are absolute path names in the tar file. We simply copied, as root, the files from one directory to another and changed the ownership to something other than dlfm.

cp /dldata/sys_pics/* /dldata2/sys_pics/

8.7 DLFM commandsWe register our new instance and database with DLFM by using the dlfm add_db command. The dlfm add_prefix command is used to register our new dlfs file system with DLFM. Figure 8-11 shows the commands.

Figure 8-11 The dlfm add_db and dlfm add_prefix commands

8.8 Running the Import utilityThe files have been copied and are now in a dlfs file system. The next step is to populate the table that contains the DATALINK column type. The Import utility can be used if the amount of data is not too large. The Import utility allows you to globally change the host name for the Data Link files by using the

Chapter 8. Creating a new database 157

Page 180: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

dl_url_replace_prefix clause. You can also use the dl_url_suffix clause, which appends the value associated with the clause to the path component of the URL part of the DATALINK value. For detailed information on the dl_url_suffix and dl_url_replace_prefix clauses, refer to DB2 UDB Command Reference, SC09-2951. In our sample, we modified the import file and changed the directory name from /dldata to /dldata2. Figure 8-12 shows a delimited file import.

Figure 8-12 Import delimited file with DATALINK column type

8.9 Running the Load utilityThe Load utility can also be used instead of the Import utility to populate the table with DATALINK columns. The Load utility should be used for a large number of rows. Another advantage of load is that it does minimal logging.

There are some options that are not supported when loading a table with Data Links:

� The COPY option� The REPLACE option

Note: When loading a large number of rows that contain DATALINK values, you will not be able to back up the database until the asynchronous copies of each file have completed on the DLFM server. This only occurs when the corresponding DATALINK columns are defined with the RECOVERY YES option. The asynchronous copy operations can be a lengthy process.

158 Data Links: Managing Files Using DB2

Page 181: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� The TERMINATE option� The CPU PARALLELISM (value is forced to 1) option� The NONRECOVERABLE option should not be used when the DATALINK

column is defined with FILE LINK CONTROL

A sample load command is shown in Figure 8-13.

Figure 8-13 The Load utility

Chapter 8. Creating a new database 159

Page 182: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

160 Data Links: Managing Files Using DB2

Page 183: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 9. Data replication

This chapter discusses how the DB2 data replication feature, otherwise known as DataPropagator Relational (DPropR), can be used to copy externally managed files from one location to another. Because it is necessary to understand the basics of replication in order to understand replication of DATALINK columns and their linked files, much of the material covered in this chapter is not limited to Data Links. This chapter explores many of the components of DPropR, how they interact, and how to use the DB2 Control Center to set up a replication environment involving externally managed files.

9

© Copyright IBM Corp. 2001 161

Page 184: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.1 Overview of DB2 replicationReplication is the process of automatically maintaining one or more copies of data so that it is kept synchronized with the original source data. As data is created, updated, or deleted at the source, the copy is also changed.

DB2 has for a long time supported the replication of traditional data types. The ability to replicate DATALINK columns was introduced in DB2 DataPropagator Version 7 and is included with DB2 Universal Database Version 7. The replication of the external files, which are controlled by the DB2 Data Links File Manager (DLFM), is not done by the DataPropagator product itself, but by a user exit routine invoked by the Apply program (a component of DPropR).

Replication Guide and Reference Version 7, SC26-9920, describes in detail all of the components of replication and how to plan, set up, and administer a replication environment. This chapter is not intended to replace Replication Guide and Reference Version 7, SC26-9920, but to supplement it by discussing items related to Data Links.

Data replication can be configured in many different ways. The simplest way to implement is known as data distribution, where updates to a source table are replicated to one or more read-only target tables. Data distribution uses most of the available features of replication and is the configuration discussed in this chapter. Replication of DATALINK values and their associated files can also be accomplished when using other replication configurations, although these configurations are not discussed in detail in this chapter. For an in-depth discussion of all possible replication configurations, refer to Replication Guide and Reference Version 7, SC26-9920.

Here are some of the supported replication configurations:

� Data consolidation: Data from multiple source tables is replicated to a common target table.

� Update anywhere: Target tables are read/write. Replication is bidirectional, that is, changes to a target table are also replicated to the source table. Note that conflict detection is not supported for DATALINK columns.

� Occasionally connected: Data from a primary source is copied to a target table on demand.

9.2 Why replicate linked filesThe primary benefit of replicating files is improved performance when accessing those files from multiple remote sites.

162 Data Links: Managing Files Using DB2

Page 185: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Suppose your company stores engineering drawings in files that are managed by Data Links, and the DB2 database and the Data Links server are located in Los Angeles. Engineers in Los Angeles can access the linked files quickly. Delays caused by moving the file data over the network to the engineers workstations are minimal, because the physical distance from the Data Links server to workstations is small. Engineers in New York, on the other hand, experience significant delays when accessing the same files, because the file data must travel long distances over the network. Performance for the New Yorkers could be improved if the files were replicated to a server located in New York. The closer the user is to where the file physically resides, the less time they have to wait for the file to be transferred over the network. With replication, when a file is created or changed at the source, it is copied to the target server asynchronously. By the time the end user needs to access the new file, it already resides on the server local to that user.

9.3 Supported platformsTables with DATALINK columns can be replicated between DB2 databases on the following operating systems:

� AIX� OS/400 (AS/400)� Solaris� Windows NT

This chapter discusses replication with both the source and target databases using DB2 on AIX. Although DPropR supports replication between many other platforms, remember that DATALINK columns cannot be replicated to platforms that do not support them. Here is a list of current restrictions:

� DATALINK columns cannot be replicated between DB2 databases on iSeries and AS/400 and DB2 databases on other platforms.

� Replication of the “COMMENT” attribute of a DATALINK value is not supported on the iSeries platform.

� Update-anywhere replication with DATALINK columns must use a conflict-detection level of NONE. DB2 does not check update conflicts for external files pointed to by DATALINK columns.

� DB2 always replicates the most current version of an external file pointed to by a DATALINK column.

� Target tables that are base-aggregate or change-aggregate tables do not support DATALINK columns.

Chapter 9. Data replication 163

Page 186: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.4 Replication componentsDataPropagator performs two primary functions:

� Collecting data that has been created or changed� Copying that data to one or more target servers

These processes are commonly referred to as Change-capture and Apply. The DB2 Control Center can also be considered a component of replication because it can be used to setup and administer the replication environment. In 9.6, “Implementing replication with Data Links” on page 172, we use it to step through the tasks needed to set up replication.

An alternative to the DB2 Control Center is the DB2 Data Joiner Replication Administration (DJRA) tool. DJRA is required to setup and administer replication on many of the non-IBM databases. See Figure 9-1 for a list of platforms that require DJRA. You can find instructions for installing DJRA and using it to set up and administer a replication environment in Replication Guide and Reference Version 7, SC26-9920.

DataPropagator stores information about what data to capture and what data to replicate in a set of tables called the control tables. Some of the control tables are used by the change-capture process and some are used by the Apply process.

All of the replication components reside on a logical server. The term “server” as used here refers to a database rather than a physical machine. There are three types of logical servers:

� Source server: The database where the source tables and the control tables used by the Capture program reside.

� Target server: The database where the target tables reside.

� Control server: The database where the control tables used by the Apply program reside.

9.4.1 Change-captureChange-capture (Figure 9-1) is the process of collecting data as it is created or modified and storing it for later retrieval by the Apply process. The table whose data is to be captured is called the replication source. Whenever a transaction issues an INSERT, UPDATE, or DELETE statement, the affected data is copied by the Capture program (or a Capture trigger on non-IBM databases) to a control table called the change-data (CD) table. Data is written to the CD table before it is

164 Data Links: Managing Files Using DB2

Page 187: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

committed. Information about which transactions have been committed is stored in another control table called the unit-of-work (UOW) table. The Apply program joins the UOW table and the CD table to ensure that only committed changes are replicated.

The Capture program runs on the same machine as the replication source. It collects the data to be replicated by reading the database log file and then copies that data to the CD table. The capture program runs continuously on the source server and is normally started immediately after DB2 is started.

Figure 9-1 Change Capture

When you use the DB2 Control Center to define the replication source, the CD table is automatically created. Each table that is defined as a replication source has its own CD table.

When the replication source table contains a DATALINK column, the DATALINK URL is stored in the CD table, but the referenced file is not. The URL is used by an exit routine called by the Apply process, and the referenced file is copied at that time.

You may not want to capture all of the columns of your replication source table. When you define the replication source, you can restrict which columns are captured. This allows you to capture only the data that you want to be copied to the target table(s). Figure 9-2 shows the SOURCE.DEPARTMENT table in the SAMPLE database being defined as a replication source. Notice that we have chosen to capture only the data in the DEPTNO and DEPTNAME columns. The data in the MGRNO, ADMRDEPT, and LOCATION columns are not captured.

Source Database

Replication Source Table Change-Data Table

SQL INSERT, UPDATE orDELETE

CaptureProgram

DB2Log

Chapter 9. Data replication 165

Page 188: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-2 Defining a replication source

9.4.2 ApplyFigure 9-3 shows the actions performed by the Apply program. The Apply program populates the target tables by either reading data directly from the source table (for an initial load or full refresh of the target table), or by reading the CD table (1). In most cases, the Apply program runs at the target server, but it can be run from any machine on the network that has access to the source server, the target server, and the control server. The data read by the Apply program is stored in a spill file (2) and then copied to the target table (3 and 4).

Note: One of the databases we use in our examples in this chapter is the SAMPLE database, which was created by a user ID of source. The tables in the SAMPLE database, therefore, have a schema name of SOURCE.

166 Data Links: Managing Files Using DB2

Page 189: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-3 Apply program data flow

9.4.3 Subscription sets and subscription set membersThe mapping of the replication source data to the target table, as well as many of the parameters governing how the data is to be replicated is defined by subscription sets and subscription set members. A subscription set is used to define the source server, the target server, the frequency of replication, etc. Subscription sets are processed in a single transaction. This assures that changes are applied to all of the target tables defined in all of the subscription set members or to none of them.

A subscription set member is used to identify the source table and target table, which columns are to be replicated, and, through the use of SQL predicates, which rows are to be replicated. Figure 9-4 shows the relationship between subscription sets and subscript set members.

Source Database

Replication Source Table Change-Data Table

Target Database

Target Table

Apply spillfile

1

2

3

4

Chapter 9. Data replication 167

Page 190: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-4 Subscription set and subscription set members

9.5 Data Links replicationReplication of DATALINK columns and their associated files requires a bit more work than replication of traditional data types. If you think about what the value contained in a DATALINK column represents, it is easy to see that you cannot just copy that value to another table. A DATALINK column contains the protocol used to access the file (HTTP or UNC), the name of the server where the file

168 Data Links: Managing Files Using DB2

Page 191: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

resides, and a fully qualified pathname to the file. If you replicate the column and its associated file to another server, the new DATALINK column needs to point to the target server and new pathname of the replicated file. This is illustrated in Figure 9-5.

Figure 9-5 DATALINK values before and after replication

Fortunately, DPropR provides a mechanism for changing the DATALINK value at the target server to point to the newly replicated file.

9.5.1 Capturing DATALINK valuesThe Capture program treats DATALINK columns no differently than any other data type. The DATALINK value is read from the log and written to the CD table, along with the data from other columns that were selected to be part of the replication source. Note that the file referenced by the DATALINK value is not copied to the CD table. Remember that the Capture program reads the log file to collect the change data, and the content of the linked files are not written to the log.

9.5.2 How Apply handles DATALINK valuesThe Apply program has two additional tasks to perform when dealing with DATALINK values. First, it must map the source file reference into the target file system. This might mean changing the server-name portion of the DATALINK value, as well as the pathname and filename so that they point to the correct location on the target server. Second, the source file must be copied to the target file system. Both of these functions are performed by a user exit program, ASNDLCOPY, which is called by the Apply program.

Datalink value:HTTP:source_server.ibm.com/source_files/file1

Source Database

Replication Source Table

file1

Source server

Datalink value:HTTP:target_server.ibm.com/target_files/file2

Target Database

Target Table

file2

Target server

Chapter 9. Data replication 169

Page 192: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

File-reference mapping functionThe Apply program invokes a user exit program, ASNDLCOPY, to perform both the file mapping and file copy functions. Before calling the exit routine, the Apply program reads the CD table (or the replication source table during initial load) and writes DATALINK column values to a file. This file is then read by ASNDLCOPY, which transforms the original file references by applying mapping definitions (stored in a file named ASNDLSRVMAP), and the modified file-references, now pointing to the target server and pathname, are written to an output file. See Figure 9-6. Section 9.6, “Implementing replication with Data Links” on page 172, examines the structure of the mapping definition file, ASNDLSRVMAP.

Figure 9-6 File reference mapping

Note: The ASNDLCOPY user exit program is a sample program supplied with DB2 and resides in the sqllib/bin directory. The C language source code for the program is in sqllib/samples/repl/ASNDLCOPY.smp and can be modified to meet any unique user requirements. The prolog section of the code describes the program usage, default options, calling conventions, etc.

Source Database

ReplicationSource Table

ConsistentChange-Data Table

APPLY

Originalfile references

Modifiedfile references

ASNDLCOPYMappingDefinitions

ASNDLSRVMAP

170 Data Links: Managing Files Using DB2

Page 193: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

File copy functionAfter the file-reference mapping has completed, the ASNDLCOPY routine copies the files from the source server to the target server. For files linked with the DATALINK option READ PERMISSION FS, the user exit program uses the FTP protocol to physically transfer the files being replicated. ASNDLCOPY reads a file containing hostnames, port numbers, user IDs, and passwords. This file is named ASNDLUSER, and its contents are discussed in 9.6.7, “Configuration files used by ASNDLCOPY” on page 184.

This information is used to establish a connection to an FTP daemon. If the DATALINK column uses the READ PERMISSION FS option (or the NO LINK CONTROL option), and both file systems are NFS-mounted, the ASNDLCOPY user exit can be customized to use the UNIX cp command to copy the files instead of FTP.

File copy function with READ PERMISSION DBIf the files to be copied have been linked using READ PERMISSION DB, the ASNDLCOPY program can still use the FTP daemon to transfer files, but the user ID used to connect to FTP must have root access. This is usually not a desirable situation because the user ID and its password must be stored in a file that is accessible to ASNDLCOPY. Most system administrators do not allow this.

There is, however, an alternative. ASNDLCOPY can use the ASNDLCOPYD copy daemon instead of the FTP daemon to copy the files. ASNDLCOPYD is a sample file transfer program similar to FTP that is included with DB2. The C language source code for ASNDLCOPYD can be found in sqllib/samples/repl and an executable resides in sqllib/bin. ASNDLCOPYD provides a subset of FTPs functions for extracting file information (like file size, modification date, and time) and for extracting the contents of a linked file. The ASNDLCOPYD daemon can read files linked with the READ PERMISSION DB option because it runs with root authority, but it has an advantage over the FTP daemon because it uses two configuration files to control who can connect to it, and to control which directories can be accessed. These configuration files are discussed in 9.6.8, “Configuration files used by ASNDLCOPYD” on page 187.

For a complete description on how to set up and use ASNDLCOPYD, refer to Replication Guide and Reference Version 7, SC26-9920, or the prolog section of the sample program ASNDLCOPYD.smp in the sqllib/samples/repl directory.

Chapter 9. Data replication 171

Page 194: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6 Implementing replication with Data LinksBefore you can begin to set up replication of DATALINK columns and their files, you need to create a source table that contains the DATALINK data type and to populate it with a few rows. We do not go into the details of how to do this, but we describe the environment that we use for the remainder of this chapter.

9.6.1 Before we beginIn this scenario, we use the Control Center to set up the replication source, replication subscription set, and subscription set member. We assume that you are at least somewhat familiar with the DB2 Control Center. We do not describe, in detail, how to navigate the Control Center. Figure 9-7 shows the table used as the replication source: SOURCE.MANAGERS. The table was created in the SAMPLE database, in DB2 instance named SOURCE, on hostname NAPA.ALMADEN.IBM.COM.

Figure 9-7 SOURCE.MANAGERS table

Figure 9-8 shows the data we will replicate. We inserted four rows into the SOURCE.MANAGERS table. The column named PICTURE is the DATALINK column. We linked four files residing on hostname UNUNBIUM.ALMADEN.IBM.COM in the file system /dldata in the directory /source/pictures. From this, you can infer that the DLFM server managing these files is located on hostname UNUNBIUM.ALMADEN.IBM.COM and that the file system /dldata is a DLFS file system managed by that DLFM.

172 Data Links: Managing Files Using DB2

Page 195: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-8 SOURCE.MANAGERS table contents

The replication target is also on hostname NAPA.ALMADEN.IBM.COM, in DB2 instance name TARGET, and our target database is COPYDB. We create the target table TARGET.MGR_COPY when we create the subscription set member. COPYDB will also use the DLFM server on UNUNBIUM as its DLFM server, and the replicated files will be stored in the DLFS file system /dldata under a directory called /target/photos.

Figure 9-9 shows the state of the environment before starting replication. Note that no files are linked to the TARGET.MGR_COPY table in COPYDB, and no files reside in /dldata/target/photos.

Figure 9-9 Environment before replication

UNUNBIUM.ALMADEN.IBM.COM

/dldata/source/pictures

/dldata/target/photos

cathy.bmprachel.bmpsayanna.bmpzack.bmp

SAMPLE

SOURCE.MANAGERS

COPYDB

TARGET.MGR_COPY

NAPA.ALMADEN.IBM.COM

Chapter 9. Data replication 173

Page 196: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6.2 Defining the replication sourceYou define the replication source using the Control Center. First, you expand the object tree to see the list of tables in the SAMPLE database. Right-click the SOURCE.MANAGERS table, select Define as Replication Source, and select Custom. See Figure 9-10.

Figure 9-10 Defining a replication source

You are next presented with a dialog that allows you to define which columns will be replicated. In this example, we use all of the columns of the SOURCE.MANAGER table, so we leave all of the columns selected and do not make any other changes. See Figure 9-11.

174 Data Links: Managing Files Using DB2

Page 197: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-11 Selecting columns to be replicated

When you click OK, you are given the option to run the generated SQL or save it to a file. Here, save the SQL so you can examine it, and later run it to define the replication source. We save it on the C: drive in a directory called scripts as a file named replsrc.sql. See Figure 9-12.

Figure 9-12 Saving the replication source definition

If you look at the generated SQL (Figure 9-13), you see that the SOURCE.MANAGERS table is altered to capture changes. Next you see that the change data table was created and given a system generated name of RMRES2.CD20010514922768. The table owner is RMRES2 because that was the user ID under which we ran the DB2 Control Center when we generated the SQL.

Chapter 9. Data replication 175

Page 198: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-13 SQL to define the replication source

Now you are ready to run the generated SQL and actually define the replication source. You do this by going back to the Control Center object tree, right-clicking Replication Sources, and selecting Run SQL Files (see Figure 9-14).

Figure 9-14 Defining the replication source by running an SQL file

176 Data Links: Managing Files Using DB2

Page 199: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

After you locate and run the replsrc.sql file, the replication source is defined. You can verify this by clicking Replication Sources in the Control Center object tree. Figure 9-15 shows the replication source table MANAGERS.SOURCE.

Figure 9-15 Viewing the replication source

The first time that a replication source is defined for a database, the replication control tables are created. These tables have a schema name of ASN and table names beginning with IBMSNAP_. The generated SQL contains several INSERT statements that populate the replication control tables with information about the replication subscription you are creating. You are now ready to define the replication subscription set and subscription set member.

Important: If you are using DB2 Control Center Version 7.2.0.0 or earlier to define the replication source with a DATALINK column, a known defect in the Control Center causes it to incorrectly define the attributes of the column in the change data table that holds the DATALINK URL. This defect is resolved with APAR IY19523 (this APAR is scheduled to be included in DB2 Universal Database V7 Fixpak 4).

Therefore, it is necessary to change the definition of the CD table before you run the saved SQL file. The CREATE TABLE statement near the beginning of the file incorrectly defines the column that will capture the DATALINK URL as CHAR(254). You need to changed this to VARCHAR(261). The column is defined correctly when using Data Joiner Replication Administration (DJRA).

Chapter 9. Data replication 177

Page 200: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6.3 Defining the subscription set and subscription set memberOnce a replication source is defined, you can define where you want the captured data to be copied. Do this by defining a replication subscription in the Control Center by right-clicking the name of the replication source and selecting Define Subscription. See Figure 9-16.

Figure 9-16 Defining the replication subscription

In Figure 9-17, you see that we have given our subscription a name of “MGRSUB”, and that the target server is COPYDB. We also gave the Apply qualifier the name “MGRSUB”. Notice that we selected the Create table check box. This causes the target table to be created at the target server when the subscription is defined.

Figure 9-17 Define replication subscription dialog

178 Data Links: Managing Files Using DB2

Page 201: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

By default, the target table has the same creator and name as the source table. You can choose a different name for the target table by clicking the Change button. Figure 9-18 shows that we changed the creator and name to TARGET.MGR_COPY.

Figure 9-18 Changing the target table name

Next, we click the Advanced button and then the Target Columns tab. Select the Primary Key box next to the source column named “ID”. This causes the target table to created with ID as the primary key. This dialog also allows you to rename or add columns to the target table definition (although we do not do so in this chapter). See Figure 9-19.

Figure 9-19 Selecting the primary key for the target

Chapter 9. Data replication 179

Page 202: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Perhaps you want to restrict which rows from the source table to replicate. By default, all of the rows in the source table are replicated to the target. You can change this by clicking the Rows tab in the Advanced subscription replication dialog. You can now enter SQL predicates that will be used by the Apply program to limit which rows are replicated. Figure 9-20 shows that we have only allowed rows with an ID greater than 000000 to be replicated.

Figure 9-20 Restricting replicated rows

When you click OK, you return to the Define replication subscription main dialog. Next you define how frequently you want the Apply program to run by selecting the Time-based check box. Figure 9-21 shows that we selected the Using relative timing radio button and then changed the replication frequency to once every minute by using the Minutes and Hours options.

180 Data Links: Managing Files Using DB2

Page 203: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 9-21 Subscription timing

Clicking OK returns you once again to the Define replication subscription main dialog. Clicking OK one more time allows you to save the replication subscription to a file or to immediately run it. In either case, you need to specify where the subscription control information will be stored. This database is the control server. Here we chose COPYDB as the control server and then saved the subscription as a file named replsub.sql (see Figure 9-22).

Figure 9-22 Saving the replication subscription

The replication subscription is actually defined when running the file. You do this by right-clicking Replication Subscriptions from the Control Center object tree, selecting Run SQL Files..., and then selecting replsub.sql.

Chapter 9. Data replication 181

Page 204: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6.4 Configuring the source databaseWhen using replication, DB2 needs to retain log files. The Capture program reads the log files to collect the data to be replicated. You can tell DB2 to retain the log files by setting the database configuration parameter LOGRETAIN to RECOVERY or CAPTURE. This only needs to be done on the source server. You can use the Control Center to accomplish this by right-clicking the database object and selecting Configure and then clicking the Logs tab. Click the logretain parameter and select Recovery. When you click the OK button, a message appear that states that all applications must disconnect from the database for the parameter change to take effect. If the logretain parameter was previously set to “no”, it is necessary to take an offline backup of the database for the database to be accessible.

If you prefer to use the DB2 command line processor rather than the Control Center to make the change, the following commands accomplish the same thing:

db2 connect to sampledb2 update db cfg for sample using logretain recoverydb2 connect resetdb2 backup database sample to /backup_directory

Important: If you are using the DB2 Control Center Version 7.2.0.0 or earlier to define the subscription set, a known defect in the Control Center causes it to incorrectly populate one of the control tables used for replication. This defect is resolved with APAR IY19523 (this APAR is scheduled to be included in DB2 Universal Database V7 Fixpak 4).

Therefore, it is necessary to change the generated subscription definition file before you run it. For each column in the target table, there is an INSERT statement into the ASN.IBMSNAP_SUBS_COLS table. Locate the INSERT statement for the DATALINK column being replicated. The value supplied for COL_TYPE is incorrectly set to “A”. This needs to be changed to “D”. If the subscription set is already defined, this column can be updated with the DB2 command line processor by connecting to the database containing the subscription definition and running this UPDATE statement:

update ASN.IBMSNAP_SUBS_COLS set COL_TYPE=’D’ where TARGET_OWNER=<owner> and TARGET_TABLE=<table> and TARGET_NAME=<dlcolname>

Here <owner> is the owner of the target table, <table> is the name of the target table, and <dlcolname> is the name of the DATALINK column in the target table.

182 Data Links: Managing Files Using DB2

Page 205: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6.5 Binding the Capture and Apply programsDB2 DPropR can automatically bind the Capture and Apply programs to the source and target databases on the UNIX, Windows, and OS/2 operating systems. You can also use the DB2 command line processor to manually bind the programs. Here are the commands we used to bind the programs to the SAMPLE and COPYDB databases:

db2 connect to samplecd sqllib/bnddb2 bind @capture.lst isolation ur blocking alldb2 bind @applyur.lst isolation ur blocking alldb2 bind @ applycs.lst isolation cs blocking all

db2 connect to copydbdb2 bind @applyur.lst isolation ur blocking alldb2 bind @ applycs.lst isolation cs blocking all

Note that you need to bind both the Capture and Apply programs to the source database, but only the Apply program to the target database. This is because Capture has no need to access the target database, while Apply needs to read from the source database and write to the target database.

9.6.6 Creating the password file for the Apply programLike any other application, the Apply program needs to connect to the databases that it will access. This connection is authenticated in the same way that any other connection is authenticated, which means that Apply must provide a user ID and a password. These are stored in a password file. The password file contains the name of the database, the user ID to be used for the connection, and a password. Our password file looks like this:

SERVER=SAMPLE USERID=source PWD=sourcepwdSERVER=COPYDB USERID=target PWD=targetpwd

Note: The logretain database configuration parameter should only be set to CAPTURE when the DB2 log files will not be used for rollforward recovery. When the logretain parameter is set to CAPTURE, the Capture program calls the PRUNE LOGFILE command to delete log files when the Capture program completes.

Chapter 9. Data replication 183

Page 206: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The first line contains the name of the replication source database, SAMPLE. The user ID that is used to connect to the SAMPLE database is source, which happens to be the DB2 instance owner ID of our instance named source. The password of the user ID source is sourcepwd. The next line supplies the name of the target server and the user ID and password that will be used to access it.

Naming the password fileThe password file must be named applyqual.pwd, where applyqual is the Apply qualifier defined in the subscription set. When we created the subscription set in 9.6.3, “Defining the subscription set and subscription set member” on page 178, we used MGRSUB as the Apply qualifier (see Figure 9-17 on page 178). Therefore, the password file needs to be named MGRSUB.pwd. The password file must reside in the same directory from which the Apply program will be started.

9.6.7 Configuration files used by ASNDLCOPYThe ASNDLCOPY user exit program needs two more files to replicate DATALINK columns and their linked files: ASNDLSRVMAP and ASNDLUSER.

ASNDLSRVMAPASNDLSRVMAP contains information necessary to transform the URL stored in the source DATALINK value into the URL that will eventually point to the copied file on the target server. Consider one of the DATALINK values stored in our replication source table SOURCE.MANAGERS:

HTTP://UNUNBIUM.ALMADEN.IBM.COM/dldata/source/pictures/cathy.bmp

We can break up this URL into four components:

� Protocol: HTTP� Hostname: UNUNBIUM.ALMADEN.IBM.COM� Pathname: /dldata/source/pictures� Filename: cathy.bmp

Note: The server name supplied must match the name in the subscription set, and, on UNIX and Windows NT, the user ID and password values supplied are case sensitive.

Note: Because the password file contains sensitive information, you may want to place it in a directory that is accessible only to authorized individuals.

184 Data Links: Managing Files Using DB2

Page 207: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

When we replicate this row to the target server, any of these four components may change. If we replicate to a target server running on the Windows NT operating system, the protocol might change from HTTP to UNC. In most cases, the hostname of the target server will be different from the source server. The pathname may change, and we may want to rename the copied file. The ASNDLSRVMAP file is used to define how to change the first three. The filename can be changed, but requires user written code to do so. This can be done by modifying the ASNDLCOPY user exit program.

Here is the format of the ASNDLSRVMAP file:

<source_server> <target_server> [<source_path> <target_path>]

The contents of the format are explained here:

� <source_server>: Contains the protocol name and hostname of the source server.

� <target_server>: Contains the protocol name and hostname of the target server.

� <source_path>: Is optional and contains the pathname of the source file.

� <target_path>: Is also optional and contains the pathname to the copied file on the target server.

Here is the content of a hypothetical ASNDLSRVMAP file:

HTTP://host1.com HTTP://host2.com /data /files

The ASNDLCOPY program uses this to change a source DATALINK value of HTTP://host1.com/data/groovin.mp3 to a target DATALINK value of HTTP://host2.com/files/groovin.mp3. If the source path and target path are not supplied, ASNDLCOPY copies the source file to a like-named path on the target server. The ASNDLSRVMAP file may contain multiple lines to define mapping for multiple source and target server pairs and multiple source and target pathname pairs.

The ASNDLSRVMAP file will look like this:

HTTP://UNUNBIUM.ALMADEN.IBM.COM HTTP://UNUNBIUM.ALMADEN.IBM.COM /dldata/source/pictures/dldata/target/photos

Important: Note that the ASNDLSRVMAP file used for our exercise contains only one line (it appears as two lines because of its length). Each source server and target server pair and any source pathname and target pathname must reside on a single line. The ASNDLSRVMAP file needs to reside in the same directory as the password file, that is, the directory from which the Apply program will be started.

Chapter 9. Data replication 185

Page 208: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

ASNDLUSERThe ASNDLUSER file contains the hostnames, port numbers, user IDs, and passwords used by ASNDLCOPY to physically copy the file being replicated from the source server to the target server. Here is the format of the ASNDLUSER file:

<source_hostname> <recv_port> <send_port> <userid> <passwd><target_hostname> <recv_port> <send_port> <userid> <passwd>

The first line identifies the hostname where the source file resides, the port number used by ASNDLCOPY to receive files, the port number used by ASNDLCOPY to send files, the user ID used to transfer files from the source, and the password of that user ID. The second line contains the same information for the target server.

Note that if the source hostname and the target hostname are the same, only one line is needed. Because we are using the FTP daemon to transfer files, both the receive port and the send port will use the standard FTP communication port number 21.

Here is the content of the ASNDLUSER file we will use for our exercise:

UNUNBIUM.ALMADEN.IBM.COM 21 21 target targetpwd

If we wanted to use the ASNDLCOPYD daemon instead of the standard FTP daemon (as is required to replicate files linked with the READ PERMISSION DB option), the receive port number specified in the ASNDLUSER file needs to match the port specified when starting ASNDLCOPYD. Typically we choose a port that we know is not being used. For example, if we started the ASNDLCOPYD daemon using port 9999, the ASNDLUSER file would need to look like this:

UNUNBIUM.ALMADEN.IBM.COM 9999 21 target targetpwd

This entry would cause ASNDLCOPY to connect to ASNDLCOPYD on port number 9999 with a user ID of target and a password of targetpwd.

Note: The ASNDLSRVMAP and ASNDLUSER files must reside in the directory from which the Apply program is started. Because the ASNDLUSER file contains sensitive password information, you may want to place it in a directory that is accessible only to authorized individuals.

186 Data Links: Managing Files Using DB2

Page 209: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

9.6.8 Configuration files used by ASNDLCOPYDThe ASNDLUSERINFO and the <userid>.DIR configuration files enable the ASNDLCOPYD daemon to restrict which user IDs can transfer files and which directories are accessible by those user IDs. These files must reside in a common directory, and the ASNDLCOPYD daemon must be started using this directory name as an argument.

ASNDLUSERINFOThe configuration file called ASNDLUSERINFO contains a list of users which can login to the ASNDLCOPYD daemon, and the login password for each user. The password stored in this file can be encrypted. Here is how an entry in the ASNDLUSERINFO file may appear:

db2inst99 22OAoPwDj0g

When the ASNDLCOPY user exit program connects to the ASNDLCOPYD daemon, ASNDLCOPY must supply a user ID and password. ASNDLCOPY gets this information from the ASNDLUSER configuration file. ASNDLCOPYD validates this user ID and password by looking in the ASNDLUSERINFO configuration file. Any attempt to connect to ASNDLCOPYD with a user ID or password which is not listed in the ASNDLUSERINFO file will be rejected.

The ASNDLUSERINFO file is populated and updated by using the ASNDLCOPYD_CMD command.

Here is the syntax for the command:

ASNDLCOPYD_CMD [-d<config_dir>] {ADDUSER | PASSWD | RMUSER} [<arg0...>]

� <config_dir>: A directory where the ASNDLUSERINFO file will be stored. If no directory is supplied, the ASNDLUSERINFO file is created or updated in the current directory (that is the directory from which the command was run).

� ADDUSER: Adds a user ID to the ASNDLUSERINFOFILE and prompts for a password. <arg0> must specify the user ID to be added.

� PASSWD: Is used to change the password of an user ID already in the ASNDLUSERINFO file. <arg0> must specify the user ID whose password is to be changed.

� RMUSER: Removes a user from the ASNDLUSERINFO file. <arg0> must specify the user ID to be removed.

Chapter 9. Data replication 187

Page 210: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

<userid>.DIRFor a user listed in the ASNDLUSERINFO file, there must exist a file named <userid>.DIR (for example, db2inst99.DIR). This file contains a list of directories that are accessible by the user in the ASNDLUSERINFO. The db2inst99.DIR file might look like this:

/datalinks/data/photos/datalinks/data/mp3

When the ASNDLCOPY program connects to the ASNDLCOPYD daemon as user db2inst99, it has access to files in the /datalinks/data/photos directory (including any subdirectories) and /datalinks/data/mp3.

The <userid>.DIR file is created and updated manually.

9.6.9 Starting and stopping the Capture and Apply programsThe Capture program usually runs on the source server. Capture can be started by issuing the following command:

asnccp <source_server> <options>

We start the Capture program here with this command:

asnccp SAMPLE COLD NOPRUNE

Using the COLD option causes Capture to clean up the CD table before it runs. Capture does not start collecting data from the replication source until the Apply program starts. Capture then does an initial load of the CD table from the replication source table.

The Apply program must be started from the directory containing the applyqual.pwd file. This directory should also contain the ASNDLSRVMAP and ASNDLUSERS configuration files. Apply is started by specifying the apply qualifier and the target database name:

asnapply <apply_qualifier> <target_database>

To start Apply, use this command:

asnapply MGRSUB COPYDB

To start the ASNDLCOPYD daemon, execute the ASNDLCOPYD command as root (or a user with superuser privilege on UNIX or administrator authority on Windows NT):

ASNDLCOPYD <port_number> [<config_directory>]

188 Data Links: Managing Files Using DB2

Page 211: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

To stop the Capture program, use the asncmd command:

asncmd <source_server> STOP

To stop the Apply program, use the asnastop command:

asnastop <apply_qualifier>

To stop the ASNDLCOPYD daemon on UNIX, log on as a user with root authority:

kill -9 <process_id>

To stop the ASNDLCOPYD daemon on Windows NT, log on as a user with administrator authority, right-click the task bar, select the asndlcopyd process, and click the End Process button.

Chapter 9. Data replication 189

Page 212: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

190 Data Links: Managing Files Using DB2

Page 213: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 10. The Reconcile utility

This chapter describes the Reconcile utility. Reconcile is a validation process that takes place between the DB2 Universal Database server that has tables with DATALINK columns and the DLFM server. It validates that the files referenced or linked by the DATALINK columns on the DB2 Universal Database server exist on the DLFM server or that the links can be re-established or that the file are in the proper state.

For example, if you were to insert a row into a table that has a DATALINK data type column, the insert would complete successfully if the file referenced (assuming that the file is not already linked) in the insert statement exists on the DLFM server. The file that is referenced will then be considered linked and “reconciled” since the table with the DATALINK column on the DB2 Universal Database server is in sync with the DLFM server. The purpose of the Reconcile utility is to ensure that the relationship described in the example given is maintained.

10

© Copyright IBM Corp. 2001 191

Page 214: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

10.1 OverviewThe Reconcile utility is initiated from the DB2 Universal Database server, and it involves all the Data Links servers running the DB2 Data Links Manager that are referenced by the DATALINK column values. When the Reconcile utility is initiated on the DB2 Database server, it communicates with the DLFM servers that are referenced by DATALINK column values. If the DLFM server is not available when the Reconcile utility is initiated, the warning message shown in Figure 10-1 is returned on the DB2 database server.

Figure 10-1 Reconcile warning when DLFM server is not available

In this example, we initiated the Reconcile utility on the DB2 Universal Database server with the following command:

db2 reconcile resident dlreport recon_report

When the Reconcile utility completes, it generates a report file with an .exp and a .ulk extension. The .ulk file contains a list of files that were unlinked on the file server, and the .exp file contains a list of files that were in exception on the file server. If there were no exceptions or no files where unlinked, the report files will be empty.

In addition, the Reconcile utility also provides an option for specifying an exception table. All exceptions that were encountered when the Reconcile utility was initiated are populated in the exception table. In order for the exception table to be populated, it must have the same structure as the table that is being reconciled. This table can then be used by the Import or Load utilities. This is desirable since it can prevent the manual process of correcting exceptions by using the Reconcile report files (Figure 11-28 shows an example on creating an exception table). For more information on the Reconcile utility, refer to the RECONCILE command in DB2 UDB Command Reference, SC09-2951.

Note: When the Reconcile utility is initiated, it locks the table being reconciled with a “Z” or Super Exclusive lock. The table is locked until the Reconcile utility is complete. The “Z” lock prevents any access to the table. In our example, a snapshot for LOCKS on the table being reconciled showed the table being locked with a “Z” lock (Figure 10-2). This should be considered before you run the Reconcile utility, especially if the table has a large number of rows.

192 Data Links: Managing Files Using DB2

Page 215: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 10-2 Extract of a lock snapshot for a table being reconciled

The Reconcile utility reconciles at a table level, that is, the Reconcile utility needs to be initiated for each table that has columns defined with the DATALINK data type. DB2 Universal Database provides a utility called db2_recon_aid, which provides a mechanism for checking and running reconcile on tables that are potentially inconsistent with the DATALINK file data on the DLFM file server.

The db2_recon_aid with the check option lists the tables that may need reconciliation. No reconciliation is performed. This is useful for determining which tables need to be reconciled in an environment where there are many tables in the database. In our example, we ran the db2_recon_aid with the check option on the database (Figure 10-3).

Figure 10-3 Output of the db2_recon_aid utility with the CHECK option

The execution of the db2_recon_aid utility without the check option initiates the Reconcile utility for each of the tables that may require reconciliation. For more information on the db2_recon_aid utility, refer to DB2 Data Links Manager Quick Beginnings, GC09-2966.

Note: You cannot specify individual DLFM file servers with the db2_recon_aid utility even though the utility provides the option.

Chapter 10. The Reconcile utility 193

Page 216: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

10.2 When to run the Reconcile utilityYou should run the Reconcile utility against tables that are in a Data Link Reconcile Pending (DRP) state to remove them from this state. To determine if a table is in a DRP state, you can examine the db2diag.log (Figure 10-4).

Figure 10-4 Extract of db2diag.log showing a table in DRP state

Alternatively, you can run the db2dart utility (Figure 10-5) against the database or tablespace.

Figure 10-5 Extract of a DB2DART report showing a table in DRP state

In some situations, the such utilities as Import or Load may detect a problem with the meta data in a DLFM server. In these situations, DB2 Universal Database would fail the utility. The table with the DATALINK column would then be placed in a Data Link_Reconcile_Not_Possible state (DRNP). The Reconcile utility cannot be run against a table in this state. The table with the DRNP state must be placed into a DRP state by using the SET INTEGRITY SQL statement.

Figure 10-6 shows a summary of when the Reconcile utility should be initiated as listed here:

� If the tables are in a Data Link Reconcile Pending state (DRP), run the Reconcile utility to remove the table from a DRP state to a normal state. Sometimes DB2 Universal Database automatically places tables into a DRP state if an inconsistency is suspected.

Note: The SET INTEGRITY and SET CONSTAINTS SQL statements are equivalent. SET INTEGRITY has replaced SET CONSTRAINTS, which is still available for compatibility with previous versions. For more information on the SET INTEGRITY statement, refer to IDB2 UDB SQL Reference, SC09-2974.

194 Data Links: Managing Files Using DB2

Page 217: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� If the tables are in a Data Link Reconcile Not Possible state (DRNP), you can:

– Prevent access to a table with possibly inconsistent DATALINK column values by issuing the following two commands.

First, place the table into a normal state with the command:

SET INTEGRITY FOR tablename DATALINK RECONCILE PENDING IMMEDIATE UNCHECKED

Then place the table into a DRP state with the command:

SET INTEGRITY FOR tablename TO DATALINK RECONCILE PENDING

When a table is in DRP state, you can only issue the SELECT SQL statement against the table. The DRP state on the table prevents INSERT, UPDATE, and DELETE SQL statements against the table. Run the Reconcile utility to remove the table from the DRP state.

– Alternatively, you can update the DATALINK values of a table in DRNP state using either of the following methods:

• Using the SQL UPDATE statement, set the data location part of a DATALINK column value to a zero-length URL if the column is not nullable, or to NULL if the column is nullable.

• Restore the files on the appropriate Data Links servers. Then run an application that issues SELECT statements to read the DATALINK column values and issues UPDATE statements to update the DATALINK column with the same values. After the update operation completes, the files are marked as linked on the appropriate Data Links servers.

After the UPDATE SQL statements are completed, you can reset the DRNP state by issuing the following SQL statement to bring the table to a normal state:

SET INTEGRITY FOR tablename DATALINK RECONCILE PENDING IMMEDIATE UNCHECKED

Then place the table into a DRP state with the following SQL statement:

SET INTEGRITY FOR tablename TO DATALINK RECONCILE PENDING

Run the Reconcile utility.

Note: Note that the Data Link_Reconcile_Not_Possible state (DRNP) must be on while the DATALINK column values are being updated. You cannot update a table in DRP state.

Chapter 10. The Reconcile utility 195

Page 218: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 10-6 Determining when to run the Reconcile utility

10.3 Situations that require the Reconcile utilityThe following situations may lead to DB2 Universal Database placing the tables in a DRP or DRNP state and you may need to run the Reconcile utility: � The entire database is restored and rolled forward to a point in time. Because

the entire database is rolled forward to a committed transaction, no tables will be in the check pending state (due to referential constraints or check constraints). All data in the database is brought to a consistent state. The

Is the tablein a DRPor DRNP

state?

RunReconcileto removeDRP state Update Datalink

column valuesusing SQL

SETINTEGRITY ...

TO DRPIMMEDIATE

UNCHECKED

(To reset DRNPto NORMAL

state)

RunReconcileto removeDRP state

DRP

DRNPYou can

eitherPREVENTACCESS toor UPDATE

the table

SETINTEGRITY ...

TO DRP

(To preventaccess to the

table and placein DRP state)

SETINTEGRITY ...

TO DRPIMMEDIATE

UNCHECKED

(To reset DRNPto NORMAL

state)

SETINTEGRITY ...

TO DRP

(To preventaccess to the

table and placeinto DRP state)

RunReconcileto removeDRP state

PREVENT

ACCESS

UPDATE

196 Data Links: Managing Files Using DB2

Page 219: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DATALINK columns, however, may not be synchronized with the meta data in the DB2 Data Links Manager, and reconciliation is required.

In this situation, tables with DATALINK columns data will already be in the Datalink_Reconcile_Pending state. You should issue the Reconcile utility for each of these tables.

� A particular Data Links server running the DB2 Data Links Manager lost track of its meta data. This can occur for different reasons:

– The Data Links server was cold started. – The Data Links server meta data was restored to a back-level state.

In some situations, such as SQL UPDATEs and DELETEs, DB2 may be able to detect a problem with the meta data in a Data Links server. In these situations, DB2 fails the SQL statement. You would put the table in the Datalink_Reconcile_Pending state by using the SET CONSTRAINTS statement, then run the Reconcile utility on that table.

� A file system is not available (for example, because of a disk crash) and is not restored to the current state. In this situation, files may be missing.

An error like this is typically discovered by an application when it cannot access the file whose file reference it obtained from the database. You should put the table in the Datalink_Reconcile_Pending state and run the Reconcile utility on it. Some of the files may be restored from the archive server if their corresponding DATALINK columns had RECOVERY=YES. In any case, the Reconcile utility records the exceptions in the exception table or in the exception report. You can then restore those files or issue SQL UPDATEs to fix the column.

10.3.1 Reconcile algorithmThis section outlines the high level steps taken by the Reconcile utility to remove a table from DRP state.

The algorithms presented are for tables that have DATALINK columns with the RECOVERY=YES option and for tables that have DATALINK columns with the RECOVERY=NO option. A table can have both types of columns.

Algorithm with RECOVERY=YESThe Reconcile utility performs the following process for columns with RECOVERY=YES.

Chapter 10. The Reconcile utility 197

Page 220: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� The file table entry does not exist for this file, then:

a. Retrieve the file from the archive.

• Retrieves only if the proper version of the file is not already in the file system.

• If the modification time of the file on file system is greater than the version required, then before retrieving, it renames the existing file with the extension .MOD. (The DB2 side exception report file will have information indicating that this was done, irrespective of whether the exception table was specified). If the .MOD file already exists, the file will not be retrieved; it is an Exception (it will be reported in the exception report and table on DB2 side).

• Retrieves the file from archive if the file is not found or if the modification time of the file on file system is less than the version required.

b. Do the "relink" processing.

� The file table entry does exist for this file, then:

– If the file is in a "linked" state as per the dfm_file table entry, then:

• Check if the file exists in the file system:

If YES, then check if the file is modified (or) file size changed.

If YES, then call retrieve daemon to retrieve the file from archive (see the conditions under a. in the first bullet above).

If file is not modified or file size is fine, then check if inode, fsid, or cellid is changed.

If YES, then update the dfm_file table entry with these new values.

• If file does not exist in file system, then call the retrieve daemon to retrieve the file from archive.

After retrieval of file from archive is successful, check if inode, fsid, or cellid of the retrieved file is different from the file table entries values.

If YES, then update the dfm_file table entry with values from the retrieved file.

– If the file is in the "unlinked" state as per the dfm_file table entry, then:

i. Retrieve file from archive (if needed, and based on the conditions mentioned in a under the first bullet).

ii. Bring back the entry to "link" state. If retrieve was successful, then if inode, fsid, or cellid of the retrieved file is different from the dfm_file table entry values, then during the "link" processing, update the dfm_file table entry with values of the retrieved file.

198 Data Links: Managing Files Using DB2

Page 221: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Algorithm with RECOVERY=NOThe following process takes place by the Reconcile utility for columns with the RECOVERY=NO option.

� The file table entry does not exist for this file, then an exception is issued.

� The file table entry exists for this file, then:

– Check if file exists in the file system. If YES, then:

If Full Access and If file modified or filesize changed then Exception. Otherwise, if inode, fsid, or cellid of the file on file system is different from the dfm_file table entry values, then update the dfm_file table entry with these new values.

If file exists on file system in proper state, then if file is in "unlink" state as per the dfm_file table entry, then bring it back to "link" state.

– If the file does not exist in file system, then an exception is issued.

– For the exception cases when the file table entry exists, bring the corresponding dfm_file table entries to an unlink state.

Chapter 10. The Reconcile utility 199

Page 222: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

200 Data Links: Managing Files Using DB2

Page 223: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 11. Recovery

This chapter describes recovery in a Data Links File Manager (DLFM) environment. Recovery in a DLFM environment may be required on the DLFM server, the DB2 Universal Database server and the file system storing the referenced DATALINK files. The best recovery strategy is one that is tested. It is important to plan ahead for disaster recovery.

This chapter looks at the following topics:

� Types of recovery� Backup and restore considerations� The database recovery history file� The garbage collection process� The Reconcile utility� Example recovery scenarios

11

© Copyright IBM Corp. 2001 201

Page 224: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.1 OverviewA database can become unusable because of hardware or software failure (or both), and the different failure situations may require different recovery actions. You should have a strategy in place to protect your database against the possibility of these failure situations. When designing a strategy, you should also rehearse it. This will allow you to detect any shortcomings in the plan and to avoid problems if you have to recover the database.

In general, recovery takes place after a failure, but there are cases where recovery is needed to go back in time to remove changes that were made to the database. An example of this is a user requesting a recovery to a point in time before the latest updates were made to the database. While recovery in a DLFM environment is similar to a standard database server recovery, there are some additional considerations in the DLFM environment.

There are three types of recovery that can take place in a DB2 UDB Database and a DLFM environment:

� Crash recovery� Version or Full Database recovery� Restore and Rollforward recovery

11.1.1 Crash recoveryThe purpose of crash recovery is to bring the database back to a consistent state after a severe error or condition that causes the database manager to abnormally terminate. It is possible that during the crash, for example a power failure, there were database transactions that were not committed or partially completed. The database manager would either commit or rollback transactions on the next CONNECT, ACTIVATE, or RESTART database command.

Crash recovery processes only the active transaction log files by performing a forward recovery followed by a backward or undo recovery before allowing access to the database. At the end of crash recovery, the database is in a consistent and usable state as before the crash occurred.

Note: You can think of active transaction logs as the logs used by DB2 Universal Database during normal transaction processing. The active logs are allocated by DB2 Universal Database when the first database connection is made. The number of active logs allocated is determine by the LOGPRIMARY database configuration parameter.

202 Data Links: Managing Files Using DB2

Page 225: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

In a DLFM environment, the two-phase commit protocol is implemented between the DB2 database manager and the DLFM servers to process a COMMIT or ROLLBACK statement issued by an application. Let us first define the two-phase commit, and then discuss how DB2 uses it for transactions with the DLFM.

Two-phase commitThe two-phase commit protocol is used in distributed transactions. It makes sure that the outcome of a transaction is consistent across all the resources involved in the transaction. As the name suggests, the protocol operates in two distinct phases to ultimately commit or abort a transaction.

In a Data Links scenario, the DB2 database manager is the coordinating transaction manager, and the DLFMs are the resource managers. The two phases in this case are (see Figure 11-1):

� Phase one: The DB2 database manager asks each DLFM involved in the transaction if it is ready to commit (1-a in Figure 11-1). If the DLFM is ready to commit the transaction, it puts the transaction in the PREPARED state and responds YES to the DB2 database manager. Only if all the DLFMs have responded YES, the transaction is committed.

� Phase two: If all the DLFMs respond YES, the DB2 database manager instructs them, in return, to commit the transaction (see 2-a in Figure 11-1). If at least one of the DLFMs responds NO, or no response came from it (due to network or machine failure), the DB2 database manager instructs all the DLFMs to rollback the transaction. Regardless of the instruction (commit or abort), each DLFM (up and running) complies and notifies when it is done (see 2-b in Figure 11-1).

Chapter 11. Recovery 203

Page 226: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-1 Two-phase commit

Three different situations may occur:

� One of the DLFMs prepared the transaction and went down before sending its YES response to the database manager.

� One of the DLFMs prepared the transaction and sent its YES response to the database manager, but its response could not reach the database manager due to, for instance, a network failure.

� One of the DLFMs prepared the transaction, and the database manager sends a final commit after receiving the YES response sent by all DLFMs (including this one), and at this stage, the DLFM goes down.

Each of these three cases would result in leaving the transaction at the DLFM side in PREPARED state. Such transactions are known as in-doubt transactions.

Two Phase Commit

DB2 DBM

DLFM-1

DLFM-n

1st phase 2nd phase

Are youready tocommit?

"YES"response

Commit

Commit

Notification

NotificationAre youready tocommit?

"YES"response

Transaction commited

(1-a) (1-b)

... ... ... ...

(2-a) (2-b)

204 Data Links: Managing Files Using DB2

Page 227: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Whenever the database manager determines that a failure has potentially created in-doubt transactions on a Data Links server, it marks the state of the Data Links server as needing crash recovery. It disallows any SQL requests involving the Data Links server, while it is in this state. SQL0357N with reason code “03” is returned to the application that made the SQL request.

While a Data Links server configured to a DB2 UDB Database is in a state needing crash recovery, the database manager disallows SQL requests involving that particular Data Links server. SQL requests involving other data in the database are still allowed. The database manager starts a process that asynchronously attempts to complete crash recovery on each Data Links server requiring recovery. When the process successfully completes the crash recovery, the state of the Data Links server is marked as available, allowing further SQL requests that involve it.

11.1.2 Version or full database recoveryVersion recovery using the BACKUP command in conjunction with the RESTORE command puts the database in a state that was previously saved (Figure 11-2). You use this recovery method with non-recoverable databases (that is, databases for which you do not have archived logs).

You can also use this method with recoverable databases by using the WITHOUT ROLLING FORWARD option. For example, if a full offline database backup is taken and the LOGRETAIN database configuration parameter is set to YES, the database will be placed into ROLLFORWARD PENDING after a restore. A ROLLFORWARD command with the STOP option removes the database from the ROLLFORWARD PENDING state and allows access to the database.

Chapter 11. Recovery 205

Page 228: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-2 Version or full database recovery

In a DLFM environment, note the following points when considering a version recovery:

� The DB2 UDB Database server can be a recoverable or a non-recoverable database. A database becomes recoverable when the LOGRETAIN and or the USEREXIT database configuration parameters are turned on. A version recovery requires the use of a full offline databases backup to recover.

� The DLFM_DB database on the DLFM server is a recoverable database and requires a full offline backup for a version recovery. The WITHOUT ROLLING FORWARD option must be used when restoring the offline backup to prevent the database from going into a ROLLFORWARD PENDING state.

In the event that a restore is done without specifying the WITHOUT ROLLING FORWARD clause, a ROLLFORWARD STOP after the restore would make the database accessible.

CREATEdatabase

BACKUPdatabase

BACKUPdatabase

BACKUPdatabase

RESTOREdatabase

BACKUPdatabase

image

BACKUPdatabase

image

BACKUPdatabase

image

create createcreate

Unitsof

work

Unitsof

work

TIME

206 Data Links: Managing Files Using DB2

Page 229: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.1.3 Restore and rollforward recoveryRollforward recovery builds on a restored database and allows you to restore a database to a particular time that is after the time that the database backup was taken (Figure 11-3). This point can be either the end of the logs, or a point between the time of the database backup and the end of the logs. The LOGRETAIN configuration parameter in the database configuration file must be set to YES to invoke log retention.

Figure 11-3 Rollforward recovery

The LOGRETAIN parameter indicates the use of log retention. The USEREXIT parameter indicates that a user exit program is used to archive and retrieve the log files. Log files are archived when the database manager closes the log file. They are retrieved when the ROLLFORWARD utility needs to use them to restore a database. If a user exit program is not used, then all the logs are kept in the current log path. This can lead to a file system full condition. We highly recommend you use a user exit program to archive log files to disk or Tivoli Storage Manager.

Important: We advise you not to copy log files manually from the log directory to free space on a file system. Some log files may be active and manually copying these files may corrupt them. Active log files are also required for crash recovery to complete.

CREATEdatabase

BACKUPdatabase

BACKUPdatabase

RESTOREdatabase

n archived logs1 active log

update

Unitsof

work

Unitsof

work

TIME

changes in logs

n archived logs1 active log

update

ROLLFORWARD

Chapter 11. Recovery 207

Page 230: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

After LOGRETAIN, or USEREXIT, or both of these parameters are enabled, you must make a full backup of the database. This state is indicated by the backup_pending flag parameter.

Log retention provides the following additional features over version recovery:

� The ability to take online database and tablespace backups� Point-in-time recovery of databases and tablespaces� The ability to restore and rollforward to the end of the logs

11.2 DLFM backup considerationsUsing files externally introduces a new level of complexity. Because there are many parts involved, you must take care of all involved components. You do not have only one file as a backup image, but several database backup images and backups of the involved file systems.

In a Data Links Manager installed environment, there are at least two databases. One is the database on DB2 database server, and the other is the database that is used by Data Links Manager to store the meta data (DLFM_DB database). These databases are kept in sync; every time an INSERT/UPDATE/DELETE on a DATALINK column is run, the DLFM_DB is updated too.

No linked file can be renamed, updated, or deleted (setting WRITE PERMISSION BLOCKED in DATALINK column) without control of the DB2 database server. When files are linked, that is, when an insert or update is made on the DB2 database server on a DATALINK column, the files (if marked with RECOVERY=YES) are asynchronously copied into the backup directory or into the hierarchical storage. The working file that remains in the original directory is now protected through DB2. All involved components (Database on DB2 server, Database on Data Links server, and files and backups) are in sync.

Note: Active logs are required for crash recovery. Archived logs can be used for database or tablespace recovery. You might use point in time recovery if an active or an archived log is not available.

In this situation, you could roll forward to the point where the log is missing. You might also roll forward to a point in time if a bad transaction was run against the database. In this situation, you would restore the database and then roll forward to just before the time that the bad transaction was run.

208 Data Links: Managing Files Using DB2

Page 231: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

When the Backup utility runs, DB2 ensures that all files scheduled for copying are copied. At the beginning of the backup process, DB2 also ensures that all Data Links Managers that are specified in the DB2 are running. If a Data Links Manager has one or more linked files, it must be available until the backup operation completes. If a DB2 Data Links Manager becomes unavailable before the backup operation completes, the backup operation is declared as incomplete.

The description that follows only applies to files that are linked by DATALINK columns that have the RECOVERY parameter set to YES. (Files that are referenced by DATALINK columns for which RECOVERY=NO is specified are not backed up or copied to the archive server.)

Figure 11-4 illustrates the actions that are performed during an INSERT in a Data Links environment.

Figure 11-4 Asynchronous archive request

When an INSERT SQL statement is issued against a table with a DATALINK data type on the DB2 server, a DB2 agent attempts to process the request by communicating this request with the DLFM server (Figure 11-4).

On the DLFM server the a dlfm_child process receives the request, links the file and makes a call to the COPY DAEMON (dlfm_copyd) process to make an asychronous backup the file being referenced in the insert statement to the archive server. The archive server could be Tivoli Storage Manager, disk, or any XBSA type storage system.

Figure 11-5 illustrates the processing that takes place when the Backup utility is executed on the DB2 UDB database server.

db2agent

AsynchronousArchive Request

dlfm_child CopyDaemon

InsertArchiveServer

SQL InsertStatement

DB2 Server DLFM Server

Chapter 11. Recovery 209

Page 232: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-5 Processing that takes place during a backup

When the database Backup utility runs on the DB2 server, DB2 ensures that all files scheduled for copying are copied. The database backup operation initiates an internal retry logic. After the retry logic iterations are completed, the backup fails if the data linked FILE backup is not complete.

When files are linked, the Data Links servers schedule them to be copied asynchronously to an archive server, such as ADSM, or to disk. When the Backup utility runs, DB2 ensures that all files scheduled for copying are copied. At the beginning of the Backup process, DB2 contacts all Data Links servers that are specified in the DB2 configuration file. If a Data Links server has one or more linked files and is not running, or stops running during the backup operation, the backup will not contain complete DATALINK information. The backup operation will complete successfully.

Before the Data Links server can be marked as available to the database again, the backup process for all outstanding backups must complete successfully. If a backup is initiated when there are already twice the value of num_db_backups outstanding backups waiting to be completed on the Data Links server, the backup operation will fail. That Data Links server must be restarted and the outstanding backups completed before additional backups are allowed.

11.2.1 Environment backup considerationsFigure 11-6 outlines the components that are involved when backing up the entire environment.

BackupVerify

DB2backup

Ensure file backup completedlfm_child

DB2 Server DLFM Server

Note: A successful backup operation can cause the Data Links servers to clean up (Garbage Collect) the archived versions of files on the archive server (either disk or TSM). The num_db_backups database configuration parameter specifies the number of DB2 database backups before archived versions of the files (that were unlinked) are removed.

210 Data Links: Managing Files Using DB2

Page 233: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

1. Make sure that all Data Links servers are up and running (unless you have specified the NO LINK CONTROL option in the DATALINK definition).

2. Back up the databases on the DB2 database servers.

3. Back up the DLFM_DB databases on the Data Links servers.

4. Back up the file systems used by the Data Links Manager.

File systems need to be unmounted, backed up (via the operating system), and then mounted again.

5. Back up the DLFM Backup directory that holds images of the DLFM_DB database and copies of the linked files and all updates if the RECOVERY OPTION is set to YES in the DATALINK column to provide point-in-time rollforward recovery.

Figure 11-6 Environment backup considerations

11.3 DLFM restore considerationsThe information that follows applies if you have a DATALINK column (or columns) that is defined with RECOVERY=YES option for a table. If a table has a DATALINK column defined with the RECOVERY=NO option, the table is put in the Datalink_Reconcile_Pending (DRP) state at the end of the restore operation.

Figure 11-7 shows the processing that takes place when the restore utility is executed on the DB2 UDB database server.

DB2 Database Server

DLFM_DB

DL Backup Dir

DL FileSystems

Data Links Server

DB: Production

http://fileserv1/images/p10.bmp

nr imagedesc

http://fileserv2/im2/p123.bmp

oo1

1

oo2

3

2

4

Chapter 11. Recovery 211

Page 234: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-7 Processing that takes place during a restore

When the restore utility is invoked on the DB2 server, the fast reconcile routine is invoked if the WITHOUT DATALINK is not specified and if there is no break in the log sequence (LS) or log chain. The dlfm_child process may have to call the RETRIEVE DAEMON (dlfm_retrieved) to retrieve the file from the archive server if the file to be LINKED is not available on the file system (Figure 11-7). Fast reconcile logic performs the following process:

� All files that were linked after the backup image that was used for the database restore are marked as unlinked (because they are not recorded in the backup image as being linked).

� All files that were unlinked after the backup image, but were linked before the backup image was taken, are marked as linked (because they are recorded in the backup image as being linked). If the file was subsequently linked to another table in another database, the restored table is put into the Datalink_Reconcile_Pending state (Figure 11-8).

Figure 11-8 Restore with the WITHOUT DATALINK option

Note: The DLFM process model may change in future releases of DB2.

DB2 Restore

Retrieve correctfile version

Reconcile w.r.t.DB

RetrieveDaemon Archive

ServerFile

System

dlfm_child

DB2 Server DLFM Server

RestoreDatabase.WITHOUTDATALINK

Tables withDATALINK

columns placedin Reconcile

Pending (DRP)

RunReconcile

212 Data Links: Managing Files Using DB2

Page 235: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

If you use the restore utility with the WITHOUT DATALINK option, all tables with DATALINK columns are placed in the Datalink_Reconcile_Pending state, and no reconciliation is performed with the Data Links servers during the Restore operation. This option can be used when the DLFM server is unavailable (Figure 11-9).

Figure 11-9 Restore without specifying the WITHOUT DATALINK option

(1) RestoreDatabaseWITHOUT

DATALINK?

Tables withDATALINK

columns placedin Reconcile

Pending (DRP)

RunReconcile

Tables withDATALINK

columns placedin Reconcile

Pending (DRP)

RunReconcile

Were fileslinked after

backuptaken?

Is DLFMavailable ?

Files areunlinked

Were filesunlinked

afterbackuptaken?

Were fileslinkedbeforebackuptaken?

Files arelinked

yes

yes

yes

no

no

no yes

yes

Chapter 11. Recovery 213

Page 236: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.4 Recovery history fileEvery DB2 UDB Database and the Data Links File Manager Database (dlfm_db) has a history file that records historical administrative operations. A recovery history file is created with each database and is automatically updated. During a database migration, the history file is migrated as well.

The history file can be accessed by issuing the following command:

db2 list history all for <dbname>

The database history file is invaluable in a recovery scenario.

The history file is individually restorable from any backup image. If the current database is unusable or not available, and the associated recovery history file is damaged or deleted, an option on the RESTORE command allows only the recovery history file to be restored. The recovery history file can then be reviewed to provide information on which backup to use to restore the database.

For example, we restore the history file for our sample database with:

db2 restore database sample history file

11.4.1 Events recorded in the history fileThe following events are recorded in the history file:

� Backup� Restore� Rollforward� Load� Quiesce of a tablespace� Alter tablespace� Dropped table (when dropped table recovery enabled)� Reorganization of a table� Update of table statistics

Note: The size of the file is controlled by the REC_HIS_RETENTN database configuration parameter that specifies a retention period (in days) for the entries in the file. Even if the number for this parameter is set to 0, the most recent full database backup, plus its restore set, is kept. (The only way to remove this copy is to use the PRUNE with FORCE option.) The retention period has a default of 366 days. The period can be set to an indefinite number of days by using -1. In this case, explicit pruning of the file is required.

214 Data Links: Managing Files Using DB2

Page 237: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.4.2 Data recorded in the history fileThe following data is recorded in the history file:

� Object affected (database, tablespace, or table)� Location and device type of output (backup image or load copy)� The status of the backup: active, inactive, expired, or deleted � Range of relevant log files� Start and completion time of event� Resulting SQLCA

11.5 Restoring an offline backup without rollforwardA database can be restored without rolling forward by using a backup that was created with the offline option (this is the default). We show the restore and what happens to the files that were linked after the backup was taken. The steps are:

1. Select data in the Data Links table.2. List files on the DLFM server.3. Insert a new row into Data Link table.4. Select data in the Data Link table.5. List files on the DLFM server.6. Restore to the backup taken before a new row is inserted.7. Display a report of unlinked files from a fast reconcile.

We use a SELECT statement with a Data Link function to show the data in the table before inserting new data. The data shown in Figure 11-10 is the data that is on our backup image, for example:

db2 'select dlurlpathonly(picture) from db2inst1.resident'

Note: When an ONLINE backup is taken, the history file shows the EARLIEST LOG and the CURRENT LOG. These are MINIMUM range of logs required for the Restore to complete. You have to rollforward through these logs to move the database out of a ROLLFORWARD PENDING state. Therefore, it is important to ensure that these logs are archived in a safe place and can be retrieved in an event of a recovery.

Chapter 11. Recovery 215

Page 238: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-10 Selecting results prior to insert and restore

Figure 11-11 shows the files in the file system that is under the control of DLFF. Note the permissions before the insertion of a new row.

Figure 11-11 The ls results of the Data Link file system prior to insert

We insert a new row as shown in Figure 11-12 and display the contents of the table after the insert.

216 Data Links: Managing Files Using DB2

Page 239: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-12 Inserting and selecting after a new link

In Figure 11-13, the file pic6.bmp is now linked and under the control of DLFM. The only way to access this file is as a root user or by using an access token generated by the DB2 database. Refer to 2.3.3, “How access tokens work” on page 32, for more information on access tokens. It seems like the dlfm user would be able to access the files because the permissions show read for dlfm. In fact, the DLFF will block access to the files from the dlfm user even though the files, when listed, show read permission.

Chapter 11. Recovery 217

Page 240: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-13 List files after the link operation has completed

The next step is to restore the database to the backup. The file that was linked after the backup is unlinked and returned to its original state. If the DATALINK table used the parameter on UNLINK DELETE, the file would have been deleted. In this example, we did not have to run the Reconcile utility because fast reconcile was run. Figure 11-14 shows the Restore and the message that says pic6.bmp was unlinked.

Figure 11-14 Restore command and files that were unlinked

The following steps summarize the Restore to an offline backup:

1. An offline backup of the DB2 UDB database that has a table with a DATALINK column is taken (T1). See Figure 11-15.

2. A row is inserted into the table that has a DATALINK column (T2).

3. The backup image taken at time T1 is restored without rolling forward (T3).

218 Data Links: Managing Files Using DB2

Page 241: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4. fast reconcile is run, and it unlinks the row inserted at T2 on the DLFM server (T4). This occurs because the row inserted at T2 is not in the backup image taken at T1.

Figure 11-15 Restore of an offline backup

11.6 Restoring and rolling forward to a point in timeThe steps for a restore and rollforward to a point in time with Data Links are:

1. Find the most recent backup taken prior to the point in time to which we will restore.

2. Restore the database.3. Rollforward to the backup time to obtain the minimum rollforward time.4. Rollforward to minimum CUT time plus 5 minutes.

5. Reconcile

DB2 UDBDatabase

OffilineBackup

Row insertedinto table with

Datalink column DB2 UDBDatabase

T1 T2 T3

RestoreBackup taken

at T1

DLFM Database

Inserted Rowfrom T2Unlinked

T4

FastReconcile

Time

DB2 Server DLFM Server

Note: The point in time recovery must be to a Coordinated Universal Time (CUT).

Note: Make sure that log-retain has been set to ON and RECOVERY is set to YES for the corresponding DATALINK column.

Chapter 11. Recovery 219

Page 242: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

We use the db2 list history backup all for db dlrestor command to produce the results shown in Figure 11-16. The point in time we restore to is a time greater than the end time of the backup. The time we use for the rollforward is 2001-05-15-11.54.07.

Figure 11-16 List history to find backup and point in time

We restore the database using information from the list history command in Figure 11-16. This restore is to a backup and the rollforward is what will make it a point in time recovery. We leave the WITHOUT ROLLING FORWARD clause off the restore command. The type of backup we use for the restore is shown in Figure 11-16. It shows “F” under Type, which means offline. If the Type is “N”, this shows that the backup is an online backup. Restoring to an online backup always requires a rollforward command. In our example, we rollforward to a point in time just slightly after the backup was taken. A fast reconcile will not take place because it will not run when the database is in the rollforward pending state. We must run the rollforward command. Figure 11-17 shows the restore and its output.

220 Data Links: Managing Files Using DB2

Page 243: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-17 Restore with rolling forward and rollforward pending status

Figure 11-18 shows a method to derive the CUT time to be used in the rollforward command. We actually use the rollforward command with the backup timestamp to come up with the minimum CUT time for the rollforward command. This may be easier than trying to figure out how many hours to add or subtract from the time zone you are in. In this case, CUT time is 8 hours greater than the backup time.

Figure 11-18 Rollforward to obtain minimum CUT time

Using the CUT time obtained in Figure 11-18, we add 5 minutes to the time for the rollforward. The user has requested the restore to be “right before noon” on 15 May 2001. The time we use is 2001-05-15-19.59.07. The rollforward in Figure 11-19 places the table in Datalink Reconcile Pending (DRP) status.

Chapter 11. Recovery 221

Page 244: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-19 Rollforward and log messages

Figure 11-20 shows the message received when a select is issued to retrieve data from the table that is in DRP.

Figure 11-20 Select statement with warning message

Now we run the reconcile command. The reconcile command removes the table from DRP status and makes the table fully accessible. Figure 11-21 shows the reconcile command and its output.

222 Data Links: Managing Files Using DB2

Page 245: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-21 Reconcile command and log messages

To restore and rollforward to a point in time, the following steps were taken:

1. Restore a database backup that was taken earlier then T1 (Figure 11-22).

2. Rollforward the database to a point-in-time (T2). A scenario like this is useful if you want to recovery to a point before a damaged log or before unwanted data was inserted into the database.

3. When the rollforward is complete, the table is placed into Data Links Reconcile Pending (DRP) state (T3). This occurs since there could be files that are linked on the DLFM server that needs to be unlinked or vice versa.

4. Run the reconcile utility to synchronize the DB2 Database with the DLFM database (T4).

Chapter 11. Recovery 223

Page 246: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-22 Restore and rollforward to a point-in-time

11.7 Tablespace recovery In the next example, we simulate losing the backup or archive Data Link files and losing one of the Data Link files from the Data Link file system. This is probably not a likely situation, but we wanted to illustrate what the reconcile command does in this case. The steps we illustrate are:

1. Delete the files in dlfm_backup.2. Delete one linked file from /dldata2/sys_pics.3. Restore tablespace userspace1.4. Rollforward the tablespace online.5. Use db2dart to show the Data Link reconcile pending status.6. Run reconcile.7. Display the report.exp file.

In Figure 11-23 as the root, we delete all of the files in the dlfm_backup directory. We also delete the file pic2.bmp, which is linked to the database. Without a backup of pic2.bmp in the dlfm_backup, this data is not recoverable by DB2. We must take manual steps to recover.

DB2 UDBDatabase

Restoreearlier Backup

Rollforward to aPoint-in-time DB2 UDB

Database table inDRP state

T1 T2 T3

DLFM Database

T4

RunReconcile

Time

DB2 Server DLFM Server

224 Data Links: Managing Files Using DB2

Page 247: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-23 Removing dlfm_backup files and removing a Data Linked file

We restore the tablespace that contains the table with Data Links. Notice that when we list the /tmp/dlreport file, there were no files unlinked during restore. This type of restore does not perform fast reconciliation like the restore to an offline backup without rolling forward example. After the restore, we run the rollforward and see the message about the table being in the DRP/DRNP state. Figure 11-24 shows the restore and rollforward.

Figure 11-24 Tablespace restore and rollforward

Chapter 11. Recovery 225

Page 248: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

An alternative to checking the db2diag.log for tables in DRP/DRNP status is to run the db2dart utility. We issue db2dart dlrestor after making sure there are no connections to the database. Figure 11-25 shows that table db2inst1.resident is in Datalink Reconcile Pending status.

Figure 11-25 Using db2dart to see the table status of DRP

Before we run the reconcile command, we select rows from the table and receive a warning message. We cannot use insert, update, or delete on the table at this point. To make the table usable, we must run reconcile. Figure 11-26 shows a SELECT statement and the DRP status.

Note: The tablespace is placed in the backup pending state after a rollforward to a point in time. We are allowed to run a database backup while the table is in the DRP state.

226 Data Links: Managing Files Using DB2

Page 249: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-26 Selecting the data before reconcile is run

The Reconcile utility finds that there are two rows that are exceptions. The row that links pic2.bmp is an exception because the file was deleted from the file system and backup directory. Reconcile sets the DATALINK value to null. To recover this row, we must first restore the file /dldata2/sys_pics/pic2.bmp from the daily backup of this file system. Then we must do an update on the DATALINK column that will link the column again. The file pic1.bmp is also an exception. Even though the file exists, the permissions and time do not match with the DLFM meta data, and therefore, the DATALINK value is set to null.

Figure 11-27 shows the reconcile command and the output it produced.

Chapter 11. Recovery 227

Page 250: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-27 Reconcile and the exceptions

Figure 11-28 The ddl to create the exception table for reconcile

The information in Figure 11-29 shows a SELECT statement that displays information from the exception table. The documentation for the msg column can be obtained by referring to DB2 UDB Command Reference, SC09-2951, under the Reconcile command.

Note: We suggest when using reconcile, you also use the exception table. The exception table makes it easier to rebuild the data than the reconcile report. Figure 11-28 illustrates a CREATE TABLE statement for the exception table.

228 Data Links: Managing Files Using DB2

Page 251: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-29 Information from the exception table for the reconcile

Figure 11-30 shows the rows that have had the value set to null in the DATALINK column.

Chapter 11. Recovery 229

Page 252: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-30 Selecting the data after reconcile has run

The steps to use for a tablespace recovery are:

1. A linked file on the Data Links File System (DLFS) is lost (T1) (Figure 11-31).

2. Restore a file system backup (T2) taken prior to T1 on the DLFM server.

3. Restore a tablespace backup to a point-in-time on the DB2 server (T3).

4. Run the Reconcile utility to synchronize the DB2 database with the DLFM database (T5).

230 Data Links: Managing Files Using DB2

Page 253: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-31 Tablespace recovery scenario

11.8 Recovering the dlfm_db to a point in timeThis section shows how to recover the dlfm_db to a point in time. We inadvertently deleted all of the rows from dfm_file table in the dlfm_db. We must restore and rollforward to the point in time just prior to our delete and then reconcile all of the databases that are defined to the dlfm_db. This highlights one of the problems with defining multiple databases to one DLFM. Any files linked after the point in time that we must recover to will be set to null by reconcile. Files that are unlinked after the point in time we restore to will appear as linked by the dlfm_db meta data until we run reconcile.

Figure 11-32 illustrates the restore command. We must issue dlfm stop before the restore command can work. DLFM maintains persistent connections to the database dlfm_db.

A linked file onthe DLFS is lost

DB2 UDBDatabase

T1 T2 T3

RestoreTablespace

Backup takenbefore T1

Run Reconcile

T4

Time

DLFM Database

RestoreFilesystemBackup thathas the file

RollforwardTablespace to a

Point-in-time

T4 T5

DLFM Database

DLFM Server DB2 Server

Chapter 11. Recovery 231

Page 254: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-32 Restore command and dlfm stop

Figure 11-33 shows the rollforward command. After we rollforward, we must start DLFM. For this, we issue the dlfm start command.

Figure 11-33 Rollforward and messages

The next step is to reconcile all of the databases defined to dlfm. To find out what these are, we run the dlfm list registered databases command. Figure 11-34 shows this command and the output.

232 Data Links: Managing Files Using DB2

Page 255: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 11-34 The list registered databases output

Using the list we obtained from Figure 11-34, we run reconcile for each registered database. As the instance owner, we run db2_recon_aid in check mode to find out which tables have DATALINK columns. Figure 11-35 shows db2_recon_aid with the -check option and also without the -check option that runs the reconcile utility.

Figure 11-35 The db2_recon_aid utility and output

The steps taken to recover the dlfm_db database to a point in time are:

1. At time T1 (Figure 11-36), the rows in the DFM_FILE table in the dlfm_db database are deleted.

2. At time T2, we restore a database backup of the dlfm_db.

3. At time T3, we rollforward the dlfm_db database to a point-in-time before the rows in the table were deleted at T1.

Note: In this recovery scenario, we run the Reconcile utility to make the meta data in dlfm_db reflect what is in the tables with Data Links. We did not restore the four databases that contained DATALINK columns. Any changes to those databases that were done between the time of the rollforward and the present time will not be reflected in the dlfm_db until reconcile is run.

Chapter 11. Recovery 233

Page 256: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4. At time T4, we run the recon_aid or reconcile utility on the DB2 UDB database servers that reference the DLFM server that was recovered at time T2.

Figure 11-36 DLFM_DB database point-in-time recovery

All rows in theDFM_FILE tablein the DLFM_DB

are deleted

T1 T2 T3

Run Reconcile

T4

Time

DLFM Database

RestoreDLFM_DBdatabasebackup.

T4

DLFM Database

DLFM Server DB2 Server

RollforwardDatabase to aPoint-in-time

234 Data Links: Managing Files Using DB2

Page 257: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 12. Garbage collection

This chapter describes the garbage collection process in a Data Links File Manager (DLFM) environment.

12

© Copyright IBM Corp. 2001 235

Page 258: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

12.1 Garbage collectionGarbage collection is a process by which DB2 monitors database backups in the database History File. It marks the backup as being active, inactive, or expired and reclaims expired database backups.

An active backup can be used to restore and rollforward through the current database logs to bring the database to the current state. An active backup is associated with the current log sequence and should be retained.

An inactive backup cannot be restored and rolled forward to reach the current state of the database because it requires a different set of log files. An inactive backup is associated with a previous log sequence or log chain and should be retained.

All database backups that are no longer needed are marked as “expired”. These backups are considered as no longer needed because there are several database backups as defined by NUM_DB_BACKUPS database configuration that are more recent. For example, if you have NUM_DB_BACKUPS set to four and have taken four backups, the “oldest” backup will be marked as “expired” when you take the fifth backup (Figure 12-1).

Note: DB2 Universal Database maintains transaction log chains or Log Sequence (LS). A log chain represents a life of the database defined by a unique set of transaction logs. All of the log records in these logs have been applied to the database.

A new log chain or LSN is created by:

� A database rollforward to a point in time � A database restore without rolling forward

After a new transaction log chain is created, there is a new version of the transaction log files. Since there can be more than one version of a log file, DB2 Universal Database must keep track of which log files belong to which chain. We recommend you make backup copies of log files before a point-in-time recovery or a restore without a rollforward recovery.

236 Data Links: Managing Files Using DB2

Page 259: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 12-1 Expired database backups

All backups that are marked as “expired”, have related linked FILE backups, and the related meta data on the Data Links File Manager Server can be deleted since they are considered as not needed.

Each Data Links server has its own garbage collector. DB2 garbage collection monitors the number of DB2 database backups that are kept.

The DB2 garbage collector is invoked after each:

� Backup� Restore� Drop database or tablespace� Drop table

When a database backup is taken on the DB2 UDB database server that has tables with DATALINK columns, the Garbage Collector daemon (the process that performs the garbage collection) on the DLFM server is invoked.

When a database backup and its related file backups are ready to be deleted (ready to be garbage collected), the DB2 garbage collection routine marks the history file entries for the database backup, all associated table space backups, and all associated load backup copies as “expired”. The routine also notifies all Data Links servers to delete all the associated files unlinked before this backup.

After every full database backup, the database configuration REC_HIS_RETENTN is used to prune (that is, the entry is deleted from the history file) expired entries from the history file. If a backup is pruned that is not expired, all Data Links servers are contacted to garbage collect the corresponding set of file backups.

Note: The DB2 garbage collector does not delete the physical database backups.The garbage collector deletes Data Link file backups and related meta data.

Chapter 12. Garbage collection 237

Page 260: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The PRUNE HISTORY command prunes only backups that are marked as expired from the history file unless the WITH FORCE OPTION is used. If a backup is pruned that is not expired, all Data Links servers are contacted to garbage collect the corresponding set of file backups. The PRUNE HISTORY command allows pruning of just backups (to include database, table space, load copy, and log). Entries marked as “expired” are pruned.

DB2 garbage collection is also invoked when a database backup is restored (with or without rolling forward). If an active database backup is restored, but it was not the most recent database backup recorded in the history file, any subsequent database backups that belong to the same log sequence are marked as inactive. If an inactive database backup is restored, any inactive database backups that belong to the current log sequence are changed back to active state. DB2 garbage collection then contacts all Data Links Servers to make the same status changes to the corresponding set of file backups.

12.2 Garbage collection scenarioThe following diagrams gives an example of how DB2 garbage collection works.

Assume that the current value of DB2_NUM_BACKUP is 4. For example, DB2 must retain up to four database backups associated with the current log chain or log sequence (Figure 12-2).

Figure 12-2 Four database backups are taken

� At time t1, take database backup BK1. Log sequence is LSN1.� At time t2, take database backup BK2. Log sequence is LSN1.� At time t3, take database backup BK3. Log sequence is LSN1.� At time t4, take database backup BK4. Log sequence is LSN1. There are now

four active database backups for log sequence LSN1.

A new log sequence or log chain is created when an active database backup is restored (Figure 12-3).

LSN1

BK1 BK2 BK3 BK4

Expired

Active

Inactive

238 Data Links: Managing Files Using DB2

Page 261: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 12-3 Active database backup being restored

� At time t5, restore active database backup BK2 and roll forward to a point before database backup BK3.

� This breaks the current log sequence LSN1 and starts log sequence LSN2.

� There are two active database backups associated with log sequence LSN2: BK1 and BK2.

� DB2 garbage collection marks database backups BK3 and BK4 as inactive (because they are in a previous log sequence).

All backups taken after a restore have the new log sequence (Figure 12-4).

Figure 12-4 Database backups taken with a new log sequence number

� At time t6, take database backup BK5. Log sequence is LSN2.� At time t7, take database backup BK6. Log sequence is LSN2.� There are now four active database backups for log sequence LSN2.

LSN1

LSN2

BK1 BK2 BK3 BK4

Expired

Active

Inactive

Restore &Roll forward

BK5 BK6LSN1

LSN2

BK1 BK2 BK3 BK4

Expired

Active

Inactive

Restore &Roll forward

Chapter 12. Garbage collection 239

Page 262: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The DB2 garbage collector marks a backup as expired when the backup is older then the oldest active backup (Figure 12-5).

Figure 12-5 Backup (BK1) is marked as expired

� At time t8, take database backup BK7. Log sequence is LSN2.

� DB2 garbage collection marks database backup BK1 as expired (because it is older than the oldest active backup).

A new log sequence is created when an earlier backup is restored (Figure 12-6).

Figure 12-6 New log sequence created after restore of backup (BK6)

� At time t9, restore active database backup BK6 and roll forward to a point before database backup BK7.

� This breaks the current log sequence LSN2 and starts log sequence LSN3.

� There are three active database backups associated with log sequence LSN3: BK2, BK5, and BK6.

BK5 BK6LSN1

LSN2

BK1 BK2 BK3 BK4

Expired

Active

Inactive

Restore &Roll forward

BK7

BK7LSN1

LSN2

LSN3Restore &Roll forward

BK1 BK2 BK3 BK4

Expired

Active

Inactive

BK5 BK6

240 Data Links: Managing Files Using DB2

Page 263: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� DB2 garbage collection marks database backup DB7 as inactive (because it is not in the active chain and in a previous log sequence).

Backup (BK2) is marked as expired by the DB2 garbage collector when two additional backups are taken (Figure 12-7).

Figure 12-7 Garbage collection marks backup BK2 as expired

� At time t10, take database backup BK8. Log sequence is LSN3.

� At time t11, take database backup BK9. Log sequence is LSN3.

� DB2 garbage collection marks database backup BK2 as expired.

� There are now four active database backups for log sequence LSN3: BK5, BK6, BK8, and BK9.

When a database backup falls out, the log sequence number or active chain and becomes expired, all other backups before this database backup must then become expired too (Figure 12-8).

Figure 12-8 All backups prior to and including BK5 are marked as expired

LSN1

LSN2

LSN3Restore &Roll forward

BK1 BK2 BK3 BK4

BK7

Expired

Active

Inactive

BK5 BK6

BK8 BK9

LSN1

LSN2

LSN3

Restore &Roll forward

BK1 BK2 BK3 BK4

BK7

Expired

Active

Inactive

BK5 BK6

BK8 BK9 BK10

Chapter 12. Garbage collection 241

Page 264: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

At time t12, take database backup BK10 on log sequence LSN3. DB2 garbage collection marks the following database backups as expired: BK5, BK3, and BK4 (because BK5 falls out the log sequence number or active chain and becomes expired. All other backups before DB5 must then become expired too).

Inactive database backups may become active because the backups are retained (Figure 12-9).

Figure 12-9 Inactive databases may become active because they are retained

In Figure 12-9, the database backup BK10 was taken. But let’s consider a different scenario here.

At time t12, we are not taking a backup, but we restore inactive database backup BK4 and rollforward to a point in time past the end of database backup BK5. DB2 garbage collection will mark database backups BK3 and BK4 as active and database backups BK5, BK6, BK7, BK8, and BK9 as inactive. There will be two active database backups in the new log sequence LSN4: BK3 and BK4. Notice that only inactive backups may become active because they are retained.

LSN1

LSN2

LSN3

LSN4

Restore &Roll forward

BK1 BK2 BK3 BK4

BK7

BK8 BK9

Expired

Active

Inactive

BK5 BK6

242 Data Links: Managing Files Using DB2

Page 265: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 13. Migrating to DB2 UDB Version 7

This chapter describes the process of migrating existing DB2 Universal Database Version 5.x and DB2 Universal Database Version 6.x databases to DB2 Universal Database V7.x in a Data Links Manager environment. Moving from Version 7.1 to Version 7.2 on the database server or the Data Links Manager server is not considered a migration, but rather an upgrade.

Before you attempt a migration, consider the following points:

� Data Links File Manager can be migrated to the current release on the AIX and Windows NT platforms.

� The Solaris version of Data Links Manager has only been made generally available with Version 7.1.

� The Windows version of Data Links Manager has only been made generally available with Version 6.1.

� The DB2 database server must have exactly the same fixpack level as the Data Links File Manager (DLFM) components.

13

© Copyright IBM Corp. 2001 243

Page 266: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

13.1 Migration optionsThere are two methods in which a migration can be performed:

� Migration of the UDB database server and Data Links server using the instance migration scripts (db2imigr) and the migrate database command

� Migration of the databases by using an offline DB2 database backup

13.1.1 DB2IMIGR and MIGRATE database commandsMigration to Version 7.x in a Data Links File Manager environment requires you to perform the following steps for each DB2 Universal Database server and Data Links Manager instance:

� Database instance and database migration on the DB2 UDB database server� Data Links instance and database migration on the Data Links server

The steps in the following sections are required to perform a successful migration of the DB2 UDB database server and the Data Links File Manager (DLFM) server. In our example, we migrate a Version 5.x DB2 UDB database server and DLFM server to Version 7.2.

Migrating the DB2 UDB V5.x Database Server (AIX)Perform the following steps to migrate a DB2 UDB Version 5.x database server to Version 7.2:

1. Log on to the DB2 UDB database server as the db2 instance owner.

2. To ensure that you are attached to the instance that contains the database that has tables with data link columns, issue the command:

db2 get instance

In our example, the db2 get instance command returns db2inst1.

3. To ensure that there are no applications connected to the database that you want to migrate, issue the command:

db2 list applications

If all applications are disconnected from the database, the following warning should be returned:

Note: You can issue the db2ilist command to list the names of all the instances on the DB2 UDB database server. In our example, there are two instances, namely db2inst1 and target. We do not have to migrate the target instance at this time.

244 Data Links: Managing Files Using DB2

Page 267: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

SQL1611W No data was returned by the Database System Monitor SQLSTATE=00000

4. Issue the db2stop command and run the db2dart utility against the database to be migrated in inspection mode. On our system, we invoked the utility as follows:

db2dart sample /db

Here sample is the name of the database and /db is an argument to inspect the entire database.

The db2dart utility can be found in /instancehome/sqllib/adm. The db2dart utility generates a report file and an error file. The report file is generated in the path in which the db2dart utility is executed and has the dbname.rpt naming convention.

In our example, our report file is called SAMPLE.RPT. Once the database inspection is complete, open the report file with an editor and examine the contents. The bottom of the report should contain the entry shown in Figure 13-1 if no problems were found in the database.

Figure 13-1 DB2DART utility output reporting no errors

5. Start the database manager instance with the db2start command.

6. Take an offline backup of the database that will be migrated. In our example, we take offline database backups to disk.

Note: The db2dart utility verifies that the architectural integrity of the database is correct. For example, this tool confirms that:

� The control information is correct.� There are no discrepancies in the format of the data.� The data pages are the correct size and contain the correct column

types.� Indexes are valid.

It is important that you run the db2dart utility against the database while there are no connections to the database.

Chapter 13. Migrating to DB2 UDB Version 7 245

Page 268: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

db2 backup database sample to /dbbackups

The migration process does not migrate database transaction logs.

7. Install the DB2 Universal Database EE Version 7.x software. In our example, we installed DB2 Universal Database Version 7.2. For details on installing on the AIX platform, refer to IBM DB2 UDB for UNIX Quick Beginnings, GC09-2970.

8. Log on as a DB2 UDB Version 5.x Database instance owner and ensure that the database manager is stopped. Issue the db2stop command if necessary. Change to the /usr/lpp/db2_07_01/bin directory.

9. Execute the db2ckmig utility to verify that the database can be migrated. Do not execute the utility as root. Usage notes for the utility can be found by running db2ckmig with no arguments. We run db2ckmig on the sample database (Figure 13-2).

Figure 13-2 Verifying that the database can be migrated with the db2ckmig utility

The migrate.log file in our example is empty since the db2ckmig utility completed without any errors.

10.Log on as a user with root authority.

11.Execute the db2imigr utility to migrate the Version 5.x instance to a Version 7.x instance. The db2imigr utility can be found in the /usr/lpp/db2_07_01/instance directory. Usage notes for the utility can be determined by running the db2imigr utility without any arguments. We will run db2imigr on the db2inst1 instance (Figure 13-3).

db2imigr -u db2fenc1 db2inst1

Figure 13-3 Instance migration using the db2imigr utility

12.Log on as the instance owner on the DB2 UDB database server. Any attempt to connect to the database return the SQL5035N error message (Figure 13-4).

246 Data Links: Managing Files Using DB2

Page 269: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 13-4 Connecting to a database that requires migration

13.Migrate the database using the migrate database command. We run the migrate database on our sample database.

db2 migrate database sample

A successful migration result in the message shown in Figure 13-5.

Figure 13-5 Successful migration of the database using the migrate command

14.Update the database manager configuration parameter to enable Data Links functionality:

db2 update database manager configuration using datalinks yes

Migrating the V5.x Data Links File Manager (AIX)Complete the following steps to migrate the V5.x Data Links File Manager:

1. Log on on the Data Links File Manager Server as a user with root authority. Install the DB2 Data Links File Manager Version 7.x software. In our example, we install DB2 Universal Database Version 7.2. For details on installing on the AIX platform, refer to DB2 Data Links Manager Quick Beginnings, GC09-2966.

2. Log on as the Data Links Administrator on the Version 5.x Data Links File Manager instance.

3. To ensure that there are no applications connected to the Data Links File Manager database, issue the command:

db2 list applications

If all applications are disconnected from the database, the following warning should be returned:

SQL1611W No data was returned by the Database System Monitor SQLSTATE=00000

Note: In Version 5.x, the DATALINKS configuration parameter was exported as an environment variable. In Version 6.x and later, the DATALINKS configuration parameter is incorporated in the database manager configuration file.

Chapter 13. Migrating to DB2 UDB Version 7 247

Page 270: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4. Issue the db2stop command and run the db2dart utility against the database to be migrated in inspection mode. On our system, we invoked the utility as:

db2dart dlfm_db /db

Here dlfm_db is the name of the database, and /db is an argument to inspect the entire database.

The db2dart utility can be found in /instancehome/sqllib/adm. The db2dart utility generates a report file and an error file. The report file is generated in the path in which the db2dart utility was executed and has the dbname.rpt naming convention. Ensure that there are no errors by examining the db2dart report.

5. Take an offline backup of the Data Links File Manager Database (dlfm_db). In our example, we take an offline backup to disk.

db2 backup db dlfm_db to /backups

6. Stop the Data Links File Manager with the dlfm_shutdown command.

7. Change to the /usr/lpp/db2_07_01/bin directory. Execute the db2ckmig utility to verify that the database can be migrated. Do not run the utility as root. Usage notes for the utility can be found by running db2ckmig with no arguments. We run db2ckmig on the dlfm_db database (Figure 13-6).

Figure 13-6 Verifying that the database can be migrated with the db2ckmig utility

8. Issue a dlfm_see command to ensure that the Data Links File Manager is stopped.

9. Log on as a user with root authority. Unmount the dlfs file system that is under the control on the Data Links File System Filter. In our example, we issue the umount /v5data command.

10.As a user with root authority, execute the db2imigr utility to migrate the Data Links File Manager Instance to Version 7.2 (Figure 13-7). The db2imigr utility can be found in the /usr/lpp/db2_07_01/instance directory.

Figure 13-7 Instance migration using the db2imigr utility

248 Data Links: Managing Files Using DB2

Page 271: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

11.Log on as the Data Links File Manager Administrator and migrate the dlfm_db:

db2 migrate database dlfm_db

12.Execute the db2dlmmg to start the DLFM migration from the /usr/lpp/db2_07_01/adm/ directory (Figure 13-8). The db2dlmmg utility:

– Binds the migration package.– Backs up the DLFM_DB database.

Figure 13-8 Successful migration of the DLFM instance

13.Issue the db2set command to determine if the environment variables from Version 5.x were converted to registry variables in Version 7.2. In our example, we received the output shown in Figure 13-9.

Figure 13-9 Output of the db2set command

14.In our example, we set two additional registry variables that were not set by the migration utility:

db2set DLFM_BACKUP_TARGET=LOCALdb2set FS_ENVIRONMENT=NATIVE

You may need to set these registry variables as well.

15.Log on as a user with root authority, and ensure that the Data Links File System Filter is loaded. In our example, we queried the driver by executing the strload command:

strload -q -f /usr/lpp/db2_07_01/cfg/dlfs_cfg

Use strload -u to load the Data Links File System Filter if it is not loaded.

16.As root, mount the file system that is to be under the control of the Data Links File System Filter. In our example, we issued:

mount -v dlfs /v5data

Chapter 13. Migrating to DB2 UDB Version 7 249

Page 272: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

17.Log on as the Data Links File Manager Administrator and issue the command:

dlfm start

18.Verify that the Data Links File Manager is running with the dlfm see command. The migration of the Data Links File Manager is now complete.

13.1.2 Migrating the DB2 UDB V6.x database serverIn our example, we migrate the DB2 UDB database server and the Data Links File Manager from Version 6.1 to Version 7.2. On the Windows platform, Data Links File Manager became generally available (GA) in Version 6.1.

Complete these steps:

1. Log on to the DB2 UDB database server as the db2 instance owner.

2. To ensure that you are attached to the instance that contains the database that has tables with data link columns, issue the command:

db2 get instance

In our example the db2 get instance command returns db2inst2.

Note: Version levels of DB2 Data Links and DB2 Universal Database can be any combination of Version 6.1 and Version 7.x. For example, DB2 Universal Database can be at Version 6.1 and Data Links Manager can be at Version 7.2.

While this configuration is supported, we recommend you have both the Data Links File Manager and the DB2 Universal Database at the same release level and fixpack level. The advantages of being on the same release and fixpack level are:

� Products at the same release provide the same functionality.

� Fixpacks are release dependent, and having the DB2 Universal Database and Data Links Manager at the same release makes upgrades easier.

� An earlier release may discontinue support.

� Troubleshooting problems becomes easier if the DB2 Universal Database and the Data Links Manager are at the same release and fixpack level.

Note: You can issue the db2ilist command to list the names of all the instances on the DB2 UDB database server. In our example, there are two instances, namely db2inst2 and target. We do not have to migrate the target instance at this time.

250 Data Links: Managing Files Using DB2

Page 273: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

3. To ensure that there are no applications connected to the database that we want to migrate, issue the following command:

db2 list applications

If all applications are disconnected from the database, the following warning should be returned:

SQL1611W No data was returned by the Database System Monitor SQLSTATE=00000

4. Issue the db2stop command and run the db2dart utility against the database to be migrated in inspection mode. On our system, we invoked the utility as:

db2dart sample /db

Here sample is the name of the database, and /db is an argument to inspect the entire database.

The db2dart utility can be found in drive:\path\sqllib\bin. The db2dart utility generates a report file and an error file. The report file is generated in the path in which the db2dart utility was executed and has the dbname.rpt naming convention.

In our example, our report file is called SAMPLE.RPT. Once the database inspection is complete, open the report file with an editor and examine the contents. The bottom of the report should contain the entry shown in Figure 13-10 if no problems were found in the database.

Note: The db2dart utility verifies that the architectural integrity of the database is correct. For example, this tool confirms that:

� The control information is correct.� There are no discrepancies in the format of the data.� The data pages are the correct size and contain the correct column

types.� Indexes are valid.

It is important that you run the db2dart utility against the database while there are no connections to the database.

Chapter 13. Migrating to DB2 UDB Version 7 251

Page 274: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 13-10 DB2DART utility output reporting no errors

5. Start the database manager instance with the db2start command.

6. Take an offline backup of the database that will be migrated. In our example, we take offline database backups to disk.

db2 backup database sample to c:\dbbackups

The migration process renames the current active logs with the *.MIG extension. For example, SQL00001.LOG is renamed to SQL00001.MIG.

7. Stop the DB2 database manager by issuing the command:

db2stop

8. End the DB2 license daemon by entering the command:

db2licd -end

9. Stop the administration server if installed by entering the command:

db2admin stop

10.Stop DB2 Services (Figure 13-11).

Figure 13-11 Stopping DB2 Services on Windows NT

11.Execute the db2ckmig from the Version 7.x CD-ROM to verify that the database can be migrated. The utility can be found in the drive:\db2\common

252 Data Links: Managing Files Using DB2

Page 275: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

directory. You can find the usage notes for the utility by running db2ckmig with no arguments. We run db2ckmig on the sample database (Figure 13-12).

Figure 13-12 Verifying that the database can be migrated with the db2ckmig utility

The migrate.log file in our example is empty since the db2ckmig utility completed without any errors.

12.Install the DB2 Universal Database EE Version 7.x software. In our example, we installed DB2 Universal Database Version 7.2. For details on installing on the AIX platform, refer to IBM DB2 UDB for Windows Quick Beginnings, GC09-2971.

13.Once the installation is complete, log in with a user that has SYSADM authority.

14.To verify that the database that will be migrated is cataloged, issue the command:

db2 list db directory

15.Migrate the database using the db2 migrate database command. In our example, we migrate the sample database:

db2 migrate database sample

Migrating the V6.x Data Links File Manager (Windows NT)This section explains the process for migrating the V6.x Data Links File Manager on Windows NT:

1. Log on on the Data Links File Manager Server as a user with root authority. Install the DB2 Data Links File Manager Version 7.x software. In our example, we install DB2 Universal Database Version 7.2. For details on installing on the AIX platform, refer to DB2 Data Links Manager Quick Beginnings, GC09-2966.

2. Log on as the Data Links Administrator on the Version 5.x Data Links File Manager instance.

3. To ensure that there are no applications connected to the Data Links File Manager database, issue the command:

Note: Windows allows only one version of DB2 to be installed on a machine. For example, if you have DB2 Version 6.x and install Version 7.x, Version 6 will be deleted during the installation.

Chapter 13. Migrating to DB2 UDB Version 7 253

Page 276: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

db2 list applications

If all applications are disconnected from the database, the following warning should be returned:

SQL1611W No data was returned by the Database System Monitor SQLSTATE=00000

4. Take an offline backup of the Data Links File Manager Database (dlfm_db). In our example, we take an offline backup to disk.

db2 backup db dlfm_db to /backups

5. Stop the Data Links File Manager with the dlfm_shutdown command.

6. Issue a dlfm_see command to ensure that the Data Links File Manager is stopped.

7. Log on as a user with root authority and execute the db2dlmmg to start the DLFM migration from the /usr/lpp/db2_07_01/adm/ directory. The db2dlmmg utility:

– Binds the migration package.– Backs up the DLFM_DB database.– Migrates the instance or database.

13.1.3 Migrating databases using an offline backupThe second method of migrating the DB2 UDB database server and the Data Links File Manager Server is by means of a database backup. DB2 Universal Database supports a restore of a backup taken from two releases prior to the latest release. For example, a Version 5.x and Version 6.x database backup can be used to restore into a Version 7.x instance. The steps described in this section apply to both the AIX and Windows platforms.

In our example, we restore a backup taken at DB2 Universal Database V5.2 into a DB2 Universal Database Version 7.2 instance. This functionality is made possible by the fact that the database engine on the instance that the database is being restored into migrates the database. The database has to be migrated during the restore before it can be used. The migration is done automatically when the database engine determines that the backup image being used to restore is from an earlier release of DB2 Universal Database.

While it is possible to restore a backup into a more current release of DB2 Universal Database, the converse is not possible. For example, you can restore a Version 5.x database into a Version 7.x instance but cannot restore a Version 7.x database into a Version 5.x or Version 6.x instance. One of the reasons for this limitation is the changes that takes place in the System Catalog Tables during the migration to a later release.

254 Data Links: Managing Files Using DB2

Page 277: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The following steps outline the process of migrating the DB2 UDB database server and the Data Links File Manager database using backups taken at an earlier release. These steps assume that the DB2 UDB database server and the Data Links File Manager instances were already migrated using the db2imigr and the db2dlmmg utilities.

On the Data Links File Manager ServerComplete these steps:

1. Log on to the DB2 Data Links File Manager Server as the Data Links administrator.

2. To determine if there are any applications connected to the database that will be migrated, issue the command:

db2 list applications

Terminate any applications normally.

3. Take an offline backup of the database (you may need to issue a dlfm stop command). In our example, we make database backups to disk. You may backup to ADSM/TSM or a vendor device.

db2 backup database dlfm_db to /datalink/dlfm/dlfm_backup

4. Verify that the backup has completed successfully by examining the database history file. This step is important since the backup image will be used to create the database on Version 7.x. The database history file can be examined by issuing:

db2 list history all for <dbname>

The output should be similar to the example shown in Figure 13-13.

Note: The migration of databases using database backups require the database backups to be offline. Online backups require the database transaction logs to be rolled forward at the completion of the restore. Database transaction logs are not migrated and, as a result, cannot be used to roll forward after the restore in the new database instance.

Chapter 13. Migrating to DB2 UDB Version 7 255

Page 278: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 13-13 Extract of a recovery history file

5. Install the DB2 Universal Database Version 7.x code and create an instance.

6. Log on as the new DB2 Universal Database Version 7.x instance owner.

7. Restore the backup taken from step 3. In our example, we issued:

db2 restore db taken at 20010511100146 from /datalink/dlfm/dlfm_backup

8. The command returns an SQL2539W warning message since dlfm_db from Version 5.2 still exists on the system. We choose to overwrite the existing dlfm_db database (Figure 13-14).

Figure 13-14 Restoring into an existing database

9. Since we had LOGRETAIN in the database manager configuration file turned on when the backup was taken, in our example, we issued:

db2 rollforward db dlfm_db stop

Note: If the LOGRETAIN configuration parameter in the database configuration file is set to Yes/No when the offline database backup was taken, you must issue an additional command:

db2 rollforward db <dbname> stop

This is needed to remove the database from rollforward pending state.

256 Data Links: Managing Files Using DB2

Page 279: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

10.Once the restore is completed issue the command:

db2 list database directory

Notice the release level.

11.Start the DLFM server with the dlfm start command.

12.Verify that the DB2 UDB database is registered on the DLFM server:

dlfm list registered databases

13.Issue the command:

db2 connect to <database name>

14.Issue the command:

db2 list tables

At this point, the migration is complete on the DLFM server side.

On the DB2 UDB database serverComplete the following steps on the DB2 UDB database server:

1. Log on to the DB2 UDB database server as the instance owner.

2. To determine if there are any applications connected to the database that will be migrated, issue the command:

db2 list applications

Terminate any applications normally.

3. Take an offline backup of the database. In our example, we make database backups to disk. You may backup to ADSM/TSM or a vendor device.

db2 backup database dltest to /dbbackup

Verify that the backup has completed sucessfuly by examining the database history file. This step is important since the backup image will be used to create the database on Version 7.x. The database history file can be examined by issuing:

db2 list history all for <dbname>

4. Install the DB2 UDB Version 7.x code and create an instance.

5. Log on as the new DB2 UDB Version 7.x instance owner.

6. Verify that the DLFM server is registered:

db2 list datalinks managers for database dltest

7. Restore the backup taken from step 3. In our example, we issued:

db2 restore db dltest taken at 20010511100146 from /datalink/dlfm/dlfm_backup without datalink

Chapter 13. Migrating to DB2 UDB Version 7 257

Page 280: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The command returns a SQL2539W warning message since dlfm_db from Version 5.2 still exists on the system. We choose to overwrite the existing dlfm_db database.

8. Once the restore is completed, issue the following command:

db2 list database directory

Notice the release level. Since we had LOGRETAIN in the database manager configuration file turned on when the backup was taken, in our example, we issued:

db2 rollforward db dltest stop

See Figure 13-15.

Figure 13-15 Rollforward completing with a warning

9. Issue the command:

db2 connect to <database name>

10.Issue the command:

db2 list tables

11.Execute the db2_recon_aid utility with the check option to determine which tables may need to be reconciled. In our example, we issued:

db2_recon_aid -db dltest -check

12.Run the Reconcile utility. For more information on the Reconcile utility, see 10.1, “Overview” on page 192. At this point, the migration is complete.

258 Data Links: Managing Files Using DB2

Page 281: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 14. Moving a Data Links file system to a new disk

There may be a situation, although rarely, where it is required to migrate files under Data Links control from one storage disk to another. This chapter discusses the various steps involved. It also discusses some performance tips that, when kept in mind, can save a lot of time.

The following two scenarios are considered (on AIX and Solaris):

� Moving the entire file systems to a different disk in the same machine. The current disk still remains connected to the machine.

� Replacing the existing disk with a new one, therefore, moving all the DLFS-enabled file systems to the new disk.

14

© Copyright IBM Corp. 2001 259

Page 282: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

14.1 Migrating a DLFS-enabled file system (AIX)Let us assume that you need to migrate a DLFS enabled file system named /dlfsfs to a new disk, configured on the machine. Also, the logical volume containing /dlfsfs file system is /dev/dlfslv. The following steps describe the entire process:

1. Stop the Data Links File Manager:

dlfm stop

2. Switch to the super user ID (root on AIX, Solaris).

3. Get the File System IDentifier (FSID) of the file system to be migrated (/dlfsfs). Here is an example to get the FSID of the file system.

Get the major and minor number of the device that is mounted on /dlfsfs:

ls -l /dev/dlfslv | awk '{print $5,$6}'

Let’s assume that /dev/dlfslv has:

– Major number = 10 (a in hex)– Minor number = 9 (0009 in hex)

So, the above command would produce the following output:

10, 9

Now the FSID value = 000a0009 (Major and Minor number appended together) or 655369 (in decimal).

4. Unmount the file system to be migrated (/dlfsfs):

umount /dlfsfs

5. Use the dd command to copy the contents from the old logical volume to the new logical volume. This command helps in maintaining the inode values of the files, therefore, minimizing the time required for a final reconcile on the DB2 UDB table having the DATALINK column. The following two cases are possible:

– If you are going to replace the old disk with a new one:

i. Copy the old logical volume to a tape:

dd if=/dev/dlfslv of=/dev/rmt0 bs=512b

ii. Repeat the same procedure, if you want to migrate more than one DLFS-enabled file system.

iii. Replace the old disk with the new one. Configure it for its standard configuration. Create a new logical volume (/dev/newdlfslv) in the new disk. Note that the size of the new logical volume must be the same or more than the old logical volume which was mounted on /dlfsfs.

260 Data Links: Managing Files Using DB2

Page 283: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Keeping the FSID same:

In AIX, the FSID is an integer whose first 16 bits represent the major number of the volume group and the last 16 bits represent the minor number of the logical volume. To keep the FSID of the new logical volume same as that of the old logical volume, the new disk should be the part of the same volume group, and the minor number of the new logical volume should be maintained too. There is no way of explicitly specifying the minor number of the logical volume on AIX. The system assigns the lowest number available to the new logical volume, under that major number (for example, the volume group).

Therefore, to keep the minor number same, after the physical volume representing the new disk is added to the same volume group as of the old disk, delete the old logical volume. This results in freeing the minor number that corresponds to the old logical volume (from which the Data Links files have to be copied).

Now create the new logical volume (under the new physical volume but the old volume group). This results in the system assigning the smallest free minor number, which should be the minor number of the old logical volume just freed.

iv. Copy the contents of the tape to the new logical volume (/dev/newdlfslv):

dd if=/dev/rmt0 of=/dev/newdlfslv bs=512b

– If both the disks are connected to the machine:

i. Create a new logical volume (/dev/newdlfslv) in the new disk. Note that the size of the new logical volume must be the same or more than the old logical volume that was mounted on /dlfsfs.

Tip: We recommend you keep the FSID of the new logical volume same as that of the current logical volume, because it would improve the performance by reducing the time taken by the Reconcile utility.

Note: It is possible (although rare) that a smaller minor number is available under the same volume group. And therefore, when the new volume group is created, it would be assigned this minor number, and not the minor number of the current logical volume. This hole can only be created when an already defined logical volume was deleted.

Chapter 14. Moving a Data Links file system to a new disk 261

Page 284: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

ii. Directly copy the contents from the old logical volume to the new one:

dd if=/dev/dlfslv of=/dev/newdlfslv

6. Change the file system entry for /dlfsfs in the file /etc/filesystems. Change the device name from the old logical volume name (/dev/dlfslv) to the new logical volume name (/dev/newdlfslv). This is done to keep the file system mount point same. So you don't have to change the file system name in the dlfm tables.

7. Mount the file system as DLFS enabled. This time the new logical device (/dev/newdlfslv) is mounted on /dlfsfs:

mount -v dlfs /dlfsfs

8. Get the FSID of the file system (/dlfsfs). This must have changed if the new logical volume has a different major and minor number than the old logical volume. Refer to step 3 to get the FSID of the file system.

9. If your new logical volume has a different major and minor number than the old one (Please refer to step 5), you need to change the FSID entry in the DFM_DIR table in DLFM_DB database. Otherwise skip this step. The following commands serve this purpose:

db2 connect to dlfm_dbdb2 "update dfm_dir set fsid=<newfsid> where fsid=<oldfsid>"ddb2 commit

10.Start the Data Links File Manager (DLFM):

dlfm start

14.2 Migrating a DLFS-enabled file system (Solaris)Lets assume that you are going to migrate a DLFS-enabled file system named /dlfsfs residing the disk slice /dev/dsk/c0t0d0s5 to a new disk configured on the machine. Here are the steps to do this:

1. Stop the Data Links File Manager:

dlfm stop

2. Switch to the super user ID (root on AIX, Solaris).

3. Get the FSID of the file system to be migrated (/dlfsfs):

df -g /dlfsfs | grep filesys | awk {'print $4'}

Note: For better performance, the FSID of the logical volume should be maintained. See “Keeping the FSID same:” on page 261.

262 Data Links: Managing Files Using DB2

Page 285: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

4. Unmount the file system to be migrated (/dlfsfs):

umount /dlfsfs

5. Run the dd command to copy the contents from the old disk slice to the new disk slice. Here are the steps to do it.

– If you are going to replace the old disk with a new one:

i. Copy the content in the old disk slice to a tape:

dd if=/dev/dsk/c0t0d0s5 of=/dev/rmt0 bs=512b

ii. Repeat the same procedure, if you want to migrate more than one DLFS-enabled file system.

iii. Replace the old disk with the new one. Configure it for its standard configuration. Configure a new disk slice (/dev/dsk/c0t8dos5) in the new disk. Note that the size of the new disk slice must be the same or more than the old disk slice which was mounted on /dlfsfs.

iv. Copy the contents of the tape to the new disk slice (/dev/dsk/c0t8d0s5):

dd if=/dev/rmt0 of=/dev/dsk/c0t8d0s5 bs=512b

– If both the disks are connected to the machine:

Directly copy the contents from the old disk slice to the new one:

dd if=/dev/dsk/c0t0d0s5 of=/dev/dsk/c0t8d0s5

6. Change the file system entry for /dlfsfs in the file /etc/vfstab. Change the device name from the old disk slice name (/dev/dsk/c0t0d0s5) to the new disk slice name (/dev/dsk/c0t8d0s5). This is done to keep the file system mount point same, so you don't have to change the file system name in the dlfm tables.

7. Mount the file system as dlfs enabled. This time the new disk slice (/dev/dsk/c0t8d0s5) is mounted on /dlfsfs:

mount /dlfsfs

8. Get the FSID of the file system (/dlfsfs). This must have changed if the new disk slice has a different major and minor number than the old one. Refer to step 3 to get the FSID of the file system.

9. If your new disk slice has a different major and minor number than the old one (refer to step 5), you need to change the FSID entries in the dlfm tables (DFM_DIR and DFM_FILE). Otherwise skip this step. The following commands serve this purpose:

Note: We recommend you keep the major number and minor number of the new disk slice same as that of the old disk slice. It reduces the migration time a lot.

Chapter 14. Moving a Data Links file system to a new disk 263

Page 286: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

db2 connect to dlfm_dbdb2 "update dfm_dir set fsid=<newfsid> where fsid=<oldfsid>"db2 "update dfm_file set fsid=<newfsid> where fsid=<oldfsid>"db2 commit

10.Start the Data Links File Manager:

dlfm start

264 Data Links: Managing Files Using DB2

Page 287: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 15. Replacing or upgrading a machine

This chapter takes you through the steps that are required to replace or upgrade a machine that has DB2 Universal Database or Data Links File Manager installed.

15

© Copyright IBM Corp. 2001 265

Page 288: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

15.1 Replacing or upgrading a DB2 machineOnce in a while, because of performance reasons, or maybe due to new usage requirements, customers need to upgrade their DB2 system. Usually, this can be done by either:

� Replacing pieces of hardware (for example, CPU, Memory, so both the HOSTNAME and IP address remain UNCHANGED)

� Moving the DB2 server to another machine, which also means both IP address and HOSTNAME change

The first scenario is straightforward because it does not involve any data movement. However, for the second case, which involves moving DB2 data from one machine to another, it becomes more complicated. It is even more complicated if the DB2 server is connected with several Data Links File Managers (DLFM) because there is some meta data stored in the Data Links File Manager about the location of the DB2 server. Currently there is no external command or tool to do it easily.

This section takes you through the procedure to replace the DB2 UDB database server that has connections to DLFMs.

15.1.1 AssumptionReplace or upgrade a DB2 UDB Database server machine that has files linked to several DLFMs. The DLFMs will remain untouched, but the IP address or hostname of the new DB2 machine will be different. Assume the hostname of old DB2 machine is OLDHOST and the hostname of the new machine is NEWHOST.

15.1.2 Steps to performPerform the following steps:

1. Make sure that there is no database activity.

2. Take an offline backup of the original DB2 UDB database.

3. Copy the database backup files to the new machine.

4. Copy the datalink.cfg file as well as the datalink.cfg.BAK file from the database directory of the original DB2 database. You can locate the database directory by using the command:

DB2 LIST DATABASE DIRECTORY

5. To drop the original database, issue the following command:

DB2 DROP DATABASE

266 Data Links: Managing Files Using DB2

Page 289: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6. For each DLFM, change the DB2 UDB database hostname registration at the DLFM side. Currently there is no external command for this, so you have to do this by directly modifying the DLFM_DB database:

db2 connect to DLFM_DBdb2 update table dfm_dbid set hostname = 'NEWHOST' where hostname = 'OLDHOST'db2 commitdb2 terminate

7. Issue a db2start on the new DB2 Universal Database machine.

8. Initiate a DB2 RESTORE from the DB2 backup file (keep all the instances and database names the same) and then issue:

DB2 ROLLFORWARD STOP

9. Issue the following SQL statement for all the tables having DATALINK columns:

"SET CONSTRAINT FOR 'table' DATALINK RECONCILE PENDING IMMEDIATE UNCHECKED"

10.For each table that has a DATALINK column, issue the following command to return the state back to normal:

DB2 RECONCILE

15.2 Replacing or upgrading a DLFM machineThe IP address of the new machine remains the same. (This scenario is somewhat similar to the disk crash recovery.)

15.2.1 Steps to performYou need to perform the following steps:

1. Make sure there is no activity for any databases connected to this DLFM.

2. Connect to the DLFM_DB database and make sure there is nothing in the dfm_xnstate table. If so, wait until it is empty.

3. Make an offline backup for each DB2 database that is connected to this DLFM.

4. Make an offline backup to DLFM_DB, and copy the backup file to the new machine.

5. Backup all the Data Link files under DLFS file system, and copy the backup archive to the new machine.

Chapter 15. Replacing or upgrading a machine 267

Page 290: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

6. If you are not using ADSM as the Data Link file backup, copy all the Data Links file archives under the DLFM_BACKUP_DIR_NAME directory on to the new machine.

7. Stop the original DLFM.

8. In the new DLFM machine, restore DLFM_DB from the backup archive created in step 4.

9. In the new DLFM machine, restore all the Data Link files from the backup archive created in step 5.

10.If using ADSM, set up DLFM to connect to the same ADSM.

11.Start the new DLFM.

12.Run reconcile for each table that has a DATALINK value pointing to this new DLFM.

268 Data Links: Managing Files Using DB2

Page 291: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Chapter 16. Problem determination

This chapter provides a detailed description of problem determination. First it describes a methodology for problem solving. Then it discusses some solutions to the common problems.

16

© Copyright IBM Corp. 2001 269

Page 292: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

16.1 Solving problems This section describes the steps required and the information to collect in order to try an solve problems in a Data Links File Manager (DLFM) environment. The steps that are outlined apply both to the DB2 UDB database server and DLFM Server unless otherwise noted.

16.1.1 Problem solving processThe problem solving process involves the following steps:

1. The first step in any problem determination process is to understand what the problem really is. Recognizing that a certain condition exists in a problem requires understanding the environment in which the problem condition has occurred. It is important to try and differentiate between a product limitation and a product defect or problem early in the process. The problem may be caused by an error in a user application, or bug in the DB2 UDB or DLFM server code.

2. Problem determination requires a “good” description of the problem. A good description of the problem usually indicates how well the problem is being understood. To determine what the problem is, you must fully describe the error conditions.

The problem description should include:

– All error codes/error conditions; include the reason code if applicable– The actions that preceded the error– A description of the problem

3. Determine if the problem can be reproduced or is it was a one-time occurrence. If the problem is reproducible, determine the steps that are required.

4. Identify the source or cause of the error.

– Is it a user error?

– Determine if the system working as designed. For example, a user did not understand the behavior of the system or the system is working as it was intended.

– Is the system configuration supported? For example, the system was never intended to run with the hardware or software that was installed.

– Is it a DB2 UDB or DLFM server bug?

5. Provide a fix for the problem.

– If the problem is caused by any of the following reasons, an application or environment change may be required:

270 Data Links: Managing Files Using DB2

Page 293: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

• User error• The system is working as designed• It is an unsupported environment or configuration

– If the problem is caused by a DB2 UDB or DLFM bug, a fix for the defect will be provided or a workaround developed.

16.1.2 Information needed to analyze a problemThis section describes the information you should gather based on the error conditions encountered. Different types of errors require different data to be collected. However, some data is collected for all error conditions.

Required informationThe following information is required:

� The SQL Error Code that was returned with the corresponding Reason Code (RC). For example, SQL0357n, RC = “03” is a possible error and reason code returned when the Reconcile utility was initiated. Provide a System Error code if the error is not a SQL error.

� The approximate time of the error.

� Determine where the error was encountered, for example, on the DB2 UDB database server or the DLFM server.

� A “good” problem description.

� Description of the actions that preceded the error.

� The database manager configuration file for the DB2 UDB database server and the DLFM server. The following command can be used to collect this information:

db2 get dbm cfg

� The database configuration file for the DB2 UDB databases and the Data Links database (DLFM_DB). You can use the following command to collect this information:

db2 get db cfg for <dbname>

� The DB2 UDB server and DLFM server code level. This information can be collected by issuing the db2level command on each server.

� The db2 diagnostic logs (DB2DIAG.LOG) from both the DB2 UDB server and the DLFM server. By default, the db2diag.log is located in the instance home directory under the sqllib/db2dump directory. The DB2DIAG.LOG file is the most important debugging information and must be collected for all DB2 UDB and DLFM error conditions.

Chapter 16. Problem determination 271

Page 294: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

– If the problem is reproducible, we recommend you set the diaglevel to 4 and recapture the information. The diaglevel is a database manager configuration parameter.

– On the DLFM server, the error level is set by the DLFM_LOG_LEVEL registry variable. The default value is ERROR. For problem determination purposes, the DLFM_LOG_LEVEL should be set to DEBUG.

To determine what the current DLFM_LOG_LEVEL is set to, issue the db2set command on the DLFM server.

– Collect any dump files mentioned in DB2DIAG.LOG. Dump files can be identified by files named as x.dmp, where x is the process ID that produced the dump.

– Collect any trap files DIAGPATH. Trap files can be identified by files named as x.trp, where x is the process ID that produced the trap.

– The DIAGPATH is a database configuration parameter that points to the directory location for placing diagnostic data. We recommend that you collect all files in the DIAGPATH directory. To reduce the files that must be analyzed, you should clean up this directory on a regular basis.

Figure 16-1 shows a description of the type of information that is written to the db2diag.log file.

Note: DIAGLEVEL can be set at:

0 No logging.1 Severe errors.2 Severe + non severe errors.3 Severe, non-severe & warning.4 Severe, non-severe, warning & informational.

The default diaglevel is level 3.

272 Data Links: Managing Files Using DB2

Page 295: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-1 Extract of an entry written to the db2diag.log file

Figure 16-2 explains what each of the identified components in (Figure 16-1) represents.

Figure 16-2 Information about each component in the db2diag.log file

16.1.3 DB2 Universal Database or DLFM ‘hang’ situationsThe debugging of “hang” situations is more complicated. An example of a hang is a CONNECT to the database that does not return. DB2 Universal Database provides the db2_call_stack on UNIX platforms and the db2bddbg.exe (DB2 Backdoor Debugger) tools on Windows platform to collect information for hang situations.

1998-03-23-14.59.01.30 3000 Instance:DB2 Node:000PID:147(db2syscs.exe) TID:203 Appid:*LOCAL.DB2.9803231 95820buffer_pool_services sqlbStartPools Probe:0 Database:SAMPLE

Starting the database

There is no error code in this case asit is an informational message

10

7

2

8

1

9

3 5

64

1. Time/date information2. Instance name3. Partition Number, even for non EEE4. Process ID and thread ID in Windows5. Application ID6. Component Identifier7. Function identifier8. Unique Error identifier (Probe ID)9. Database name10. Error description and/or Error code

Chapter 16. Problem determination 273

Page 296: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Hangs in the UNIX environmentOn UNIX platforms, the db2_call_stack tool can be found in the sqllib/bin directory. The db2_call_stack tool should be initiated by the db2 or dlfm administrator. The db2_call_stack tool does not resolve the hang, but rather provides more information by issuing a signal -36 on AIX (signal -21 on Solaris) against db2 or dlfm processes. The tool can be initiated on the DB2 UDB server or the DLFM server.

For example, let us assume that we suspect a hang on the DLFM server in a UNIX environment. You should perform these steps:

1. Log on to the DLFM server as the DLFM administrator.

2. Issue the db2_call_stack command. This generates trap files in the sqllib/db2dump directory. A trap file is generated for each DB2 process on the DLFM server.

3. Wait for 3 minutes and issue the db2_call_stack command again. You should repeat this step at least twice. You do this to determine if there are any changes on the stack calling chain dumped in the trap file. Figure 16-3 shows an example of what is dumped to a trap file. The trap files are required for problem determination purposes by the DB2 support team.

Figure 16-3 Extract of a trap file

4. The trap file shows the order in which functions were called. The most recent function call is on the top of the stack. The trap is analyzed by looking at changes on the top of stack after a number of iterations of dumping the stack with the db2_call_stack command. We recommend you dump the stack at least three times, two minutes apart.

274 Data Links: Managing Files Using DB2

Page 297: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Hangs in the Windows environmentOn Windows platforms, the db2dbdbg tool can be found in the sqllib/bin directory. The db2dbdbg tool should be initiated by the db2 or dlfm administrator. The db2dbdbg tool does not resolve the hang, but rather provides more information. The tool can be initiated on the DB2 UDB server or the DLFM server.

For example, let us assume that we suspect a hang on the DLFM server in a Windows environment. You should perform the following steps:

1. Issue the following command:

db2set DB2_BDINFO

This sets up the debugger registry variable. This command displays three numbers. For example, the output of the numbers should resemble the following format:

1904 1812223120 2011788013

2. Run the following command:

db2bddbg 1904 1812223120 db2bd db2ntDumpTid E:\work -1 stack.dmp

The first two arguments to the db2bddbg tool are the first two numbers from the output of the db2set DB2_BDINFO command. The next two arguments of db2bddbg are the internal debug DLL and function names.

E:\work is the directory for output file, stack.dmp is the name of the stack trace back file.

3. Run the following command again after two minutes:

db2bddbg 1904 1812223120 db2bd db2ntDumpTid E:\work -1 stack2.dmp

4. Send the stack.dmp and stack2.dmp files to DB2 support for analysis.

16.1.4 DB2 Universal Database or DLFM crashA crash is a severe error or condition that causes the DB2 UDB Database Manager or DLFM to abnormally terminate. An example of a crash is a power failure. In the event of a crash of the DB2 UDB server or the DLFM server, a number of trap files are generated in the sqllib/db2ump directory or the directory specified by the DIAGPATH database manager configuration parameter. A trap file is generated for each DB2 process. The contents of the sqllib/db2dump directory should be sent to the DB2 support team for analysis of the crash.

The minimum amount of information to collect for problem determination purposes includes:

� The SQL code and any reason code or system error code� A useful description of the problem� A description of the actions preceding the error

Chapter 16. Problem determination 275

Page 298: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� The database code level� The database manager and database configuration parameters� The DB2DIAG.LOG file� The time of the error� Any dump file listed in the DB2DIAG.LOG file� Any trap file in the DIAGPATH

If possible, collect as much information as possible since it can reduce the time taken to resolve complex problems.

Additional informationIn many cases, the DB2DIAG.LOG and its associated trap and dump files will be enough information to solve the problem. However, in some cases, additional data may be required:

� A DB2 Trace can be taken on the DB2 UDB and DLFM server. A trace is useful in determining the internal code path taken by DB2 UDB and the DLFM server that leads to the problem.

� A SYSLOG in the UNIX environment or an Event Log in the Windows environment can be collected.

What else can be doneIn the event that the previous information does not provide the source of the problem, what else can be done?

� Additional debug code can be added that provides more information in the DB2DIAG.LOG file when the error occurs again.

� Additional trace points can be added that provide additional information about the function where the error is occurring and the data it is manipulating.

16.1.5 The DB2 TraceThis section describes capturing and analyzing a DB2 Trace. There are other forms of traces, such as Operating System Traces and Application Traces, which we do not discuss. The DB2 Trace can be used on both the DB2 UDB database server and the DLFM server. It may sometimes be necessary to take a trace concurrently on the DB2 UDB database server and the DLFM server. An example of such as situation is a communication problem between the DB2 UDB database and the DLFM server.

Note: Refer to Appendix D, “Logging priorities for DLFF and DLFSCM” on page 331, to learn how to change logging level of DLFF (or DLFSCM in DCE-DFS environment) to the required level.

276 Data Links: Managing Files Using DB2

Page 299: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Taking a DB2 TraceA DB2 Trace may be required if the diagnostic data already collected does not give enough information about a problem. A DB2 Trace can be really useful if the problem being encountered is reproducible. Since the trace logs all actions being performed along with parameter values at various steps in the process, the following actions occur:

� DB2 Trace does impact performance.

� The trace should be taken when there is minimum activity on the machine to prevent the capture of unnecessary information.

� In addition, DB2 Trace also initializes some variables. This sometimes eliminates traps or segmentation violations from occurring while tracing.

DB2 Trace in memoryTo perform a DB2 Trace in memory, follow these steps:

1. Turn on the trace with the command:

db2trc on -l 8000000 -e -1

The 8000000 represents a 8 MB memory buffer. This may be increased if the buffer is too small to capture the error in the trace.

The -e -1 indicates that the trace should continue after system errors.

2. Reproduce the error.

3. Dump the trace with the command:

db2trc dmp tracefile.name

The tracefile.name can be any name to store the trace output from memory to disk.

4. Turn off the trace with the command:

db2trc off

5. Since the trace is dumped in a binary format, it must be converted to ASCII to be analyzed by the following commands:

db2trc fmt tracefile.name tracefile.fmtdb2trc flw tracefile.name tracefile.flw

tracefile.name is used as an input file from the db2trc dmp command. tracefile.fmt and the tracefile.flw represent the ASCII output files of the trace FLOW and trace FORMAT. These are the two files that are analyzed to determine what caused the error.

Chapter 16. Problem determination 277

Page 300: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DB2 Trace to a fileThe following steps outline the process of taking a DB2 Trace directly to a file instead of to memory. This method is recommended in system hang situations when you cannot manually dump the trace. A trace to file causes more of performance decrease than tracing to memory.

1. To turn on the trace, enter:

db2trc on -l 8000000 -e -1 -f tracefile.name

The 8000000 represents a 8 MB memory buffer. This may be increased if the buffer is too small to capture the error in the trace.

The -e -1 indicates that the trace should continue after System Errors.

The -f tracefile.name indicates to write directly to the file specified instead of logging to memory

2. Reproduce the error.

3. There is no need to dump the trace file since it is already on disk.

4. Turn the trace off with the command:

db2trc off

Information provided by the traceThe trace gives the following information:

� The trace records all functions called in the order of time in which they were called.

� The trace captures trace points. The types of trace points include:

– Function entry trace points– Data trace points to record variable values at points within the function– Exit trace points to record the function return codes– Error trace points to record additional data in the event of an error

Figure 16-4 shows an illustration of a trace entry that is dumped to the trace format file that is produced by using the db2trc fmt command.

Figure 16-4 Extract of a trace entry in the formatted trace file

1 DB2 cei_entry oss 2 sqlo_init_GMT_timer_services (1.20.74.152)

pid 198; tid 197; cpid 0; time 1512414; trace_point 0

called_from 10047AE9

1

2 3 4 5 6

7

118 109

278 Data Links: Managing Files Using DB2

Page 301: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-5 describes each of the components that are dumped to a formatted trace file for each trace point.

Figure 16-5 Information about each component in a formatted trace file

The most recent trace points are at the bottom of the trace. When analyzing a trace, start at the bottom and work upward to find the source of the problem.

You need to format or flow the trace prior to examination:

� To format the trace file, issue the following command:

db2trc fmt trace.file trace.fmt

� To flow the trace file, issue the following command:

db2trc flw trace.file trace.flw

Analyzing a DB2 Trace to resolve problemsThis section demonstrates using the DB2 Trace utility through examples to resolve problems. In the first example, we took a trace on the DB2 UDB server.

The db2diag.log in this example provides enough information to resolve the problem, but for learning purposes, we use a trace to confirm the error. In the second example, we took a trace on the DLFM server. This helps demonstrate that the principals of problem determination are very similar on both servers.

Example In this example, the scenario is as follows. A user is running an application in a DLFM environment and receives an SQL1036 error message when trying to use the application. The user approaches you, the Database Administrator (DBA), to resolve the problem.

1. Sequence number2. Instance name3. Entry type, for example: functionentry/exit/data4. Component name5. Function name6. Internal ID7. Process and thread information8. Companion process ID9. Time information10. Unique trace point identifier11. Address where the function was called

Chapter 16. Problem determination 279

Page 302: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

You should use the following approach to resolve the problem:

1. Log on to the DB2 UDB server as the database instance owner. As the DBA, you know that the application has to communicate with the DB2 UDB database that would communicate the user request with the DLFM server.

2. You need to narrow down that there is a problem with the DB2 UDB database or DLFM server by using the DB2 CLP (Command Line Processor) and not the application. Open a CLP window and try to connect to the database that the user is trying to use. You notice the error message, shown in Figure 16-6, on the connect.

Figure 16-6 An SQL1036 error message when connecting to the database

3. You can try to connect to the Data Links File Manager database (dlfm_db) to ensure that you can connect. From this information, you can determine that the problem is on the DB2 UDB database server. It is important to narrow down the problem source as much as possible.

4. At the CLP, issue the following command to gather more information about the error message:

db2 ? sql1036

You notice that there are many possibilities for this error message to be returned.

5. Examine the db2diag.log (Figure 16-7). For problem determination purposes, you should have diaglevel in the database configuration file set to 4.

280 Data Links: Managing Files Using DB2

Page 303: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-7 Extract of the DB2DIAG.LOG with the SQL1036 error message

6. The db2diag.log shows the first SQL1036 error occurring at 2001-05-30-15.29.57 on PID:12136 (Figure 16-7). The function sqlpgint reports the SQL1036 error. The first function to report an error is the sqlpgilt on PID:20914. From db2diag.log, you can tell that we are missing the SQL000001.LOG file, which is the first active log file.

7. Notice the ZRC=FFFFE60A error being dumped in the db2diag.log on PID 12136. This is an internal error message, on which you can find more information in Appendix A, “DB2 Internal Return Codes,” in DB2 UDB Troubleshooting Guide, GC09-2850. The E60A error maps to a “File Does Not Exist” message.

8. Since this problem is reproducible, we take a trace of the problem to confirm that the missing log file is indeed causing the I/O error.

9. Take a DB2 Trace to memory on the DB2 UDB database server as follows:

a. Turn the trace on with the command:

db2trc on -l 8000000 -e -1

Reproduce the SQL1036N error:

db2 connect to <database_name>

b. Dump the trace with the command:

db2trc dmp tracefile.name

Chapter 16. Problem determination 281

Page 304: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

c. Turn the trace off with the command:

db2trc off

d. Since the trace is dumped in a binary format, it must be converted to ASCII to be analyzed by the following commands:

db2trc fmt tracefile.name tracefile.fmtdb2trc flw tracefile.name tracefile.flw

Figure 16-8 Output of the DB2 Trace format command

10.Open the tracefile.flw, tracefile.fmt, and the db2diag.log files in three different windows.

The trace flow file shows the flow of control of processing by DB2 UDB. The vertical lines in the file in the flow helps to match the start and finish of each function (Figure 16-9). The the trace flow is a diagram of the execution path of the source code.

Each function has an entry, data, and exit point. For the purposes of our analysis, we focus on the entry and the exit points. The function exit points dump a return code. A return code of zero (rc=0) means the function completed without any error. A negative return code indicates an error condition. A positive return code is usually a warning.

When analyzing a trace flow, it is important to understand that error conditions are propagated to the calling functions, so a function can return an error only because an error was encountered by another function.

Note: When the trace is being formatted, look at the output to determine whether the trace is wrapped. You should try and capture a trace that is not wrapped. A trace that is wrapped usually indicates that the error has not been captured (Figure 16-8).

282 Data Links: Managing Files Using DB2

Page 305: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-9 Function flow structure

11.Go to the bottom of the tracefile.flw and search for -1036. The trace flow shows the sequence in which functions were called and the corresponding return codes. If the error is found and the trace is not wrapped, this indicates that the trace has captured the error and can be analyzed to find the source of the problem.

12.Search for the PID = 12136 string in the trace flow. This pid is the process ID of the sqlpgint function that failed with a “File no found” error message in the db2diag.log (Figure 16-7). We recommend this approach of starting the trace analysis because it focuses on the functions that returned the error (Figure 16-10).

13.When you reach sequence number 1701, as in our example of the flow, you will notice that there where no serious errors. Continue following the rest of the functions for the identified pid and analyze any errors.

14.Analyze all the error codes that are being returned on PID = 12136. Some functions return errors that are not really serious. In general, an error is considered serious when the error code is propagated to all functions on a particular pid (Figure 16-10).

Tip: Most recent trace points are at the bottom of the trace. Start at the bottom and work upwards to find the source of the problem. Not all errors in the trace are real problems.

begin function OpenFile().....

begin function FindFile()filefound=0;

end function FindFile().....end function OpenFile()

begin function OpenFile()| ......| begin function FindFile()| filefound=0;| end function FindFile()| .......end function OpenFile()

Entry

Data

Exit

Source Code Trace Flow

Chapter 16. Problem determination 283

Page 306: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-10 Extract of the trace flow file

15.Follow the sequence numbers in an ascending numerical order. If a sequence number suddenly jumps to another pid, follow it on the new pid. In our example, the sequence number 2296 is on another pid (Figure 16-11).

16.The sqloopenp function returns an FFFFE60A error. This is the first sign of a function returning an error. Notice that this error is propagated to other functions that eventually leads to the SQL1036 error (I/O error).

17.Now that you have found the source of the FFFFE60A error, analyze the trace format to gather more details.

pid = 12136; tid = 1; node = 0;

1161 sqloset cei_entry1162 sqloset cei_data ...1163 sqloset cei_data ...1164 sqloset cei_retcode 01165 sqleWaitUntilReactivated fnc_data ...1166 sqleWaitUntilReactivated fnc_retcode 01167 sqleAgentActivationInit fnc_entry1168 |sqloInstallEDUSignalHandler cei_entry1169 |sqloInstallEDUSignalHandler cei_retcode 01170 |sqloInstallEDUSignalHandler cei_entry1171 |sqloInstallEDUSignalHandler cei_retcode 0..............1692 | | | | | | | | | |sqlubrfg cei_entry1693 | | | | | | | | | | |sqloppth cei_entry1694 | | | | | | | | | | |sqloppth cei_retcode 01695 | | | | | | | | | | |sqloppth cei_entry1696 | | | | | | | | | | |sqloppth cei_retcode 01697 | | | | | | | | | | |sqloopenp cei_entry1698 | | | | | | | | | | |sqloopenp cei_data ...1699 | | | | | | | | | | |sqloopenp cei_data ...1700 | | | | | | | | | | |sqloopenp cei_errcode 0xffffe60a = -66461701 | | | | | | | | | |sqlubrfg cei_retcode 0................

Corresponds to PIDin db2diag.log

Entry to the function sqlubrfgwhich calls function sqloppth

Function sqlubrfgreturns with no errors

Not a serious error sincefunction sqlubrfg belowreturns no error.

284 Data Links: Managing Files Using DB2

Page 307: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-11 Extract of trace flow showing the SQL1036 error

18.The trace format file provides more details for each function. We know from the trace flow file that the function sqloopenp returned an error at sequence number 2296.

19.Open the trace flow file if it is not already opened and search for sequence number 2296 (Figure 16-12).

20.Notice the rc = 0xffffe60a on the exit of the sqloopenp function. Find the entry point to the sqloopenp function so you can analyze the function. In our example, the entry point to sqloopenp is at sequence number 2293.

2158 | | | | | | | | | | | |sqlpinit cei_entry2159 | | | | | | | | | | | | |sqlogmblk cei_entry2160 | | | | | | | | | | | | |sqlogmblk cei_data ...2161 | | | | | | | | | | | | | |MemHoldSeg cei_entry2162 | | | | | | | | | | | | | |MemHoldSeg cei_retcode 02163 | | | | | | | | | | | | |sqlogmblk cei_data ...2164 | | | | | | | | | | | | |sqlogmblk cei_retcode 02165 | | | | | | | | | | | | |sqloinca cei_entry2166 | | | | | | | | | | | | |sqloinca cei_retcode 02167 | | | | | | | | | | | | |sqlpgint fnc_entry2168 | | | | | | | | | | | | | |sqlpgolf fnc_entry2169 | | | | | | | | | | | | | | |sqloopenp cei_entry2170 | | | | | | | | | | | | | | |sqloopenp cei_data ...2171 | | | | | | | | | | | | | | |sqloopenp cei_data ...2172 | | | | | | | | | | | | | | |sqloopenp cei_retcode 0.........pid = 19130; tid =1; node = 0.........2296 | | | |sqloopenp cei_errcode 0xffffe60a = -6646.........2662 | | | | | | | | | | | | | | |sqlocloselog cei_entry2663 | | | | | | | | | | | | | | | |get_libc_reen_buffer cei_entry2664 | | | | | | | | | | | | | | | |get_libc_reen_buffer cei_data ...2665 | | | | | | | | | | | | | | | |get_libc_reen_buffer cei_data ...2666 | | | | | | | | | | | | | | | |get_libc_reen_buffer cei_retcode 02667 | | | | | | | | | | | | | | |sqlocloselog cei_retcode 02668 | | | | | | | | | | | | | |sqltfast2 cei_retcode 02669 | | | | | | | | | | | | |sqlpgint fnc_errcode 0xffffe60a = -66462670 | | | | | | | | | | | |sqlpinit cei_errcode 0xffffe60a = -66462671 | | | | | | | | | | |sqledint fnc_errcode 0xfffffbf4 = -1036

Function entryfor sqlpinit

Function entryfor sqlpgint

Function exitfor sqlpgint witherror

Error propogatedto calling functions

SQL 1036

Function exitfor sqloopenp witherror on PID 19130

Chapter 16. Problem determination 285

Page 308: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure 16-12 Trace format file

21.Sequence number 2294 gives valuable information about the problem. It is the data portion of the sqloopenp function and reports that the file SQL000001.LOG cannot be found in the directory path /datalink/db2inst1/NODE0000/SQL00001/SQLOGDIR/SQL000001.

This information is useful, since we get the name of the file that DB2 is trying to open as well as the path in which it is looking. The path to the file is not dumped in db2diag.log.

22.Log on to the DB2 server and go to the path that was dumped. Notice the SQL000001.LOG file is missing, which explains the I/O error.

23.Since the missing log file is an active log file, any connection attempts to the database would fail. To resolve this problem, you need to restore from the latest backup and rollforward to a point-in-time before the missing log files.

This concludes our example of analyzing a trace taken on the DB2 UDB server. The methodology used here applies to analyzing traces taken on the DB2 UDB database server and the DLFM server.

16.2 Solutions to common problemsMany of the problems encountered when using DLFM tend to fall into one of three categories. First, there could be a problem with the DLFM file server. Maybe it won't start, or it cannot talk to the DB2 server. Second, there could be a problem with the DB2 server that is attempting to use Data Links. Third, there

286 Data Links: Managing Files Using DB2

Page 309: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

may be a problem with a file system being managed by DLFM. Maybe the file system cannot be mounted, or client workstations may not be able to access linked files. The following sections discuss each of these types of problems, what can cause them, and what can be done to fix them.

Before discussing the three problem categories, let’s briefly look at the resources that are available to aid in problem determination.

16.2.1 Available resourcesA good explanation for many of the error conditions encountered can be found in one of the DB2 manuals. The question is always which one. Here’s a helpful guide:

� DB2 UDB Message Reference, GC09-2978: For error messages beginning with the characters DB2 or SQL

� DB2 Data Links Manager Quick Beginnings, GC09-2966, in Appendix A, “DB2 Data Links Manager Errors and User Responses”: For messages beginning with DLFM

� DB2 UDB Troubleshooting Guide, GC09-2850, Appendix A, “DB2 Internal Return Codes”: For interpretations for the four-byte, hexadecimal error codes written to the db2diag.log file; there is another chapter that covers the Data Links Manager

16.2.2 DLFM server problemsThis section outlines common problems with the DLFM server. In some cases, it also outlines the symptoms, and, in all cases, offers solutions to correct the problem.

Problem: DLFM will not startIf any of the critical resources needed by DLFM are unavailable, startup may fail. A DLFM101E error message may be written to the db2diag.log file. This problem most commonly occurs when trying to start dlfm shortly after it is stopped. The dlfm stop command takes some time to cleanup all Inter Process

Important: One of the most useful resources can be the db2diag.log file. This is probably the first place one should look for error messages. If no useful messages can be found in db2diag.log and the problem is reproducible, try increasing the Database manager configuration parameter DIAGLEVEL to its maximum value of 4, issue a db2stop and db2start, recreate the problem, and check db2diag.log again. When doing this, be sure to reset the DIAGLEVEL parameter to its previous value.

Chapter 16. Problem determination 287

Page 310: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Communication resources (IPCs), and any attempt to run start dlfm before the cleanup is complete will fail. This problem can also occur if the database manager has a problem starting, or communication services cannot be started, or if the dlfsdrv device driver is not loaded.

SymptomsAfter running the dlfm start command on the Data Links server, one or more of the following conditions exits:

� The DLFM SEE command shows no processes running.

� Application program receives a SQL0357N reason code, 03 return code, when attempting to SELECT, INSERT, or UPDATE a DATALINK value.

� db2diag.log on DB2 server contains a message that indicates that DLFM is unreachable.

� db2diag.log on DB2 server contains a message indicating that restart recovery is pending/in progress for DLFM.

Solution1. Log on as dlfm, issue the dlfm shutdown command, and retry the dlfm start

command.

2. Check the DB2 registry variables using db2set -all, and validate that they are correct. Be sure that the port number specified in the DLFM_PORT registry variable is not being used by another process.

3. Validate that the dlfm instance can be started by logging on as dlfm and issuing the db2start command.

4. Verify that DLFM_DB is usable by connecting to it.

5. Check to see if the dlfsdrv device driver is loaded. Log on as root and run:

strload -qf /usr/lpp/db2_07_01/cfg/dlfs_cfg

This command should return: /usr/lpp/db2_07_01/bin/dlfsdrv: yes

If no is returned instead of yes, the dlfs device driver needs to be loaded.

The driver can be loaded by root by running:

strload -f /usr/lpp/db2_07_01/cfg/dlfs_cfg

6. Examine the db2diag.log file for additional messages.

288 Data Links: Managing Files Using DB2

Page 311: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Problem: DLFM does not automatically start after reboot

SolutionFollow these steps to correct the problem:

1. Validate that /etc/inittab has an entry to start db2 by running /etc/rc.db2.

2. Check to see that /etc/rc.db2 issues a dlfmstrt command to start dlfm.

3. Verify that all dlfs file systems are being mounted during startup. The mount option in /etc/filesystems for the dlfs file systems should be set to false. The file systems should be mounted by issuing the mount -v dlfs command. See the note box below.

4. Validate that /etc/rc.db2 or a script called by /etc/rc.db2 loads the dlfsdrv device driver.

Problem: Files can be written to a DLFS but not read

SymptomsThe db2diag.log on the DB2 UDB server has no errors recorded.

The db2diag.log on the DLFM server has messages such as “Dest not valid for upcall” and “Expired or invalid token errors”.

The application errors on reading a file persists even though the DL_EXPINT database configuration parameter on the DB2 UDB server is set to 600 seconds. The read errors occur within the 600 seconds.

SolutionCheck the system times on the DLFM and DB2 UDB database server and ensure that they are synchronized. For example, a one hour difference in times between the DLFM server and the DB2 UDB server will expire the token immediately even though the DL_EXPINT database configuration is set to 5 minutes.

Note: One common practice is to create an executable file called /etc/rc.dlfs and to call it from /etc/rc.db2 after the dlfmstrt command is executed. The file should contain the command to load the dlfsdrv device driver, mount all dlfs file systems, and export them.

The sample content of /etc/rc.dlfs is:

strload -f /usr/lpp/db2_07_01/cfg/dlfs_cfgmount -v dlfs /datalinks_fs1mount -v dlfs /datalinks_fs2exportfs -a (this assumes that the dlfs file systems are listed in /etc/exports)

Chapter 16. Problem determination 289

Page 312: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Problem: Error mounting a DLFS file system, on Solaris

The following command is issued to try and mount a DLFS file system:

dmins0/opt/IBMdb2/V7.1/instance$ ./dlfmfsmd /dlfs

The following error is returned:

dlfs mount Error: Invalid argumentumount : warning: /dlfs not in mnttabExplanation: An attempt to mount the specified file system has failed.User Response: Verify that the file system is defined. Correct any errors from the mount command and try again.DB1035E Failed to mount file system /dlfs

SolutionEnsure that the server is booted in 32-bit mode and not 64-bit mode. The Solaris command isainfo -v displays the mode in which the server was booted.

16.2.3 DB2 server problemsThis section outlines common problems with the DB2 server. In some cases, it also outlines the symptoms, and in all cases, offers solutions to correct the problem.

Problem: DB2 server cannot talk to DLFM When a DB2 instance that uses Data Links is started, DB2 attempts to connect to the File Managers that are registered with it. If a File Manager is not running, an error is written to the db2diag.log file, but users of the database using Data Links do not see any errors until they try to access a DATALINK value. This problem results in a SQL0357N error.

Another communication-related error, SQL0368N, can be caused by the DB2 database not being registered with DLFM or being registered incorrectly. This can also be caused by the Database manager configuration parameter DATALINKS being set to NO. DLFM will refuse a connection from a DB2 server that is not on the exact same release level and fixpak level as the DLFM server.

Note: If the server is booted in 64-bit kernel mode, the isainfo -v command would show both the architectures (32-bit sparc and 64-bit sparcv9).

290 Data Links: Managing Files Using DB2

Page 313: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

SymptomsAfter running the dlfm start command on the Data Links server, the dlfm see command indicates that DLFM is up and running, but one or both of the following conditions still exist:

� Application program receives a SQL0357N reason code, 03 return code, when attempting to SELECT, INSERT, or UPDATE a DATALINK value.

� db2diag.log on the DB2 server contains message indicating that DLFM is unreachable.

SolutionFollow these steps:

1. Verify that the database, instance and hostname are registered correctly with DLFM. Log on as dlfm and issue the command:

dlfm list registered databases

2. Verify that the DLFM server is correctly registered with the DB2 database. Connect to the DB2 database and issue the command:

db2 list datalinks managers for database <dbname>

Check the hostname and port number that is registered.

3. Make sure the DATALINKS database manager configuration parameter is set to YES. On AIX, run the command:

db2 get dbm cfg | grep DATALINKS

4. Run the db2level command on the DB2 server and on the DLFM server to verify that both are running the same version and fixpak level of DB2.

5. Make sure the DB2COMM registry variable on the DB2 server includes the value “TCPIP”. On AIX, enter the command:

db2set -all | grep DB2COMM

6. On AIX, issue the following commands:

a. db2stop command on the DB2 serverb. iptrace commandc. db2start on the DB2 serverd. kill the iptrace processe. ipreport command

This shows what DB2 is sending to DLFM when it tries to connect. Here is an example of using iptrace (see the man pages for options and details):

iptrace -b -d <dlfm_hostname> -s <db2_hostname> -P TCP <pathname/outfilename>kill -9 <iptrace_process_id>ipreport -r -n -s <pathname/outfilename> > <formatted_filename>

Chapter 16. Problem determination 291

Page 314: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

16.2.4 File system problemsThis section outlines common problems with the file system. Each problem, is then followed by the symptoms, in some cases, and a solution.

Problem: Can’t mount the DLFS file systemThe file system that contains the files that are managed by Data Links must be mounted as a dlfs file system. For the mount to succeed, the dlfsdrv device driver must be loaded and the file system must be defined as a dlfs file system in /etc/filesystems (on AIX).

SymptomsThe symptoms may include:

� The mount command does not show file system mounted.

� The mount command shows file system mounted, but not as a dlfs file system.

� The mount -v dlfs command fails with one of the following two messages:

dlfs mount Error: Function not implementeddlfs mount helper: Mount UnsuccessfulUnmount the base file system

dlfs mount helper: Error in getting basefs typedlfs mount helper: No base file system specified

SolutionFollow these steps:

1. Check to see if the dlfsdrv device driver is loaded as root:

strload -qf /usr/lpp/db2_07_01/cfg/dlfs_cfg

This command should return:

/usr/lpp/db2_07_01/bin/dlfsdrv: yes

If no is returned instead of yes, the dlfs device driver needs to be loaded:

strload -f /usr/lpp/db2_07_01/cfg/dlfs_cfg

2. Check /etc/filesystems and verify the following settings for the file system:

vfs = dlfsnodename = - (Make sure there are no trailing spaces after the dash)mount = falseoptions = rw,Basefs=jfs

If the file system is defined as a journal file system (jfs on AIX), convert it to a dlfs file system by running the dlfmfsmd command as root:

292 Data Links: Managing Files Using DB2

Page 315: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

/usr/lpp/db2_07_01/instance/dlfmfsmd </file_system_name>

Here </file_system_name> is the name of a jfs file system to be converted to a dlfs file system.

3. Check /etc/vfs and verify that there is an entry that identifies the helper programs for a dlfs file system. This entry will look something like this:

dlfs 12 /usr/lpp/db2_07_01/bin/dlfs_mnthlp /usr/lpp/db2_07_01/bin/dlfs_fshelper

Problem: Clients cannot access files in the DLFS file systemTo read files in a file system that is managed by Data Links, clients may need several things. First, the file system must be accessible from the network. In other words, the file system needs to be NFS mounted on the clients. A prerequisite for this is that the file system has been exported on the Data Links server. Next, the clients may need read/write permission on the file system. If the files are linked using the READ PERMISSION DB option, clients need to read files with a valid access token.

SolutionFollow these steps:

1. Make sure the dlfs file system is mounted on the client. If it is suspected that the mount is stale, unmount the file system on the client, validate that the file system is mounted on the server, export the file system on the server, and then remount the file system on the client. Stale mounts can occur if the file system is unmounted and then remounted on the server without exporting the file system.

2. Check the access permissions of the dlfs file system on the server and on the client.

3. Check the READ PERMISSION option on the DATALINK column. Section 3.4.5, “Querying DATALINK options” on page 74, discusses how to do this. If it is READ PERMISSION DB, make sure the client application is using the access token generated by DB2. It may be necessary to run a DB2 event monitor that traces the SQL statement activity of the application program to see this. DB2 UDB SQL Reference, SC09-2974, discusses how to create a DB2 event monitor.

If the application uses the DLURLPATHONLY function to extract the pathname and filename from the DATALINK value, DB2 does not return an access token. Also, check the DL_EXPINT database configuration parameter. On AIX, this is:

db2 get db cfg for <dbname> | grep DL_EXPINT

Chapter 16. Problem determination 293

Page 316: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

This determines the length of time (in seconds) for which the generated access token will be valid. If the application does not use the token within that period of time, the token will be rejected by DLFM, and the application will not be allowed to read the file.

16.2.5 Frequently Asked Questions (FAQs)This section answers some of common questions asked about the DB2 UDB database and Data Links File Manager (DLFM) environment.

� What file backup products can I use to backup Data Linked files?

Data Links File Manager currently supports:

– Disk– Tivioli Storage Manager– Net Backup– Legato– Any Backup Services API (XBSA) compliant applications

� Can I integrate disk backup systems with Data Linked files?

No, file system backups of Data Link files can be made to be used to recover in the event of a disk crash. File system backups should not be used for any other form of recovery. The DB2 UDB Backup and Restore utilities should be used for Data Links recovery purposes.

� Can I use HSM functionality with Data Linked files?

Yes, but only on the AIX platform.

� Can I mix and match DB2 versions between DLFMs and host databases?

No, a DB2 server and any DLFM servers that are registered with it must be on the exact same release level and fixpak level of DB2.

� Can a DATALINK column in a single table reference file systems on different operating system platforms?

Yes, any combination of the operating system platforms which support Data Links can be registered as a DLFM server. A maximum of 16 DLFM servers can be registered with a DB2 database, unless the DLFM server resides in a DCE/DFS environment, in which case, the limit is one. A DATALINK column in a single table can reference files that are managed by any of the DLFM servers that are registered with the DB2 database.

� What are the symptoms of temporary unavailability of the file system or the network in DFS case while workload is going on?

– Will not be able to link files since the two-phase commit processing will fail.– Will not be able to change directories.

294 Data Links: Managing Files Using DB2

Page 317: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Is integrity compromised if DLFM is unavailable or if the file system is not currently mounted as DLFS?

If the DLFM is unavailable and read permission is set to DB, integrity will not be compromised. If the file system is not mounted as DLFS, then integrity will be compromised.

� What are the restrictions in a DLFS environment?

The ability to rename directories is restricted.

� What happens when the maximum number of backup copies maintained is exceeded?

Backups that are marked as expired will be garbage collected.

� Is JDBC supported?

Yes, JDBC can be used in a Data Links environment.

� Can I use capabilities of Data Linked files in an NFS environment? Are there any integrity issues?

Yes, Data Linked files can be accessed through NFS. There are no integrity issues. However, caching at the NFS client may result in a user being able to access the READ PERMISSION DB file(s), even when the token has expired.

� What do I miss out if I use LOBs instead of Datalinks type? Are there functions if I still want to bring data into the database?

See 3.3, “Data Links versus LOBs” on page 67.

� Would the Data Links control over a file system restrict normal file system activity for ordinary files, or for ordinary file system user accesses into the file system?

All requests to access files that reside in a dlfs file system are intercepted by the dlfs helper programs to determine if the request will be allowed. Access to any file that is linked using READ PERMISSION FS will be allowed or disallowed based only on the file access permissions on the file. Access to any file that is linked using READ PERMISSION DB will be allowed only if a valid access token is supplied as part of the file name. For details about using an access token to read a file linked with READ PERMISSION DB, see 3.5.3, “Reading a linked file” on page 77.

� When does file backup of linked files occur, if recovery is set to yes?

File backups of files that are linked occur asynchronously. When a file is linked, DLFM records the attributes of the linked file (name, creator, access permissions, size, etc.) in the DLFM_DB database, and returns control to the program that issued the SQL INSERT or UPDATE statement. DLFM maintains a queue of files that need to be backed up, and performs the backup of these files as resources permit.

Chapter 16. Problem determination 295

Page 318: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� How can we tell that the copy daemon is busy or doing something before we decide to run a backup?

– Use Operating System tools to see if there is any CPU activity with the copy daemon process (dlfm_copyd) on the DLFM server.

– Run the command:

retrieve query

� How can we tell that the Reconcile utility is not hung?

Use Operating System tools to see if there is any CPU activity with the Reconcile utility agent on the DB2 UDB Server.

� Can we run more then one DLFM on a server?

No, the DLFS is implemented using a kernel extension (see 4.1.4, “Multiple DLFMs on a single host” on page 94) that can communicate with a single DLFM, on one server.

� What kind of operations can continue if DLFM is down?

– Backup– Restore– Rollforward– Reconcile

� Can I create a directory in a DLFS type file system if DLFM is down?

No, the Data Link File System Filter needs to communicate with the DLFM.

296 Data Links: Managing Files Using DB2

Page 319: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Appendix A. BNF specifications for DATALINK

This appendix provides information on the Backus Naur Form (BNF) specifications for DATALINKs.

A

© Copyright IBM Corp. 2001 297

Page 320: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

A DATALINK value is an encapsulated value that contains a logical reference from the database to a file stored outside the database. The data-location attribute of this encapsulated value is a logical reference to a file in the form of a Uniform Resource Locator (URL).

The following conventions are used in the BNF specification:

� | is used to designate alternatives.

� [] are used around optional or repeated elements.

� “” are used to quote literals.

Elements may be preceded with [n]* to designate n or more repetitions of the following element; if n is not specified, the default is 0.

The BNF specification for DATALINK is explained here:

URL

HTTP

Note that the search element from the original BNF in RFC1738 has been removed, because it is not an essential part of the file reference and does not make sense in DATALINK context.

FILE

Note that host is not optional and the “localhost” string does not have any special meaning, in contrast with RFC1738. This avoids confusing interpretations of “localhost” in client/server and DB2 EEE configurations.

url httpurl | fileurl | uncurl | dfsurl | emptyurl

httpurl “http://” hostport [“/” hpath ]

hpath hsegment *[ “/” hsegment ]

hsegment *[ uchar | “;” | “:” | “@” | “&” | “=” ]

fileurl “file://” host “/” fpath

fpath fsegment *[ “/” fsegment ]

fsegment *[ uchar | “?” | “:” | “@” | “&” | “=” ]

298 Data Links: Managing Files Using DB2

Page 321: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

UNC

Supports the commonly used UNC naming convention on Windows NT. This is not a standard scheme in RFC1738.

DFS

Supports the DFS naming scheme. This is not a standard scheme in RFC1738.

EMPTYURL

Empty (zero-length) URLs are also supported for DATALINK values. These are useful to update DATALINK columns when reconcile exceptions are reported and non-nullable DATALINK columns are involved. A zero-length URL is used to update the column and cause unlink

uncurl “unc:\\” hostname “\” sharename “\” uncpath

sharename *uchar

uncpath fsegment *[ “\” fsegment ]

dfsurl “dfs://.../” cellname “/” fpath

cellname hostname

emptyurl “”

hostport host [ “:” port ]

host hostname | hostnumber

hostname *[ domainlabel “.” ] toplabel

domainlabel alphadigit | alphadigit *[ alphadigit | “-” ] alphadigit

toplabel alpha | alpha *[ alphadigit | “-” ] alphadigit

alphadigit alpha | digit

hostnumber digits “.” digits “.” digits “.” digits

port digits

Appendix A. BNF specifications for DATALINK 299

Page 322: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Miscellaneous definitions

Leading and trailing blank characters are trimmed by DB2 while parsing. Also, the scheme names ('HTTP', 'FILE', 'UNC', 'DFS') and host are case-insensitive. They are always stored in the database in uppercase.

lowalpha “a” | “b” | “c” | “d” | “e” | “f” | “g” | “h” | “i” | “j” | “k” | “l” | “m” | “n” | “o” | “p” | “q” | “r” | “s” | “t” | “u” | “v” | “w” | “x” | “y” | “z”

hialpha “A” | “B” | “C” | “D” | “E” | “F” | “G” | “H” | “I” | “J” | “K” | “L” | “M” | “N” | “O” | “P” | “Q” | “R” | “S” | “T” | “U” | “V” | “W” | “X” | “Y” | “Z”

alpha lowalpha | hialpha

digit “0” | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”

safe “$” | “-” | “_” | “.” | “+”

extra “!” | “*” | “'” | “(” | “)” | “,”

hex digit | “A” | “B” | “C” | “D” | “E” | “F” | “a” | “b” | “c” | “d” | “e” | “f”

escape “%” hex hex

unreserved alpha | digit | safe | extra

uchar unreserved | escape

digits 1*digit

300 Data Links: Managing Files Using DB2

Page 323: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Appendix B. Overview of DCE-DFS on AIX

This appendix introduces Transarc’s Distributed Computing Environment-Distributed File Service (DCE-DFS) on IBM-AIX. It also includes detailed, high-level information on the DCE-DFS concepts on AIX.

B

Note: Refer to Administering IBM DCE and DFS Version 2.1 for AIX and OS/2 Clients, SG24-4714, to learn more about the administration of DCE-DFS.

© Copyright IBM Corp. 2001 301

Page 324: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Distributed Computing Environment (DCE)The Distributed Computing Environment (DCE) is a cross-platform, comprehensive, integrated set of services that supports the development, use, and maintenance of distributed computing applications. The availability of a uniform set of distributed computing services gives applications an effective means to harness the power inherent in networks of computers that may otherwise be unused.

DCE has the following main services:

� Distributed File Service (DFS)� Time Service� Cell Directory Service� Security Service� Threads Service

Figure B-1 shows the layout of various services of DCE.

Figure B-1 DCE architecture

Distributed Applications

Transport Services/Operating System

RemoteProcedure

Call

DCETime

Service

Cell DirectoryService

SecurityService

DistributedFile Service

Threads Service

302 Data Links: Managing Files Using DB2

Page 325: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DCE provides a communications environment that supports information flow from wherever it’s stored to wherever it’s needed, without exposing the network's complexity to the end-user, system administrator, or application developer.

DCE encompasses all of the facilities necessary for building distributed applications. It integrates all of these services into a single, logical structure that enables programmers and administrators to develop and manage distributed applications as easily as traditional, single-system programs.

In DCE, the cell is the basic unit of operation. A cell consists of from one to several thousand systems that share an administratively independent installation of server and client machines, a unified DCE Cell Directory Service (CDS) naming environment, and a common authentication server and database. Multiple cells can exist at one geographical location. It is also possible for DFS machines at geographically distant locations to belong to the same cell. However, a machine can belong to only one cell at one time.

DCE is based on many formal and de facto standards, including:

� Internet TCP/IP protocols � POSIX 1003.4a draft threads and POSIX 1003.6 draft ACLs � CCITT X.500/ISO 9594 Directory Service � Internet DNS and Network Time Protocols (NTP) standards � X/Open Directory Service (XDS) and X/Open Object Management (XOM)

application� Programming interfaces � Internet GSS API

Distributed File Service (DFS)The Distributed File Service is a DCE application that provides global file sharing. Access to files located anywhere in the interconnected DCE cells is transparent to the user. To the user, it appears as if the files were located on a local drive. DFS servers and clients may be heterogeneous computers running different operating systems.

The DFS has its origin from the Transarc Corporation' s implementation of the Andrew File System (AFS) from Carnegie-Mellon University.

The DFS is built onto and integrated with all of the other DCE services. It has the following main features:

� Location transparency� Uniform naming� Good performance

Appendix B. Overview of DCE-DFS on AIX 303

Page 326: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Security� High availability� File consistency control� NFS inter operability

The DFS distributed file system is a high-performance, scalable, secure method for sharing remote files. DFS appears to the user as a local file system, providing access to files from anywhere in the network for any user, with the same file name used by all (that is, uniform file access). DFS includes many advanced features not found in traditional file systems. It includes scalability and security over wide area networks, which greatly enhance DFS performance and, at the same time, simplify administration.

These services include:

� Distributed File System Client: The Distributed File System Client makes requests for file data from file servers and maintains caches of commonly requested information. Through sophisticated protocols, the client ensures that file updates made by multiple users are coordinated so that a single file image is seen by all users.

� Base File Service (BFS): The BFS distributed file system server provides file data from existing local file systems to DFS clients. Using BFS, an administrator can make existing data from a UNIX File System (UFS), JFS, Veritas, CD-ROMs, and other physical file systems available to DFS clients.

� Enhanced File Service (EFS): The EFS distributed file system server provides features that greatly increase the availability of information and further simplify the administration of DFS. The EFS delivers the ability to replicate, back up, and even move different parts of the DFS file system without interrupting service to the end user. Through the use of copy-on-write technology, EFS can maintain entire snapshots of backed-up file data for on-line access to previous versions of files. EFS enables the use of access control lists (ACLs) on files and directories stored in DFS for fine-grained control over access to data. The EFS also includes a high-performance, log-based physical file system for fast server restart.

� DCE NFS-to-DFS Secure Gateway: The NFS-to-DFS Secure Gateway provides uniform and secure access to DFS files from NFS clients. The gateway provides an easy migration path for the introduction of DFS into environments with widely installed NFS clients.

� DFS and the Web: DFS combines its replication, unique file names, security and scalability to meet the demands of growing Web sites. And DFS Web Secure is the ideal tool for protecting corporate security as you expand access to your enterprise via the Web.

304 Data Links: Managing Files Using DB2

Page 327: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DFS Name SpaceIn a distributed computing environment connecting many workstations, a user likely will have access to several different computers. For example, a user in New York might prepare a document for a meeting in Europe using an office computer, and later amend the document from a computer in Munich. For this reason, a distributed computing environment should support global file names. One mechanism that allows the name of a file to look the same on all computers is called a uniform name space. Without such a mechanism, users might have difficulty finding files as they move from computer to computer and might have to return to the workstation on which they created their files to make updates efficiently.

DCE-DFS solves this problem by providing an enforced uniform name space. It specifies a naming convention with which all installations must comply. DFS file access is consistent, regardless of which computer is being used or by whom. In addition, the DCE DFS naming system is designed to provide a global name space across all DFS installations. As a result, all DFS installations taken together appear as one worldwide file system.

Figure B-2 shows an example of Cell Directory Service (CDS) entry in Domain Name Service (DNS) format.

Figure B-2 CDS entry in DNS format

Table B-1 summarizes some of the common terms used in the DCE-DFS environment.

Note: The local cell can also be abbreviated to:

/:/usr/ricardoh/games/tictactoe.exe

The /: abbreviation represents /.../local_cell/fs.

DNS Format/.../almaden.ibm.com/fs/usr/ricardoh/games/tictactoe.exe

Cell root

CDS Entry into DFS

File System Directory File name

Appendix B. Overview of DCE-DFS on AIX 305

Page 328: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Table B-1 Some commonly used terms in DCE-DFS environment

Term Explanation

DMAPI Data Management Application Programming Interface (DMAPI) is a user-level programming interface to logical extensions of the operating system. It supports data management applications that typically require intercepting file system operations in a manner that is transparent to file system applications.

DCE-Cell A DCE-Cell consists of a collection of machines that fall within a single DCE administration domain.

DFS SMT DFS Storage Management Toolkit (DFS SMT) is an implementation of DMAPI for the DFS Local File System (DCE LFS). It provides some extensions to DMAPI to handle certain DFS specific aspects.

Aggregate Aggregate is a logical unit of disk storage, similar to a disk partition. The DCE LFS aggregate is a logical volume that has been formatted as a DCE LFS physical file system by the DFS newaggr command. A DCE LFS aggregate can contain multiple DCE LFS filesets. A standard UFS exported into the DFS file space by a DFS File Server is referred to as an aggregate or a non-LFS aggregate, which can contain only one fileset.

Fileset Fileset is a hierarchical grouping of files, managed as a single unit; this is the basic unit of data administration in DFS. DCE LFS supports multiple filesets within a single aggregate. When UFS is used with DFS, the entire file system is considered one fileset.

DCE LFS It is the log-based high performance physical file system provided with DFS. The DCE LFS supports multiple filesets within a single aggregate, fileset replication, fast system restarts, and DCE access control lists.

Non-LFS Non-LFS refers to OS native file systems (JFS on AIX).

DM-enabled DM-enabled aggregates are aggregates that have been enabled with Data Management.

Events Events are the foundations of DMAPI. In this paradigm, the operating system informs a DM application running in the user space when a particular event occurs (pertaining to the file system). Events may be:

� Synchronous: A token identifies the event message and a response to each event message is a must to avoid the system or calling applications from hanging

� Asynchronous: No token is involved and does not require any response from the DM application; these are mostly used for logging purposes

306 Data Links: Managing Files Using DB2

Page 329: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Appendix C. VPM and Data Links

This appendix demonstrates a methodology on how IBM middleware (DB2 and Data Links) can provide solutions for Data Archive and restoration on a large enterprise basis, that is, when specifically working with IBM and Dassault Systèmes CATIA and VPM. This appendix provides details on how the systems work together and the different options that are available.

Data Links technology has been supported in VPM since the general availability (GA) of VPM 1.2. This technology support provides four primary capabilities:

� Logical data consistency: For example, an engineer cannot delete or rename a file that is referenced by its corresponding part description in the database.

� Transaction consistency: If a transaction is rolled back in the database, the link to the appropriate version of the file at this site is maintained.

� Security and access: Files controlled by Data Links can either be totally protected by the database preventing unauthorized file system access, or opened to allow file system access.

� Synchronized backup and recovery: Using DB2 with Data Links ensures consistent backup and recovery of ENOVIAVPM meta data and the associated CATIA models. This makes the overall process more automatic and less database administrator (DBA)-intensive. In the past, administrative tasks were performed outside of the CATIA environment, requiring a separate backup strategy for external CATIA files, which introduced a large risk of inconsistencies between the database and related external files.

C

© Copyright IBM Corp. 2001 307

Page 330: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Installation overviewYou need to follow these steps to assemble DB2, VPM, and Data Links:

1. Install the DB2 UDB server.2. Install DB2 CAE for VPM Clients.3. Create VPM DB.4. Set up DB2 CAE Client communications.5. Install VPM on a client and populate the UDB database.6. Install DB2 Data Links.7. Enable VPM for Data Links storage.

You can find instructions for installing and configuring DB2 Data Links in IBM DB2 Data Links Manager for AIX, Quick Beginnings, GC09-2837.

CATIA 422R1This installation uses CATIA V4.22 R1, and the installed PTFs will vary. You should contact your local geography's CATIA level 1 support organization.

VPM 1.3VPM 1.3 PTF1 is known by APAR HC64371 with PTF UB79557, UB79556, UB79561, UB79563, and UB79564 for AIX.

VPM 1.3 PTF2 is known by APAR HC67387 with PTF UB80888, UB80890, and UB80893 for AIX.

VPM 1.3 PTF3 is known by APAR HC69972, with PTF UB81606, UB81614, UB81599, UB81611, and UB81593 for AIX.

To create the VPM database on the UDB server, you must first create an empty database on the UDB server. Optionally, you can install VPM on the UDB server or simply mount the VPM code and administrator's file systems from an installed client workstation with NFS. After the empty database is created, the population of the VPM data structures can be performed on the server. It is not the intention of this document to describe how to install CATIA or VPM. We assume that we are starting with a working database and system.

There is a CATIA PTF that is required to support more than one Data Link Manager while VPM is in operation. By default, without APAR HC68395 or PTF UB81274 & UB81275, you can only work with one hardcoded (and declared) Data Link File Manager.

308 Data Links: Managing Files Using DB2

Page 331: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

DB2 6.1 level usedFor the purpose of this exercise, we used Fixpak 5 for DB2 6.1. This is delivered on PTF U472727 (AIX). Remember, after you install this PTF on a Universal Database server, you must update your database instance where your VPM database is located:

db2iupdt <instance name>

Then, rebind your VPM database. On a Data Links File Manager (DLFM), you must update the dlfm instance:

dlmupdt <instance name>

Then rebind it:

dlfm bind

We will start a DLFM installation from scratch. Also remember to install Fixpak 5 on the DB2 server, the Data Links File Manager nodes, and the clients.

Installing DB2 Data Links Manager 6.1 GAHere we assume that the VPM client and the VPM database are on the same node, and the Data Links File Manager is on a separate physical node. For the purposes of DB2 and its Data Links File Managers, ports need to be defined in the /etc/services file. In our example, we use 50100, as suggested by DB2 Data Links Manager Quick Beginnings, GC09-2966. If you used the DB2 installer (db2setup) from the CD-ROM, these can be generated automatically.

Software levels, Fixpak 5You need Fixpak 5 to run this installation. You can find Fixpak 5 for DB2 6.1 on the Web at: ftp://ftp.software.ibm.com/ps/products/db2/fixes/english-us/db2aixv61/

On this site, you will find the files that are listed in Figure C-1.

Appendix C. VPM and Data Links 309

Page 332: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-1 DB2 V6 Fixpak 5

The file named bnd.tar is a collection of the client bind files.

Preliminary installation stepsBefore you install the Data Links Manager software, there are a few steps that you must perform. You need to create new journaled file systems (JFS) to be used by either the DB2 code, the dlfm administrator, or the DLFF backup directory/file system. Depending on what backup rules there may be, at the minimum, the backup directory will be used to store database backups of the DLFM_DB database that is created during the installation of Data Links. You also need to create a group and user that will be the DLFM administrator (and DLFM instance owner). The steps are outlined here:

1. When creating your file systems, use this list as a reference (your choices may be different depending on your installation).

Table C-1 Creating your file systems

File system name File system size Description

/home/dlfm 105 MB dlfm instance home

/home/dlfm/dlfmbackup 65 MB db backups

/usr/lpp/db2_06_01 as needed actual DB2 s/w

Note: Do not mount these file systems yet.

310 Data Links: Managing Files Using DB2

Page 333: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

2. Do not create your Group/User for the DLFM Instance Owner if using RDIST. The DB2 Installer (db2setup) does this for you in V 6.1.

3. Modify the file system stub points to match the new dlfm user ID. Then mount the file systems from step 1.

Data Links post-installationFrom now on, we assume that Data Links has been successfully installed and that the installation has been verified.

The DB2 installer should have placed the following list of variables in the db2profile or local profile (.profile)/dtprofile for the dlfm administrator:

� DLFM_PORT=port_number� DLFM_LOG_LEVEL=LOG_ERR� DB2_RR_TO_RS=ON� DB2_HASH_JOIN=ON� DLFM_INSTALL_PATH=$HOME/sqllib/bin� DB2INSTANCE=dladmin_username� DLFM_BACKUP_DIR_NAME=$HOME/dlfmbackup

The following values were set for our example:

� DLFM_PORT=50100� DLFM_LOG_LEVEL=LOG_ERR� DB2_HASH_JOIN=ON� DB2_RR_TO_RS=ON� DLFM_INSTALL_PATH=$HOME/sqllib/bin� DB2INSTANCE=dlfm� DLFM_BACKUP_DIR_NAME=$HOME/dlfmbackup

Time management: A very important aspect of the Data Links technology depends on the time synchronization that exists between the Data Links File Manager and the UDB database for which it is configured.

If there is a time difference of more than the expiry time of the token, you will be unable to access the files stored in the Data Link File Managers. This is also important for point-in-time recovery. Now is a good time to synchronize your machine time and time zone information. There is also an AIX daemon called timed that can broadcast a network time which makes synchronization much easier.

Appendix C. VPM and Data Links 311

Page 334: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Refer to AIX Version 4.3 System Management Guide: Communications and Networks, SC23-4127. This book contains reference information on Advanced Interactive Executive (AIX) operating system commands. It also describes the tasks that each command performs, how commands can be modified, how they handle input and output, and who can run them. Plus it provides a master index for all six volumes.

Making Data Links work with VPMYou should complete the steps in the following sections to make Data Links work with VPM.

At the Data Links server (file server)Starting on the Data Link server, follow these steps:

1. Create a JFS file system /test and register it with DLFS (using dlfmfsmd script).

2. Register the /test file system with the DLFM by issuing the following command:

dlfm add_prefix /test

3. The VPM database vpmdb1 (residing at the DB2 server) should be registered with DLFM. If this database resides in the db2adm instance on a machine called ibm3 (the DB2 server), issue the following command:

dlfm add_db vpmdb1 db2adm ibm3

4. Start the DLFM by issuing the following command:

dlfm start

5. Create the directory called pictures on the file system /test, by entering the following command:

mkdir /test/pictures

6. Change the permissions of the pictures directory that you just created so that any user can create a file in that directory by entering the following command:

chmod 777 /test/pictures

7. Create a file called paulz.bmp in the /test/pictures directory, to be managed by the Data Links File Manager, by entering the following command:

echo “This is a picture of Paul Zikopoulos” > /test/pictures/paulz.bmp

312 Data Links: Managing Files Using DB2

Page 335: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

At the DB2 serverContinue the process on the DB2 server by following these steps:

8. Log on to the system with a valid DB2 user ID that has System Administrative (SYSADM) authority on the DB2ADM instance that you created.

9. Run the db2profile or db2cshrc script as follows:

<INSTHOME>/sqllib/db2profile (for Bash, Bourne or Korn shell)source <INSTHOME>/sqllib/db2cshrc (for C shell)

Here, <INSTHOME> is the home directory of the instance owner (in this case, DB2ADM).

10.Start the DB2ADM instance by entering the command:

db2start

11.Register the Data Links server that will control the files that are linked by a DATALINK data type by entering the following command:

db2 “add datalinks manager for database vpmdb1 using node ibm3 port 50100”

12.Connect to the VPMDB1 database by entering the following command:

db2 connect to vpmdb1

13.Create a table called EMPLOYEE in the VPMDB1 database that you just created, that has a column defined with a DATALINK data type, by entering the following command:

db2 “create table employee (id int, fname varchar(30), lname varchar(30),picture datalink

linktype url file link control integrity all read permission dbwrite permission blocked recovery yes on unlink restore)”

Note: By default, any user that belongs to the primary group of the instance owner has SYSADM authority on an instance.

Appendix C. VPM and Data Links 313

Page 336: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

14.Insert an entry into the EMPLOYEE table that you created by entering the following command:

db2 "insert into employee values (001,'Paul','Zikopoulos', dlvalue('http://ibm4/test/pictures/paulz.bmp'))"

Back again at the Data Links serverReturn to the Data Links server, and log on to the system as any user (except as a user with root authority, or as the DB2 Data Links Manager Administrator).

Verify that the paulz.bmp file is now controlled by the Data Links File Manager by entering the following command:

cat /test/pictures/paulz.bmp

If this file is being controlled by the Data Links File Manager, you receive the following error:

Cannot open /test/pictures/paulz.bmp.

VPM and Data Link tokensThis section explains how to force Data Links to generate tokens compatible with VPM and how to use them in conjunction with VPM.

Uppercase tokensFor VPM operations to be successful, they require that you change a Database parameter in the VPM database to generate a Data Link token with all uppercase letters. This can be enabled by the database configuration parameter DL_UPPER = YES.

Note: These options give a read permission to all users, but block them from writing. If and when a file is unlinked from this table, the file is restored or deleted from the file system. This includes operations in VPM as simple as a CATIA File->Save (overwrite) or a New Model Revision. In our example, we used restore. When the unlink option is set to delete, the previous model copy is deleted. This option (delete) is desired if you have implemented a backup strategy (Tivoli Storage Management, for example) that can manage the archived or backed up files as they are created. If you are not using Tivoli Storage Management, and you want to maintain backup versions (on a daily, weekly, or other basis), set the option to RESTORE.

314 Data Links: Managing Files Using DB2

Page 337: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

You can use the following commands to enable this setting. Login to the DB2 administrator for the VPM database (on the DB2 server):

db2 “connect to vpmdb1”db2 “update db cfg for vpmdb1 using DL_UPPER 'YES'”

Using the Data Links access token to access a fileThe access token provides an application with a way to secure the access of a file, giving the right only to the users that request this access. In the case of VPM, models stored in a DLFS can be opened so that users can access the files through the directories, or only through database authorization. Database authorization is granted via VPM.

A Database Configuration Parameter is used to control the expiration length of time (in seconds) that a Data Links Token can be valid. The token is granted when a view of the table (an SQL select statement) is taken. This token is then valid for the length of time specified for the Database parameter DL_EXPINT. If another view of the table is taken, a new token is generated. It is also valid for the same length of time. Both tokens would expire after their DL_EXPINT periods have ended.

In the case of a remote DLFS that is used to store documents in a wide area network (WAN), the token only needs to be set for as long as it takes for a valid file to be opened. If a CATIA model must be transferred over a telecommunications line, the token would only need to be valid for the amount of time it takes to start opening the model, not the entire time for the model to be read into memory.

Changing the expiry tokenBy default, the access token that is returned is only valid for 60 seconds. This means that once you enter this command, you only have 60 seconds to complete the remaining steps in this section (or edit any Data Links controlled file). You can change the default expiration time by changing the DL_EXPINT database configuration parameter.

To change the default expiration time for an access token to 10 minutes (the value is entered in seconds), enter the following commands on the database server:

db2 update db cfg for staff using dl_expint 600db2 terminatedb2 connect to database vpmdb1

If you change a setting for any database configuration parameter, you must always reconnect to the database for the changes to take effect.

Appendix C. VPM and Data Links 315

Page 338: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Obtaining an access tokenStart the DB2ADM instance by entering the db2start command. Connect to the VPM database by entering the following command:

db2 connect to vpmdb1

Select the controlled file for update by issuing an SQL SELECT statement, such as:

db2 "select dlurlpath(picture) from employee where lname = 'Zikopoulos'"

This command returns the full path name with an access token of the form:

<controlled_filepath>/<access_token>;<controlled_filename>

Note the following explanation:

� <controlled_filepath>: The fully qualified path of the controlled file.

� <access_token>: An encrypted key assigned by the database manager.

� <controlled_filename>: The name of the file that is under the control of a Data Links File System Filter.

In our example, the access token that you receive is similar to this example:

/test/pictures/HVJ5NXGC0WQ.I5KKB6;paul.bmp

This key is used to read this file on the Data Links server.

For a complete urlpath for the object, issue the following command:

db2 "select dlurlcomplete(picture) from employee where lname ="Zikopoulos"

The system responds with a URL:

HTTP://HOSTNAME/DIRECTORY_NAME/Token_key;FILENAME

Verify that you can access the file that is under the control of the Data Links File Manager. In our example, enter the following command:

cat "/test/pictures/<token_key>;paulz.bmp"

Here, <token_key> is the encrypted key that you recorded in the previous step.

You should receive the following output from this command:

This is a picture of Paul Zikopoulos

316 Data Links: Managing Files Using DB2

Page 339: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Adapting VPM to work with Data LinksThis section explains how to alter the VPM tables to make them compatible with Data Links.

Altering the VPM table CDM.INFO_LFHow you define the DATALINK column in CDM.INFO_LF will affect security and recovery of the CATIA model files. The following example guarantees referential integrity. It prevents unauthorized access to external applications (like CATIA File/Open). Also, as files are unlinked (re-written), the older versions are deleted from the Data Links File systems:

db2 alter tableCDM.INFO_LF ADD CUR_DATALINK DATALINK LINKTYPE URLFILE LINK CONTROL INTEGRITY ALLREAD PERMISSION DBWRITE PERMISSION BLOCKEDRECOVERY YES ON UNLINK RESTORE

Read Permission DB specifies that in order to access the model file, authorization through VPM must be used. The option for UNLINK DELETE (or UNLINK RESTORE) alters the behavior or an unlinked file. With Data Links implemented, as a user unlinks (or deletes) a VPM model, the most recent previously written version in the DLFMBACKUP directory is restored to the Data Links file system. To allow for garbage collection of the unwanted versions of backup files, you should specify ON UNLINK DELETE.

DATALINK options for VPMThese are the DATALINK options that you should use when VPM is working with Data Links:

� INTEGRITY ALL

Any model referenced by a DATALINK column is under the control of the Database Manager and may not be deleted, renamed, or copied using standard file system commands.

� READ PERMISSION FS/DB

When set to DB, model files can be read only by VPM. When read permission DB is used, VPM must obtain an encrypted token from DB2 and use it to open the file. When set to FS, application access is granted to the Data Link file system, based on file system permissions.

� WRITE PERMISSION BLOCKED

When write permission blocked is used, DB2 does not allow linked files to be modified. To modify a file, VPM performs the following steps:

Appendix C. VPM and Data Links 317

Page 340: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

a. Makes a copy of the linked fileb. Makes changes to the copyc. Unlinks the original filed. Links the modified file

� RECOVERY YES

This option allows point in time recovery of VPM and model data. This means that you can restore a backup database image and then roll the logs forward to a point in time. You should use this option so that models can be recovered, if they were ever to be lost.

� ON UNLINK RESTORE/DELETE

For RESTORE, when VPM deletes a model, the (unlinked) file will be returned to its previous AIX owner and file permission set. For DELETE, the file is erased.

DATALINK column options in the databaseIf you want to see the current settings of the DATALINK column in the VPM database, you could use the following DB2 statements.

Log in as the DBA for the VPM database:

db2 "connect to VPMDB1"db2 "select COLNAME,DL_FEATURES from SYSIBM.SYSCOLPROPERTIES"

You should receive a listing similar to this example:

Figure C-2 illustrates how the DL_FEATURES can be interpreted.

COLNAME DL_FEATURESCUR_DATALINK UFADBYD

318 Data Links: Managing Files Using DB2

Page 341: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-2 Interpreting DL_FEATURES values

CATIA and VPM declarationsThe following definitions must be added to the CATIA declaration series. This must be completed for both the VPM and CATIA sides.

catcdm.DBLFCAT_AUTHORIZATIONS = 'rw-rw-rw-' ;catcdm.DBLFAIX_ALGO = 'DELETE_RR' ;catcdm.DBLFAIX_OLD_SUFFIX = '' ;catcdm.DATALINK_SERVER : set of STRING ;catcdm.DATALINK_SERVER = 'dlfm_machine,/dlff_filesystem1' ;catcdm.DATALINK_SERVER = 'dlfm_machine,/dlff_filesystem2' :catcdm.DBLFCAT_NOSHOW_PATH = 'TRUE' ;

Where 'dlfm_machine' is the nodename of your DLFM, and '/dlff_filesystem1' is the physical name of the DLFF file system. The declaration catcdm.DBLFAIX_ALGO is set in the delivered MECCDM.dcls.

UFADBYD

Linktype U=URL

Link Control F=FILE

Integrity A=ALL

Read Permission D=DB

Write Permission B=Blocked

Recovery Y=Yes

On Unlink D=Delete

DL_FEATURES columnfrom the SYSIBM.SYSCOLPROPERTIES table

Appendix C. VPM and Data Links 319

Page 342: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The declaration catcdm.DBLF_OLD_SUFFIX is also set in the delivered MECCDM.dcls. The declaration catcdm.DBLFCAT_AUTHORIZATIONS is set in the delivered CATCDM.dcls. The declaration for catcdm.DBLFCAT_NOSHOW_PATH is currently set to FALSE in the DCLS file CATCDM.dcls. Please feel free to modify these declarations that are already in your DCLS set. Then simply add the remaining catcdm.DATALINK_SERVER declarations to either CATCDM.dcls or MECCDM.dcls.

Mounting the DLFS file system on the VPM clientsEach VPM client that needs to write to the DLFS (/test in our example) needs to NFS mount the DLFS file systems.

Use the following mount characteristics. Log on as root and issue following command:

mount -o noac dlfmsrv:/test /test

Here dlfmsrv is the DLFM nodename and /test is the NFS exported file system.

Seeing how the files are manipulated by Data LinksAt the point where you have stored your model in the Data Links storage environment, it is important to understand that whether you are using the VPM drivers of DBLFCAT or DBLFAIX. You must let VPM name the models for you. The reason for this is that every time a CATIA model is File->Saved, or re-stored, a new file is generated in the Data Links storage area. If your users allocate a model name, it is impossible to re-save your model.

You can use the following process to write and update models in VPM, so that Data Links has the controls.

Writing a modelIf the user uses the VPM method for Model Create, they would see this screen (Figure C-3).

320 Data Links: Managing Files Using DB2

Page 343: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-3 Creating a model in VPM

1. Select the Create & Save icon (Figure C-4).

Appendix C. VPM and Data Links 321

Page 344: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-4 Creating and saving a model

2. Fill in these attributes. Note that for the repository field, we selected the DBLFCAT driver, and for the directory, we specified my Data Links file system, which is mounted locally.

3. Clicking OK gives you a warning of some kind (Figure C-5), like a new part/model.

Figure C-5 Confirm Write

322 Data Links: Managing Files Using DB2

Page 345: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Then, the model is saved (Figure C-6).

Figure C-6 Saved model in VPM

4. Now, if you go to a window on your VPM client and change the current working directory to the NFS mounted Data Links File system, and run a list, you will see this file there, still available to you and other users logged into your client for read-only (Figure C-7).

Appendix C. VPM and Data Links 323

Page 346: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-7 Read-Only file

This file stays in this format (not fully Data Linked), until either another model is written, or you re-save this same model.

5. Open your model into CATIA, by selecting the model we created a few steps back (Figure C-8).

324 Data Links: Managing Files Using DB2

Page 347: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-8 Opening a model in CATIA

6. Double-click the Model definition shown above. CATIA then opens and displays your model (Figure C-9).

Appendix C. VPM and Data Links 325

Page 348: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-9 A model in CATIA

7. At this point, if you simply click File->Save, or press CTRL+S from your CATIA session, the model is re-filed.

8. You should then open an aixterm or any other type of window, and change directories to the Data Links file system. If you ask for a detailed list, you see code similar to the example in Figure C-10.

326 Data Links: Managing Files Using DB2

Page 349: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-10 File under Data Links control now

You can now see that DB2 Data Links has fully taken control of your model. You should also notice that the model file name is also different. What is the reason for this? Data Links Version 6.1 does not support update-in-place of linked files. This means that to change a linked file, VPM must perform the following tasks:

1. Make a copy of the linked file.2. Make changes to the copy.3. Unlink the original file.4. Link the modified file.

Whenever a file is linked, a backup copy of this file is written to the directory pointed to by the DLFM_BACKUP_DIR_NAME variable on the DLFM server. This means that every time a model is saved, a copy of the new version is written to the backup directory.

If you want to see your prior file, log into your DLFM as root or dlfm, and change your current working directory to the directory identified by the variable DLFM_BACKUP_DIR_NAME. In this directory, you see backed up copies of the DLFM_DB database, and a subdirectory that includes the DLFM’s file system, in our case test (see Figure C-11).

Appendix C. VPM and Data Links 327

Page 350: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Figure C-11 Backup directory

You see all of the backup copies that have been written to the DLFM. The file you are looking for should be found in the display shown in Figure C-12.

Figure C-12 Files backed up under the Backup directory

328 Data Links: Managing Files Using DB2

Page 351: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Model storage methodsVPM uses three different methods for writing models. These are:

� DBLFCAT: This is a file system based method that writes models using the insertion of a model read header so that the file can be opened from an AIX file system. This would allow a CATIA user to perform a File/Open operation from within CATIA.

� DBLFAIX: This is another file system based writing method that writes models without a Model header. They are unreadable when accessed via a file system read operation.

� DBLFCDM: This method writes long field blobs into the CDM relational database.

With the Data Links PTFs applied, any or all of these write methods will work. If you want to restrict the use of any of these methods, the normal method is to customize your environment profiles, and use a User Exit called DMUSLF, or CheckLFBeforeWrite.

Additional informationAll along the deployment of CATIA or ENOVIA solutions, functional and technical architectures interact. They need to be consistent when Digital Enterprise requirements are often evolving to require global teams, distributed databases, and ever increasing integration of function to support digital mock-up and virtual product, process, resource, and model management.

In April 1999, IBM and Dassault Systèmes announced the establishment of the IBM/Dassault Systèmes International Competency Center (IDSICC) located at Dassault Systèmes headquarters in Suresnes, France. The IDSICC is staffed by a team of highly skilled developers from both IBM and Dassault Systèmes. All have extensive hands-on experience with implementing IBM and Dassault Systèmes Digital Enterprise solutions for customers.

Note: If you attempt to delete this model from VPM, you will encounter an Abend S0004. This defect has been reported to Dassault Systèmes. This abend was encountered using the Data Links UNLINK options of RESTORE and DELETE. The file should have been returned to the Data Links file manager file system, with the original user's permissions and ownership in the case of RESTORE, and deleted in the case of DELETE.

Appendix C. VPM and Data Links 329

Page 352: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The center's mission is to provide worldwide technical experience and comprehensive Digital Enterprise system recommendations to customers as well as to IBMers and IBM business partners to address every phase of development from research to implementation and production.

IBM is the only technology provider in CATIA/ENOVIA solutions to develop a competency center with this mission of total solution optimization. So IBM provides you with the most powerful combination available in the industry through the integration of IBM key e-business technologies and services to implement the Digital Enterprise vision.

Skilled engineers from each of these IBM labs are part of the core staff within the center working with customers to provide integration support and total enterprise solution implementation.

330 Data Links: Managing Files Using DB2

Page 353: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Appendix D. Logging priorities for DLFF and DLFSCM

SYSLOG is a system log file on the UNIX platforms, in which all the messages are logged by the kernel. The path of this file can be found in the /etc/syslog.conf file. Only the root user can edit this file and alter the path for SYSLOG. The DLFF in AIX, Solaris, and DCE-DFS environments log messages in this file. There are two tunable priority levels for logging in DLFF:

� The message logging priority� The module logging priority

On Windows, DLFF uses an event log mechanism to log the messages. Instead of having two logging levels, it has a single dynamically tunable parameter.

This appendix discusses how to modify the logging priorities (or levels) on DLFF on AIX, Solaris and Windows environment, and for the DLFSCM on DCE-DFS environment.

D

© Copyright IBM Corp. 2001 331

Page 354: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Modifying the DLFF logging priorities on AIXYou can modify the logging level for the Data Links File System Filter (DLFF) by changing the dlfs_cfg file. The dlfs_cfg file is passed to the strload routine to load the driver and configuration parameters. The file is located in the /usr/lpp/db2_07_01/cfg directory. Through a symbolic link, the file can also be found in the /etc directory. The dlfs_cfg file has the following format:

d <driver name> <vfs number> <DLFM1 UID> <global message priority> <global module priority> - 0 1

Here:

� d: This parameter specifies that the driver is to be loaded.

� <driver name>: The driver name is the full path of the driver to be loaded. For instance, the full path for DB2 Version 7 is /usr/lpp/db2_07_01/bin/dlfsdrv. The name of the driver is dlfsdrv.

� <vfs number>: This is the vfs entry for DLFS in /etc/vfs.

� <DLFM1 UID>: This is the user ID of the owner of the READ PERMISSION DB files.

� <global message priority>: This is the global message priority.

� <global module priority>: This is the global module priority.

� 0 1: These are the minor numbers for creating non-clone nodes for this driver. The node names are created by appending the minor number to the cloned driver node name. No more than five minor numbers can be given (0 to 4).

A real-world example might look like this:

d /usr/lpp/db2_07_01/bin/dlfsdrv 14,208,255,-1 - 0 1

The messages that are logged depend on the settings for the global message priority and global module priority. To tune DLFF logging, you can change the value for these global priorities.

There are four message priority values you can use:

#define LOG_EMERGENCY 0x01

#define LOG_TRACING 0x02

#define LOG_ERROR 0x04

#define LOG_TROUBLESHOOT 0x08

332 Data Links: Managing Files Using DB2

Page 355: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

These values can be added together, depending on the level of logging you want. Do not worry much about the module logging priorities. The final value of priority that is calculated by the DLFF is done by using a bit-wise AND operation on these two logging priorities values. But in case you want to log any particular DLFF operation (generally not used), you can modify the module priority using one or more (by adding them together) of the following values:

#define LOG_LOOKUP_NORMAL 0x01

#define LOG_LOOKUP_TOKEN 0x02

#define LOG_OPEN 0x04

#define LOG_CLOSE 0x08

#define LOG_RENAME 0x10

#define LOG_REMOVE 0x20

#define LOG_MKDIR 0x40

#define LOG_GETATTRIBUTE 0x80

#define LOG_SETATTRIBUTE 0x100

#define LOG_CREATE 0x200

#define LOG_UPCALLMESSAGES 0x400

#define LOG_LOADUNLOAD 0x800

#define LOG_MOUNT 0x1000

#define LOG_UNMOUNT 0x2000

#define LOG_VFSROOT 0x4000

#define LOG_VFSSTAT 0x8000

#define LOG_VFSVGET 0x10000

#define LOG_IOCTL 0x20000

#define LOG_SUBROUTINE 0x40000

#define LOG_OTHERS 0x80000

#define LOG_INITIALIZE 0x100000

#define LOG_GETACL 0x200000

#define LOG_SETACL 0x400000

#define LOG_ACCESS 0x800000

Appendix D. Logging priorities for DLFF and DLFSCM 333

Page 356: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Consider this example. Suppose you want to log only the error and emergency messages for the open, mkdir, and remove DLFF operations. The values for the two logging priorities can be calculated as shown here:

The dlfs_cfg file for these logging priorities would look like this example:

d /usr/lpp/db2_07_01/bin/dlfsdrv 14,208,5,100 - 0 1

If you want to log for each and every operation, the value for module logging priority should be “-1” (0xFFFFFFF).

Modifying the DLFSCM logging priorities in DCE-DFS (on AIX)

Similar to DLFF on AIX, there is a configuration file dlfscm_cfg under /usr/lpp/db2_07_01/cfg directory. Through a symbolic link, the file can also be found in the /etc directory. The dlfscm_cfg file has the following format:

d <driver name> <vfs number> <global message priority> <global module priority> - 0 1

Here:

� d: The d parameter specifies that the driver is to be loaded.

� <driver name>: The driver name is the full path of the driver to be loaded. For instance, the full path for DB2 Version 7 is /usr/lpp/db2_07_01/bin/dlfscmdrv. The name of the driver is dlfscmdrv.

� <vfs number>: This is the vfs entry of DLFSCM in /etc/vfs

� <global message priority>: This is the global message priority.

#define LOG_INACTIVE 0x1000000

#define LOG_SYMLINK 0x2000000

#define LOG_HARDLINK 0x4000000

Message logging priority ====

LOG_ERROR + LOG_EMERGENCY0x04 + 0x01 0x05 5

Module logging priority ====

LOG_OPEN + LOG_MKDIR + LOG_REMOVE0x04 + 0x40 + 0x200x64100

334 Data Links: Managing Files Using DB2

Page 357: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� <global module priority>: This is the global module priority.

� 0 1: These are the minor numbers for creating non clone nodes for this driver.

The node names are created by appending the minor number to the cloned driver node name. No more than five minor numbers can be given (0 to 4).

A real-world example might look like this:

d /usr/lpp/db2_07_01/bin/dlfscmdrv 15,255,-1 - 0 1

There are four message priority values you can use:

These values can be added together, depending on the level of logging you want. Do not worry much about the module logging priorities. The final value of priority that is calculated by the DLFF is done by a bit-wise AND operation on these two logging priorities values. But in case you want to log any particular DLFF operation (generally not used), you can modify the module priority using one or more (by adding them together) of the following values:

#define LOG_EMERGENCY 0x01

#define LOG_TRACING 0x02

#define LOG_ERROR 0x04

#define LOG_TROUBLESHOOT 0x08

#define LOG_LOOKUP_NORMAL 0x01

#define LOG_LOOKUP_WITHTOKEN 0x02

#define LOG_OPEN 0x04

#define LOG_CLOSE 0x08

#define LOG_RENAME 0x10

#define LOG_LOCAL 0x20

#define LOG_MKDIR 0x40

#define LOG_GETATTRIBUTE 0x80

#define LOG_SETATTRIBUTE 0x100

#define LOG_CREATE 0x200

#define LOG_UPCALLMESSAGES 0x400

#define LOG_LOADUNLOAD 0x800

#define LOG_MOUNT 0x1000

Appendix D. Logging priorities for DLFF and DLFSCM 335

Page 358: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The values of the these two logging priorities can be calculated as shown in the example on page 334.

Modifying the DLFF logging priorities on SolarisIn Solaris, the logging priorities are defined in the /etc/system file. The entries (in /etc/system) corresponding two these two logging priorities look like:

set dlfsdrv:glob_mod_pri=0x100800set dlfsdrv:glob_mesg_pri=0xff

Here, the format is:

set <module name>:<system variable name>=<value>

To change the values for the two logging priorities, you should edit the /etc/system file. You may have to reboot the machine for the new logging values to take effect.

#define LOG_UNMOUNT 0x2000

#define LOG_VFSROOT 0x4000

#define LOG_VFSSTAT 0x8000

#define LOG_VFSVGET 0x10000

#define LOG_IOCTL 0x20000

#define LOG_SUBROUTINE 0x40000

#define LOG_OTHERS 0x80000

#define LOG_INITIALIZE 0x100000

#define LOG_GETACL 0x200000

#define LOG_SETACL 0x400000

#define LOG_ACCESS 0x800000

#define LOG_INACTIVE 0x1000000

#define LOG_SYMLINK 0x2000000

#define LOG_RDWR 0x4000000

#define LOG_CHANGEVNOPS 0x8000000

#define LOG_IMPERSONATION 0x10000000

336 Data Links: Managing Files Using DB2

Page 359: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Modifying the DLFF logging level on WindowsUnlike on AIX and Solaris, the logging level (or priority) can be changed dynamically on Windows. Instead of two different variables (message and module priorities), there is only one variable on Windows. It has the following possible values:

0 Logs all messages (success messages also)1 Logs basic information, warning and error messages2 Logs warning and error messages3 Logs error messages only.

The following two commands should be issued to modify the log level for DLFF:

dlff set loglevel <number>dlff refreshtrace

For example, to modify the log level of DLFF to log only warning and error messages, enter the following commands:

dlff set loglevel <number>dlff refreshtrace

Appendix D. Logging priorities for DLFF and DLFSCM 337

Page 360: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

338 Data Links: Managing Files Using DB2

Page 361: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Related publications

The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.

IBM RedbooksFor information on ordering these publications, see “How to get IBM Redbooks” on page 341.

� HACMP/ES Customization Examples, SG24-4498

� Administering IBM DCE and DFS Version 2.1 for AIX and OS/2 Clients, SG24-4714

� Tivoli Storage Management Concepts, SG24-4877

� DB2 UDB for AS/400 Object Relational Support, SG24-5409

� Tivoli Storage Manager Version 3.7.3 & 4.1: Technical Guide, SG24-6110

Other resourcesThese publications are also relevant as further information sources:

� DB2 UDB Troubleshooting Guide, GC09-2850

� DB2 Data Links Manager Quick Beginnings, GC09-2966

� Quick Beginnings manuals for UNIX (GC09-2970), Windows (GC09-2971), OS/2 (GC09-2968), and Linux (GC09-2972) (from Chapter 3)

� TSM for AIX Administration Guide, GC35-0403

� DB2 Application Building Guide, SC09-2948

� DB2 Application Development Guide, SC09-2949

� DB2 UDB Call Level Interface Guide and Reference, SC09-2950

� DB2 Command Reference Guide, SC09-2951

� Data Movement Utilities Guide, SC09-2955

� AIX Version 4.3 System Management Guide: Communications and Networks, SC23-4127

� Replication Guide and Reference, SC26-9920

© Copyright IBM Corp. 2001 339

Page 362: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Davis, Judy R. Data Links white paper “Data Links: Managing External Data with DB2 Universal Database”. Prepared for the IBM Corporation by Database Associates International, February 1999. http://www.software.ibm.com/data/pubs/papers/

� Biggs, Maggie. “IBM's DB2 6.1 Strengthens Web Appeal”. Infoworld, Infoworld Test Center, June 28 1999.

� Alur, Nagraj and Davis, Judy R. “How to Improve RDBMSes -- Seven long-term requirements for managing complex data”. Byte Magazine, April 1997.

� “IBM emerges as early leader in burgeoning content management market”, ContentWatch, September 1997.

� Narang, Inderpal and Rees, Robert. “Data Links - Linkage of database and filesystems”. Proceedings of the Sixth High Performance Transaction Systems (HPTS), September 1995.

� Stodder, David. “An Interview with Don Haderle”. DB2 Magazine, Summer 1998.

� Saracco, Cindy. Universal Database Management: A Guide to Object/Relational Technology, Chapter 9. Morgan Kaufmann Publishers, Inc, California, 1998.

� Gwynne, Peter. “Reaching Beyond the Database”. IBM Research Magazine, Number 3, 1998, http://www.research.ibm.com/resources/magazine

� Papiani, Mark et al. “A distributed Scientific Data Archive Using the Web, XML and SQL/MED”. ACM SIGMOD Record, Vol 28, Number 3, September 1999.

� Hsiao, H. and Narang, I. “DLFM: A Transactional Resource Manager”. In ACM SIGMOD/PODS 2000.

� Alur, Nagraj and Routray, Ramani Ranjan. “Link Integrity+: A Web Asset Integrity Solution”. IBM Almaden Research Center paper.

� Baker. Brian and Roushdi, Amr. “Installing and Configuring VPM with DB2 Datalinks”. V1.0, IBM Dassault Systèmes International Competency Center (IDSICC), 20 November 2000.

Referenced Web sitesThese Web sites are also relevant as further information sources:

� Data Links home page: http://www.ibm.com/software/data/db2/datalinks

� DB2 Product Family home page: http://www.software.ibm.com/data/db2

� Data Management home page: http://www.software.ibm.com/data

340 Data Links: Managing Files Using DB2340 Data Links: Managing Files Using DB2

Page 363: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

� Data Links white paper: http://www-4.ibm.com/software/data/pubs/papers/#datalink

� DB2 technical library: http://www.ibm.com/software/data/db2/library

� DB2 Administration Guide: http://www.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v7pubs.d2w/en_main

� DB2 related software downloads: http://www.software.ibm.com/data/db2/udb

� DB2 product and service technical library: http://www-4.ibm.com/software/data/db2/library/db2udb

� TSM Administration guide: http://www.tivoli.com/support/public/Prodman/public_manuals/td/TD_PROD_LIST.html

� Information on Dassault Systems: http://www.developer.ibm.com/welcome/icc/dassault.html

How to get IBM RedbooksSearch for additional Redbooks or redpieces, view, download, or order hardcopy from the Redbooks Web site:

ibm.com/redbooks

Also download additional materials (code samples or diskette/CD-ROM images) from this Redbooks site.

Redpieces are Redbooks in progress; not all Redbooks become redpieces and sometimes just a few chapters will be published this way. The intent is to get the information out much quicker than the formal publishing process allows.

IBM Redbooks collectionsRedbooks are also available on CD-ROMs. Click the CD-ROMs button on the Redbooks Web site for information about all the CD-ROMs offered, as well as updates and formats.

Related publications 341

Page 364: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

342 Data Links: Managing Files Using DB2342 Data Links: Managing Files Using DB2

Page 365: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Special notices

References in this publication to IBM products, programs or services do not imply that IBM intends to make these available in all countries in which IBM operates. Any reference to an IBM product, program, or service is not intended to state or imply that only IBM's product, program, or service may be used. Any functionally equivalent program that does not infringe any of IBM's intellectual property rights may be used instead of the IBM product, program or service.

Information in this book was developed in conjunction with use of the equipment specified, and is limited in application to those specific hardware and software products and levels.

IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to the IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785.

Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact IBM Corporation, Dept. 600A, Mail Drop 1329, Somers, NY 10589 USA.

Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee.

The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

Any pointers in this publication to external Web sites are provided for convenience only and do not in any manner serve as an endorsement of these Web sites.

© Copyright IBM Corp. 2001 343

Page 366: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

The following terms are trademarks of other companies:

Tivoli, Manage. Anything. Anywhere.,The Power To Manage., Anything. Anywhere.,TME, NetView, Cross-Site, Tivoli Ready, Tivoli Certified, Planet Tivoli, and Tivoli Enterprise are trademarks or registered trademarks of Tivoli Systems Inc., an IBM company, in the United States, other countries, or both. In Denmark, Tivoli is a trademark licensed from Kjøbenhavns Sommer - Tivoli A/S.

C-bus is a trademark of Corollary, Inc. in the United States and/or other countries.

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and/or other countries.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States and/or other countries.

PC Direct is a trademark of Ziff Communications Company in the United States and/or other countries and is used by IBM Corporation under license.

ActionMedia, LANDesk, MMX, Pentium and ProShare are trademarks of Intel Corporation in the United States and/or other countries.

UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group.

SET, SET Secure Electronic Transaction, and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.

Other company, product, and service names may be trademarks or service marks of others.

344 Data Links: Managing Files Using DB2

Page 367: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Index

Symbols 93.DIR 188/etc/vfs 140? command 81

Aaccess control

READ PERMISSION DB 72READ PERMISSION FS 72

access to linked files 70access token 30, 32, 72, 78

expired 78ADSM (Adstar Distributed Storage Manager) 31Adstar Distributed Storage Manager (ADSM) 31advanced transparent recall 121aggregates 60AIX with DCE-DFS 301analyzing a problem 271application development 65application development tasks 69application program 85Apply program

binding with the Capture program 183File copy function 171File-reference mapping function 170handling DATALINK values 169password file 183starting, stopping 188

architecture, Data Links 11archive directory 96archive/retrieve 122ASNDLCOPY 169, 170, 171

configuration files used by 184ASNDLCOPYD 171

.DIR 188configuration files used by 187

ASNDLSRVMAP 170, 184ASNDLUSER 171, 186ASNDLUSERINFO 187asynchronous 35authentication 30automatic migration 119

© Copyright IBM Corp. 2001

Bbackup

active 236expired 236inactive 236

backup/restore 122Backup-Archive Client 108, 114Backus Naur Form (BNF) 23

specifications for DATALINK 297Binary Large Object (BLOB) 67BLOB (Binary Large Object) 67BNF (Backus Naur Form) 23

CCache Manager 62Capture program

binding with the Apply program 183Capture of DATALINK values 169starting, stopping 188

CATIA 422R1 308cell 58, 303change-data (CD) table 164Character Large Object (CLOB) 67Chown daemon 41Client Cache Manager 62client workstation 84CLOB (Character Large Object) 67commit processing 40communication method 114Coordinated Universal Time (CUT) 219Copy daemon 43copy groups 113crash 275crash recovery 202, 205creating a new file 76CURRENT LOG 215CUT time 221

Ddaemon process 34Data Joiner Replication Administration (DJRA) 164, 177

345

Page 368: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

Data Linksaccess control 68advantages of 68applications 7architecture 11, 12Backup-Archive Client 114control over a file system 103controlled file system 97DCE-DFS 57deployment 91migrating existing applications 84on UNIX, Windows 33READ PERMISSION DB option 68suitable applications 66tables and servers 102Tivoli Space Manager 116transactional semantics 66versus LOBs 67VPM 307

Data Links File Manager (DLFM) 4, 13, 34, 58Data Links File Manager V5.x (AIX) 247Data Links File System Cache Manager (DLFS-CM) 16, 62Data Links File System Filter (DLFF) 5, 14, 43Data Links Manager administration 101Data Links server 12Data Links support for HSM 125Data Management Application Programming Inter-face (DMAPI) 59Data Manager Application (DMAPP) 14, 59data manager events 59data replication 161database configuration parameters 31DATALINK data type 6, 18

attributes 23Backus Naur Form specifications 297choosing options 71scalar functions 24, 80

DATALINK options 26changing 74choosing 71querying 74

Datalink Reconcile Pending (DRP) 194, 221DATALINK values 169Datalink_Reconcile_Not_Possible state (DRNP) 155, 194Datalink_Reconcile_Pending (DRP) 211DATALINKS parameter 31DataPropagator 162

DataPropagator Relational (DPropR) 161DB2 6.1 309DB2 agent 38DB2 Call Level Interface 80DB2 client 15DB2 Client Application Enabler 84DB2 Data Links Manager 6.1 GA 309DB2 database access 69db2 list history 214DB2 Logging Manager 14, 35DB2 replication 162DB2 server problems 290DB2 Trace 276

analysis 279in memory 277information 278to a file 278

DB2 UDB server 15DB2 UDB V5.x database server (AIX) 244DB2 UDB V6.x database server 250DB2 Universal Database

crash 275hang situations 273

DB2 Universal Database server 15db2_recon_aid 193, 233db2dart 194, 226db2diag.log 287DB2IMIGR database command 244db2look command 154DB2OPTIONS 28DBID 36DCE (Distributed Computing Environment) 302DCE cell 303DCE-DFS 57

on AIX 301ddl 154default.env 139delete file 45Delete Group daemon 40device class 110dfm_access 36dfm_archive 37dfm_backup 37dfm_boot 36dfm_dbid 36dfm_dir 37dfm_file 37dfm_grp 36dfm_prfx 36

346 Data Links: Managing Files Using DB2

Page 369: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

dfm_rcfile 36dfm_url 37dfm_xnstate 37DFS (Distributed File Service) 303DFS Client Cache Manager 62DFS Client Enabler 16, 57DFS Client Enabler for Data Links 62DFS Name Space 305DIAGLEVEL 287Distributed Computing Environment (DCE) 302Distributed File Service (DFS) 303Distributed Transaction Processing (DTP) 54DJRA (Data Joiner Replication Administration) 164DL_DROP_TIME parameter 31DL_EXPINT 78DL_EXPINT parameter 31DL_FEATURES column 74DL_NUM_COPIES 96DL_NUM_COPIES parameter 31DL_TOKEN parameter 32DL_UPPER parameter 32DLFF (Data Links File System Filter) 5, 14, 43DLFF logging level, Windows 337DLFF logging priority 331

on AIX 332on Solaris 336

DLFM (Data Links File Manager) 4, 13, 34, 58dlfm add_db 104, 157dlfm add_prefix 104, 157DLFM backup 96dlfm bind 104DLFM commands 104DLFM crash 275dlfm create 104dlfm create_db 104dlfm drop_db 104dlfm drop_dlm 104DLFM hang situations 273dlfm help 104dlfm list registered databases 105dlfm list registered prefixes 105DLFM process model 38dlfm refresh key 105dlfm restart 105dlfm retrieve 105dlfm see 105DLFM server problems 287dlfm setup 105dlfm shutdown 105

dlfm start 106dlfm startdbm 106dlfm stop 106dlfm stopdbm 106dlfm_backup 224DLFM_BACKUP_DIR_NAME 96dlfm_child 209, 212dlfm_copyd 209DLFM_DB 14, 208DLFM_DB database backup 98dlfm_export 149, 152, 153dlfm_import 149dlfm_retrieved 212DLFM101E 287dlfmfsmd 103DLFMs, single Universal Database 92dlfs_cfg 140DLFS-CM (Data Links File System Cache Manager) 16, 62DLFSCM logging priority 331

in DCE-DFS (on AIX) 334dlurlpathonly 215DMAPI (Data Management Application Program-ming Interface) 59DMAPP (Data Manager Application) 14, 59DMAPP process model 60DMLFS 60, 62DPropR (DataPropagator Relational) 161DRP (Datalink Reconcile Pending) 194, 222dsm 124dsmadmc 123dsmc 124dsmdu 124dsmls 124dsmmigfs 124dsmmigrate 124dsmmonitord daemon 124dsmrecall 124dsmrecalld daemon 124dsmserv 123DTP (Distributed Transaction Processing) 54

EEARLIEST LOG 215embedded 44error handling 81events 60exception table 228

Index 347

Page 370: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

export 152Export utility 86, 87

Ffailover of services 132FAQs 294fast reconcile 212, 218, 220fast reconciliation 225file linking 47, 86file migration 118file permissions 216file server 84file server location 98File System Migrator (FSM) 12file system problems 292file system sizing 95file unlinking 47files per directory 99fileset 63Frequently Asked Questions (FAQ) 294FSM (File System Migrator) 12

GGarbage Collect 210Garbage collection 236Garbage Collector daemon 41, 51ged 124get dbm cfg 156

HHACMP (High Availability Cluster Multiprocessor) 131HACMP cluster configuration, hot standby 132HACMP Cluster Manager 132hangs 273

in UNIX 274Windows 275

Hierarchical Storage Manager (HSM) 12High Availability Cluster Multiprocessor (HACMP) 131host database 15host variable declaration 75hostname 91hot standby 132HSM (Hierarchical Storage Manager) 12HSM migration 119

IImport utility 157in-doubt transactions 204INTEGRITY 27Inter Process Communication (IPC) 105

resources 287IPC (Inter Process Communication) 105ipcs | grep dlfm 106

JJDBC scalar functions 80

KKerberos authentication 114

Llarge object (LOB) 26, 67least-recently used 119LFS aggregate 60Link Integrity+ 7linked file

enabling access to 70updating 79

linking a new file 76linking files 47, 86list datalinks managers 102, 156list datalinks managers command 70list db directory 102list history backup 220list registered databases 232load 158Load utility 87LOB (large object) 26, 67LOBs 84, 85

BLOB, CLOB 67externalizing data 86externalizing LOB data 85

log retention 208Log Sequence (LS) 236logging levels 98logging priority 331LOGPRIMARY 202LOGRETAIN 207lsfs -v dlfs 103

Mmanagement class 113

348 Data Links: Managing Files Using DB2

Page 371: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

maximum length of DATALINK value 76MIGRATE database command 244migrate-on-close 121Migrating 244migrating

Data Links File Manager (AIX) 247Data Links File Manager (Windows NT) 253database server (AIX) 244database server (Windows NT) 250existing applications to use Data Links 84using an offline backup 254

migration 118db2 backup database 246db2 get instance 244db2_recon_aid 258db2admin stop 252db2ckmig 246db2dart 245db2dlmmg 249db2ilist 244db2imigr 244db2licd -end 252db2set 249dlfm see 250dlfm_see 248dlfm_shutdown 248mount 249strload 249umount 248

mount command 103multiple DB2s and DLFMs 94multiple DLFMs on single host 94multiple file server restrictions 82multiple links to the same file 83multiple Universal Databases 93multiple Universal Databases, single DLFM 93mutual suspicion algorithm 114

Nneeding crash recovery 205Network File System (NFS) 97Network Information Service (NIS) 97NFS (Network File System) 97NIS (Network Information Service) 97NUM_DB_BACKUPS 96, 236num_db_backups 210

Ooffline backup for migration 254OLTP (OnLine Transaction Processing) 54ON UNLINK option 27, 74OnLine Transaction Processing (OLTP) 54open file 44options file 122out-of-space condition 119ownership 47

Pperformance tuning 98permissions 216point in time recovery 202point-in-time 35policy 112policy domain 113policy set 113power failure 275prefix 36, 50pre-migration policy 120prepare processing 40prepare-to-commit 51PRFX_ID 36PRFX_NAME 36problem analysis 271problem determination 269problem solving 270profiles.reg 139PRUNE HISTORY 238

Qquiesce 152

Rrc.db2dls 142rc.db2server.dls 141READ PERMISSION DB 78reading linked files 77, 78read-without-recall 121REC_HIS_RETENTN 96, 214, 237recall process 120reconcile 222

exceptions 227Reconcile utility 191reconciliation 121, 191RECOV_ID 35

Index 349

Page 372: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

RECOVERY 27recovery at a point in time 202RECOVERY option 73RECOVERY YES 96Redbooks Web site 341

Contact us xxreferential integrity 30rename directory 45rename file 45

processing 45replication 21

Apply program 166binding Capture, Apply programs 183Capture of DATALINK values 169Capture program 164CD table 169Change-capture 164change-data (CD) table 164changing the target table name 179components 164configurations 162conflict-detection 163control server 164, 181control tables 164, 177data 161data distribution 162Data Joiner Replication Administration tool 177DB2 Control Center 172defining the replication source 174DJRA 177logical servers 164LOGRETAIN parameter 182Occasionally connected 162READ PERMISSION DB 171, 186replicating DATALINK columns 161, 168, 172restrictions 163Source Server 164spill file 166subscription set members 167Subscription sets 167Supported Platforms 163Target Server 164Update anywhere 162using the DB2 Control Center 177, 182

replication source 164restart 57restore 156, 218

without rolling forward 215Retrieve daemon 43

retrieve_query 105return code 282rollback 53ROLLFORWARD PENDING 205

Sscalar functions 21, 24, 80

SQLBuildDataLink 80SQLGetDataLinkAttr 80with DB2 Call Level Interface 80with JDBC 80

security 30, 60, 114segmentation violations 277selective migration 119selective recall 118, 121servers 102SET INTEGRITY 194, 195setting permissions 45single host with multiple DLFMs 94single server implementation 92single Universal Database, multiple DLFMs 92sizing and file systems 95SMT (Storage Management Toolkit) 59spawns 57SQL0357N 82SQL0358N 81, 83SQLBuildDataLink 80SQLGetDataLinkAttr 80stale mount 293storage device 110storage hierarchy 110Storage Management Toolkit (SMT) 59storage pool 110storage pool migration 119strload 140subscription set 167subscription set member 167Super Exclusive lock 192superuser privilege 41SYSCOLPROPERTIES table 74sysibm.syscolproperties 102SYSIBM.SYSCOLPROPERTIES table 74

Ttables 102Tivoli

communication methods 114Data Links support for HSM 125

350 Data Links: Managing Files Using DB2

Page 373: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

policy 112security 114storage device 110

Tivoli Data Protection (TDP) 108Tivoli Disaster Recovery Manager (DRM) 109Tivoli Space Manager 108, 116

archive/retrieve 122backup/restore 122file migration 118options file 122pre-migration policy 120recall process 120reconciliation 121

Tivoli Storage Manager 31, 107, 207tools, processes, interfaces 123

token 13token algorithm 99token expiration 32tokenized file name 30transaction ID (XN_ID) 55transaction support 53transactional semantics 66transparent recall 118, 120traps 277two-phase commit 51, 203

Uuniform name space 305unit of work consistency 34unit-of-work (UOW) table 165UNIX, hangs 274UNLINK DELETE 218unlinking files 47, 79Upcall daemon 32, 43updating a linked file 79USEREXIT 208

VVFS (Virtual File System) 43Virtual File System (VFS) 43VNODE 43VPM 1.3 308VPM with DB2 Data Links 8

WWindows, hangs 275WRITE PERMISSION option 73

XXA transaction 54XN_ID 36, 55

Index 351

Page 374: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

352 Data Links: Managing Files Using DB2

Page 375: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

(0.5” spine)0.475”<

->0.875”250 <

-> 459 pages

Data Links: Managing Files Using DB2

Page 376: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00
Page 377: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00
Page 378: Data Links - IBM Redbooks · Data Links: Managing Files Using DB2 December 2001 International Technical Support Organization SG24-6280-00

®

SG24-6280-00 ISBN 0738423106

INTERNATIONAL TECHNICALSUPPORTORGANIZATION

BUILDING TECHNICALINFORMATION BASED ONPRACTICAL EXPERIENCE

IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.

For more information:ibm.com/redbooks

Data LinksManaging Files Using DB2

Understand the Data Links architecture, unleashed for the first time

Explore planning, migration, the Reconcile utility, and recovery

Learn about HSM and HACMP support

The amount of data that is stored digitally is growing rapidly. The file paradigm is very common for such data types as video, image, text, graphics, and engineering drawings because capture, edit, and delivery tools use the file paradigm for these data types. A large number of applications store, retrieve, and manipulate data in files.

Data Links – a new feature of DB2 Universal Database – extends the management umbrella of the relational database management system (RDBMS), to data stored in external operating system files as if the data was stored directly in the database. Data Links provides several levels of control over external data such as referential integrity, access control, coordinated backup and recovery, and transaction consistency.

This IBM Redbook explains how to effectively deploy Data Links in a complex environment. First it describes the technical architecture of Data Links, developing applications in a Data Links environment, and planning a deployment of Data Links. Then, it covers administering a Data Links environment, setting up Tivoli Storage Manager as a backup server with Data Links, and implementing high-availability cluster multiprocessing (HACMP) with Data Links. It includes a full chapter on data replication and the replication of Data Linked files. It then describes the Reconcile utility and how the DB2 backup and recovery mechanism supports Data Links. This redbook concludes by providing some hints and tips for problem determination in a Data Links environment.

Back cover