citadel white paper

20
Data Logging with National Instruments Citadel 5 February 2006 Executive Summary National Instruments Citadel is a proprietary, real-time logging database designed primarily to store single-point logging data. In contrast to relational databases, Citadel stores data in terms data traces which are compressed and optimized for real-time logging and historical data retrieval. Citadel is an integral component of many National Instruments software products such as the o LabVIEW Datalogging and Supervisory Control Module (DSC), Lookout, DIAdem, and VI Logger As a common data storage mechanism, Citadel allows these software products to easily exchange test and measurement data. For example, a researcher could use DIAdem to analyze data generated by the DSC Module or Lookout without having to modify, convert, or reprocess the original data. Citadel offers several major benefits that will improve productivity while saving time and money. These benefits are: Citadel is optimized for real-time logging and historical data retrieval. This results in increased application performance while saving valuable system resources, especially in large applications. Citadel includes advanced data visualization and management components. You won’t have to develop custom data visualization or management tools for your application. Citadel doesn’t require any programming or prior knowledge of database systems to use. This saves time and money when developing an application. 1/20

Upload: dsffgdsgsgasfgasfg

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

DESCRIPTION

Citadel

TRANSCRIPT

Page 1: Citadel White Paper

Data Logging with National Instruments Citadel 5February 2006

Executive SummaryNational Instruments Citadel is a proprietary, real-time logging database designed primarily to store single-point logging data. In contrast to relational databases, Citadel stores data in terms data traces which are compressed and optimized for real-time logging and historical data retrieval.

Citadel is an integral component of many National Instruments software products such as the

o LabVIEW Datalogging and Supervisory Control Module (DSC), Lookout, DIAdem, and VI Logger

As a common data storage mechanism, Citadel allows these software products to easily exchange test and measurement data. For example, a researcher could use DIAdem to analyze data generated by the DSC Module or Lookout without having to modify, convert, or reprocess the original data.

Citadel offers several major benefits that will improve productivity while saving time and money. These benefits are:

Citadel is optimized for real-time logging and historical data retrieval. This results in increased application performance while saving valuable system resources, especially in large applications.

Citadel includes advanced data visualization and management components. You won’t have to develop custom data visualization or management tools for your application.

Citadel doesn’t require any programming or prior knowledge of database systems to use. This saves time and money when developing an application.

Citadel is inherently network-aware allowing you to share data seamlessly among team members, or between data terminals.

Citadel is ready to use out-of-the box which greatly reduces start-up and system familiarization time.

You do not need to be familiar with the contents of this white paper to be successful with Citadel. However, if you are interested in learning about the technology behind Citadel or are planning a large project using Citadel, you may want to take a few minutes to review this paper.

1/13

Page 2: Citadel White Paper

Table of Contents

Executive Summary__________________________________________________________________1

Citadel Database Structure___________________________________________________________2Supported Data Types______________________________________________________________________3Compression of Numeric Data________________________________________________________________3Traces and Subtraces_______________________________________________________________________4VI Logger Data____________________________________________________________________________6Database Files____________________________________________________________________________7Database Size Considerations________________________________________________________________8Historical Alarm and Event Data______________________________________________________________8

Citadel Operations__________________________________________________________________9Installing Citadel__________________________________________________________________________9Creating, Attaching, or Detaching a Database____________________________________________________9Archiving Data___________________________________________________________________________10Backing Up and Restoring a Database_________________________________________________________10Retrieving Data from a Database_____________________________________________________________11Inserting Data into a Database_______________________________________________________________12

Networking and Security Considerations______________________________________________12Networking______________________________________________________________________________12Security_________________________________________________________________________________12

Citadel Database StructureCitadel organizes data into a three tiered hierarchy which contains the originating computer, process name, and trace name. See figure 1 for a graphical overview of a typical Citadel database.

Figure 1: Citadel Database Layout

2/13

Local Databases

VI Logger Data is organized into tasks and data runs

Network Databases

Page 3: Citadel White Paper

Notice that the Historical Data Viewer also gives you access to multiple databases on the same computer and provides access to remote databases.

Besides traces views, Citadel also includes Waveform hierarchies which contain data logged from VI Logger, and dataset hierarchies which are created using the DataSet Marking I/O server in the DSC Module.

Supported Data TypesThough optimized for numerical data logging, Citadel supports a wide range of data types including numeric, discrete, bit array, waveform, string, and variant.

Once logged, data is available to other NI software such as DIAdem and can be retrieved programmatically using the Citadel API available in the DSC Module, through the Historical Data Viewer (HDV) component in Measurement and Automation Explorer (MAX), or through the Citadel ODBC interface. There is no C, C++, or VisualBasic API for retrieving Citadel data.

In summary, you can store the following types of data in Citadel:

Analog Analog values are compressed based on the specified logging resolution

Integer &Bit-array Discrete are not compressed when logged and are returned as an I32 typed values

String Citadel supports three string types; LabVIEW style string, null terminated strings,

and Unicode strings Variant

Citadel stores raw variant data, but optimizes storage by only storing the metadata associated with a variant once instead of with each logged value.

Waveform VI Logger and DIAdem are the only two NI software products that can interact

with waveform data

Compression of Numeric DataThis section contains an overview of how Citadel compresses numerical data. Understanding this section is not required to use Citadel.

The purpose of this section is to present a simplified view of how Citadel compresses numeric data. This section does not describe the full algorithm Citadel uses for compression.

When queried for historical data, Citadel will always return a value equal to measured value ± the logging resolution at the time the data point was logged. The actual resolution of the returned data point may be higher than the specified logging resolution, but will always be within the logging resolution.

Data is typically acquired a data source such as a data acquisition board, analog input device, or PLC. In most cases measured values are passed through a series of deadbands before being sent to Citadel. Once in Citadel, the values are converted to deltas and compressed using the

3/13

Page 4: Citadel White Paper

specified logging resolution. The example shown in table 1 uses a logging resolution of 0.1 which means all values retrieved from the database will be within ±0.1 of the actual recorded value.

Citadel uses a complex compression algorithm to store data, but this compression can be approximated using the following equation.

Equation 1: Simplified Citadel Delta Calculation

where y is the set of recorded values,k is the set of logged values, and .

The following table provides sample delta calculations, using equation 1.

Table 1: Example Citadel delta calculation

The logging resolution is a key parameter in how Citadel compresses numeric data. The logging resolution can be any real number but should be chosen in concert with the logging deadband. For instance, if you specified a logging resolution of 10 units and a logging deadband of 1 unit, Citadel would log any change of at least 1 unit, but due to compression within the Citadel database any value change under 10 could not be discerned from data retrieved from the database. To prevent this from happening you should always specify a logging resolution less than or equal to the logging deadband. You should also specify the logging deadband as a power of ten unless you have an application-specific reason to do otherwise. Specifying a logging resolution of zero (0) will always log data with full resolution.

For reasons explained later in this white paper, the number of values in a run will always be less than 100, i.e. n will always be less than or equal to 100.

Traces and SubtracesCitadel databases are structured around traces. Traces are uniquely named and point to historical data associated with that trace. When working with a Citadel database you interact with database based on trace names. Internally to Citadel, each trace is in turn made up of subtraces which contain a historical record of values for a single run of data. Subtraces are composed of data runs with constant resolution and logging parameters.

A variety of properties are associated with each trace, all of which you can view in the HDV. Right-click a trace in the HDV and choose Properties to display trace properties. The trace properties dialog box will look like the one shown in figure 2.

4/13

Page 5: Citadel White Paper

Figure 2: Trace properties dialog box

Data type indicates the most generic type associated with a trace. The most generic type indicates what sort of reader should be used to retrieve the majority of the trace’s contents.

Types from most to least generic are:

Binary (LabVIEW String) Variant Unicode String ANSI String BitArray Logical

The startTime and endTime properties of the trace indicate the actual start and end times of the trace in Universal Coordinated Time (UTC) format. UTC avoids all conflicts associated with daylight savings time or varying time zones. Although Citadel stores timestamps in UTC, the times displayed in the Hyperbrowser are shown in the local time zone. If you expand the value column you will see that the current time zone name is appended to the end of the time string.

The lifespan property indicates how long trace data will remain in the Citadel database before Citadel reuses storage space taken up by old trace data. For instance, if a trace were configured with a lifespan of 10 days, only the 10 most recent days worth of data would remain in the database. After 10 days the oldest data would be overwritten. The lifespan of a trace is defined by the application that writes data to the database and can only be configured on a per-process basis, not a per-trace basis. A lifespan of 0 indicates an infinite lifespan.

The data lifespan is not a contract. Setting a lifespan indicates signals Citadel that data older than the lifespan can be deleted in order to save space, but the database does not guarantee that expired data will be deleted on any schedule. Lifespannig only occurs when the process that originally logged the data is actively logging new data to the database. For instance, if you

5/13

Page 6: Citadel White Paper

copied data to a CD and, 15 years later, reattached your database to retrieved data, the data would not suddenly disappear due to the lifespanner.

The pages property indicates how many 4kb pages the traces is using and provides a rough estimate of how much disk space is used by the trace.

The subtraces property indicates how many subtraces a trace is composed of.

A subtrace is an arbitrary data stream containing a single type of data, plus associated meta-data. Subtraces are indexed by time, making it possible to seek to a time position within a subtrace without reading through the entire stream. Because data points are indexed by time, all data in a subtrace data stream must have nondecreasing timestamps. If a back-in-time event occurs, or the type of the data changes, Citadel must create a new subtrace. Changing the type of the data means changing the base data type (double, bool, etc.), or changing the format with which data is serialized to disk (e.g. changing the compression algorithm by changing the value or time resolution, changing the type of Variable being logged to a Variant subtrace, or even changing the attributes that are set on the Variant. Setting attributes on Variants logged to Citadel is not recommended.)

Though Citadel provides no direct interface to work with individual subtraces it’s important to understand subtraces because they can have a significant impact on database performance. In particular, if a trace contains many subtraces it will perform less well than a trace with fewer subtraces. In typical use cases Citadel will not need to create multiple subtraces.

If any local Citadel client, reader, writer, or the Citadel service fails to shut down cleanly while accessing a database, the database will be re-indexed on restart. When a database is re-indexed, we do not know if a subtrace may have been somehow damaged by the failure that triggered the re-indexing, so all existing subtraces are closed permanently, and new data will be logged into a fresh subtrace.

The following events can trigger the creation of a new subtrace:

The trace data type changed. You log a “back-in-time” value. The logging properties, such as logging resolution changed. The Citadel service terminated abnormally (power loss, system crash, etc.).

Subtraces are made of data runs of n = 100 points. This applies mainly to analog traces where data is compressed based on delta values and the logging resolution. After every 100 points an absolute measurement value is logged which resets the compression algorithm. This prevents accumulation of error and reinforces data integrity within the database.

VI Logger DataVI Logger stores waveform data to Citadel in groups of runs, which belong to VI Logger tasks. Each run corresponds to a single acquisition of data. A VI Logger subtrace consist of between one and three subtraces, a waveform subtrace containing data from hardware, a waveform subtrace containing calculated data generated by the VI Logger application, and a subtrace

6/13

Page 7: Citadel White Paper

containing VI Logger event data. VI Logger subtraces contain raw waveform data, which are essentially arrays of uncompressed values. Waveform subtraces do not contain serialized timestamps because the waveform timestamp can be calculated from its start time and deltaT attributes. VI Logger uses uncompressed subtraces to maximize throughput to disk, at the cost of greatly increased disk and network usage. The VI Logger event subtrace uses the same value/timestamp format as single-point traces to log arbitrary-length boolean arrays without compression.

Database FilesCitadel databases are stored in a special set of files on your hard drive. A Citadel database typically resides in a folder unique that that database. Though not required, you should avoid placing non-database related files a folder that contains a Citadel database. You can only have one Citadel database in a given folder. A typical database is comprised of a set of files similar to those presented in figure 2.

Figure 3: Typical Citadel database files

The number of CDPG and CDIB files will vary depending on the amount of data in a database. The nodetree.*, pid.cdih, and stridm.cdin files contain important information about the structure of the database. The mssql.* files contain historical alarm information and are automatically created the first time alarm or event data is written to the database.

Important: You should never modify, move, or delete a database file while the database is attached. Doing so will result in a database corruption. If you modify or delete a database file while the database is detached you may not be able to reattach the database and could loose some or all of the data in the database. If you move or copy a detached database, be sure to move/copy all database files.

The CDPG files contain trace data. Citadel stores data in a compressed format therefore it is not possible to read and extract data from these files directly. You must use the Citadel API in LabVIEW DSC or the HDV to access trace data. See the Citadel Operations section for more information about retrieving data from a Citadel database.

Each of the 1,024 Kb CDPG files contains a set of 4 Kb pages. Each page contains data for a single subtrace. Citadel will attempt to use all 4 Kb of space in a page before opening a new

7/13

Page 8: Citadel White Paper

page. If a subtrace is terminated or a new subtrace is started before the end of a page the remainder of the space in the page will go unused. Citadel will only reclaim the remaining unused space if all trace data in that page is removed or deleted.

Database Size ConsiderationsCitadel does not impose a size limitation to databases. Still, there are several performance considerations that need to be taken into account when planning a data logging system.

Citadel database performance will degrade slightly as the database grows. This is due primarily to the physical limitations of hard disks and limitations inherent in the NTFS and FAT32 file systems.

Archiving large databases takes significantly longer than archiving smaller databases Database integrity is not related to database size

Based on experience and interactions with a variety of users, a large database is typically in the range of 5-10 GB, a medium-sized database is 2-5 GB, and a small database is less than 2GB. In many applications, database life spans range from 1 to 15 years.

Properly managing database size will greatly improve your ability to maintain the database over its expected life time. The following tips will help you plan a healthy database system:

Use an appropriate logging resolution and logging dead band. Use an appropriate lifespan setting when configuring a process. Plan for regular archiving operations to reduce the amount of data stored in the

“operational” database. To achieve maximum performance you may want to store only the last 1Gb worth of data in the operational database. The actual amount of data you decide to keep in your operational database will depend on your application.

Historical Alarm and Event DataAs previously noted Citadel is very efficient at storing numeric data and is not optimized for storing record-based data. Each distinct piece of alarm and event data is composed of several fields of data of varying types. Storing this data in the primary Citadel format would both lower its efficiency and provide a less than optimal mechanism for returning alarm data for viewing. To alleviate these concerns, National Instruments decided to store this data in an attached Microsoft SQL Server Desktop Engine 2000 (MSDE 2000) relational database. MSDE was chosen because it was freely redistributable, thereby having no impact on the pricing for the DSC module. MSDE can also be upgraded to Microsoft SQL Server if necessary. National Instruments recommends that you treat this attached database as a part of Citadel and access it via the API provided with LabVIEW DSC.

The alarms and events portion of the Citadel database is limited to 2 GB, a restriction imposed by the MSDE 2000 database. Users that expect their alarms and events database to exceed 2 GB can purchase the workstation version of MSSQL 2000 from Microsoft which will remove the 2GB restriction.

8/13

Page 9: Citadel White Paper

Citadel Operations

Installing CitadelThe Citadel database component is automatically installed with the following National Instruments Software:

o LabVIEW Datalogging and Supervisory Control Module (DSC), Lookout, VI Logger (MSDE component is not installed) DIAdem (optional component)

Citadel is not an independent software component and cannot be installed as such. If you need to use Citadel on a remote terminal, the recommended option is to purchase a LabVIEW DSC Module Run time license for that terminal. The LabVIEW DSC Module Run time includes the Citadel database drivers.

The following other components are also installed with Citadel,

NI Logos (NI-PSP), Microsoft SQL Server 2000 Desktop Engine (MSDE 2000), and

If you intend to use the Historical Data Viewer component to view Citadel trace data you must also install Measurement and Automation Explorer (MAX). MAX is automatically installed with most National Instruments software.

Creating, Attaching, or Detaching a DatabaseCitadel databases are managed primarily through the Historical Data Viewer (HDV) component in MAX, NI’s software and hardware configuration tool. You can also use the LabVIEW DSC Module to programmatically manage Citadel databases.

To create a Citadel database you must specify a database name and a path to a folder on a local hard drive. Citadel will use the path as the default name if none is specified. Databases can also be created on remote PCs; refer to the Networking with Citadel section for more information about using Citadel across a network. If you attempt to create a database in a folder that already contains an unattached Citadel database, that database will be attached. You can only have one Citadel database in a given folder.

Attaching a database allows you to open a connection to database files on disk. This is useful if you need to review data that was previously stored on backup media for archival purposes. You can only attach a database located on a local hard disk. You cannot create or attach a Citadel database from a mapped network drive.

Detaching a database is useful if you don’t need immediate access to historical data or would like to backup database files to permanent storage medium such as a tape drive, DVD, or CD. Databases can be detached manually using the HDV or programmatically using the DSC Module historical API. Citadel releases control over the database files in the database directory once the database has been detached; this is the only safe state in which a user should manually move or

9/13

Page 10: Citadel White Paper

delete database files. You must detach a database before you manually copy the database files or write them to permanent storage. If you fail to detach a database before manually copying database files it is possible that you will not get an accurate snap-shot of database files and the copied database may be corrupt when you try to attach it.

When detaching a database you have the option to delete the database files. This option is useful if you no longer need the data in the database. Only local databases can be detached or deleted from a system. Citadel does not allow these operations for networked databases.

Archiving DataArchiving data allows you to copy trace data from one location to another. You can perform archives of trace data within a database or to another local or remote database.

An archive operation requires the user to specify four pieces of information: the sources trace(s), the destination database, and the start and end times for the archive operation. Citadel will not archive data that has already been archived for a particular trace. For example, if you perform an archive operation for a time range from 1/1/2005 to “now” on a daily basis, Citadel will archive the initial range of data once, then it will only archive the latest day’s worth of data on subsequent archive requests.

The lifespan of all archived trace data is automatically set to 0, or infinite. Unless the lifespan property is reset by logging data to the archived trace, the data will not be affected by a lifespan.

Citadel archives data by physically copying the trace pages from one database to another. Since Citadel must reconstruct a database page-by-page, performing an archive operation will take significantly longer than it would to copying a raw file.

Two common use cases for archiving include trace merging and destructive archiving. Merging allows you to take information from one trace and merge it into a trace with the same name. For example, assume that, for redundancy purposes, you have two computers, A & B, logging the same data source to local database. All of your client applications reference the database computer A when the retrieve data. If computer A goes down, you need a way to merge the redundant data from computer B into the same trace on computer A. This can be done using the following procedure:

1. Create a temporary database on computer A, then

2. Archive the backup trace from computer B to this temporary database on computer A. The trace will have the same name and path as it did on computer B.

3. Rename the archived trace in the temporary database on computer A to have the same computer name as computer A

4. Archive the data from the temporary database to the master database on computer A. This will create a new subtrace in the original trace that contains the redundant data.

Destructive archiving is used to move data from one database to another. This is a common process when backing up older data to a more permanent or off-line storage database.

10/13

Page 11: Citadel White Paper

Backing Up and Restoring a DatabaseYou have several options when it comes to backing up Citadel databases. You must choose the method that best meets your system specifications.

The simplest option for backing up a Citadel database is to perform a regular archive operation from a local database to another database on a separate hard disk or on a networked Citadel database. You can perform the archive operation manually using the HDV, or you can write a utility application using the LabVIEW DSC module to perform the archive automatically and on a regular basis.

Some applications require that a physical hard copy back up be made of data at regular intervals. In this scenario the best option is to plan a manual or automated archive process that archives data on a daily or hourly basis to a secure location. When the database grows to an adequate size (650Mb for a CD or 4 Gb for a DVD) you can detach the archive database, then write the database files to permanent storage. Using this method each CD or DVD would contain data for specified time period.

Restoring a Citadel database will require you to perform a mirror operation of the every step in the backup process. The second option will require you to copy the database files back to a hard drive and reattach each CD as a separate database. You could then reconstruct the entire master database by archiving all databases together.

Citadel does not contain any built-in mechanisms for automated archiving or database restoration.

Retrieving Data from a DatabaseHistorical data can be retrieved from Citadel in a variety of ways.

The Historical Data Viewer (HDV) component in MAX contains a generic interface for viewing and exporting trace data. Using this interface you can view and export historical data to a text file as well as use cursors to perform a basic analysis of your trace data.

11/13

Page 12: Citadel White Paper

Figure 4: Historical Data Viewer

Trace views in the HDV can be configured to show multiple overlapping traces and can be configured in trace groups.

The LabVIEW DSC Module is the most efficient way to programmatically retrieve data from Citadel. The DSC module includes an advanced API for retrieving historical trace and alarm/event data, which can be requested in interpolated or raw form. Raw data simply contains the actual values recorded in the database and the associated time stamps associated with those values. See Table 1 for an overview of raw data. A request for interpolated data requires the user to specify a time range and time interval. The returned data will be an interpolated approximation of what trace values were at a given time. The user can specify either linear or stair-step approximation when requesting interpolated data.

Finally, National Instruments DIAdem provides a direct interface to Citadel which allows users to perform advanced analysis and reporting directly on Citadel data. Visit ni.com/diadem for more information about this powerful analysis package.

The Citadel database also includes an ODBC driver, which enables you to directly retrieve data from 3rd party applications. The Citadel 5 ODBC driver is compliant with SQL 92 and ODBC 2.5 standards. All clients that follow these standards can retrieve data from a Citadel 5 database.

12/13

Page 13: Citadel White Paper

ADO clients can now use the ODBC driver through the Microsoft OLE DB Provider for ODBC Drivers to access the Citadel 5 database.

Inserting Data into a DatabaseInserting data directly into a Citadel trace is useful if you need to,

Log data faster than can be accomplished with a Shared Variable Record waveform-type data in a Citadel trace (data with highly accurate time-stamps) Implement a data redundancy system for LabVIEW RealTime targets.

The LabVIEW DSC Module 8 contains a set of VI functions that can used to manually insert data into a Citadel trace. Data can be inserted point-by-point with either a user specified or server generated time-stamp. Numeric, logical, string, bit-array, or variant data can all be written using this method. Benchmarks have demonstrated that single point data can be written to a Citadel trace at approximately 80k updates per second from memory. Once written to the database, the data is immediately available to all local or remote Citadel clients.

Networking and Security Considerations

NetworkingThe Citadel 5 database is fully network aware and contains a host of built-in networking tools and enhancements. All database functions discussed in the Citadel Operations section of this white paper can be conducted across a network, except deleting or detaching a database.

Citadel also supports logging directly to a remote database. This is accomplished internally by first storing data in a local cache database, then sending the data to a central server when CPU and network bandwidth are available. In the case of network failure all data is stored locally in a cache database and is sent to destination database when a network connection is restored.

SecurityWhen planning a networked data logging system it is important to consider that Citadel does not contain any built-in security features. Data must be secured by placing all PCs on the data logging network on an isolated sub-net or an isolated network. Otherwise, any user with access to Citadel on the network will be able to view, modify, or copy potentially sensitive Citadel data. All Citadel terminals should also be password protected to prevent unauthorized access to data.

13/13