copyright © 2002 by the institute of electrical and ... · customer-initiated issues, the...

8
Copyright © 2002 by the Institute of Electrical and Electronics Engineers. Reprinted from "2002 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium," Seattle, Washington, USA, January 28-31, 2002. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of ReliaSoft's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by sending a blank email message to [email protected]. By choosing to view this document, you agree to all provision of the copyright laws protecting it.

Upload: others

Post on 27-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

Copyright © 2002 by the Institute of Electrical and Electronics Engineers. Reprinted from "2002 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium," Seattle, Washington, USA, January 28-31, 2002. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of ReliaSoft's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by sending a blank email message to [email protected]. By choosing to view this document, you agree to all provision of the copyright laws protecting it.

Page 2: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 1 RF

Intellectual Capital: Utilizing the Web for Knowledge Management and Data Utilization in Reliability Engineering

Adamantios Mettas • ReliaSoft Corporation • Tucson

David Rock • ReliaSoft Corporation • Tucson

Key Words: QTMS, Web-based, FRACAS, Dashboard, Reliability Database, Automated Analysis, Information Systems,

Knowledge Management, Lessons Learned, Reliability Reports, Process Improvement

SUMMARY & CONCLUSIONS

Intellectual capital is one of the most important assets in successful companies. An important part of this intellectual capital is the product quality and reliability intellectual capital that is comprised of “lessons learned,” corrective actions taken, best practices, data regarding customer perception and customer feedback as it relates to different products, and most importantly accurate reliability metrics on deployed parts, components and systems. Managing this knowledge, experience and information should be a high priority in both large-scale and small-scale companies. Unfortunately, in many organizations, most of the required information and knowledge is spread across different divisions and geographical locations. Such information usually resides either on paper or in disparate databases, thus making the effective utilization of such information a difficult or impossible task unless a centralized data collection and management system is in place.

This paper presents a description of the conception and implementation of three web-based systems for the management of reliability and quality information. These are: 1. A simple Internal Tracking System (ITS) developed to

manage customer support and anomaly tracking data for a software development company.

2. The evolution of this concept into a more complex and powerful Quality Tracking and Management System (QTMS) capable of meeting the failure reporting, corrective action and customer support needs of both large and small companies with diverse product lines and product configurations.

3. A web-based Reliability/Quality Dashboard for automated analysis and presentation of the reliability and quality data captured by systems like the ITS and QTMS.

The three web-based systems presented in this paper were initially developed by and used by ReliaSoft Corporation with great success and user acceptance. This success led to subsequent modifications, adaptations and deployments of similar systems at other companies.

1. INTRODUCTION

Reliability plays a major role in a well-balanced business scheme, since it is involved in many areas of the enterprise.

Utilization of product quality and reliability intellectual capital assists companies to avoid expensive mistakes, allows for faster product development, provides better customer service, creates better issue/process management and results in more robust and reliable products. This product quality and reliability information can arrive from different sources within the organization, including product development and testing divisions, customer support centers, service centers, sales and marketing units, etc. Management and easy utilization of such information becomes essential, not only for the reliability engineers, but also for the design engineers, management, sales and marketing personnel and other entities within the organization. As an example, knowledge of a design attribute that is perceived by the customer to be undesirable can be obtained through analysis of customer feedback data and is of high importance to design engineers when revising the product design.

Building upon current software technology and the high levels of IT infrastructure in most enterprises, systems and processes can be created and/or adapted to enable the collection, categorization, analysis and presentation of enterprise-wide information to the entire organization. Web-based systems (intranet/extranet/Internet) designed for the collection, storage and dissemination of this information to engineering, management and other personnel provide an efficient way to accomplish this objective. With such a system, for example, a customer support center can collect and utilize the knowledge and lessons learned when assisting customers to better manage its process while at the same time making this information available to reliability and design engineers in the enterprise for further analysis.

The implementation of this type of web-based data management system can significantly increase the efficiency and financial strength of an enterprise and it tends to have a high and measurable return on investment (ROI). With the right system in place, problems can be solved faster, mistakes of the past can be avoided, higher product reliability can be achieved, better maintenance practices can be developed, better customer support can be provided, product development time can be decreased, and thus customer satisfaction and loyalty will be maintained and will likely be significantly increased.

In the next section of this paper, Section 2, we will look at the initial creation of a simple system, the Internal Tracking System (ITS), created by and for ReliaSoft Corporation, a

Page 3: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 2 RF

software company, to better manage quality and reliability data and knowledge for its own products (in this case, software). Sections 3 and 4 present the evolution of this concept into a global Quality Tracking and Management System (QTMS) for use in large-scale enterprises with various products and data needs followed by a description of a web-based automated data reporting and presentation component, the Reliability/Quality Dashboard.

2. INTERNAL TRACKING SYSTEM (ITS)

2.1 Background and Historical Information

ReliaSoft Corporation is a software company specializing in reliability engineering software tools and services. Believing in the adage of “practicing what we preach,” reliability and quality data regarding our products is of paramount importance to our company. With that in mind, we set out to create a system, for our internal use, that was ideally suited to our business needs and provided us with the desired flexibility. This system, the Internal Tracking System (ITS), combines a customer support and issue database with a research and development (R&D) database. Initially, ITS was developed for the needs of ReliaSoft’s customer support division. After its initial implementation, it was recognized that there was a big overlap between the customer support division’s needs and the R&D division’s needs. A complete system was then created in which information on every product “issue” resides in the same database, regardless of whether it originated from a support call, during internal product testing, during feedback sessions or through distributors. During the initial development of the system, we determined that such a system should be fully web-based (foregoing traditional client-server architecture) to allow for easy deployment, access and maintenance across the enterprise. (Appendix A of this paper presents some of the benefits of this approach.)

2.1.1 Customer Support

The initial ITS system was created out of the necessity to better manage and organize the customer support process and data collection for technical support calls related to our software products. Before the creation of the ITS system, customer support representatives were running into a fundamental and yet complex problem: information-sharing. As our customer base rapidly increased, so did the volume of support calls. Efficiency in the customer support process became very important at this point for three main reasons: (a) customer satisfaction; (b) better understanding of customer requirements; and (c) business costs. At the same time, it was observed that many of the calls were repetitive, i.e. multiple callers had the same question. However, knowledge about how to resolve particular issues was “polarized” among individual tech-support representatives and was not readily available to all support personnel, which led to longer support calls. Time was squandered on finding the solution to a problem that had already been resolved by another technical support representative or on looking for the right representative to answer a customer’s question.

To resolve the problem, we decided to create a customer support system to record data from all customer support calls, manage the status of particular issues and maintain history. At this point, the fundamental information that needed to be captured included: the purpose of call (i.e. customer’s question or problem), product information, platform and environment that the product was installed on, customer information, the problem resolution and the name of the tech-support representative. It was also crucial for this system to be easily accessible and available to all tech support personnel and to be fully searchable. This is why we implemented a web-based solution that was deployed through our intranet.

Once the first version of this system was implemented, it became apparent that the information captured was very important to the product development engineers as well as the tech-support personnel because the customer’s “voice” is essential in any new development or update of an existing product. The utilization of existing customer support information in new product designs led to the development of the next phase of the ITS system, which was to expand the system to incorporate an Anomaly Tracking System (ATS) as well, which is the R&D database part of the ITS.

2.1.2 Anomaly Tracking System (ATS)

The anomaly tracking system was developed to improve the process of fault reporting, product improvement reporting, and overall task assignment and prioritization during the product development phase, thus improving the communication flow and subsequently improving product quality and expediting product release. Prior to the development of this system, faults and suggestions were reported on paper (or via e-mail) to the project manager. Thereafter, the reported fault was passed on to the engineer responsible for the component that the fault was associated with, and corrective action was taken. Once the action was completed, the paper was returned to the project manager who closed the incident. This generated a substantial paper trail, and many lessons learned were lost in the process. This was also a source of delays in the development, and consequently in the product release.

2.2 Overview of the ITS

All issues in the ITS require the same information to be entered by the user, regardless of whether they are entered as the result of a support call, found during internal testing, or obtained from other sources. The only difference is that, for customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an example of an ITS screen, where the tech-support representative or the engineer enters the details of an issue. Information on the product, the environment in which it is used, the problem, the resolution, and the priority of the issue for resolution are required to fully describe the “issue.” In addition, the issue is assigned to a category and assigned to a specific individual to take the required action. For example, in the case of software, a “Run Time Error” is a priority 1 issue. The issue is assigned to the appropriate project manager to ensure that the error is corrected in the software.

When accessing the ITS, each user has his/her own “inbox” window where the issues assigned to him/her appear. Furthermore, all cases can then be searched, viewed,

Page 4: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 3 RF

prioritized and tracked through multiple interfaces, as shown in Figure 2.

Figure 1: An illustration of a sample web-based ITS screen

where the tech-support representative or the engineer enters the details of an issue.

2.3 ITS Benefits Overview

Since its implementation, the ITS system has been readily and easily adopted by all personnel and has had a tremendous positive impact on the company’s processes. This system is now an integral part of our product development and support process. Some of the realized benefits of this system include:

1. Offers the ability to report technical support problems from customers and search for previous solutions, therefore significantly reducing both the length and the cost of support calls.

2. Provides a “knowledge base” of the lessons learned from previous customer support calls that are used to troubleshoot new customer calls. This eliminates the “polarization” of knowledge among specific tech-support representatives and makes it easier to bring additional tech-support staff on when needed.

3. Problems are addressed more quickly, and based on their priority, which results in more reliable products and higher customer satisfaction.

4. Provides the ability for R&D to have wider and larger testing base (including external testers and distributors) that reports issues in real time to the project manager and responsible engineer.

5. Eliminates the paper trail, therefore allowing for easy auditing of problems later on.

6. Allows the design engineers to better manage their time because they can decide which problems to address first.

Figure 2: Additional web-based interface screens utilized in the ITS system.

Page 5: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 4 RF

7. Allows for suggestions to be entered for future product updates and development, resulting in better products and products tailored to the customer’s needs.

8. The integrated search capability allows engineers to find previously reported problems and information on their resolution, resulting in expediting product release and the avoidance of duplicated effort.

9. Facilitates the improvement and enhancement of internal test protocols for new products to include issues found on prior products.

10. Reliability growth information (e.g. RG graphs of the product development stages) can now be easily retrieved, which results in better planning for testing and more realistic product release goals.

11. The web-based nature of the system allows authorized users to access and share information at any time and from anywhere there is access to the intranet/extranet/Internet.

3. QUALITY TRACKING AND MANAGEMENT SYSTEM (QTMS)

3.1 Background and Historical Information

After realizing the benefits of the Internal Tracking System (ITS) system, our R&D team set out to create a next generation system with a much broader scope and applicability to multiple industries, including different types of original equipment manufacturers (OEMs). This system, the Quality Tracking and Management System (QTMS), incorporated all aspects of a complete failure reporting, analysis and corrective action (FRACAS) process and also included a closed loop system for problem resolution, tracking and reporting. In creating a larger and more general system that could be easily adopted for use by multiple OEMs and other industries, lessons learned during the development of the ITS were utilized. The first item that needed to be modified was the process of dealing with issues. In the ITS system, each issue was dealt with individually by the responsible engineer. The drawback was that multiple issues can be due to the same underlying problem and require a single corrective action, and thus valuable time was wasted dealing with and resolving individual issues (i.e. even though the underlying problem was addressed, each issue related with that problem had to be resolved individually). Additionally, the system needed to (a) be expanded to allow its use for any type of product, including large systems/products (e.g. a car) with multiple component levels; (b) capture all vital reliability information (including accumulated age on each component, repair duration, etc.) for further reliability analysis, thus serialization capabilities and cross-linking with bills of materials for each product were needed; (c) be a complete FRACAS system; (d) act as process manager; and (e) become a reliability and quality data warehouse and knowledge base (as more information is accumulated).

3.2 Building QTMS

3.2.1 Working with Problems Instead of Incidents

For any incident, one can say that there is an underlying reason or a problem that must be addressed in order to resolve the incident. Additionally, the relationship between incidents and problems is not necessarily one to one. In other words, multiple incidents can exist due to the same problem. In this case, requiring engineers to work with each incident can be counter-productive. A better approach is to group incidents into problems and then proceed with resolving the underlying problem. This became an integral part of the QTMS process. Figure 3 illustrates this approach.

Inc id e nts are the n as s ig ned to p rob lem s ,

s ince m ultip le inc id e nts c an b e the s am e o r be

c aus ed b y the s am e p rob lem . In QT M S jarg o n inc id e nts are as s ig ned to

P RR s (P rob lem R eso lutio n R ep o rts ).

Inc idents occur and a re reported

Inc idents a re ass igned to p rob lem s

M ultip le inc id e nts re lating to p rob le m s are re po rted . T hese

re po rted inc ide nts c an o rig inate f ro m m ultip le so urces ,

inc lud ing in-ho us e te s ting , R& D ,

d e ale rs , c us tom e rs o r e ve n s up p lie rs .

Inc id e nts are e nte re d thro ug h a we b

inte rface d ire c tly into the Q T M S s ys te m .

P rob lem s (P RR s ) are the n as s ig ned , m anag ed ,

trac ke d and reso lve d . R eso lv ing a p ro b le m re so lves all inc id e nts

as so c iated w ith it.

Problem s a re Managed, Tracked and Resolv ed

Inc id e nts are the n as s ig ned to p rob lem s ,

s ince m ultip le inc id e nts c an b e the s am e o r be

c aus ed b y the s am e p rob lem . In QT M S jarg o n inc id e nts are as s ig ned to

P RR s (P rob lem R eso lutio n R ep o rts ).

Inc idents occur and a re reported

Inc idents a re ass igned to p rob lem s

M ultip le inc id e nts re lating to p rob le m s are re po rted . T hese

re po rted inc ide nts c an o rig inate f ro m m ultip le so urces ,

inc lud ing in-ho us e te s ting , R& D ,

d e ale rs , c us tom e rs o r e ve n s up p lie rs .

Inc id e nts are e nte re d thro ug h a we b

inte rface d ire c tly into the Q T M S s ys te m .

P rob lem s (P RR s ) are the n as s ig ned , m anag ed ,

trac ke d and reso lve d . R eso lv ing a p ro b le m re so lves all inc id e nts

as so c iated w ith it.

Problem s a re Managed, Tracked and Resolv ed

Figure 3: Overview of incident grouping in the QTMS

process.

Figure 4: A sample of an incident creation web page in

QTMS.

Page 6: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 5 RF

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb-based interfaceto log new incidents

or trackexisting incidents.

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesResponsibleEngineer (or

Group) of NewIncident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRRs.One or more incidentscan be assigned to the

same ProblemResolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,

Or inAlready

ResolvedPRR

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,Or inAlready

ResolvedPRR

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,

Or inAlready

ResolvedPRR

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

INCIDENTRESOLVED

PRR Process inSystem

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

Teams ManageOpen PRRs

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

RequestPRR

Closure

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages AssignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned Create/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

INCIDENTRESOLVED

PRR Process inSystem

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

Teams ManageOpen PRR's

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

RequestPRR

Closure

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Reportingof Incident

(Failure, Malfunction,Functionality, etc.)

External IncidentField Incident,

Customer ComplaintDealer Incident, etc.

Internal Incident(found in testing,

R&D, etc)

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb based interfaceto log new incidents

or trackexisting incidents.

Use QTMSweb-based interfaceto log new incidents

or trackexisting incidents.

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesresponsibleEngineer (or

Group) of newincident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

SystemAutomatically

NotifiesResponsibleEngineer (or

Group) of NewIncident. (Based on

group's assignedresponsib ilities andwhat the incident

relates to wrt system/subsystem/component)

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRR's.One or more incident

can be assigned to thesame Problem

Resolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

ResponsibleEngineer ManagesIncidents through

PRRs.One or more incidentscan be assigned to the

same ProblemResolution Report PRR,and engineering teams

solve the underlyingproblem which in turnsolves the associated

incident(s).

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,

Or inAlready

ResolvedPRR

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,

Or inAlready

ResolvedPRR

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,Or inAlready

ResolvedPRR

Assign toNew PRR

Assign toExisting

PRR

CloseDisregardIncident,

Or inAlready

ResolvedPRR

INCIDENTRESOLVED

PRR Process inSystem

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

Teams ManageOpen PRR's

INCIDENTRESOLVED

PRR Process inSystem

INCIDENTRESOLVED

PRR Process inSystem

Teams ManageOpen PRR's

Teams ManageOpen PRRs

RequestPRR

Closure

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

RequestPRR

Closure

RequestPRR

Closure

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages assignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned to himCreate/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Each PRR Owner Manages AssignedProblem Resolution Process

Manage Action ItemsPRR Owner:Create Action ItemsAssign Action ItemsManage Action ItemsReview StatusAction Item Assignee:Review Items Assigned Create/Update StatusesCreate ResultsAssociate Related FilesMark it as completed

CategorizePrioritize

PRR

SAFETY

HIGH

MODER ATE

LOW

DEFINE

CONTAIN

CORRECT

PREVENT

ManagePRR

Figure 5: Graphical overview of the QTMS Process.

In the case of the QTMS, we also realized that incidents occur and are reported from multiple sources, including in-house testing, manufacturing, R&D, dealers, customers or even suppliers. To allow for this, an easily deployable and accessible interface for entering incidents was needed. The web-based technology developed and employed in the creation of the ITS system was enhanced and utilized for the QTMS system. This allowed instant access to the incident creation page from any web enabled device that had access and authorization. Furthermore, and for more complete reliability information, additional data for each failure incident, including data as to the primary failures and collateral failures, may be collected and entered. Other data, such as parts replaced/repaired and repair duration are also captured. This information will then be utilized to automatically compute life distributions for each part, component and system, as well as appropriate distributions for repairs, to facilitate better system reliability, maintainability and availability analysis. Figure 4 shows a sample of an incident creation web page.1

Once all of the required information has been entered, and based on the primary cause of the failure/incident, the system initiates the process by notifying the responsible engineer for the part/system that caused the incident. In turn, this engineer determines and assigns the incident to an existing problem or creates a new problem for this particular incident. In QTMS jargon, “incidents” are assigned to “PRRs” (Problem

1 Note that different interfaces may need to be

utilized if accessing the system from hand-held devices.

Resolution Reports). Once the incident is assigned to a PRR, the process continues with the resolution of the problem. Figure 5 shows an overview of this process.

3.2.2 Resolving the Incidents by Resolving the Problems

Once incidents have been assigned to PRRs and the PRRs have been assigned to the engineer responsible for resolving the problem (ERRP), the system automatically notifies him/her of the assignment (usually via e-mail). Once notified, it is then the responsibility of the ERRP to utilize the problem resolution interface (PRI) of the system to resolve the issue. The main facets of the problem resolution process can be grouped into activities to: (a) describe the problem; (b) contain the problem; (c) correct the problem; and (d) close/resolve the problem.

The system’s PRI enables and assists the ERRP to accomplish this through an integrated interface that allows for task/action item2 creation and assignment. These are assigned to other resources (i.e. other engineers or groups) and they can be prioritized, updated and monitored via QTMS interfaces. During this process, data as to actions taken becomes an integral part of the knowledge stored in the system.

The PRR record for a problem that has been resolved will include: (a) a definition of the problem; (b) the identification of the root cause of the problem; (c) a description of the failure mode and part responsible; (d) a list of the effects

2 An example of a task could be “Find a temporary fix to the problem to get the customer’s product back up and running until we resolve the manufacturing issue.”

Page 7: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 6 RF

caused by the problem (e) a description of the resolution to the problem; (f) any additional documents and references utilized in resolving the problem; and (g) a description of the methods to prevent the problem from happening again in future product designs (i.e. lessons learned knowledge base). The QTMS also includes the capability to allow a “closure committee” or “review board” to review and approve the closure of particular PRRs.

3.3 Benefits of QTMS

QTMS provides a complete closed loop system for the tracking and management of incidents/problems and their resolution, while capturing relevant data. The benefits of this system include the ITS benefits enumerated earlier, with the following additions:

1. Provides a more refined, closed loop process capable of effectively dealing with large enterprises and complex products.

2. Supports a team approach for tracking and management of the problem resolution process.

3. Provides detailed data (including times-to-failure and times-to-repair data by part, component and system) that enables further reliability analysis, including advanced parametric analysis.

4. Allows engineers to ascertain feasibility and initial reliability of future designs based on the historical reliability data captured by the system.

5. Provides a view into customer usage profiles and use environments.

6. Provides an insight into the problem and issue resolution process by maintaining metrics on the issue resolution process (i.e. how fast are issues dealt with, etc.)

7. Can be used as a knowledge sharing system, so that the user can find out if a problem already exists and the status of resolution efforts.

4. ADDED PROCESS BENEFITS - THE WEB-BASED DASHBOARD COMPONENT

4.1 Background

Once the QTMS (or other similar data capture system) is successfully deployed, multiple drill-down reports and queries can be made available to most users (depending on their access or authorization level) for further reporting and analysis. That may be sufficient for the engineering groups of most organizations. However, presenting high level information in a graphically rich, and easy to use/view format, on a real-time basis, automatically garners management buy-in and support, along with providing a very intuitive and consistent interface for turning data into information, and information into decisions. This can be accomplished via the use of the Reliability/Quality Dashboard (first introduced by ReliaSoft in 1994). Like the dashboard in a car, which contains gauges and instruments that empower the driver to make informed decisions to successfully and safely reach his/her destination, the Reliability/Quality Dashboard uses the same metaphor to allow the enterprise to easily navigate

through its quality and reliability issues. Implementing such a Dashboard system is by no means a simple task, and some lessons learned from such an implementation are discussed by J. Jauw and P. Vassiliou (Ref. 1).

The most challenging aspect of this Dashboard implementation is the coherent and consistent acquisition of the data required to drive analysis and reporting. However, with the implementation of a QTMS system, the most challenging data elements are already captured and stored during the problem-solving process, thus making the implementation of a Dashboard component a less formidable task.

4.2 About the Dashboard System

A Reliability/Quality Dashboard system can and should be polymorphic, allowing individual users to easily view analyzed results that interest them. Think of it as a personalized edition of USA Today, delivered via a web browser, detailing the reliability and quality metrics of products. One instance of the Dashboard, geared toward management personnel, may include information on current and projected sales,3 warranty costs, warranty costs per unit shipped, reliability metrics, technical hotline call numbers per issue, failure analysis results, reasons for returned products, etc. A generic example of such a web-based Dashboard implementation is shown in Figure 6. From an engineering perspective, the Dashboard may have a totally different look. For engineers, the general business information may be replaced with specific graphs and charts that can, for example, include Weibull probability plots, reliability growth curves, as well as other reliability and or quality graphs.

Figure 6: An example of web-based dashboard.

3 In general, revenue and forecasting information

may be required for some management oriented systems. Such information may not be available in standard reliability databases such as the QTMS. In cases like this, links can be established with the systems containing such information.

Page 8: Copyright © 2002 by the Institute of Electrical and ... · customer-initiated issues, the tech-support representative enters customer contact information. Figure 1 illustrates an

RF 2002RM-043: page 7 RF

4.3 Benefits of Dashboard System

The Dashboard system provides many benefits including the following:

1. Provides a single enterprise-wide system for access to crucial reliability and quality information.

2. Provides a timely view to vital information, thus decreasing reaction time for emerging problems and enabling decisions based on accurate information.

3. Can be configured to automatically monitor processes (utilizing automated calculations and updates, and through the use of control limits).

4. Increases senior management’s awareness of reliability information and practices.

5. Reports can be created automatically in real time, and in a consistent manner, allowing engineers to spend more time in resolving issues instead of creating reports to identify them.

5. APPLICATIONS AND SUCCESS STORIES

Web-based systems similar to those described in this paper have been implemented in some well-known enterprises. Such a system is the Reliability Data System (RDS), which is the cornerstone of Plug Power's failure reporting, analysis & corrective action system (FRACAS) (Ref. 2).

Mr. Chris Smith [2] explains: “The web-based tool enables effective documentation of all failure events occurring during system development, verification and field reliability testing. E-mail notification of incident report provides immediate feedback to design engineers, thus allowing them to investigate the event prior to losing valuable information. This has greatly improved our ability to determine failure causes. Incident reports are readily classified based on severity and frequency then assigned to a problem resolution report (PRR). The PRR is designed to guide engineering teams through a structured problem solving process, provide a means for them to document action items and results and serve as a repository for lessons learned. Custom query and reporting capabilities facilitate the management of the entire problem set which in turn grows our system and fleet reliability.

Additionally, RDS provides a means for Plug Power to accurately measure fleet reliability. The "Dashboard" component automates several reliability metrics based on the failure events, the system reliability model and system run time. It is used to estimate failure rates of subsystems and components which enables management to prioritize resources to improve system reliability. The automation of this data reduction effort has directly improved the productivity of our reliability engineering department.”

6. APPENDIX A - BENEFITS OF WEB-BASED

INFORMATION SYSTEMS

Web-based information systems provide many benefits over a client-server based application and over a stand-alone application.

1. Uniform presentation of information, so everyone is “on the same page.”

2. Centralized systems for storing reliability information, instead of separate databases or even separate Excel spreadsheets.

3. N-tiered architecture, for ease in scalability and reliability.

4. Ease of deployment of system (no software to install on individual user computers).

5. Ease of maintainability of system (since a single change on the server automatically updates the system).

6. Everything runs in a web browser, so as long as a connection is present to the corporate Intranet/Internet/Extranet, access to the system is assured.

7. Flexible security and permission models can be utilized.

ACKNOWLEDGMENTS

We give special thanks to Pantelis Vassiliou, Lisa Hacker and all other ReliaSoft personnel for their support and feedback on this paper. We also wish to thank Chris Smith for his input and contribution to the functional requirements for the QTMS system, as well as for his contribution to this paper.

REFERENCES

1. J. Jauw and P. Vassiliou, "Field Data is Reliability Information: Implementing an Automated Data Acquisition and Analysis System,” Proceedings of the Annual Reliability & Maintainability Symposium, 2000 Jan, pp 86-93.

2. C. Smith, Reliability Engineer, ASQ, CRE, Plug Power, Inc.

BIOGRAPHIES

Adamantios Mettas ReliaSoft Corporation Corporate R&D 115 S. Sherwood Village Dr. Tucson, AZ 85710 USA [email protected] Mr. Mettas is the Senior Research Scientist at ReliaSoft Corporation. He fills a critical role in the advancement of ReliaSoft's theoretical research efforts and formulations in the subjects of Life Data Analysis, Accelerated Life Testing, and System Reliability and Maintainability. He has played a key role in the development of ReliaSoft's software including Weibull++, ALTA, and BlockSim, and has published numerous papers on various reliability methods. Mr. Mettas holds a BS Degree in Mechanical Engineering and an MS degree in Reliability Engineering from the University of Arizona. David Rock ReliaSoft Corporation Corporate R&D 115 S. Sherwood Village Dr. Tucson, AZ 85710 USA [email protected]

Mr. Rock serves as the development team leader for ReliaSoft’s Enterprise Systems Division, which has implemented several enterprise systems for fortune 500 companies. Mr. Rock has also served as the development team leader for ReliaSoft’s Weibull++ 5.0, ALTA, and Reliability Growth software packages. Mr. Rock holds a BS Degree in Aerospace Technology from Kent State University and an MS Degree in Mechanical Engineering from the University of Arizona.