implementing six sigma in big data – training program for...
TRANSCRIPT
Running head: Six Sigma in Big Data 1
Implementing Six Sigma in Big Data – Training Program for
Technical Consultant at PwC
Srinivas Pochincharla
Dr. Priscilla Berry
University of Florida
Six Sigma in Big Data 2
Implementing Six Sigma in Big Data – Training Program for
Technical Consultant at PwC
Authors:
Srinivas Pochincharla, 401 East Las Olas Boulevard, Suite 1800, Fort Lauderdale, Florida 33301
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 3
ContentsExecutive Summary...............................................................................4
Introduction............................................................................................5
What is Big Data.................................................................................5
Current Difficulties..............................................................................6
Solution..................................................................................................7
What is six sigma................................................................................7
Implementation of Six Sigma methodologies in Big Data...................8
Conclusion............................................................................................10
Reference.............................................................................................11
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 4
Executive Summary
As a consulting and advisory firm, Price Waterhouse Coopers (PwC) currently
provides data assurance solutions to clients. With the advent of big data,
PwC is also venturing into the field of predictive analytics; analyzing gigantic
amounts of data using different complex techniques ranging from NOSQL
databases to proprietary solutions, such as SAS. Big data is an abstract
ideology, where extraction, analyzing and sorting the data can help an
organization predict the future trends and achieve profitability in a highly
competitive market. The problem with big data lies with the uncertainty
associated and one of the many challenges involve the extraction process to
be time efficient and error free. By using the Six Sigma process, the process
can be enhanced efficiently. Six Sigma is an intricate process where the
organization meticulously observes and mitigates the errors and deviations
occurring in its operations by applying rules and strategies. Implementation
of Six Sigma has resulted in an estimated savings of $427 billion for the
Fortune 500 companies (Marx, 2007).
Through combining the elements of six sigma and the predictive analytics
concepts of big data, PwC can minimize the uncertainty associated with the
data and streamline the process. In big data, categorization is difficult.
Therefore, using the process of Six Sigma will make categorization easier, as
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 5
Six Sigma is more statistical in concept. Furthermore, using Six Sigma will
also result in better time cycle, as time management for the teams working
on the data extraction will improve, thus providing a critical competitive
edge to the firm. Implementing Six Sigma through the teams working on big
data projects, will result in higher client satisfaction, thus increasing revenue
for PwC.
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 6
Introduction
What is Big Data
The amount of data in our world has been expanding, and analyzing large
data sets —— so-called big data —— will become a key component of
competition, underpinning new waves of productivity growth, innovation, and
consumer surplus. Big Data is currently a $53.4 billion industry and is
growing exponentially, as shown in Figure 1 (Kelly, 2014).
Big data is usually referred to as large amounts of data. Having a large chunk
of data is useless unless some information is extracted from it. Not only does
the extraction have to be meaningful but it also has to be rapid. Extraction of
data usually depends on three factors:
1) Volume, how big the data is;
2) Velocity, how fast the data is growing;
3) Variety, what types of data are in the sample collected.
300 Madison Avenue #24, New York, NY-10017
Figure 1
Six Sigma in Big Data 7
An excellent example is the retail market chain Target. Using data analytics
on its customers and by tracking what they are purchasing, the retail giant is
able to predict what the customers are planning to buy next and
consequently send them advertisements related to the product. The
prediction is very accurate. For instance, there was a famous incident where
a gentleman asked Target’s customer service to stop mailing him coupons
related to pregnancy.
He came to find out later that his daughter was pregnant and Target was
mailing him the coupons by predicting the purchase history occurring under
their household account (Goswami, 2014).
Current Difficulties
The significant problem associated with Big Data is being able to relate it.
Since the data is in large volumes and is spontaneous, being unable to relate
the data causes problems with the speed of extraction of meaningful data as
shown in Figure 2 (Taleb, 2013). The data has to be analyzed thoroughly by
professionals, and special statistics are used on the data to approximate its
meaning.
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 8
PwC specializes in this field, where the risk assurance branch essentially has
a data assurance department, which studies large amounts of data for the
clients. Using different software tools and providing various control checks,
the data is extracted for the clients and its accuracy is insured. However, the
process is time consuming, because it requires effectively-managed teams
and emphasis on client priorities.
Another problem associated with Big Data, are the security issues. The most
recent data breach occurred with Target and Sony when the customers’
private information was compromised. Security is becoming an essential
element in driving customer satisfaction for companies. PwC provides IT
security services that investigate the companies’ security loopholes and that
make the data more secure by performing analyses.
300 Madison Avenue #24, New York, NY-10017
Figure 2
Six Sigma in Big Data 9
Solution
What is six sigma
Six Sigma improves the quality of process outputs by identifying and
removing the causes of defects (errors) in business processes. Six Sigma
(although it seems like a technical component) is applicable to all kinds of
industries and companies. It assists users in developing minimal error
products, which also enhance and improve the efficiency of the process
involved.
Six sigma follows two project improvement methodologies-DMIAC (Define,
Measure, Analyze, Improve and Control) and DMADV (Define, Measure,
Analyze, Design and Verify), and each phase is composed of five different
phases. Companies usually start to implement the DMAIC methodology later
if the organization culture permits DMADV to be added to it. Only DMAIC
methodology will be dealt with, in the current article.
Implementation of Six Sigma methodologies in Big Data
Big data is becoming fundamental to the future of business. Six Sigma
essentially is statistics, as is big data. Processes, organization structures and
metrics were all designed to support the “zero defects” philosophy of Six
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 10
Sigma. Utilization of the Six Sigma process can effectively diminish human
error problems (Goswami, 2014).
Six Sigma can effectively be utilized to provide big data solutions through the
five phase (DMAIC) process:
In the define phase, the voice of the customer (VOC), which translates all
customers’ core needs into technical requirements can process intangible
into a tangible/usable form. The VOC is crucial because it is known as (CTQs)
critical quality measures. This process is essentially important in the
consulting industry, because all the other processes are dependent upon this
phase. By critically understanding customer needs through applying the Six
Sigma process, chances of errors in correlating the data can be significantly
reduced.
300 Madison Avenue #24, New York, NY-10017
Figure 3
Six Sigma in Big Data 11
The failure modes effects analysis (FMEA) in the measuring phase can
analyze the potential failure modes for each of the measured fields.
Executives use the feedback from FMEA to predict disruptions and allow
anticipated actions. This mode is very critical to the speed of extraction of
data, because any bottlenecks relating to data crunching can hinder the
process of extracting it efficiently and, most importantly, to extract it quickly.
The third phase (Analysis) uses data and decomposes the collected statistics
to offer practical solutions for the problems at hand. Experiment in this phase
is a tool that effectively and efficiently analyzes the cause-and-effect
relationship between the measured fields and the CTQ’s.
The improve phase identifies the variations and develops control charts by
simulating the changes in data flow. These charts can be used for real-time
monitoring (Hartwig, 2012).
The control phase in Six Sigma monitors the variability in the changed
system. The control phase is critical in the sense that any vulnerabilities
related to the data need to be exposed in real time. This process is extremely
difficult considering the volume of the data, and implementation of Six Sigma
methodology in this phase can essentially act as a safeguard for the data at
hand. Any data breach that occurs, if detected in real time, can help
companies employ better control schemes.
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 12
Conclusion
With the advent of big data, industries are moving forward in a competitive
environment where predicting the future through historical analysis will
prove to be a major game changer. However, the technology is fairly new,
considering that Web 2.0, where users can actually interact over the Internet,
was conceived in the last decade. Vast amounts of data have to be managed
effectively, and to do that, research and progress are proceeding at a brisk
space, where new fields of study such as predictive analytics, visual analysis
and, information systems are helping to define the future.
Six Sigma has proven to be a very effective project management solution in
various Fortune 500 industries. In fact, major firms consider compensating
employees, if they are Six Sigma-certified associates. The statistical
improvements made with Six Sigma can prove to be extremely critical for the
future of big data. Not only can the analysis of the data be improved with the
application of Six Sigma, but collection of data and, most importantly,
reduced time cycles in extraction of data can prove to be the critical edge
that industries require in this competitive environment.
As a consulting firm for whom process improvement and risk assurance are
major components of revenue generation, it is extremely important for PwC
to apply older concepts to recent advancements to give our firm a
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 13
competitive edge in the consulting world. Especially, since the competing
firms are already forming dedicated departments related to the issue of big
data (EY, 2014). The timing is of critical essence for innovation and progress
in the related field or PwC may risk losing clients to competition.
Reference
Marx, M. (2007, Jan 11). Six sigma saves the fortune 500 $427 billion. . Retrieved from http://www.isixsigma.com/community/blogs/six-sigma-saves-fortune-500-427-billion/
Kelly, J. (2014, Feb 12). Big data vendor revenue and market forecast 2013-2017. Retrieved from http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2013-2017
Goswami, B. (2014, Feb 14). Why six sigma learnings are relevant for big data. Retrieved from http://insights-on-business.com/electronics/why-six-sigma-learnings-are-relevant-for-big-data/
Taleb, N. (2013, Feb 8). Beware the big error of ‘big data’. Retrieved from http://www.wired.com/2013/02/big-data-means-big-errors-people/
Dmiac vs dmadv. (n.d.). Retrieved from http://www.isixsigma.com/new-to-six-sigma/design-for-six-sigma-dfss/dmaic-versus-dmadv/
Six sigma dmadv methodologies. (n.d.). Retrieved from http://www.villanovau.com/six-sigma-methodology-dmadv/
Hartwig, C. (2012, Apr 10). The parallels between big data and the advent of six sigma. Retrieved from http://www.katoka.com.au/2012/04/big-data-and-six-sigma/
EY (2014, Apr 2). Corporate website detailing service offerings related to big data. Retrieved from
300 Madison Avenue #24, New York, NY-10017
Six Sigma in Big Data 14
http://www.ey.com/US/en/Services/Advisory/IT
300 Madison Avenue #24, New York, NY-10017