project report rp12 - deliverable 5.5 project: … · practitioner’s guide on the use of cloud...
TRANSCRIPT
Project Report RP12 - Deliverable 5.5
Project: Distributed Computing in Finance
Binghuan Lin
Email: [email protected]
Techila Technologies LtdSupervisor: Mr. Rainer Wehkamp & Prof. Juho Kanniainen
Acknowledgment: The research leading to this report has received funding from theEuropean Union’s Seventh Framework Programme FP7/2007-2013 under grant agreementNo. 289032.
First of all, I am thankful to Mr. Rainer Wehkamp and Prof. Juho Kanniainen for theirsupervision and support. I am also grateful to colleagues from Techila Technologies’s R&Dteam, in particular, Tuomas Eerola, Teppo Tammisto, Kari Koskinen and Marko Koskinenfor their helpful comments on the reports.
I would like to thank people who provided research support during my past 3 years. Inparticular, I would like to thank Erik Vynckier, Grigorios Papamanousakis, Terence Naharand Jinzhe Yang for hosting my secondment to SWIP. I would like to thank Raf Woutersfor hosting and supervising me during my visit to National Bank of Belgium. I would liketo thank Juha Kilponen and Jani Luoto of Bank of Finland for research collaboration andI would also like to thank Prof. Eduardo Trigo Martinez and Prof. Rafael Moreno Ruiz forhosting my visit to Malaga.
Practitioner’s Guide on the Use of Cloud Computing inFinance
Binghuan Lin∗, Rainer Wehkamp†, Juho Kanniainen‡
December 1, 2015
∗Techila Technologies Ltd & Tampere University of Technology†Techila Technologies Ltd‡Tampere University of Technology
Contents
1 What is Cloud Computing? 11.1 Why cloud computing and why now? . . . . . . . . . . . . . . . . . . . . . . 2
2 Backgrounds 42.1 The taxonomy of parallel computing . . . . . . . . . . . . . . . . . . . . . . 42.2 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Financial Applications of Cloud Computing 93.1 Derivative Valuation and Pricing . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Risk Management and Reporting . . . . . . . . . . . . . . . . . . . . . . . . 103.3 Quantitative Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.4 Credit Scoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 The Nature of Challenges 13
5 Implementation and Practices 135.1 Implementation Example: Techila Middleware with MATLAB . . . . . . . . 145.2 Computation Needs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.3 Solution Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.4 Algorithm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.5 Evaluation and Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6 Case Studies 236.1 Portfolio Backtesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.2 Distributed Portfolio Optimization . . . . . . . . . . . . . . . . . . . . . . . 266.3 Distributed Particle Filter for Financial Parameter Estimation . . . . . . . . 29
7 Cloud Alpha: Economics of Cloud Computing 327.1 Cost Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327.2 Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4
1. What is Cloud Computing?
It takes mankind centuries to learn how to make use of electricity. In the early age,
factory and corporations are powered by on-site small scale power plant. Maintaining such
power plants are expensive due to the additional labor cost. Nowadays with the help of large
scale power plants and efficient transmission networks, electricity powers modern industrial
society for transportation, heating, lighting, communications and so on. Electricity is at
everyone’s disposal at a reasonable price.
Cloud computing shares many of the similarities with electricity. By connecting end-
users via Internet to data centers, where powerful computing hardware are located, cloud
computing makes computation available to everyone. The core concept of cloud computing
is resource sharing. To harvest the computing power, cloud computing is also a practice of
operation research in computing resource optimization. Such technology enables the process-
ing of massive-parallel computations by using shared computing resources. The computing
resources usually consists of large numbers of networked computing nodes. The word “cloud”
is used to depict such networked computing resources.
The formal definition of cloud computing given by National Institute of Standards and
Technology (NIST) of U.S. Department of Commerce is:
“a model for enabling ubiquitous, convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks, servers, storage,
applications, and services) that can be rapidly provisioned and released with
minimal management effort or service provider interaction. ”
Mell and Grance (2009),NIST
The following five essential characteristics differentiate cloud computing from other com-
puting solutions, such as on-premise servers:
• On-demand self-service: a consumer can provision computing capacity as needed
without interaction with service provider.
• Broad network access: computing resources are available to the consumer through
the network and can be accessed from mobile phones, tablets, laptops and workstations.
• Resource pooling: Resources are dynamically assigned to customers’ needs.
• Rapid elasticity: Capacities can be reconfigured automatically to scale rapidly in
response to the changing demand.
• Measured service: The resource utilization is automatically controlled and optimized
by the cloud systems. The utilization is monitored, measured and reported.
1
1.1. Why cloud computing and why now?
The first reason is the increasing computing demand from industry, especially
from financial industry. Joseph, Conway, Dekate, and Cohen (2014) from International
Data Corporation (IDC) reported the demand for high-performance computing (HPC) from
13 sectors. They predict a 8.7% yearly growth in the spending of HPC in economics/financial
sector from 2013 to 2018, which is among the top 3 of all the 13 sectors studied as shown in
Fig 1. The second reason is the pervasiveness of cloud computing. Cloud computing
has its deep root date back to utility computing in 1960s:
“If computers of the kind I have advocated become the computers of the
future, then computing may someday be organized as a public utility just as the
telephone system is a public utility... The computer utility could become the
basis of a new and important industry.”
John McCarthy at MIT Centennial in 1961
The technology has developed since then. To provide an overview of the technology
developments, Fig 2 shows the advances of cloud computing related technology alongside
the innovations in financial engineering. While the innovation of more complex models
in financial engineering increase the demand for high performance computing technology,
the supply of high performance computing technology increases with the development of
technologies, such as cloud computing. There is also evidence of increasing awareness of
cloud computing from the public. Fig 3 shows an increasing search trend for cloud computing
using Google Trends.
Fig. 1. HPC Spending By Sector 2013 V.S. 2018, Data Source: IDC2014
2
1900s 1940s 1950s 1960s 1970s 1980s 1990s 2000 2000s 2006 2012 2015
1900Theory of speculations – Louis. Bachelier
1952Portfolio selection – Harry Markowitz
1973Black-Scholes-MertonF.Black, M.ScholesR.Merton
1990-2000Stochastic Volatility Model and Local Volatility Model
2000-Jump diffusion model, Levy model, etc
Large-scale mainframe computers
Time-sharing service, IBM VM Operating System
Virtual private network (VPN)
2006,Amazon introduce Elastic Compute Cloud
1900s
2008, Microsoft Azure
2013 - Basel III
Fig. 2. History of Financial Engineering and Cloud Computing
Fig. 3. Google search trend of cloud computing (since 2004)
3
The undergone development and commercialization of cloud computing are significantly
boosted by the increase computation demands in the real world. According to Gartner’s
reportSmith (2008):
“By 2012, 80 percent of Fortune 1000 enterprises will pay for some cloud com-
puting service and 30 percent of them will pay for cloud computing infrastructure.
Through 2010, more than 80 percent of enterprise use of cloud computing will
be devoted to very large data queries, short-term massively parallel workloads,
or IT use by startups with little to no IT infrastructure.”
Modern day commercial cloud computing has also revolutionized a new level of industrial
computing practice, especially in financial industry where the computing need is massive.
Our focus is on how to utilize the enormous computing resource from cloud computing
to harness the massive computing challenges posed by financial industry.
We start by introducing parallel computing problems and massive parallel computing
tasks in finance industry.
We then compare cloud computing with alternative solutions from the following aspects:
• Performance;
• Cost;
• Elasticity;
To improve understanding about what kind of problems users may face in practice and
how they can be solved, we also walk readers through a complete implementation proce-
dure in financial industry with case studies and an implementation of Techila R© Middleware
Solution.
2. Backgrounds
2.1. The taxonomy of parallel computing
Computer processors process instructions sequentially. Thus traditional computing prob-
lems are serial problems by such design. The birth of multiprocessors has innovated a new
type of computing problem: how to utilize the parallel structure?
Parallel computing problem, in contrast to serial computing problems, refers to the
type of computing problems that can be divided into sub-problems to be processed simulta-
neously.
Based on the dependency structure of sub-problems, it can be further classified into
embarrassingly parallel and non-embarrassingly parallel computing problems. If the
4
processing of one sub-problem is independent of other sub-problems, then it is an embar-
rassingly parallel computing problem. It is called non-embarrassingly parallel computing
problem otherwise.
The following figure illustrate the structure of embarrassingly parallel and non-embarrassingly
parallel problems.
There is no communication between jobs in embarrassingly parallel case as in Fig.4(a),
while communication is required in non-embarrassingly parallel case as in Fig.4(b).
By the nature of underlying problem, it can be classified as data-parallel problem and
task-parallel problem. While data parallelism focus on distributing data across different
processors, task parallelism focus on distributing execution processes (subtasks) to different
processors.
Another important aspect of parallel computing is whether the parallel computing prob-
lem is a scalable problem. A scalable problem has either a scalable problem size or scalable
parallelism. Either the solution time reduces with the increasing of parallelism, or the per-
formance of the solution increase with the problem size. The elasticity of the computing
architecture is the key to the success of processing of scalable problems.
Here we provides two examples from finance industry:
(a) Embarrassingly parallel computing (b) Non-embarrassingly parallel computing
Fig. 4. Parallel Computing Structure
5
Example 2.1 (Monte Carlo Option Pricing). Monte Carlo simulation is a typical
embarrassingly parallel and task-parallel computing problem.
By the fundamental theorem of arbitrage pricing, option price is equal to the expected
payoff V discounted by a discount factor D. The expectation can be evaluated via
Monte Carlo method. The Monte Carlo estimator of option price is given by:
C0 = D1
N
∑ω∈sample set
V (ω)
where N is the number of sample paths.
1. The simulation of each price path is independent of other paths. Thus, it is easy
to parallel process the simulation of different paths on different computing nodes;
2. It is task-parallel in the sense that simulation of each path is a small task. How-
ever, it is not data-parallel since there is no data set to be distributed to different
computing nodes.
3. There is benefit in scaling the computation. The accuracy of the price estimator
improves with the increase of N . (The error has convergence of O( 1√N
)).
To illustrate the difference between task-parallel and data-parallel, we use the following
example.
6
Example 2.2 (Backtesting Investment Strategy). Depending how you imple-
menting the computation tasks, back-testing can be either task-parallel or data-
parallel.
Suppose you need to back-test a basket of different investment strategies to identify
the optimal strategies. The processing of each investment strategies are independent
of each other and can be run simultaneously. Distributing the processing of differ-
ent strategies to different computing nodes is an embarrassingly parallel and task
parallel implementation.
Input: A set of investment strategies, historical data sample;
Output: PnL, Portfolio Attribution, Risk Exposures, etc;
for i← 1 to number of strategies do
PnL← ProfitandLoss(strategy i, datasample);
PA← PortfolioAttribute(PnL) ;
RE ← RiskExposure(PnL) ;
endAlgorithm 1: Task-Parallel Backtesting
Back-testing of one strategy can also be implemented as data-parallel. By generating
sub samples from test data set (for example by bootstrapping), strategy can be processed
on different sub samples simultaneously. The result on different sub samples are then
aggregated to generate the performance and risk report of the strategy.
Input: Investment strategy, sub data samples
Output: PnL, Portfolio Attribution, Risk Exposures, etc;
for i← 1 to number of data sample do
PnL[i]← ProfitandLoss(strategy, data sample i);
end
PnL total = aggregate(PnL)Algorithm 2: Data-Parallel Backtesting
2.2. Glossary
• Computing instance refers to a (virtual) server instance that is linked to a com-
puting network to provide computing resources. To offer flexibility to their customers,
cloud vendors offer different types of nodes that comprise varies combination of CPU,
7
memory, storage and networking capacity1.
• Data Center A data center comprises of a large number of computing nodes, the
network to connect these nodes and necessary facility to house the computer system.
• Server-Worker Nodes Server-worker nodes is a typical mechanism to coordinate the
computation between different computing nodes. The server nodes assign computing
tasks to worker nodes and collect results from worker nodes. Worker nodes receive
instructions from server nodes, execute the computations and send the results back to
server.
• Middleware Middleware is a computer software that “glues” software applications
with computing hardware. In cloud computing, middleware is used to enable commu-
nications and management of data.
• Job Scheduler and Resource Manager Softwares to optimize the usage of comput-
ing resources based on the resources available and job priority. Commercial solutions
usually package job scheduling, resource management with middleware.
• Virtualization Using computer resources to imitate other computer resources. By
virtualization, users are not locked with specific operating systems, CPU architecture,
etc. Thus, middleware and virtualization are particularly important to ensure on-
demand self-service of cloud computing.
• Cloud Bursting Cloud computing offers on-demand service. Cloud bursting refers to
the process of dynamic deployment of software applications.
• Elastic Computing Elastic computing is a computing service which has the ability
to scale resources to meet requirements.
• Public Cloud Public cloud is the cloud computing service that is available to public
and can be accessed through Internet.
• Private Cloud Private cloud, in contrast to public cloud, is not available to public.
The computing resources are dedicated to select users.
• Hybrid Cloud Hybrid cloud is a cloud computing service that combines different
type of services, for example, public and private. A hybrid cloud combines public and
private clouds and allows workloads move between public and private clouds. The
flexibility offers users optimize the allocations to reduce cost while still have direct
control of their environments.
• Wall Clock Time (WCT) Wall clock time is the human perception of the passage
of time from the start to the completion of a task.
• CPU Time: The amount of time for which a central processing unit (CPU) was used
for processing instructions of a computer program or operating system, as opposed to,
1We will provide a list of different computing nodes in section 5
8
for example, waiting for input/output (I/O) operations or entering low-power (idle)
mode.
• Workload In cloud computing, workload is measured by the amount of CPU time
and memory consumption.
• CPU Efficiency CPU efficiency measured as the CPU time used for computation
divided by the sum of CPU time and I/O time used for data transfer. Thus CPU
efficiency measures the overhead of paralleling a computation. A low CPU efficiency,
in general, indicates a high overhead.
• Acceleration Factor Acceleration factor is measured by wall clock time of running
the program locally on end users computer divided by the wall clock time of running
it on the cloud. In an ideal case, the acceleration factor can be linear in the number
of cores used for computation.
• Total Cost of Ownership (TCO) TCO measures both direct and indirect cost of
deploying the solution. In cloud computing and alternative computing solutions, TCO
includes the cost of: hardware, software, operating expenses (such as infrastructure,
electricity, outage cost, etc) and long term expenses (such as replacement, upgrade and
scalability expenses, decommissioning, etc)
3. Financial Applications of Cloud Computing
3.1. Derivative Valuation and Pricing
One of the core businesses of front office is derivative pricing. Even though, the mar-
ket for exotic derivatives has shrunk after the crisis in 2008 - 2009, the exoticization of
vanilla products has increased the complexity of the valuation process. Moreover, numerical
methods needed with certain models, even with vanilla options, such as non-affine variance
models, infinite-activity jump models, etc.
The valuation process usually requires high performance numerical solutions, as well as a
high performance technology platform. The size of the book and time criticalness requires a
platform that is both suitable for handling large data and processing massive computing. A
recently paper by Kanniainen, Lin, and Yang (2014) evaluates the computation performance
of using Monte Carlo method for option pricing. With the aid of cloud computing and
Techila Middleware Solution2, the time consumption of valuating option contracts using
Monte Carlo methods is comparable with other numerical methods:
2For more information, please visit: http://www.techilatechnologies.com
9
... valuate once the 32,729 options in Sample A using the Heston-Nandi
(HN) model was 25 s with the HN quasi-closed-form solution and 249 s
with the Monte-Carlo methods. Moreover, with cloud-computing with the
Techila middleware on 173 Azure extra small virtual machines (173 × 1
GHz CPU, 768 MB RAM) and the task divided into 775 jobs according to
775 sample dates, the overall wall clock time was 55 s and the total CPU
time 44 min and 33 s. The Monte-Carlo running-times were approximately
the same for GJR and NGARCH. Substantially shorter wall clock times can
be recorded if more workers (virtual machines) are available on the cloud
or if the workers are larger (more efficient). Then the wall clock time differs
very little between the HN model with the quasi-closed-form solution on a
local computer and the HN model or some other GARCH model (such as
GJR or NGARCH) with the Monte-Carlo methods on a cloud computing
platform. Consequently, with modern acceleration technologies closed-form
solutions are no longer a critical requirement for option valuation.
Another key post-crisis trend is the populating of XVAs (Fund Valuation Adjustment
(FVA), Credit Valuation Adjustment (CVA), etc). The books of XVAs are usually huge
and the time constraint to process the valuation is tight. An industry success in the award-
winning in-house system of Danske Bank. The combination of advanced numerical technique
and modern computing platform allows real-time pricing of derivative counterparty risk.
3.2. Risk Management and Reporting
The financial crisis also reshaped the business of risk management in financial industry.
The implementation of Solvency II for insurance and Basel III for banking respectively, pose
new challenge to financial computing.
First, the computations are highly resource-intensive. Second, the computation needs
are dynamic rather than static. Risk report (Solvency II, Basel III, etc) are required at
monthly or quarterly frequency. Computation needs are periodic, where they reach their
peak before the reporting deadline. Building and managing a dedicated data center to meet
the computation need at its peak will significantly increase the cost. One the other hand,
most of the computing resource will be wasted during relatively less intensive period.
Cloud computing has the advantage of being scalable, which allows it meet the dynamic
computation need from financial industry. Using Google search volume, we find an interesting
pattern. Google search volume for cloud computing increased rapidly after 2009 as shown in
10
Fig. 3. So have the search volumes for Solvency II and Basel III. We are not suggesting any
causality between the increasing attention of cloud computing and that of risk regulation.
However, such trends shows the right timing of popularity of cloud computing as a potential
solution for regulation-oriented computation needs.
Financial industry started to embrace cloud solutions, especially when they are inte-
grated to support the need for an effective and timely risk management. IBM’s survey on
the implementation of cloud computing for Solvency II in the insurance industry points out
the trend of adopting cloud computing as part of the implementation strategy for risk man-
agement. Of the 19 firms, 27% are either have successfully implemented cloud solutions or in
the process of implementing cloud solutions. Another 23.8% have started considering cloud
solutions.
One of the key questions is whether a cloud solution is cost-efficient. Little (2011) from
Moody’s Analytics analyze the potential usage of cloud for economic scenario generation and
Solvency II in general. They conclude:
“Building a Solvency II platform on the cloud is a realistic and cost-effective
option, especially when scenario generation and Asset Liability Modeling are
both performed on a cloud.”
3.3. Quantitative Trading
The lowering barrier in the market participation challenges the development of more
complicated trading strategies as well as a race of technology. Quantitative trading, especially
high frequency trading, requires a quick time-to-production as well as a quick time-to-
market.
Fast R&D of trading strategies and back testing in a timely manner will significantly
shorten time-to-market. Firms, by taking advantage of cloud computing, are generating
alpha even before trading strategies are actually implemented in market.
The quick prototyping and back testing of strategies requires close-to-data computing
as well as adaptability to the heterogeneity of developer tools, such as different end-user
applications, different data storage types and different programming languages. On the
other hand, the adaptability to different end-user software has been one of the key feature
of the matured cloud computing platform.
3.4. Credit Scoring
Cloud computing is arguably the solution for big data problems in finance. One typical
big data problem in finance industry is credit scoring.
11
Credit scoring is the procedure for lenders, such as banks and credit card companies
to evaluate the potential risk posed by lending money to consumers. It has been widely
treated as classification problem in machine learning literature (Hand and Henley (1997);
West (2000); Baesens, Van Gestel, Viaene, Stepanova, Suykens, and Vanthienen (2003) and
many others).
The large number of consumers and the variety of credit report formats create a big data
problem. To solve the classification problem over the massive data set of credit history of
consumers, an efficient data storage and processing system is required.
As a summary, modern day financial computing requires
• Adaptability to the heterogeneity end-user softwares;
• Processing large data and close-to-date computing;
• Massive computing;
• Data security;
In the following chapter, we are going to introduce details of how modern cloud computing
can help to solve these problems. There are also innovative cloud-supported new business
models using the concept of sharing, such as cooperative quantitative strategy development
platforms. While we focus the massive computation part of cloud computing for financial
engineering, we refer readers who are interested in those platforms to Internet resources.
12
4. The Nature of Challenges
Integrating massive computing power to existing IT systems may face several challenges.
Finance industry poses certain specific requirements to cloud computing.
System needs to be multi-tenant. The system needs to support multiple users accessing
the computing resource at the same time. Meanwhile, the system has to be smart enough to
allocate computing resource based on the priority and need of the tasks. The requirements
arise from the heterogeneous and dynamic nature of financial computing. Computing from
different desks have different priorities and uneven demand for resources.
Compliance requirement and cybersecurity. Finance industry operates with public data
/ information as well as business critical private data/information. Due to compliance re-
quirements, a hybrid system needs to make computing with sensitive data in-house while
allow utilizing external computing resource with non-confidential data.
A unified platform for quality assurance. Large financial organizations have teams sup-
porting local business operations across the world. A unified platform will make life of
quantitative support, model validation/review teams easier by guaranteeing consistency and
coherency of data and models for users.
IT Legacy. Maintaining a monster-level of legacy codes is a huge task for IT departments.
Any change that needs complete re-write of the codes will be a nightmare. Thus IT systems
have to have a high level of adaptability to have effortless integrating with existing libraries.
5. Implementation and Practices
In previous section, we reviewed the technical and non-technical challenges in the inte-
gration process of cloud computing in finance industry. Luckily, with the development of
commercial cloud computing services, the complexity of cloud computing are hidden behind
user-friendly interfaces.
Aiming to ease the use of cloud computing, many softwares and computing framework
were developed during the past decade. To mention a few, popular computing framework
includes Hadoop+MapReduce, Apache Spark, etc. Middleware solutions such as Techila
Middleware, Sun Grid Engine also help commercial users to distributed computing tasks to
computing nodes.
In this section, we will introduce using an example of Techila Middleware on how chal-
lenges in section 4 can be handled by cloud computing solution.
Then we walk readers through the procedure of implementing cloud computing in prac-
tice.
13
5.1. Implementation Example: Techila Middleware with MATLAB
Techila Middleware Solution, developed by Techila Technologies Ltd3, is a commercial
software solution aiming to provide user-friendly integration of cloud computing. The ser-
vice structure of Techila R© is shown in Fig.5. The specific design of the service structure
allows accessing both on-premise and external computing resources to ensure compliance
requirements are met when necessary.
The system is multi-tenant, where users can assign different priorities to computing jobs
sent to Techila system through secured Gateway. The jobs are scheduled according to the
availability of resources and priorities.
The solution hides the complexity of integration to heterogeneous end-user software be-
hind a user-friendly interface. To illustrate that, we provide an example of using Techila with
MATLAB. For more information of the programming languages and softwares that Techila
supports, please refer to the company’s website at: www.techilatechnologies.com.
Before proceeding to use Techila solution, the end user needs some minor configuration.
Readers are referred to Techila’s online documents4 for more details.
3For more information, please visit: http://www.techilatechnologies.com4Techila Online Documents
Fig. 5. Techila High Level Architecture
14
5.1.1. Easy Implementation: Code Example
When Techila is successfully installed, using cloud computing with existing codes devel-
oped in MATLAB is straight-forward.
Suppose end user have a code contains are for loop structure:
1 f unc t i on r e s u l t = l o c a l l o o p s ( l oops )
2
3 r e s u l t = ze ro s (1 , l oops ) ;
4 f o r counter =1: l oops
5 r e s u l t ( counter ) = counter ∗ counter ;
6 end
To parallelize the computation inside the for loop on cloud computing resources, the end
user only needs to make a minor change to the code:
1 f unc t i on r e s u l t = r u n l o o p s d i s t ( l oops )
2 % the only change i s change for−end to c loud for−cloudend
3 r e s u l t = ze ro s (1 , l oops ) ;
4 c l o u d f o r counter = 1 : l oops
5 r e s u l t ( counter ) = counter ∗ counter ;
6 cloudend
5.2. Computation Needs
Before making a decision to adopt any high performance computing solution, a key step
is to understand your computational need and usage pattern. The best choice of solution
depends on the answers to the following questions:
• Do you have a computational bottleneck?
• Where is the computational bottleneck?
Do you have a computational bottleneck?
The question may seem to be trivial at first sight. A computational bottleneck exists
when the current computing resource can not process the computing tasks within given time
constraints. However, many of the cases, the computational bottleneck arises from another
15
dimension, that is the time it required to upgrade computing resource to meet the increased
demand.
The point we would like to emphasize is that the planning of computing needs to be
forward-looking. While quants, researchers and developers are aware of the computational
bottleneck, it is usually the IT department’s decision to whether expand current IT resources.
The procedure may take some time. Thus, forward-looking planning is critical in order to
ensure an efficient and effective response to the computational need.
The scalability and elasticity of cloud computing may offer an alternative solution by
providing computing resource on demand.
Where is your computational bottleneck?
A typical computational bottleneck from finance industry pose one of the following types
of challenges:
1. Massive computational time exceeds time constraints;
2. Massive memory consumption exceeds limited memory and storage;
3. Dynamic usage pattern meets non-scalable computing resource.
There are several solutions for type 1 challenge. In a production scenario, e.g. when
implementing a high frequency trading algorithm, hardware accelerators, such as FPGA and
GPU, may be better alternatives to cloud computing5. The reasons are:
1. Execution time is critical. Hardware acceleration may be the only solution to accelerate
the algorithm;
2. No frequent reconfiguration needed. The cost to configure and adapt the algorithm for
hardware acceleration is less than the profit gain from shorter execution time.
While in an R&D scenario, time-to-market is more important. The easy implementation and
massive parallel features provided by cloud computing will enable researchers and developers
to quickly prototype and backtest algorithms and models.
Cloud computing is also an economical solution to type 2 and type 3 challenges. A
distributed storage and memory consist of relatively cheaper hardware, compared with ex-
pensive local instances that have adequate memory and storage size, reduce significant the
cost to invest in hardware. The elasticity of cloud computing provides users on-demand ser-
vice. In a type 3 challenge, an investment in computer instances that can process computing
5Although in practice, there are firms implementing their algorithms in the cloud to gain benefit for lowerlatency in connection to exchange when colocation is not possible or too costly to implement.
16
demands at their peak time is a waste of resource during periods where computing demands
are less intensive.
5.3. Solution Selection
Both performance and cost should be taken into consideration when choosing a cloud
vendor. Among them, Amazon Elastic Compute Cloud (AWS), Google Compute Engine
(GCE) and Microsoft Azure (Azure) are three popular cloud computing platforms.
Vendors provide a variety of instance types. For example, the cloud instances used in
a recent benchmark by Techila include 4 different instances from AWS and Azure and 2
instances from GCE as listed in Table A.1 of Techila (2015). These instances are optimized
for different purposes: CPU, memory, I/O, cost, storage, etc.
Vendors adopt different pricing models. The cost of using cloud computing is affected by
the pricing model. Table 1 provides an overview of pricing models adopted by vendors for
data centers based in Europe. The table reports price per instance (PPI), price per CPU
core (PPC) and the billing granularity for AWS, Azure and GCE6. Generally speaking, a
minute-based (even seconds-based) pricing model offer more flexibility for the utilization of
cloud computing. While an hour-based pricing model makes using distributing computation
for short intervals (less than 1 hour) an unwise choice. However, the difference among pricing
models isrelatively insignificant when the computation is massive.
Cloud Platform Instance Type PPI (USD/h) PPC (USD/h) Billing GranularityAWS c4.4xlarge(win) 1.667 0.1041875 HourAWS c4.4xlarge(linux) 1.003 0.0626875 HourAWS c3.8xlarge(win) 3.008 0.094 HourAWS c3.8xlarge(linux) 1.912 0.06 Hour
Azure A11(win) 3.5 0.219 MinuteAzure A11(linux) 2.39 0.149 MinuteAzure D14(win) 2.372 0.148 MinuteAzure D14(linux) 1.542 0.096 MinuteGCE n1-standard-16(win) 1.52 0.095 MinuteGCE n1-standard-16(linux) 0.88 0.055 Minute
Table 1: Pricing Model and Cost
Depending on the vendor, instance types and the operating systems, instances have vary-
ing time consumption for configuration and deployment. According to Techila’s benchmark
report in Techila (2014), the differences are significant as shown in Fig.6 and Fig.7. These
6GCE machine types are charged a minimum of 10 minutes. After 10 minutes, instances are charged in1 minute increments, rounded up to the nearest minute.
17
are non-negligible factors for the elasticity of cloud computing. As a rule of thumb, the
operating systems, instances type and vendor should be chosen according to the end-user’s
version of software applications.
Fig. 6. Configuration Time
Fig. 7. Deployment Time
We cite the test results from two recent benchmark reports Techila (2014, 2015) from
Techila to provide readers an impression of the cost of utilizing cloud computing. The test
cases simulate real-world financial applications in many areas including: portfolio analytics,
machine learning, option pricing, backtesting, model calibration, etc. However, readers
18
should be aware of the different nature of these applications, whether they are high I/O,
high memory consumption or high CPU consumption.
Fig.8 summarizes the cost of computing for different vendor and instances versus the
performance of computing.
Fig. 8. Cost V.S. Performance
Table 2 provides the cost in the portfolio simulation case7. The simplified cost provides
the cost per unit of computation after correcting for the difference in pricing model. For
example, for portfolio simulation, the simplified cost range from 0.58 USD (GCE with n1-
standard-16 instance on Linux Debian 7 operating system) to 1.99 USD (Azure A11 on
Windows Server 2012 R2). The difference is significant (about 4 times). However, if we take
into consideration the pricing model, the real cost (that is the billing from vendor) differences
even more. The cost of using AWS is more than 10 times the cost using Azure or GCE.
This is because AWS using hour-based pricing model. Users should be able to allocate their
computation as units of hours to reduce the cost of computation with AWS.
The report provides valuable insight about the effect of pricing models on the cost of cloud
computing. Together with the benchmarks on instance performances, this should provide
readers some information on how to choose cloud vendors.
7For more information of other user scenarios, please refer to Techila’s latest “Cloud HPC in FinancebenchmarkTechila (2015)
19
Table 2: Cost of Cloud Computing Case: Portfolio SimulationCloud Platform Instance Type PPC(USD/hour) Cost (USD) Simplified(USD)
AWS c4.4xlarge(win) 0.104 26.672 0.926AWS c4.4xlarge(linux) 0.063 16.048 0.566AWS c3.8xlarge(win) 0.094 24.064 1.324AWS c3.8xlarge(linux) 0.060 15.296 0.701
Azure A11(win) 0.219 2.8 1.991Azure A11(linux) 0.149 1.275 1.073Azure D14(win) 0.148 1.898 1.550Azure D14(linux) 0.096 1.234 0.829GCE n1-standard-16(win) 0.095 4.053 1.486GCE n1-standard-16(linux) 0.055 2.347 0.583
5.4. Algorithm Design
Designing a well-suited algorithm for specific problem can significantly boost the perfor-
mance and benefit from cloud computing.
We use the following simple example to illustrate how the design of algorithm can change
the performance.
20
Example 5.1 (Distributed Matrix Multiplication). M is a matrix of size d×n. N is
another matrix of size n× d. The matrix multiplication of M and N: G = MN can be
done via two schemes.
• Scheme 1: Inner Product. The entry in i− th row and j − th column of matrix
G: Gi,j =∑n
r=1Mi,rNr,j.
1 % Scheme 1 : v i a inner produc t
2 c l o u d f o r i =1:d
3 c l o u d f o r j =1:d
4 G( i , j ) = M( i , : ) ∗N( : , j )
5 c loudend
6 c loudend
• Scheme 2: Outer Product. Alternatively, we can return the matrix G as the sum
of the outer products between corresponding rows and columns of M and N .
1 % Scheme 2 : v i a o u t e r produc t
2 c l o u d f o r i = 1 : n
3 c l o u d f o r j = 1 : n
4 %c f : sum=G
5 G =M( : , i )∗N( j , : )
6 c loudend
7 c loudend
%cf:sum=G command is used to sum the return value from each worker node.
The two schemes differ in both storage complexity and computational complexity.
1. To send the data, scheme 1 requires a distributed storage of O(nd2) while scheme
2 requires O(nd)
2. To return the result, scheme 1 requires a local storage of O(1) and total of O(d2)
while scheme 2 requires a local storage of O(d2) and total of O(nd2)
3. Both scheme require a distributed computation of O(nd2). Scheme 1 requires a
local computation of O(n) on each worker nodes while scheme 2 requires O(d2).
Depending the relative value of n and d, one of the scheme outperform the other
scheme. In terms of computation, outer product scheme parallel computation in the
direction of n, thus is preferred when n is large, while inner product scheme is preferred
in large d scenario.
21
5.5. Evaluation and Optimization
One of the characteristics of cloud computing is measured services. Users can review the
automatically generated reports to evaluate the usage of cloud and optimize it.
Several rules of thumb:
1. Minimize data transfer and communication between jobs;
2. Uniformly distribute computation across nodes;
3. Optimize based on number of jobs and job size;
22
6. Case Studies
6.1. Portfolio Backtesting
In this subsection, I demonstrate how portfolio backtesting can be accelerated with dis-
tributed computing technique, in particular Techila Middleware solution.
Backtesting is widely used in financial industry to estimate the performance of a trading
strategy or a predictive model using historical data. Instead of gauging the performance
using the time period forward, which may take many years, traders/portfolio managers can
measure the effectiveness of their strategies and understand the potential risks by backtesting
on prior time period using the datasets that are available today. Computer simulation of the
strategy/model is the main part of modern backtesting procedure. It might be very time-
consuming due to a few computing issues raised during the procedure. Thus it is necessary to
seek acceleration using modern techniques and shorten time-to-market in the rapid-changing
financial world.
6.1.1. Potential Computing Bottleneck
Backtesting requires three main components: Historical Datasets, Trading Strategy/-
Model and Performance Measure. The following computing issues might be raised for each
of the components:
1. The datasets used for testing might be huge while the requested output (performance
measure) is relatively small. For example, a portfolio consist of N assets. Its historical
return data series over the past T time period is N*T. The covariance matrix is of size
N*N. In case N=75000, the memory size of the covariance matrix is 450GB.
2. The simulation of the strategies/models can be computational-intensive. The intensity
of computing is increasing w.r.t the complexity of the strategies/models. Complex
logic branching operations can also be involved.
3. The evaluation of the performance measure can also be time-consuming. Measures
that are based on Monte-Carlo approach require simulation of thousands and even
more paths.
Cloud computing has its natural advantage of processing large data. In general, CPU threads
have better performance than GPU threads, especially in handling complex logic branching
operations. Thus cloud computing seems to be a suitable technique for accelerating back-
testing procedure. To illustrate how to use cloud computing for backtesting, we did some
experiment in Microsoft Windows Azure Cloud, as well as a local cluster. The results are
presented in the following subsubsections.
23
6.1.2. Computing Environment and Architecture
By installing Techila SDK on their computer, end users (traders/portfolio managers) can
use Techila-enabled computing tools: Matlab, R, Python, Perl, C/C++, Java, FORTRAN,
etc to access the computing resource managed by Techila Server.
Techila Server works as a resource manager, as well as a job scheduler. Computational
jobs are distributed through Techila Server to Techila Workers, which are machines in the
Cloud (Azure, Google Computing Engine, Amazion EC2, etc) or local cluster. When the
computation on the worker node finished, the requested results are sent back to end-user
through Techila Server. In our experiment, we use Techila environment on Windows Azure
Cloud. The testing code is written in Matlab. The computational jobs (optimization and
evaluation of each data sets) are sent to each of the worker nodes (virtual machine in Azure).
When dealing with large data sets which exceed single machines capability, there are two
solutions: 1. Data sets can be stored in a common storage that can be accessed by each of
the workers (Blob on Azure for example); 2. Under specific license, workers can easily access
data sources such as Bloomberg, Thomson Reuters, etc.
6.1.3. Experiment Design and Test Result
The callback feature of Techila enable streaming result when a computational job is
finished. This can also be used to monitor intermediate result of a computational job.
This enable us to update the visualization of the result when a computational job is
finished.
To visualize the result, we plot the time evolution of efficient frontier over the backtesting
period as a 3D surface. We also plot the maximum Sharpe Ratio portfolio as a 3-D line.
In the mean-variance optimization framework of Markowitz, this portfolio is the tangency
portfolio. Thus, we should expect the line is on the surface 8 as shown in figure 6.1.3.
We first performance a small scale test using 20 stocks. The result based on cloud
computing are consistent with the result generated from local run on my own laptop.
Using weekly return from 2000 to 2013, I performance several tests using different number
of stocks and different length of backtesting period.
I set the historical estimation window length equal to 60 weeks, the strategy is re-
estimated every 3 weeks. The weekly return data from 26-Feb 2001 to 07-Oct-2013 are
separated into 220 windows. A straightforward way of distributing the computational load
is to treat the backtesting for each window as independent job.
8the visualization code is adapted from Portfolio Demo by Bob Taylor athttp://www.mathworks.com/matlabcentral/fileexchange/31290-using-matlab-to-optimize-portfolios-with-financial-toolbox
24
Fig. 9. Time evolution of efficient frontier
By default, Techila Middleware will automatic distributed the computing project such
that each job will have sufficient length to reduce the overhead caused in data transfer. User
can also set the job length (iterations per job) using the job specification parameter.
I ran tests for 50, 100, 500 stocks. When the number of stocks increased, the optimizer
will take a longer time to find the portfolio that maximize Sharpe ratio. In fact, when the
number of stocks are too large, the optimization problem might became an ill-posed problem.
However, the performance of the optimizer is not the concern of this report. Compared with
simply setting the stepsperworker=1, Techilas default setting significantly improved the CPU
efficiency (CPU time/Wall clock time) as shown in table 3
Table 3: CPU Efficiency in Portfolio BacktestingNoS NoJ (step = 1) ACE (step=1) NOJ (Auto) ACE(Auto)
50 220 88.14% 55 96.57 %100 220 90.49% 74 112.79 %500 220 114.13% 220 114.13%
NoS is number of stocks. NoJ is number of jobs. Auto refers to Techila’s automated jobdistribution scheme. step = 1 refers to assigning 1 step to each job.
25
6.2. Distributed Portfolio Optimization
In this subsection, we demonstrate how large scale portfolio optimization problem can
be solved with distributed computing technique with specific algorithm design.
6.2.1. Challenges in Large Scale Portfolio Construction
Constructing optimal portfolio consists two steps. The first step is to construct future
belief of the return distribution, which is essentially an inference and prediction problem.
The second step is to find the optimal portfolio weights, which is an optimization problem
deal with the trade off between portfolio risk and portfolio return.
On one hand, this problem is a statistical challenge. Most of the portfolio optimization
and risk minimization approaches require estimation of the co-variance or its inverse of the
return series. When using the sample variance as the expected variance, the estimation error
could be large. To achieve a reasonable accuracy, as state in DeMiguel, Garlappi, and Uppal
(2009), an in-sample period of 3000 months is needed for a portfolio of 25 assets to beat naive
1/N strategy. The problem become even more significant when the portfolio size is large. As
noticed in Fan, Lv, and Qi (2011), estimating the moments of high-dimension distribution
is challenging. Among them, one crucial problem is the spurious correlation arise with the
curse of dimension.
One the other hand, the problem is also challenging numerically. First, when the degree
of freedom is large, finding optimum in high-dimension parameter space is almost impossible
to achieve in reasonable time with general optimizers. Additionally, we need to take good
care of the property of the matrices to have retain feasibility. It is also a data- intensive
problem from hardware perspective. Suppose we are dealing with 75,000 assets (data of the
universe), the covariance matrix have 2812537500 parameters. That means, it take more
than 20GB of memory if we are using double-precision. Last but not least, the matrix
operation for matrix size of M N have linear computational cost increase with the number
of columns.
6.2.2. Algorithm Design for Large Scale Mean-Variance Optimization Problem
In the classical Markowitz’s mean-variance framework, the portfolio optimization problem
is to minize the variance for given expected return b = wTµ. The optimum w∗ is a solution
to:
minwTCw
26
s.t.
wTµ = b
wT1N = 1
This optimization problem is equivalent to solve:
minE[|ρ− wT rt|2] (1)
with the same restriction. ρ = 1T b
By replace the expectation in equation 1 with its sample average, the problem can be
considered as a least-square regression.
Regularization methods are introduced targeting to solve the problem arises with estima-
tion error via shrinkage and achieve either stability or sparsity or both. The regularization
can be achieved by adding ln−penalty term r(x) to the objective function:
r(x) = λ ‖x‖n (2)
Where λ is a constant that scale the penalty term.
When n = 1, the objective is a LASSO regression. While n = 2, it is a ridge regression.
In order to find a solution to this penalized problem, more over to utilize the modern
computing environment - computer cluster/cloud, I try to solve the problem using distributed
optimization technique, namely the alternative direction method of multiplier (ADMM) and
block splitting. An detailed introduction of this optimizer can be found in Boyd, Parikh,
Chu, Peleato, and Eckstein (2011).
Noticing that we can transform the constraint optimization problem to its consensus
form:
min||b1T −Rw||22 + λ||w||1 + IC(w) (3)
where IC is the indicator function, i.e.IC(w) = 0, if w ∈ C
IC(w) =∞, if w /∈ C(4)
And C is the constraint set C = {w|wTµ = b, wT1N = 1}
27
Here we rewrite the problem in ADMM form (denote b1T := B):
wk+11 := argminw((1/2)||Rw −B||22 + (ρ/2)||w − zk + µk1||22) (5)
wk+12 := ΠC(z − µ2) (6)
zk+1 := 1/2(Sλ/ρ(wk+11 + µk) + Sλ/ρ(w
k+12 + µk)) (7)
µk+11 := µk1 + wk+1
1 − zk+1 (8)
µk+12 := µk2 + wk+1
2 − zk+1 (9)
The update of w1 is Tikhonov-regularized least squares which have analytical solution:
wk+11 := (RTR + ρI)−1(RTB + ρ(zk − µk))
Via Block Splitting, we can utilize distributed computing environment, and solve the problem
for small data block of R and B in parallel:
two cases: allocation and consensus. Depending on the data structure. If the number of
assets are larger than the length of the asset pricing series, allocation is prefered, while in
the other case, consensus is preferrable alternative that is also easier to implement, since it
is consistent with the previous decomposition.
wk+1i := (RT
i Ri + ρI)−1(RTi Bi + ρ(zk − µk))
28
6.3. Distributed Particle Filter for Financial Parameter Estimation
In this subsection, we present how to use particle filter to estimated the parameter of
jump diffusion model and how the parallelize the computation. Part of the work is included
in Yang, Lin, Luk, and Nahar (2014).
6.3.1. Jump Diffusion Model in Finance
To account for large variations observed on financial market, jump diffusion models were
introduced to derivative pricing literature, such as Bates (1996), Kou (2002) and Duffie, Pan,
and Singleton (1999).
The stochastic volatility with jump in return and volatility (SVCJ) model is particular
interesting. This model add jump components to popular Heston model. It is an affine jump
diffusion model that have semi-closed form solution for option price.
Assuming the joint dynamic (SVCJ) of log-price st and stochastic variance Vt is :
dst = (µ− 1/2Vt)dt+√VtdW
st + d(
Nt∑j=1
Zsj )
dVt = κ(θ − Vt)dt+ σ√VtdW
vt + d(
Nt∑j=1
Zvj )
Where Nt ∼ poi(λdt) describe the number of jumps, Zsj = µs+ρsZ
vj +σsεj, εj ∼ N(0, 1) and
Zvj ∼ exp(µv) are the jump size in log-price process and variance process respectively.This is
a rather generic model representing jump-diffusion models. When λ = 0(SV), this become
a Heston Model (ref); When µv = 0(SVJ), it is a Bates Model (ref). Following the majority
of literature, we assume two brownian motions W s and W v are correlated with correlation
coeffient ρ. This enable the model to capture the well-known leverage effect in financial time
series.
6.3.2. Likelihood Estimator via Particle Filter
Estimate the likelihood of the model is challenging due to the unobserved latent states
of jumps and stochastic volatility.
We use particle filter to construct a likelihood estimator, so that we can use maximum
likelihood estimation to learn the parameters of the model.
Using simple Euler discreterization scheme, it is easy to get a state model representation
29
of the jump-diffusion model as follows:
sn = sn−1 + µ∆t− 1/2Vn−1∆t
+√
(1− ρ2)Vn∆tε1,n−1
+ρ(Vn − Vn−1 − κ(γ − Vn−1)∆t)/σ
+(∆Nn)(∆Nn∑j=1
Zsj )
Vn = Vn−1 + κ(γ − Vn−1)∆t+ σ√Vn−1ε2,n−1
+(∆Nn)(∆Nn∑j=1
Zvj )
∆Nn = Nn −Nn−1 is the total number of jumps generated between stage n− 1 and n
Then, we use the auxiliary particle filter proposed by Pitt and Shephard (1999). We
extended the algorithm to SVCJ model. The detail of the algorithm is available in our paper
Yang et al. (2014). As a byproduct, the likelihood can be estimated from the weight of
particles.
6.3.3. Distributed Implementation
This computational bottleneck can potentially be solved by parallel computing. Due to
the necessary steps of resampling and normalization, the particle filtering is non-embarassingly
parallel problem. There are naturally 2 parallel schemes: external parallelization and internal
parallelization.
The external scheme divides the particle swarm of N particles into M groups of K
particles, where N = M ×K, and likelihood estimator:
LdN(θ) ≈ 1
MLd
K(θ) (10)
Resampling using parallel resampling scheme (ref Miguez) at group level is needed if particle
shrank at group level.
The external scheme is rather simple and require less implmentation effort. In (ref bhl2),
the scheme is applied for MLE of heston model (can be viewed a s a special case of SVCJ
when λ = 0) parameters using Techila Middleware. The test is performed on distributed
computing system (both a local computer cluster (TUTgrid) and Window Azure Cloud).
Noticing the communication between computing nodes is comparable costly. By proper
choosing the quantity of K, we can reduce the unnecessary resampling at group level. In
optimal situation, embarassingly parallelizable at group level can be achieved. In the test,
30
we take K = 500 and M = 20, so we can have a total number of 10, 000 particles. The
acceleration factor approximately is 14 times. The required quantity of K is expected to
increase with the frequency of jumps.
The internal scheme is more statistically efficient compared with external scheme. In
fact, the internal scheme can be considered as an extreme case of the external scheme, with
K = 1 and maximum required times of resampling.
31
7. Cloud Alpha: Economics of Cloud Computing
In their review of cloud computing, Armbrust, Fox, Griffith, Joseph, Katz, Konwinski,
Lee, Patterson, Rabkin, Stoica, et al. (2010) proposed a formula to evaluate the economic
value of cloud computing by comparing to alternative solutions9:
UserHourscloud × (revenue− costcloud) ≥ UserHoursdatacenter × (revenue− CostdatacenterUtilization
)
The cost of cloud computing or alternative solutions can be summarized into total cost of
ownership (TCO).
While cloud computing and alternative solutions may have different risk of IT failure.
The effect on risk measure should be taken into account when evaluating the potential benefit
of cloud computing. Thus we derive the following formula of benefit as the change in revenue
plus cost reduction and benefit of risk control:
Benefitcloud = ∆(Revenue)−∆(TCO)− γ∆(Risk) (11)
Where ∆(Revenue) = Revenuecloud−Revenuealternative is the profit difference from cloud v.s.
alternative solution. ∆(TCO) = TCOcloud− TCOalternative is the negative of cost reduction.
∆(Risk) = Riskcloud−Riskalternative measures the change in risk and γ is the risk premium.
The optimal choice of computing solutions is simply the optimum of the following Markowitz-
style objective:
maxs∈S
Revenues − TCOs − γRisks
S is the set contains all feasible computing solutions.
Quantitative measuring of revenue, cost and risk is a difficult task and is beyond the
scope of this book. Thus, in the following subsections, we only provide qualitative analysis
of cost, revenue and risk to give some intuition of the economics of cloud computing.
7.1. Cost Analysis
Financial market reduce transaction cost. As an example, asset managers issue ETF and
ETN to investors, offering them a lower cost of diversification and exposures to risks and
markets that may be costly for an individual investor to access.
Cloud computing, by pooling computing resources, offer clients lower total cost of own-
ership (TCO) and access to up-to-date hardware.
9Here they compare cloud computing with a dedicated data center.
32
Cloud computing may offer cost reduction along one of the following dimensions. The
first dimension of cost reduction is from lower cost of hardware maintenance and upgrade.
The second dimension of cost reduction is from elasticity of cloud computing. The third
dimension of cost reduction is from lower cost of human resources.
7.2. Risks
The risk of IT system failure is unnegligible in finance industry. The following two
examples provide some ideas of the importance of having backup IT systems and highly-
reliable IT systems.
Example 7.1 (NYSE and Bloomberg). The New York Stock Exchange crashed at
11:32am ET, July 8, 2015. The exchange was down for 3 hours and 38 minutes.
According to NYSE’s (reference online document), this was due to software update to
the IT system.
Coincidentally, Bloomberg terminals suffered a widespread outage on April 17, 2015,
affecting more than 325,000 terminals worldwide.
IT failure can be costly. What would be the best way of risk management for IT systems?
Cloud computing can be viewed as an insurance of IT. While diversification is widely
accepted concept in finance industry, cloud computing maybe an easy way to diversify the
IT failure risk for finance industry.
The distributed file systems, either in-house or in cloud vendors’ data centers, protect
data from hardware failures.
Cloud vendors also offer access to computing to data centers located in various locations
around the world. Such scheme provide constant supply of computing resources in case of
catastrophic tail events, such as earthquakes, tsunamis, etc.
References
Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patter-
son, D., Rabkin, A., Stoica, I., et al., 2010. A view of cloud computing. Communications
of the ACM 53, 50–58.
Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J., Vanthienen, J., 2003.
Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the
Operational Research Society 54, 627–635.
33
Bates, D. S., 1996. Jumps and stochastic volatility: Exchange rate processes implicit in
deutsche mark options. Review of financial studies 9, 69–107.
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., 2011. Distributed optimization and
statistical learning via the alternating direction method of multipliers. Foundations and
Trends R© in Machine Learning 3, 1–122.
DeMiguel, V., Garlappi, L., Uppal, R., 2009. Optimal versus naive diversification: How
inefficient is the 1/n portfolio strategy? Review of Financial Studies 22, 1915–1953.
Duffie, D., Pan, J., Singleton, K., 1999. Transform analysis and asset pricing for affine jump-
diffusions. Tech. rep., National Bureau of Economic Research.
Fan, J., Lv, J., Qi, L., 2011. Sparse high dimensional models in economics. Annual review
of economics 3, 291.
Hand, D. J., Henley, W. E., 1997. Statistical classification methods in consumer credit
scoring: a review. Journal of the Royal Statistical Society. Series A (Statistics in Society)
pp. 523–541.
Joseph, E., Conway, S., Dekate, C., Cohen, L., 2014. Idc hpc update at isc’14.
Kanniainen, J., Lin, B., Yang, H., 2014. Estimating and using garch models with vix data
for option valuation. Journal of Banking & Finance 43, 200–211.
Kou, S. G., 2002. A jump-diffusion model for option pricing. Management science 48, 1086–
1101.
Little, M., 2011. Esg and solvency ii in the cloud. Moody’s Analytics Insights .
Mell, P., Grance, T., 2009. The nist definition of cloud computing. National Institute of
Standards and Technology 53, 50.
Pitt, M. K., Shephard, N., 1999. Filtering via simulation: Auxiliary particle filters. Journal
of the American statistical association 94, 590–599.
Smith, D. M., 2008. Cloud computing scenario.
Techila, T., 2014. Cloud benchmark - round 1.
Techila, T., 2015. Cloud hpc in finance, cloud benchmark report with real-world use-cases.
34
West, D., 2000. Neural network credit scoring models. Computers & Operations Research
27, 1131–1152.
Yang, J., Lin, B., Luk, W., Nahar, T., 2014. Particle filtering-based maximum likelihood esti-
mation for financial parameter estimation. In: Field Programmable Logic and Applications
(FPL), 2014 24th International Conference on, IEEE, pp. 1–4.
35