pnr-based no-show forecast agifors res/yield management study group annual meeting new york, 21-24...
TRANSCRIPT
PNR-Based No-Show Forecast
AGIFORS Res/Yield Management Study Group Annual MeetingNew York, 21-24 March 2000
Kai-Uwe Kalka Project Manager Revenue Management
Klaus Weber Senior Scientific AnalystRevenue Management
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 2
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 3
Motivation I
Overbooking research has the longest history in revenue management. (McGill 99)
No-shows and cancellations forecast= base for overbooking
Two main approaches
Time series models Causal models
Hybrid models0
10
20
30
40
50
60
Apr 9
7
Jun
97
Aug 9
7
Oct 9
7
Dec 9
7
Feb 9
8
Apr 9
8
Jun
98
Aug 9
8
Oct 9
8
Dec 9
8
Feb 9
9
Apr 9
9
showed upbooked
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 4
Motivation II
Common assumption
Historical no-show / cancellation behaviourwill be repeated
Difference
Time series model analyzes forecast variable Causal model analyzes explanatory variable
Better estimation for explanatory variable
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 5
Motivation III
Rich source of explanatory variables
Passenger Name Records (PNR)
PNRPRO digs up this treasure
booking class # record updates seat reservation
segment miles # passengers # booking changes
# name changes booking office booking time
electronic ticket time to other flight protection booking
status stretcher day of week
free travel special meal ...
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 6
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 7
Knowledge Discovery in Databases (KDD) I
Common Definition (Fayyad et al 96)
Knowledge Discovery in Databases is the
nontrivial process of identifying
valid,
novel,
potentially useful, and
ultimately understandable patterns in data.
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 8
Principle of Parsimony / “Ockham‘s razor“
Frusta fit per plura quod potest fieri per pauciora.
Quando propositio verificatur pro rebus, si duae res sufficiunt ad eius veritatem, superfluum est ponere tertiam.
William of Ockham, ca. 1286-1347 (Hoffmann 97)
Knowledge Discovery in Databases (KDD) II
It is futile to do with more what can be done with fewer.
When a proposition comes out true for things, if two things suffice for its truth, it is superfluous to assume a third.
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 9
Knowledge Discovery in Databases (KDD) III
KDD Goals (Anand 98a)
classification
regression
discovery of associations
discovery of sequential patterns
temporal modelling
deviation detection
dependency modelling
clustering
characteristic rule discovery
Will this customer cancel his booking, will he be a no-show, or will he ultimately show up?
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 10
Knowledge Discovery in Databases (KDD) IV
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
(Wrobel 98, OLAP Council 95)
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 11
Knowledge Discovery in Databases (KDD) IVa
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
synonymously used in commercial area
in the narrow sense:
part of KDD process search and evaluation of
hypotheses
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 12
Knowledge Discovery in Databases (KDD) IVb
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
Statistics (traditional):certainty of set hypothesis on the basis of given data
KDD:automatic computer-aided search for hypotheses
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 13
Knowledge Discovery in Databases (KDD) IVc
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
size of data base
KDD emphasizes scalability up to very large data bases
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 14
Knowledge Discovery in Databases (KDD) IVd
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
process which
extracts data from different data base systems
merges these data stores it appropriately for
further analysis
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 15
Knowledge Discovery in Databases (KDD) IVe
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAPon-line analytical processing
OLAP functionality characterization:
dynamic multi-dimensional analysis
of consolidated enterprise data
supporting end user analytical and navigational activities
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 16
Knowledge Discovery in Databases (KDD) V
What is what?
Data mining
Statistics
Machine learning
Data Warehousing
OLAP
PNR-basedno-show forecastwith induction trees
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 17
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 18
Rule Generation through Induction Trees I
Induction inductive learning
Discover novel patterns in data(KDD definition)
Opposite: deductive learning,i.e. analyze and modifyknowledge in knowledge bases
Tree looks like a tree
Leafs indicate classes Nodes specify tests
Induction tree? (Quinlan 1993)
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 19
Rule Generation through Induction Trees II
multi-dimensional point set
miles
class
-
---
-
--
--
-
+++
++
+
+ +
++
+ID attr. 1
classattr. 2miles
targetno-show
1 5 18.5 +2 2 17.9 -...
.
.
.
.
.
.
.
.
.
N 12 32.6 +
PNR-table
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 20
Rule Generation through Induction Trees IIIa
Known facts: Historical PNR data
Build a model of the dataset Induction tree
Quantify the model
R+ : p = 15/17 = 88%
R- : p = 20/25 = 80%
R- : p = 12/15 = 80%
R- : p = 25/31 = 81%
R+ : p = 23/27 = 85%
+
+
-+ +-
-
+ +
--
-
-
--
-
-
+
+ +
++
--
- -
-
-
--
--
-- -
-
-
--
+
+
+
+
+
+
+
+
+
+
+
-
-
-
-
-
-
-+
+
++
++
- --
--
----
-
--
--
--
-
-
-+
+ ++
+
+
+
+
++
+-
-
-
-
++ +-+
+ + ++
+
+
+- -
class
miles
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 21
Rule Generation through Induction Trees IIIb Substitute the data by the model for fast predictions
Model prediction: Datum (class = 2, miles = 18.5) is no-show with p = 88%
Query the model: Is datum (class = 2, miles = 18.5) a no-show?
-
- +
+
+
-+ +-
-
+ +
--
-
-
--
-
-
+
+ +
++
--
- -
-
-
--
--
-- -
-
-
--
+
+
+
+
+
+
+
+
+
+
+
-
-
-
-
-
-
-+
+
++
++
- --
--
----
-
--
--
--
-
-
-+
+ ++
+
+
+
+
++
+-
-
-
-
++ +-+
+ + ++
+
+
+- -
miles
class
-+ -
- +
88 %
80 %
80 % 81 %
85 %
18.5
2
+
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 22
Rule Generation through Induction Trees IV
miles 0 -16miles 0 -16
class 1-3class 1-3 class 4-6
class 4-6
+-
miles 0-12miles 0-12 miles 13-36
miles 13-36
+
class
miles
rootall datarootall data
class 1-6class 1-6 class 7-12
class 7-12
miles 17-32miles 17-32
rootall datarootall data
class 1-6class 1-6 class 7-12
class 7-12
miles 17-32miles 17-32
-
+ -
+-
+
+
--
-
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 23
Rule Generation through Induction Trees V
Why rules?
Induction trees can be cumbersome, complex and inscrutable.
Each node has a specific context.
Individual subconcepts can be fragmented.
Corresponding rule set is as complex as the tree is.
Generate rules and simplify them
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 24
Rule Generation through Induction Trees VIa
From trees to rules every path (root leaf)
gives one initial rule -
Class 7-12Class 7-12
+
Miles 13-36Miles 13-36
+-
Miles 0-12Miles 0-12
class 1-6class 1-6
miles 0 -16miles 0 -16 miles 17-32
miles 17-32
class 1-3class 1-3 class 4-6
class 4-6
+-
+ -
rootall datarootall data
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 25
Rule Generation through Induction Trees VIb
-
class 1-6class 1-6
miles 0 -16miles 0 -16 miles 17-32
miles 17-32
class 1-3class 1-3
+-
+
rootall datarootall data
IF [class in range 1-6]
AND [miles in range 17-32]
THEN [pax is no-show (88 %)]
From trees to rules
THEN [pax shows up (80 %)]
AND [miles in range 0-16]
IF [class in range 1-6]
AND [class in range 1-3]
88 %
80 %
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 26
Rule Generation through Induction Trees VII
KDD
Passenger attributes
Passenger
is no-show
cancels
shows up
Rule 1: IF … THEN ...
Rule 3: IF … THEN ...
Rule 2: IF … THEN ...
Rule set
PNR data
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 27
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 28
Integration in Revenue Management Process I
KnowledgePost-Processing
Human ResourceIdentification
ProblemSpecification
DataProspecting
Domain KnowledgeElicitation
DataPre-Processing
PatternDiscovery
Refinement
permanent
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
MethodologyIdentification
(Anand 98b)
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 29
Integration in Revenue Management Process IIa
1
Get all customer data ...
CustomerData
and feed the FC-Engine
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 30
Integration in Revenue Management Process IIb
1 2
Generate different forecast models for the same problem
MA MB MJ MKModels
PA PJPB PKPNR data
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 31
Integration in Revenue Management Process IIc
1 2 3
Rank predictions for all forecast models
Normalization and Ranking Process
A J KB Probabilities
MA MB MJ MK Models
CustomerData
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 32
Integration in Revenue Management Process IId
2 31 4
Link models for prediction application
BMBAMA JMJ KMK
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 33
Integration in Revenue Management Process IIe
2 31
Use the best forecast mix for actual customer request
4
and feed the Optimizer
RevenueManagement
Optimizer
RevenueManagement
Optimizer
forecast
5
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
BMBAMA JMJ KMK
PNR-data
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 34
Integration in Revenue Management Process IIf
In the meanwhile go to step 1 and refine 2 3 4 51
KnowledgeDiscovery
FC - Engine
KnowledgeDiscovery
FC - Engine
RevenueManagement
Optimizer
RevenueManagement
Optimizer
BMBAMA JMJ KMK
forecastPNR-data
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 35
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 36
Results I
All results based on Lufthansa real world data
Computation of no-show rate
based on PNRs (new method) based on historical data (standard method) based on combination of both methods
Comparison with respect to
flights booking classes compartments
Calculation of no-show rate errors
using mean absolute deviation (MAD) using mean square error (MSE)
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 37
Results IIa
Relative no-show rate error
Comp 87 % * Bkd Standard PNR-based combined
F 15.2 % 16.4 % 10.7 % 11.6 %
C 10.5 % 9.6 % 8 % 7.6 %
M 12 % 11.2 % 9.7 % 9.3 %
Average 12.5 % 12.4 % 9.5 % 9.5 %
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 38
Results IIb
Relative no-show rate error compared to standard method
Comp 87 % * Bkd Standard PNR-based combined
F 7 % 0 % 34.8 % 29.3 %
C -9.3 % 0 % 16 % 20.8 %
M -5.3 % 0 % 22.3 % 25.9 %
Average -0.8 % 0 % 23.4 % 23.4 %
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 39
Results III
Significance of attributes with respect to no-show forecast
Influence of “day of week” is highly overrated
“It is futile to do with more what can be done with fewer.” (Ockham)
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 40
PNR-Based No-Show Forecast
Knowledge Discovery in Databases (KDD)
Rule Generation Through Induction Trees
Results
Conclusions, Perspective
Motivation
Integration in Revenue Management Process
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 41
Conclusions, Perspective
Introduction of PNR-based no-show forecast
Induction Trees
Performance superior to standard methods
New insight into significance of attributes
Easy integration in existing revenue management process
Other methods, e.g.
CHAID analysis, Logit model Artificial Neural Networks
Transfer to cancellations forecast: first results
Introduction of fuzzy logic approaches: upcoming
AGIFORS Res/Yield Management Study Group Annual Meeting New York, 21-24 March 2000
Chart 42
References
(Anand 98a)Anand, S.S., Hughes, J.G.: Hybrid Data Mining Systems: The Next Generation. In: Wu, X., Kotagiri, R., Korb, K.B. (Eds.): Research and Development in Knowledge Discovery and Data Mining. Springer-Verlag, Berlin, 1998.
(Anand 98b)Anand, S.S., Patrick, A.R., Hughes, J.G., Bell, D.A.: A Data Mining methodology for cross-sales. Knowledge-Based Systems 10 (1998) pp. 449-461.
(Fayyad et al 96)Fayyad, U., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R.: Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, Cambridge, 1996.
(Hoffmann 97)Hoffmann, R., Minkin, V.I., Carpenter, B.K.: Ockham’s Razor and Chemistry. HYLE - An International Journal for the Philosophy of Chemistry, http://www.uni-karlsruhe.de/~philosophie/hyle.html 3 (1997) pp.3-28.
(McGill 99)McGill, J.I., van Ryzin, G.J.: Revenue Management: Research Overview and Prospects. Transportation Science 33 (1999) 2, pp. 233-256.
(OLAP Council 95)http://www.olapcouncil.org, glossary of terms: http://www.olapcouncil.org/research/glossaryly.htm
(Quinlan 93)Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, 1993.
(Wrobel 98)Wrobel, S.: Data Mining und Wissensentdeckung in Datenbanken. Künstliche Intelligenz 1 (1998) pp. 6-10.
Thank you for your attention!Any questions?