gdmc v11 presentation
TRANSCRIPT
![Page 1: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/1.jpg)
2017 IEEE CIG
Game Data Mining Competition (GDMC)(https://cilab.sejong.ac.kr/gdmc2017/
1
KyungJoong Kim, Dumim Yoon and Jihoon Jeon
(Cognition & Intelligence Lab, Sejong University)
Sung-il Yang and SangKwang Lee
(Electronics and Telecommunications Research Institute)
EunJo Lee and Yoonjae Jang
(NCSOFT)
![Page 2: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/2.jpg)
Game Data Mining
• Understanding game players’ behaviors from data
• Especially, predict players’ churn/retention or purchase behaviors from game log data
• Few public datasets available to researchers and it limits the growth of the field
2
![Page 3: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/3.jpg)
Game Data Mining Competition
• Access to the big game log data (about 100G) from
commercially successful MMORPG game, Blade & Soul
by NCSOFT, one of the biggest game companies in South
Korea
• Predict the game players’ churn (binary classification problem)
and survival time (regression problem) from the massive
game log data
3
![Page 4: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/4.jpg)
4http://www.bladeandsoul.com/en/
![Page 5: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/5.jpg)
Competition Tracks
Track 1: Churn Prediction
In this track, participants will predict players’ churn or retention on the test datasets. The winner will be determined based on the average F1-Measure.
Track 2: Survival Analysis
In this track, participants will predict the survival time (the number of days) of game players on the test datasets. The winner will be determined based on the average Root Mean Squared Logarithmic Error (RMSLE).
5
![Page 6: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/6.jpg)
GDMC 2017 Homepage
• Important Dates
• Problem Description
• Tutorial (with R)
• Data Description
• Rules
6
https://cilab.sejong.ac.kr/gdmc2017/
![Page 7: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/7.jpg)
GDMC 2017 Google Groupshttps://groups.google.com/d/forum/gdmc2017
• Announcement
• Sample Log
• Log Schema
• Log Data Download• Training Data
• Test Data without Label
• Question/Answer
7
0
76
106
206
255 264
0
50
100
150
200
250
300
March April May June July August
# o
f M
em
bers
![Page 8: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/8.jpg)
Test Serverhttp://web_cilab.sejong.ac.kr/gdmcServer/
8
• Test your predictions before the deadline
• 10% of test data used for this test server (not used in final rankings)
• For security reason, limit maximum 48 trials per day (30 minutes waiting time from the last submission)
![Page 9: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/9.jpg)
Problems Description
9
![Page 10: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/10.jpg)
Prediction Targets
10
Expense
Loya
lty
Light Usersor
Malicious Users(Bots)
Prediction Targets
![Page 11: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/11.jpg)
Predictions about 3 Weeks from Now
11
Churn/Retention
TimeThree WeeksTwo Months
User Data
![Page 12: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/12.jpg)
Churn/Retention
• Long-term inactive stateas a Churn
• How many weeks for churn decision? • Five Weeks
• Retention: Logged in the game more than once during the five weeks
12
![Page 13: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/13.jpg)
Concept Drift
(Dec 2016~)
13
Subscription Model (Monthly Fixed Charge Payment) Free-to-Play
![Page 14: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/14.jpg)
Data Description
14
Data Set Time Period WeeksNumber of
GamersData Size*
Training APR-1-2017 ~ MAY-11-2017 64000
(30% churn)
48G(175m Events)
Test Set 1 JULY-27-2016 ~ SEP-21-2016 83000
(30% churn)30G
Test Set 2 DEC-14-2017 ~ FEB-08-2017 83000
(30% churn)30G
* Uncompressed Size
![Page 15: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/15.jpg)
Log Data Sample
15
Time Event Type Details (up to 72 columns)
2016-05-04 6:38:32 PM Enter World Login Type, Actor Data …
2016-05-04 6:39:16 PM Enter Zone Enter Zone Reason, Zone Type …
2016-05-04 6:39:36 PM Lose Item Item Type, Item Count, …
2016-05-04 6:39:36 PM Get Item Item Type, Item Count, …
2016-05-04 6:39:40 PM Get Item Item Type, Item Count, …
⋮ ⋮ ⋮
82 Event Types(World, Zone, Item, Party, Quest, Guild)
![Page 16: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/16.jpg)
Competition ResultsTrack 1 Churn Prediction
16
![Page 17: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/17.jpg)
Participants (13 Teams)
17
Team name Team member Affiliation Type County
GoAlone 1 Yonsei University Academia South Korea
DTND 3 DTND ? South Korea
goedle.io 2 goedle.io GmbH Industry Germany
IISLABSKKU 3 Sungkyunkwan University Academia South Korea
leessang 2 Yonsei University Academia South Korea
TheCowKing 2 KAIST Academia South Korea
TripleS 3 - ? South Korea
UTU 4 University of Turku Academia Finland
YD 6 Silicon Studio Industry Japan
YK 1 Yonsei University Academia South Korea
suya 1 Yonsei University Academia South Korea
NoJam 3 Yonsei University Academia South Korea
MNDS 3 Yonsei University Academia South Korea
![Page 18: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/18.jpg)
18
Rank Team Test1 score Test2 score Total score
1 YD (Japan) 0.61008 0.63326 0.62145
2 UTU (Finland) 0.60326 0.60370 0.60348
3 TripleS (Korea) 0.57968 0.62459 0.60130
4 TheCowKing 0.59370 0.60718 0.60036
5 goedleio 0.57717 0.60095 0.58882
6 MNDS 0.55920 0.56205 0.56062
7 DTND 0.49937 0.58776 0.53997
8 IISLABSKKU 0.56643 0.48733 0.52391
9 suya 0.44460 0.40967 0.42642
10 YK 0.49099 0.33181 0.39600
11 GoAlone 0.42697 0.31019 0.35933
12 NoJam 0.30741 0.30930 0.30835
13 Lessang 0.29760 0.29202 0.29479
![Page 19: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/19.jpg)
YD (Winner)
• Silicon Studio, Japan
• Team Members: Paul Bertens, Pei Pei Chen, Kexin Chen, AnnaGuitart, Sovann Lay, Africa Perianez
• Find features which have similar distribution between trainingset and testing set.
• Test 1 : LSTM + DNN (implemented with Keras)
• Test 2 : Extra Tree Classifier (# of trees = 50)
19
![Page 20: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/20.jpg)
20
LSTM+DNN
from the document of YD team
![Page 21: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/21.jpg)
21
Rank Team Techniques
1 YD LSTM+DNN, Extra-Trees Classifier
2 UTU Logistic Regression
3 TripleS Random Forest
4 TheCowKingLightGBM
(Light Gradient Boosting Machine)
5 goedleio Feed Forward Neural Network
6 MNDS Deep Neural Network
7 DTND Generalized Linear Model
8 IISLABSKKU Tree Boosting
9 suya Deep Neural Network
10 YK Logistic Regression
11 GoAlone Logistic Regression
12 NoJam Decision Tree
13 Lessang Deep Neural Network
Neural Net
Tree Approach
LinearModels
![Page 22: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/22.jpg)
Competition ResultsTrack 2 Survival Analysis
22
![Page 23: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/23.jpg)
Participants (5 Teams)
23
Team name Team member Affiliation County
DTND 3 DTND South Korea
IISLABSKKU 3 Sungkyunkwan University South Korea
TripleS 3 - South Korea
UTU 4 University of Turku Finland
YD 6 Silicon Studio Japan
![Page 24: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/24.jpg)
24
Rank Team Test1 score Test2 score Total score
1 YD (Japan) 0.883248 0.616499 0.726151
2 IISLABSKKU (Korea) 1.034321 0.679214 0.819972
3 UTU (Finland) 0.927712 0.898471 0.912857
4 TripleS 0.958308 0.891106 0.923486
5 DTND 1.032688 0.930417 0.978888
![Page 25: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/25.jpg)
25
Rank Team Techniques
1 YDEnsemble of Conditional Inference Trees
(# of Trees = 900)
2 IISLABSKKU Tree Boosting
3 UTU Linear Regression
4 TripleS Ensemble Tree Method
5 DTND Generalized Linear Model
Neural Net
Tree Approach
Linear Models
![Page 26: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/26.jpg)
Future Data Use
• Data Download Deadline• Active until end of August, we’re under discussion to extend the
deadline
• Data Use for Academic Research • No restriction on the data use for academic research (please include
acknowledgement on this competition and NCSOFT)
• Test Data Label • We’ll open the test data label soon.
26
![Page 27: Gdmc v11 presentation](https://reader034.vdocuments.site/reader034/viewer/2022050614/5a6777d47f8b9a656a8b54ff/html5/thumbnails/27.jpg)
Q & A
27