educational question routing in online student communities

31
Educational Question Routing in Online Student Communities Jakub Macina Slovak University of Technology Ivan Srba Slovak University of Technology Joseph Jay Williams Harvard University / National University of Singapore Maria Bielikova Slovak University of Technology 11th ACM Conference on Recommender Systems, Como, Italy, 27th-31st August 2017

Upload: jakub-macina

Post on 22-Jan-2018

146 views

Category:

Education


2 download

TRANSCRIPT

Educational Question Routing in Online Student Communities

Jakub Macina Slovak University of Technology

Ivan Srba Slovak University of Technology

Joseph Jay Williams Harvard University / National University of Singapore

Maria Bielikova Slovak University of Technology

11th ACM Conference on Recommender Systems, Como, Italy, 27th-31st August 2017

2 / 31

Online Student Communities

• Massive Open Online Courses (MOOCs)

Dropout rate up to 94%

Community Question Answering (CQA)

Discussion forum

3 / 31

Challenge for MOOC Discussions

• Up to 50% of unanswered questions

• Course instructors are overloaded with many students to serve

• Low participation of students in question answering

1. Lurkers who are not contributing

2. Willing to participate but overloaded with many questions

4 / 31

Our Idea: Question Routing

• Recommendation of new questions to users who are suitable to answer them

• Well-known research task from CQA systems

What is capital city of Italy?

5 / 31

Related Work

• Question routing in standard CQA

• Asker-oriented approaches

• Overloading small group of experts

• Based mainly on QA data

6 / 31

Related Work

• Question routing in standard CQA

• Asker-oriented approaches

• Overloading small group of experts

• Based mainly on QA data

• Question recommendation in MOOCs

• Constraints optimization framework (Yang et al 2014)

• Any question beneficial to user

• With significant time-delay

Not appropriate for MOOCs

7 / 31

Educational Question Routing

8 / 31

Educational Question Routing Task

Given new question 𝑞

find an ordered list of users 𝑢1, … , 𝑢𝑛who are most suitable to answer question 𝑞

Opportunities

Data from MOOC course (grades, accomplished exercises)

Constraints

Appropriate knowledge

Willingness to answer

Working capacity

9 / 31

Goals of Educational Question Routing

• G1: Decrease information load of users by accurate recommendations

• G2: Engage a greater part of the community in the question answering

• G3: Increase an average number of contributions

10 / 31

Educational Question Routing Framework

11 / 31

Educational Question Routing Framework

User modeling phase

12 / 31

Educational Question Routing Framework

Routing phase

13 / 31

1. Construction of Question Profile

• Question text profile 𝜃𝑞

• Captures question’s content

• Text pre-processing

• Bag-of-words model (tf-idf weights)

• Metadata

• Asker, category, etc.

14 / 31

2. Construction of User Profile

• User text profile 𝜃𝑢

• Captures topics of question user previously answered

(user’s interests)

𝜃𝑢 =

𝑞𝜖𝑄𝑢

(𝜃𝑞 + 𝜃𝑎,𝑞)

• Metadata about previous user activities

• Quantity, quality and time distribution

• In CQA and MOOC

15 / 31

3. Matching of Questions and Users

• Ranking of users given new question

• Ensemble of two classification tasks:

• Appropriate expertise to answer a new question

• Willingness to answer a new question

• Combination:

𝑃(𝑦 = 1) = 𝑃(𝑒𝑥𝑝𝑒𝑟𝑡𝑖𝑠𝑒 = 1) ∗ 𝑃(𝑤𝑖𝑙𝑙𝑖𝑛𝑔𝑛𝑒𝑠𝑠 = 1)

16 / 31

3. Matching of Questions and Users• Features derived from text and metadata comparison between

question and user profile

• Features for expertise classification (# of features = 11)

• Level of difficulty for a user to answer a new question - knowledge gap

• Portion of related lectures watched

• Grades

• Features for willingness classification (# of features = 14)

• Overall count of answers, questions and comments

• Amount of latest activity

• Response time on rec.

17 / 31

4. Optimization

• Balancing routed questions by considering current student’s workload 𝐿𝑢

18 / 31

Experiments – CQA system

• Educational and organizational CQA system Askalot

• Open source, developed at Slovak University of Technology

• Builds on diversity in students’ knowledge and educational/organizational specifics

• University/MOOC variant

github.com/AskalotCQA/askalot [email protected]

19 / 31

Experiments - MOOC

• QuCryptox Quantum Cryptography at edX

• Offered by Caltech and TU Delft

• 10 weeks (Sept. 2016 – Dec. 2016)

https://courses.edx.org/courses/course-v1:CaltechDelftX+QuCryptox+3T2016

20 / 31

Course Statistics

Metric Quantity

Students enrolled in the course 8115

Students started the course 4618

Users participating in CQA (contributors + lurkers) 1098 (24%)

Users contributing in CQA 377 (8%)

Questions 361

Answers 386

Comments 476

21 / 31

Evaluation Methodology

• Offline experiment

• Online experiment

• Very rare in context of CQA systems

• Ecologically valid

• Total impact on student community

• Baseline: non-educational asker-oriented question routing method with optimization

22 / 31

Offline Experiment Setup

• Standard ML pipeline including:

• Feature transformation

• Feature selection

• Chi square selection

• Model selection

• SVM, Random forest, Logistic regression

• Hyper-parameter tuning

23 / 31

Offline Experiment Results

• Comparison with actual answerers of a question

24 / 31

Online Experiment Setup

• A/B test during 7 weeks

• Stratified random assignment to three groups:

1. Educational (n=1306)

2. Baseline (n=1306)

3. Control (n=1306)

• Recommendation to top 10 users

• Constraint for workload 𝐿𝑢

• maximum 4 recommendations per 7 days

• Real-time profile updates, re-training each day

25 / 31

Online Experiment Setup

Notification

Dashboard

26 / 31

Online Experiment Results

• 132 new questions were routed to potential answerers

• Resulting in 2640 recommendations

27 / 31

G1: Accurate Recommendations Decreased Information Load

Metric Our method Baseline Statistical significance

CTR 23.25% 18.29% 𝜒2 1, 𝑁 = 2640 = 10.03, 𝑝 < 0.01

Success@10 15.91% 10.61% 𝜒2 1, 𝑁 = 264 = 1.61, 𝑝 = 0.20

28 / 31

G2: Greater Part of the Community Got Involved

Period Our method Baseline Control

Before 7.60% (62/816) 8.99% (73/812) 9.12% (74/811)

During 13.16% (40/304) 9.35% (26/278) 8.72% (28/321)

Active CQA users / active MOOC users

29 / 31

G3: Average Number of Contributions Increased

Before experiment During experiment(with recommendation)

30 / 31

Possible Improvements

• Duplicate questions identification

• Question retrieval (another well-known task in CQA)

• Question type identification

• Some questions can be answered only by instructors

• Scalability

31 / 31

Educational Question Routing in Online Student Communities

1. Answerer-oriented question routing framework considering not only expertise, but also willingness and workload of answerers

2. Incorporating additional MOOC data beyond CQA activity

3. Effectiveness in real world is demonstrated by online experiment with more than 4600 MOOC students. Code available at:

https://github.com/dmacjam/dp-analysis-evaluation