recommendations for open online education: an algorithmic study

Recommendations for Open Online Education:

An Algorithmic Study

Soude Fazeli1, Enayat Rajabi2, Leonardo Lezcano3, Hendrik Drachsler1, Peter Sloep1

1 Open University Netherlands, 2 Dalhousie University, 3 eBay Inc.

27.07.2016, ICALT 2016, Austin, Texas, USA

3

• Hendrik DrachslerAssociate Professor Learning Technologies

• Research topics:Personalization, Recommender Systems, Learning Analytics, Mobile devices

• Application domains: Schools, HEI, Medical education

WhoAmI

2006 - 2009

@HDrachsler

2. Mai 2023Hendrik Drachsler 3

Context of the study• Goal: Personalization of Learning (based on prior knowledge)• Problem: Selection from a huge variety of possibilities (Information

overflow)• Solution: Recommender systems that points a target user to content of

interest based on her user profile

Recommendations for Open Education: An Algorithimic StudyPagina 4

Problem definition

Recommendations for Open Education: An Algorithimic StudyPagina 5

Institutional Course RecSys Open Education RecSysVS.

Rich learner and course metadata

Sparse learner and course metadata

Pagina 6

RQ: How to recommend courses to learners in open education platforms?

Recommendations for Open Education: An Algorithimic Study

Research Question

Pagina 7

1. Content-based 2. Collaborative filtering ✓


Recommender system algorithms

Our Input Data are mainly user indirect ratings, thuscollaborative filtering are more relevant for us

8

Drachsler, H., Verbert, K., Santos, O., and Manouselis, N. (2015). Recommender Systems for Learning. 2nd Handbook on Recommender

Systems. Berlin:Springer

Recommender system algorithms

Pagina 9

• Memory-based• Use statistical approaches to infer similarity between users based on

the users’ data stored in memory • k-Nearest Neighbour method (kNN, with neighbourhood size k)• Similarity metrics: Pearson correlation, Cosine similarity, and the

Jaccard coefficient.

• Model-based• Use probabilistic approaches to create a model of users’ feedback

• Matrix factorization, and Bayesian networks • are faster than memory-based algorithms • more costly (required resources and maintenance)

In this study, we use both memory-based (both user-based and item-based) and model-based algorithms to test which one performs best on the Open U platform.


Collaborative Filtering (CF) algorithms

Pagina 10

H1: Item-based outperforms user-based approaches

H2: Model-based outperforms memory-based approaches


Hypothesis

Experiment

Pagina 11

1. Dataset

• From Open Education Platform: OpenU

A broad national online learning platform for lifelong learning

• Data collected: from March 2009 until September 2013• Users: OpenU Users are professionals from various domains

Dataset Users Learning objects

Transactions Sparsity (%)

OpenU 3462 105 92,689 98.14


Pagina 12

• Figure 1: Course completion in related to the students’ activity

• Each blue X: the Percentage of Online Interactions (POI) for a given student and a

given course, relative to the highest online interactions of a student in that course.

• Online interactions = student’s contributions to chat sessions and forum messages.

The course completion rate for OpenU students goes up dramatically with increases in students’ interactions (course-mates and the academic staff)


Experiment1. Data set

Pagina 13

Experiment2. Algorithms2.1. Memory-based

• Most CF algorithms are based on kNN methods:

• Find like-minded users and introduces them as the target user’s nearest neighbours

• The appropriate similarity measure depends on whether the input data is:

• Explicit (e.g. 5-star ratings) or • Implicit user feedback (e.g. views, downloads, clicks, etc.)

• Open U = Implicit user feedback (activities) -> Jaccard coefficient and Cosine are appropriate


Pagina 14

Experiment2. Algorithms2.2. Model-based

• Bayesian Personalized Ranking (BPR) method proposed by Rendle et al. • They applied their BPR to the state-of-the-art matrix factorization

models to improve the learning process in the Bayesian model used (BPRMF).

• MostPopular approach• Makes recommendations based on general popularity of items• Items are weighted based on how often they have been seen in the past

S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “BPR: Bayesian Personalized Ranking from Implicit Feedback,” in UAI ’09 Proceedings of the Twenty-

Fifth Conference on Uncertainty in Artificial Intelligence, 2009, pp. 452–461

Pagina 15

Experiment2. Algorithms2.3. Graph-based• Implicit networks: a graph

– Nodes: users; Edges: similarity relationships; Weights: similarity values

• Improve the process of finding nearest neighbors– By invoking graph search algorithms – Memory-based and user-based– For more information, see our ECTEL2014

paper:

S. Fazeli, B. Loni, H. Drachsler, and P. Sloep, “Which Recommender system Can Best Fit Social Learning Platforms?,” in 9th European Conference on Technology Enhanced

Learning, EC-TEL 2014, 2014, pp. 84–97.

Pagina 16

Experiment3. Settings• Metrics

• Precision (ratio number of relevant items recommended to the total number of recommended items)

• Recall shows the probability that a relevant item is recommended• Both Precision and recall range from 0 to 1.

• The number of courses in this experiment is 105 thus • The number of top-n items to be recommended is 5 (approx. 5% of

the courses) and 10 (approx.10% of the courses).

• For each memory-based CF algorithm, we evaluated six neighbourhood sizes (k={5,10,20,30,50,100}).


Pagina 17

1. Memory-based • User-based with Jaccard (UB1)• User-based with Cosine (UB2)• Item-based with Jaccard (IB1)• Item-based with Cosine (IB2)

2. Model-based• MostPopular (MB1)• Bayesian Personalized Ranking with Matrix Factorization (MB2)

3. Graph-based• User-based with T-index (UB3)

ExperimentWhich algorithm and parameters are best suited for the users of the Open U learning platform?

Experimental study3. Results

Pagina 18

Values for the highest-scoring neighbourhood size are in bold, the highest values among all are underlined

Discussions

H1: Item-based outperformed user-based methods.User-based CFs exceeded all expectations - contrary to what the recommender systems literature suggests.• Item-based results were expected to trump the user-based

since the number of items (courses) is much smaller than the number of users for our dataset.

• User-based algorithms performed better on the Open U data than those that make use of similarities between items (courses).

• Therefore: we reject H1.

Pagina 19Recommendations for Open Education: An Algorithimic Study

Discussions

H2: Matrix factorization methods outperform memory-based methods. • The user-based recommenders (UB1, UB2, UB3), which are memory-

based, widely outperform the model-based ones (MB1, MB2).• We expected the matrix factorization (model-based CFs) to perform

better since they often prove to outperform prediction accuracy of recommendations particularly when explicit user feedback is available (e.g. 5-star ratings).

• So we reject also H2.

Pagina 20Recommendations for Open Education: An Algorithimic Study

Pagina 21

• This study sought to find out how best to generate personalized recommendations from user activities within an open online course platform.

• The results show that user-based and memory-based methods perform better than item and model-based factorization methods.

• The UB1 algorithms seem to be most suited to provide accurate recommendations to the users of our Open U platform.


Conclusion

Pagina 22

Ongoing and Further work

1. Integrating the selected recommender algorithms in the OpenU platform to provide online recommendations.

2. Studying how the graph-based approach can help to improve the process of finding like-minded neighbours in terms of social network analysis (SNA)

3. User study – To measure novelty and serendipity of the recommendations made

for OpenU users.


Pagina 23

soude.fazeli[at]ou[dot]nl

@SoudeFazeli