designing statistical learning environments with

30
CAL'09 - Brighton - Patrick Wessa Designing Statistical Learning Environments with Educational Compendium Technology

Upload: others

Post on 25-Feb-2022

38 views

Category:

Documents


0 download

TRANSCRIPT

CAL'09 - Brighton - Patrick Wessa

Designing Statistical Learning Environments with Educational

Compendium Technology

CAL'09 - Brighton - Patrick Wessa

Outline

● Technology● Reproducible

Computing● Compendium● Compendium Platform● Applications

● Design of SLE● Empirical Findings● Building Guidelines● Educational Research● Educational Quality

Control

CAL'09 - Brighton - Patrick Wessa

Claerbout's principle*

● An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and that complete set of instructions that generated the figures.

*Source: Jan de Leeuw

CAL'09 - Brighton - Patrick Wessa

My question

● If academic statisticians find it hard (if not impossible) to verify or review the results in empirical papers, how could we possibly expect students to learn from statistical results without the proper tools to easily review, verify, or challenge them?

CAL'09 - Brighton - Patrick Wessa

Reproducible Computing

CAL'09 - Brighton - Patrick Wessa

Reproducible Computing

● http://www.freestatistics.org

● http://www.wessa.net

● Wessa, P., “A framework for statistical software development, maintenance, and publishing within an open-access business model”, Computational Statistics, Springer Verlag, 2008

● Wessa P., “Reproducible Computing: a new Technology for Statistics Education and Educational Research”, IAENG Transactions on Engineering Technologies, Volume II, American Institute of Physics, 2009, forthcoming

CAL'09 - Brighton - Patrick Wessa

Setting up the course

CAL'09 - Brighton - Patrick Wessa

A framework for statistical software development, maintenance, and publishing within an open-access business model, 2008, Computational Statistics, Springer

CAL'09 - Brighton - Patrick Wessa

Computations are “blogged”

CAL'09 - Brighton - Patrick Wessa

Error messages

CAL'09 - Brighton - Patrick Wessa

Error messages

CAL'09 - Brighton - Patrick Wessa

Weekly assignments

Learning Statistics based on the Compendium and Reproducible Computing, Proceedings of the World Congress on Engineering and Computer Science 2008, ISBN: 978-988-98671-0-2,

UC Berkeley, San Francisco, USA

CAL'09 - Brighton - Patrick Wessa

Snapshot of “Blogged” Computation

Reproduce or Reuse at wessa.net

Cite the computation as follows

CAL'09 - Brighton - Patrick Wessa

Social Interaction, Collaboration, Networking, ...

CAL'09 - Brighton - Patrick Wessa

Social Networks (“co-opetition”)

CAL'09 - Brighton - Patrick Wessa

Fraud detection

CAL'09 - Brighton - Patrick Wessa

Feedback (Peer Review)

Submitting Peer Review (feedback) is a good learning activity – not a good grading procedure

CAL'09 - Brighton - Patrick Wessa

Lectures● 13 weeks (semester)

● Week 1: Introduction (explanation) + workshop assignment

● Week 2-12: Workshops + Peer Assessments

● Week 13: Final Exam (multiple choice)

● Grades received from Peers do NOT count => there is no penalty for making mistakes!!

● The quality of feedback messages is graded by the educator

Week 1 Week 2 Week 3 Week 4 ...

ExamL1 L2 L3 L4 L5

WS1 WS2 WS3 WS4 WS5

Rev 1 Rev 2 Rev 3 Rev 4 ...

...

CAL'09 - Brighton - Patrick Wessa

Problem: separate threads of discussion

Week 1 Week 2 Week 3 Week 4 ...

ExamL1 L2 L3 L4 L5

WS1 WS2 WS3 WS4 WS5

Rev 1 Rev 2 Rev 3 Rev 4 ...

...

CAL'09 - Brighton - Patrick Wessa

Computation 3

Computation 1

Connected threads of discussion

Computation 2

Computation 4

Computation 5

Computation 6

CAL'09 - Brighton - Patrick Wessa

4 cohorts, 2 years

Year 0 Bachelor Prep. Progr.

Female 58 53

Male 53 76

Year 1 Bachelor Prep. Progr.

Female 41 45

Male 42 74

time

CAL'09 - Brighton - Patrick Wessa

time

CAL'09 - Brighton - Patrick Wessa

Double Hierarchical Structure

CAL'09 - Brighton - Patrick Wessa

Integrated Design

● Statistical Computation = Core Object of Study● Statistical Computation = Core IT Object

=>● Communication (peer review) should be an function

of the Computation● Hierarchical Parent-Child relationships between

computations are maintained & can be browsed

CAL'09 - Brighton - Patrick Wessa

Predictive Performance

Y0 (U) Y0 (C) Y1 (U) Y1 (C) Y1*(C)

Correctly Classified (RT) 75.0 % 82.9 % 75.7 % 88.6 % 90.1%

Correctly Classified (CV) 44.6 % 72.9 % 36.6 % 75.2 % 80.2%

Kappa Statistic (RT) 0.6015 0.5914 0.6259 0.7183 0.7345

Kappa Statistic (CV) 0.1382 0.386 0.0201 0.3863 0.4757

Number of leaves 29 13 36 11 7

Size of tree 57 25 71 21 13

Peer Review Moodle Compendium Platform

CAL'09 - Brighton - Patrick Wessa

Overfitting problems

=== Confusion Matrix ===

  a  b  c  d   <­­ classified as 13  1  4  0 |  a = Excellent  1 73  8  0 |  b = Fail  2 18 57  0 |  c = Guess  4  7  4 10 |  d = Pass

Correctly Classified Instances         153               75.7426 %Incorrectly Classified Instances        49               24.2574 %Kappa statistic                          0.6259

=== Confusion Matrix ===

  a  b  c  d   <­­ classified as  1 14  2  1 |  a = Excellent  9 41 26  6 |  b = Fail  4 38 30  5 |  c = Guess  2 14  7  2 |  d = Pass

Correctly Classified Instances          74               36.6337 %Incorrectly Classified Instances       128               63.3663 %Kappa statistic                          0.0201

● In-sample ● Out-of-sample

CAL'09 - Brighton - Patrick Wessa

Year 0 (Corrected)

CAL'09 - Brighton - Patrick Wessa

Year 0 (Corrected)activenon-active

malemalefemale

bachelor prep.progr.

CAL'09 - Brighton - Patrick Wessa

Year 1 (Corrected)

CAL'09 - Brighton - Patrick Wessa

Year 1 (Corrected)activenon-active

drop-out

female male

prep.progr. prep.progr.bachelor bachelor