vertical search for courses of uiuc by jessica bell, alexander loeb, sharon paradesi, michael paul,...
TRANSCRIPT
![Page 1: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/1.jpg)
Vertical Search for Courses of UIUC
by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul,
Jing Xia, Jie Zhang
![Page 2: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/2.jpg)
Demo
http://greedy.cs.uiuc.edu/dssi/course/search.php
![Page 3: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/3.jpg)
Goals of the project- construct a database of UIUC courses across all departments ultimately creating a centralized knowledgebase about each course.
- augment the database by drawing relations between courses both within and between departments and further by finding similarities among courses outside of the University of Illinois.
![Page 4: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/4.jpg)
DA
TA
SO
UR
CE
Course Catalog
Book Store
Webpages
Other Universities
PHP script
JAVA script
AgentIDE
Heritrix
WEKA
DATABASE
Basic Course Info
Book Info
Course homepage
Keywords
Related Courses
Query by
Course Name
Instructor
Description
…
PHP
Architecture
![Page 5: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/5.jpg)
Web Crawling Wget, AgentIDE and Heritrix
Parsers Python and Java
Learning Tools WEKA
Website Design PHP and MySQL
Tools used
![Page 6: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/6.jpg)
Tasks finished
Data Mining – Basic course information Similar course recommendation Prerequisite course list Recommended book information
Learning – Clustering Classification
![Page 7: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/7.jpg)
Keywords
Pull from course descriptions Remove uninformative/common words
![Page 8: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/8.jpg)
Keywords (contd.)
topics 0.1328 fruits 0.6453their 0.1352 horticultural 0.6453problems 0.1370 agricultural 0.6454basic 0.1373 0.6478techniques 0.1439 doctorate 0.6489students 0.1457 speaker 0.6489is 0.1494 meteorological 0.6492are 0.1505 anthropology 0.6493analysis 0.1531 institute 0.6498special 0.1531 reflective 0.6498areas 0.1556 later 0.6508graduate 0.1563 weather 0.6513research 0.1586 protein 0.6514be 0.1586 mobilization 0.6514various 0.1589 authentic 0.6514methods 0.1600 romance 0.6514selected 0.1618 libraries 0.6561current 0.1625 became 0.6563advanced 0.1651 novelists 0.6563that 0.1651 colonization 0.6563concepts 0.1668 initiatives 0.6563both 0.1731 revisit 0.6563development 0.1744 churches 0.6563
russian
![Page 9: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/9.jpg)
Search Search by name, instructor, or content Clean up search string
“cs125” becomes “CS 125” “real-time” becomes “real time realtime”
Split search string into individual words and query database for word matches
Score and rank results by match frequencies and keyword informativeness scores
Look at distribution of scores and display the top results
![Page 10: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/10.jpg)
Classification NBTree Classifier
Training set: 34 instances Test set: 38 instances Attributes: 17
Accuracy - 94.74% Precision - 0.947 Recall - 0.947 F-Measure - .947
![Page 11: Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang](https://reader036.vdocuments.site/reader036/viewer/2022082613/5697c0021a28abf838cc2f69/html5/thumbnails/11.jpg)
Clustering Cobweb Clustering Algorithm
Instances: 20 Attributes: 112
Number of clusters: 17 Incorrectly clustered instances: 7.0 (i.e. 35%)