data science and analytics curriculum development at rensselaer (and the tetherless world...
Post on 21-Dec-2015
220 Views
Preview:
TRANSCRIPT
Data Science and Analytics Curriculum development at Rensselaer
(and the Tetherless World Constellation)NRC BigData Education Workshop
April 11-12, 2014, Washington DC
Peter Fox (RPI and WHOI/AOP&E) pfox@cs.rpi.edu, @taswegianTetherless World Constellation, http://tw.rpi.edu #twcrpiEarth and Environmental Science, Computer Science, Cognitive Science, and IT and Web Science
tw.rpi.edu
Research Themes
Future Web•Web
Science•Policy•Social
Xinformatics•Data Science
•Semantic eScience
•Data Frameworks
Semantic Foundations•Knowledge Provenance
•Ontology Engineering Environments•Inference, Trust
Hendler
Fox
McGuinness
Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)
Application Themes
Govt. Data•Open
•Linked•Apps
Env. Informatics•Ecosystems
•Sea Ice•Ocean imagery
•Carbon
Health Care/ Life Sciences•Population Science•Translational Med
•Health Records
Hendler/ Erickson
Fox
McGuinness
Platforms:Bio-nano tech centerExp. Media and Perf. Arts Ctr.Center for Comput. Innovation
Institute for Data Exploration and Applications http://idea.rpi.edu
http://tw.rpi.edu/web/Courses
5
Data Information Knowledge
Context
PresentationOrganization
IntegrationConversation
CreationGathering
Experience
Data Science Xinformatics Semantic eScience
Web Science
GIS4ScienceData Analytics
I teach and am involved:
• Data Science*, Xinformatics*, GIS for the Sciences*, Semantic eScience*, Data Analytics*, Sematic Technologies**
• School of Science– ITWS and E&ES curriculum committees, SoS CC– E&ES international student advisor – Institute Faculty Fellow
• Institute-wide– New Digital Humanities program
• Institute for Data Exploration and Applications
Data Science/ Xinformatics
Science has fully entered a new mode of operation. Data science is advancing inductive conduct of science driven by the greater volumes, complexity and heterogeneity of data being made available over the Internet. Data science combines of aspects of data management, library science, computer science, and physical science using supporting cyberinfrastructure and information technology. As such it is changing the way all of these disciplines do both their individual and collaborative work. Data science is helping scientists face new global problems of a magnitude, complexity and interdisciplinary nature whose progress is presently limited by lack of available tools and a fully trained and agile workforce. At present, there is a lack formal training in the key cognitive and skill areas that would enable graduates to become key participants in e-science collaborations. The need is to teach key methodologies in application areas based on real research experience and build a skill-set. At the heart of this new way of doing science, especially experimental and observational science but also increasingly computational science, is the generation of data.
In the last 2-3 years, Informatics has attained greater visibility across a broad range of disciplines, especially in light of great successes in bio- and biomedical-informatics and significant challenges in the explosion of data and information resources. Xinformatics is intended to provide both the common informatics knowledge as well as how it is implemented in specific disciplines, e.g. X=astro, geo, chem, etc. Informatics' theoretical basis arises from information science, cognitive science, social science, library science as well as computer science. As such, it aggregates these studies and adds both the practice of information processing, and the engineering of information systems. This course will introduce informatics, each of its components and ground the material that students will learn in discipline areas by coursework and project assignments.
IT and Web Science
• First IT academic program in U.S.
• First web science degree program in U.S.
• BS in ITWS (20 concentrations) and MS in IT (10 concentrations)
• PhD in Multi-Disciplinary Sciences
• http://itws.rpi.edu
Technical Track Courses
Concentrations
Computer Engineering Track
1) ECSE-2610 Computer Components and Operations2) ENGR-2350 Embedded Control3) ECSE-2660 Computer Architecture, Networking and
Operating Systems
Civil EngineeringComputer HardwareComputer Networking (hardware focus)Mechanical/Aeronautical Eng.
Computer Science Track 1) CSCI-2200 Foundations of Computer Science2) CSCI-2300 Introduction to Algorithms3) CSCI-2500 Computer Organization
Cognitive ScienceComputer Networking (software focus)Information SecurityMachine and Computational Learning
Information Systems Track 1) CSCI-2200 Foundation of Computer Science2) CSCI-2500 Computer Organization3) Four credits from the following: CSCI-2220 Programming in Java (2 credits) CSCI-2961 Program in Python (2 credits) CSCI-2300 Introduction to Algorithms (4 credits) ITWS-49XX Web Systems Development II (4 credits)
ArtsCommunicationEconomicsEntrepreneurshipFinanceManagement Information SystemsMedicinePre-lawPsychologySTS
Web Science Track 1) CSCI-2200 Foundations of Computer Science2) CSCI-2500 Computer Organization3) One of the following: CSCI-49XX Web Systems Development II Web/Data Course approved by ITWS Curriculum
Committee
Data ScienceScience Informatics Web Technologies
CHANGES TO THE MASTER’S IN INFORMATION TECHNOLOGY
PROGRAM• In Spring 2013 the MS in IT core curriculum was revised
to include Data Analytics.• Networking core classes were replaced with Data
Analytics core classes: Data Science, Database Mining, X-informatics, and Data Analytics (a new class offered in Spring 2014).
• The MS in IT program also added two new concentrations: Data Science and Analytics and Information Dominance.
• The Information Dominance concentration was developed for a new Navy program that will be educating a select group of 5-10 naval officers a year with the skills needed for military cyberspace operations. Two officers started in Fall 2013 and three began in Spring 2014.
IT Core Area Course Number Course Title Term(s) Offered
Database Systems CSCI-4380 Database Systems Fall/Spring
Data Analytics ITWS-6350 Data Science Fall
Software Design and Engineering
CSCI-4440 Software Design and Documentation Fall
ITWS-6400 X-Informatics Spring
Management of Technology*
ITWS-6300Business Issues for Engineers and Scientists (Professional Track Only)
Fall/Spring
Human Computer Interaction
COMM-6420 Foundations of HCI Usability Fall
COMM-696X Human Media Interaction Spring
MS in IT Required Core Courses
* For the research track, replace ITWS-6300 Business Issues for Engineers and Scientists with one of the two semester courses ITWS-6980 Master’s Project or ITWS-6990 Master’s Thesis.
Advanced Core options for students who have previously completed a Core Course
IT Core Area Course Number Course Title Term(s) Offered
Database Systems
CSCI-6390 Database Mining Fall
ITWS-6350 Data Science Fall
ITWS-696X Semantic E-Science Fall
Data Analytics
CSCI-6390 Database Mining Fall
ITWS-6400 X-Informatics Spring
ITWX-696X Data Analytics Spring
Software Design
CSCI-6500 Distributed Computing Over the Internet Fall
ECSE-6780 Software Engineering II Fall
ITWS-696X Semantic E-Science Fall
Management of Technology
MGMT-6080 Networks, Innovation and Value Creation Fall
MGMT-6140 Information Systems for Management Spring
Human Computer Interaction
COMM-6620 Information Architecture Spring
COMM-6770 User-Centered Design Fall
COMM-696X Interactive Media Design Summer
Concentration Course Number Course Name Term(s) Offered
Data Science and
Analytics
Data and Information analytics extends analysis (descriptive and predictive models to obtain knowledge from data) by using insight from analyses to recommend action or to guide and communicate decision-making. Thus, analytics is not so much concerned with individual analyses or analysis steps, but with an entire methodology. Key topics include: advanced statistical computing theory, multivariate analysis, and application of computer science courses such as data mining and machine learning and change detection by uncovering unexpected patterns in data. Select two or three of the following courses:
ITWS-6350 Data Science Fall
ITWS-6400 X-Informatics Spring
ITWS-696X Data Analytics Spring
ITWS-696X Semantic E-Science Fall
ITWX-696XAdvanced Semantic Technologies*
Spring
If only two of the above were chosen, select one more of the following courses:
COMM-6620 Information Architecture Spring
CSCI-4020 Computer Algorithms Spring
CSCI-4150 Introduction to AI Fall
CSCI-6390 Database Mining Fall
CSCI-4220 or CSCI-6220
Network Programming or Parallel Algorithm Design
Spring
ISYE-4220Optimization Algorithms and Applications
Fall
ISYE-6180 Knowledge Discovery with Data Mining
Spring
MGMT-696XTechnology Foundations for Business Analytics
Fall
MGMT-696XPredictive Analytics Using Social Media
Spring
Concentration Course Number Course Name Term(s) Offered
Information Dominance
The Information Dominance concentration prepares students for careers designing, building, and managing secure information systems and networks. The concentration includes advanced study in encryption and network security, formal models and policies for access control in databases and application systems, secure coding techniques, and other related information assurance topics. The combination of coursework provides comprehensive coverage of issues and solutions for utilizing high assurance systems for tactical decision-making. It prepares students for careers ranging from secure information systems analyst, to information security engineer, to field information manager and chief information officer. It is also appropriate for all IT professionals who want to enhance their knowledge of how to use pervasive information in situational awareness, operations scenarios, and decision-making.
Select two or three of the following courses:
ISYE-6180Knowledge Discovery with Data Mining
Spring
CSCI-6960Cryptography and Network Security I
Fall
ITWS-4370 Information System Security Spring
CSCI-4650 Networking Laboratory IFall/Spring
MGMT-7760 Risk Management Fall
ISYE-4310Ethics of Modeling for Industrial Systems Engineering
Fall
If only two of the above were chosen, select one more of the following courses:
CSCI-6390 Database Mining Fall
CSCI-6968Cryptography and Network Security II
Spring
CSCI-4660 Networking Laboratory IIFall/Spring
ECSE-6860Evaluation Methods for Decision Making
Fall
ISYE-6500Information and Decision Technologies for Industrial and Service Systems
Fall/Spring
CSCI-496XComputational Analysis of Social Processes
Fall
Two New MS in IT Concentrations
Also at RPI
• Data Science Research Center and Data Science Education Center (dsrc.rpi.edu, 2009)
• http://www.rpi.edu/about/inside/issue/v4n17/datacenter.html– Over 45: research faculty, post-docs, grad students, staff,
undergraduates…
• Data is one of the Rensselaer Plan’s five thrusts• Other key faculty
– Fran Berman (Center for Digital Society and RDA)– Bulent Yener (DSRC Director)– Jin Hendler (IDEA Director)
More RPI Curriculua
• Environmental Science with Geoinformatics concentration
• Bio, geo, chem, astro, materials - informatics
• GIS for Science
• Master of Science – Data Science?? (pending)
• Multi-disciplinary science program - PhD in Data and Web Science
• DATUM: Data in Undergraduate Math! (Bennett)
• Missing – intermediate statistics
• Graphs – significant potential here – must teach!
5-6 years in…
• Science and interdisciplinary from the start!– Not a question of: do we train scientists to be
technical/data people, or do we train technical people to learn the science
– It’s a skill/ course level approach that is needed
• We teach methodology and principles over technology *
• Data science must be a skill, and natural like using instruments, writing/using codes
• Team/ collaboration aspects are key **• Foundations and theory must be taught ***
Challenging the “Heroic” Science Paradigm
This national and international has drawn attention to the need for a reassessment of priorities to recognize that, in the new data era, the burden of making data and information usable shifts from the user to the provider.
top related