pg program in data science & data management systems

18
1 In collaboration with: A program by: PG PROGRAM IN FOR RECENT GRADUATES AND EARLY CAREER PROFESSIONALS DATA SCIENCE & DATA MANAGEMENT SYSTEMS

Upload: others

Post on 28-Dec-2021

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

1

In collaboration with:A program by:

PG PROGRAM IN

FOR RECENT GRADUATES AND EARLY CAREER PROFESSIONALS

DATASCIENCE & DATAMANAGEMENTSYSTEMS

Page 2: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

2

I N T R O D U C T I O N

Every business, ranging from agriculture to technology, generates tons of data for every process handled.

In the words of Peter Drucker

The data generated by modern companies is an important asset that can be leveraged to make effective business decisions. Data Science enables businesses to draw meaningful insights from massive amounts of data. As organizations realise the importance of being data-driven in an increasingly digital world, there has been a sharp uptick in the demand for data scientists.

The curriculum and content for PGP-DSDMS is designed keeping practice through execution as the central idea. Throughout the program you will work on 5+ hands-on projects using the tools you were taught during the lectures. Along with the projects, you will apply your learnings through assignments and quizzes throughout the program.

The topics covered in this course can be categorized into Data Science, Data Visualization and Data Engineering, giving you a well rounded exposure to the most important Data Science concepts. The course is divided into 3 modules, each taking a comprehensive approach to the concepts covered, with the projects at the end of each course giving you a holistic problem solving vision for Data Science problems.

Throughout the duration of this program, you will progress through the life-cycle of a Data Science project. Starting with Data Preparation and Data Wrangling, you will learn about how to build relationships between Datasets and automate data transformations. Further, you will learn how a Data pipeline can be created and run on a cloud service like AWS. You will also learn the optimal techniques for Data Visualization and Dashboarding using Tableau.

The final leg of this course gives you the opportunity to choose your specialization based on your area of Interest. Elective A, Big Data Engineering uses PySpark and HADOOP to compute large volumes of data efficiently, while Elective B, Building Data Science Models implements mathematical concepts like regression, classification clustering, etc. to build solutions for business problems.

According to the U.S. Bureau of Labor Statistics, Data Science roles are expected to grow by more than 25% in the coming years. In a recent survey conducted by Analytics Insight, there will be more than 3 million new job openings in Data Science worldwide, and the average value of Data Science salaries is $110,000.

What gets measured, gets managed.

Page 3: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

3

P R O G R A M B E N E F I T S

P R O G R A M S T R U C T U R E *The program will be delivered in an online format through recorded lectures and more than 45 hours of online, live-mentored learning sessions. The mentored learning sessions are conducted by industry experts, who will help you gain industry exposure. You will also work on projects to get a simulated experience of the real challenges faced by a data scientist.

6 MONTHS 5-HANDS ON PROJECTS

1 INTEGRATIVE CAPSTONE PROJECT

WEEKLY MENTORED LEARNING SESSIONS

*Refer to program fees for more information.

MOST SOUGHT-AFTER DATA ENGINEERING AND DATA ANALYTICS TOOLS

MENTORED LEARNING SESSIONS WITH EXPERTS

CASE STUDIES FOCUSING ON REAL WORLD BUSINESS SCENARIOS

18 WEEKLY MENTORED LEARNING SESSIONS

36+ ASSIGNMENTS AND QUIZZES

18 PRACTICE EXERCISES

LAB SESSIONS AND PROJECTS FOR HANDS-ON EXPERIENCE

CAPSTONE PROJECT TO CONSOLIDATE YOUR LEARNINGS THROUGHOUT THE PROGRAM

CERTIFICATE FROM THE UNIVERSITY OF TEXAS AT AUSTIN

Page 4: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

4

W H O I S T H I S P R O G R A M F O R ?

PG Program in Data Science and Data Management Systems is designed to include practical, hands-on skills across various tools and technologies that are in high demand and are critical for young career professionals to land their first job in the Data Science domain. The program takes you through the entire Data Science value chain, which includes.

Source: Indeed

Industry Growth Hiring Companies

$349 Billion global spending on Data Science in 2025.

Source: IDC Spending Guide

$110K average salaryfor Data Science roles.

Source: Glassdoor

11.5 Million new Jobsfor Data Science professionals.

Source: US Bureau of Labour Statistics

28% annual growthin Data Science jobs by 2026.

Source: US Bureau of Labor Statistics

Industry Trends

Our learners in the DSDMS program are early career professionals, with more than 85% of the cohort having less than 2 years of experience. They come from varied backgrounds like Information Technology, Banking, Pharma, Consulting and Research, with the drive to transform their career to the tune of the rapidly growing Data Science industry.

Data Management

Data Extraction

Exploratory Data Analysis

Data Visualization

Leveraging Cloud Infrastructure

Building Data Science Models

Orchestrating Data Pipelines

Page 5: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

5

Y O U R P E R S O N A L C A R E E R

S U C C E S S T E A M

The PGP in Data Science and Data Management Systems is dedicated to ensure the success of all participants, even beyond the lectures and curriculum. With this program, you will get access to GL Excelerate - a career support program, exclusive to our PG Program learners.

• Career Sessions - Personal interactions with industry professionals and access to Career Workshops to gain valuable insights and guidance

• Resume & Linkedin Profile Review - An expert will review your resume and LinkedIn profile, and help build them so that you can achieve your career goals.

• Interview Preparation - Get an insider’s perspective to understand what recruiters look for when hiring for Data Engineering and Data Analytics roles

Apart from this, you will also have access to a Career Workshop, where you will receive guidance on evaluating job opportunities, identifying your strengths and weaknesses and preparing your elevator pitch for prospective employers.

You will also get a chance to appear for a Mock Interview, where you will get an opportunity to understand the expectations of recruiters, and receive personalized feedback on your performance.

With these tools, the PG Program in Data Science & Data Management Systems enables you to take the right steps when it comes to your professional growth and career development.

Page 6: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

6

C E R T I F I C A T E

The University of Texas at AustinConferred to attest that

has successfully completed the

June 2020

JOHN SMITH

Post Graduate Program inData Science and Data Management Systems

Gaylen PaulsonAssociate Dean and Executive DirectorTexas Executive Education

Kumar Muthuraman, Ph.D,Faculty Director Data Science and Data Management SystemsTexas Executive Education

All certificate images are for illustrative purposes only. The actual certificate may be subject to change at the discretion of the university.

Hands-on practice sessions using Popular Industry Tools

and more..

Page 7: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

7

C O U R S E C U R R I C U L U M

ESSENTIALS OF

COMPUTER SCIENCE

• Hardware

• OS

• Data Structures & Algorithms

• Programming

PYTHON FUNDAMENTALS

• Setup

• Variables

• Data Types

• Operators

• Functions

• Loops

• OOPS

LINUX FUNDAMENTALS

• Basics of OS

• Protocols and Networking

• Basic Linux Commands

VERSION CONTROL

• Introduction to Git

• Features of Git

• Basic Commands

• GitHub

DESCRIPTIVE STATISTICS

• Measures of Central Tendancy

• Measures of Dispersion

You will learn all the essentials of Computer Science, Programming and Statistics to build a strong foundation before you start your learning journey.

MODULE 1

PYTHON & SQL FOR DATA MANAGEMENT (10 Weeks)

COURSE 0: PRE-WORK

Page 8: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

DATA PREPARATION

• Data Connectiaon and Data Read

• Data Formatting

• Missing Value Treatment

• Dataframe Operations

EXPLORATORY DATA ANALYSIS

• Graphs and Plots

• Univariate and Bivariate Analysis

• Correlation

WRANGLING

UNSTRUCTURED DATA

• Web Scraping

• Data Cleaning

• Exception Handling

PROJECT 1

8

In this course, you will learn how to connect, extract and aggregate data present in various data sources, clean and perform Exploratory Data Analysis and derive meaningful insights using Python.

In this course, you will learn how to Query data in RDBMS and NoSQL DBs, design schemas and relationships between tables and automate data transformations in a database using Stored Procedures.

INTRODUCTION TO DBMS AND

FUNDAMENTALS OF SQL

• Querying on SQL

• Functions

• Window Functions

DATA MODELING AND

ARCHITECTURE:

• ER Diagrams

• Schema Models

• Stored Procedures

• Views

NOSQL DATABASES

• File Formats and Comparison

• Introduction to MongoDB

• SQL Operations

PROJECT 2

COURSE 1: PYTHON FOR DATA SYSTEMS (6 WEEKS)

COURSE 2: SQL AND DATABASES (4 WEEKS)

Page 9: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

9

In this course, you will learn how to orchestrate a data pipeline & navigate the AWS cloud infrastructure to leverage big data services to define and solve a business problem end-to-end from data requirements, to identifying drivers by formulating hypothesis, and finally present the insights in a markdown.

In this course, you will learn how to tell stories using data and create stunning dashboards with relevant visualizations to meet the business needs using Tableau.

INTRODUCTION TO CLOUD

INFRASTRUCTURE

• Cloud9-IDE

• Cluster Compute Services

• Storage and Databases

AIRFLOW FOR DATA PIPELINE

MANAGEMENT - PART 1

• Data Orchestration

• DAG

• Code Architecture

• UI

FOUNDATIONS OF STATISTICS

• Inferential Statistics

• Distributions

• Sampling

• CLT

• A/B Testing

HYPOTHESIS TESTING

• Interpreting p-values

• Errors

• Parametric Tests: t-Test and Chi-Square Test

PROJECT 3

DATA, STORIES AND

DASHBOARDING

• Visual Analytics

• Design Principles

TABLEAU - A BI TOOL

• Architecture

• Data Preparation

• Calculations

• Actions

• Performance Optimization

PROJECT 4

MODULE 2

DATA ANALYTICS & AUTOMATION (9 WEEKS)

COURSE 3: DATA ANALYTICS ON CLOUD (6 WEEKS)

COURSE 4: DATA VISUALIZATION USING TABLEAU (3 WEEKS)

Page 10: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

10

Learn how to navigate and build solutions on the cloud by leveraging the Hadoop Ecosystem and use PySpark to compute huge volumes of data efficiently.

INTRO TO HADOOP AND

BIG DATA ECOSYSTEM

• HDFS• YARN• SQOOP• HIVE Fundamentals

DATA PROCESSING

USING SPARK

• Hadoop vs Spark• Spark Architecture• Launch Modes• RDDs

DATAFRAMES WITH

SPARK SQL

• Dataframes • Resource Allocation• Partitioning• Persistence

SPARK JOB OPTIMIZATION

• Memory Management• Dynamic Allocation• Compression• Shuffle

PROJECT 5

Learn how to apply industry relevant Data Science techniques such as Regression, Classification, Clustering, Dimensionality Reduction etc. to solve real world problems.

SUPERVISED LEARNING PT. 1

• SL vs USL

• Regression

• Evaluation Metrics

SUPERVISED LEARNING PT. 2

• Classification

• Linear vs Logistic

• Decision Trees

• Confusion Matrix

UNSUPERVISED LEARNING

• K-means

• K-modes

• K-prototype

• Elbow Curve

• Silhouette

MODEL TUNING

• Bias Variance Trade-off

• Underfitting vs Overfitting

• K-fold Validation

PROJECT 5

ELECTIVE A

BIG DATA ENGINEERING

ELECTIVE B

BUILDING DATA SCIENCE MODELS

MODULE 3

SPECIALIZATION (8 WEEKS)

COURSE 5 (4 WEEKS)

Page 11: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

11

CAPSTONE PROJECT (4 WEEKS)

A comprehensive project that encompasses a rigorous employment of all the tools and techniques you have learnt as a part of this program. Through expert assistance, learners would learn how to solve and manage real-world Data Science problems.

This program also introduces you to advanced data science topics, which can be learnt at your own pace. These topics will bolster your understanding of Data Science, and will give you a competitive edge when applying for jobs and appearing for interviews.

In this course, you will learn model deployment techniques and make your model scalable, robust, and reproducible.

1. Model Deployment: Flask, Amazon SageMaker

2. Containerization using Docker: Productionalization

3. Container Orchestration: Kubernetes

COURSE 6: MODEL DEPLOYMENT (SELF-PACED)

In this course, you will learn how to perform a variety of statistical tests and the math behind them.

1. Tests for Normality: Shapiro-Wilk Test, Anderson-Darling Test, D’Agostino’s K2 Test

2. Parametric Tests: ANOVA, ANCOVA, Paired Student’s t-Test

3. Non-Parametric Tests: Mann-Whitney U Test, Wilcoxon Signed-Rank Test, Kruskal-Wallis H Test

4. Tests for Correlation: Pearson’s Correlation Test, Spearman’s Rank Correlation Test

COURSE 7: CORE STATISTICS (SELF-PACED)

Page 12: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

12

P R O G R A M F E E S

The Post Graduate Program in Data Science and Data Management Systems has a series of modules designed to help you build your Data Science career.

Each Module builds on the previous one and gives you a deeper understanding of

the technologies prevalent in Data Science. You cannot begin an advanced module

without completing the previous modules.

You have the option to select between 3 learning paths, each designed to kick-start

your career in Data Science. with Unit I - Data Management Fundamentals, this course

gives you an option to start your learning journey at just USD 750. Get in touch with

you Program Advisor to learn more about the Units and how you can benefit from

each of them.

PROGRAM FEE - USD 2500

You need to successfully complete all 3 modules and the Capstone Project at the end of Module 3 to receive a Certificate from McCombs School of Business.

Unit I

Data Management Fundamentals

USD 750

Duration - 8 WeeksAccess to Career Workshops and Webinars

Unit II

Advanced Data Management Systems

USD 1500 USD 1750

Duration - 17 WeeksCareer Workshops + Resume & Linkedin Review by Professionals

Program

Data Science and Data Management Systems

USD 2500 USD 3000

Duration - 26 Weeks

Career Workshops + Resume & Linkedin Review + 1:1 Career Sessions by Industry Experts

OR

OR

Page 13: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

13

COURSE PROJECTS

Here are a few sample projects to give you a glimpse into the program:

MOVIELENS DATA EXPLORATION

Industry Entertainment

Summary The GroupLens Research Project is a research group in the Department of Computer Science and Engineering at the University of Minnesota. In this project, you will perform exploratory data analysis to understand the popularity trends of movie genres and derive patterns in movie viewership.

Tools & Concepts

PERSONAL LOAN CAMPAIGN

Industry Banking

Summary You will build a model that helps to identify potential customers of a bank who have a higher probability of purchasing a loan.

Tools & Concepts

CALL DROP ANALYSIS

Industry Telecom

Summary This project involves identification of the major reason for call drops for a telecom company. A large volume of call record data is analysed using big data technologies to identify the reasons and provide recommendations to improve the telecom services to customers.

Tools & Concepts

Supervised Learning, etc.

Career Workshops + Resume & Linkedin Review + 1:1 Career Sessions by Industry Experts

Page 14: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

14

TAXI DEMAND PREDICTION

Industry Transportation

Summary Understanding taxi supply and commuter demand, especially the imbalance between the supply and the demand, would directly help to improve the quality of taxi service and eventually increase a city’s traffic system efficiency. As part of this project, you will use Python & Big Data tools to analyze the demand for taxis during specific times of the day and also under specific weather conditions.

Tools & Concepts

Page 15: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

15

E - P O R T F O L I O

P R O G R A M F A C U L T Y

SHOWCASE YOUR SKILLS WITH AN E-PORTFOLIO

The E-Portfolio summarizes all the projects you will undertake and tools you will learn during the program, helping you to stand out from other applicants in the highly competitive Data Science industry.

Kumar Muthuraman, H. Timothy (Tim) Harking Centennial Professor, Faculty Director, Center for Research and Analytics, McCombs, University of Texas at Austin, M.S & Ph.D, Stanford University

Dan Mitchell, Assistant Professor, McCombs School of Business Ph.D, The University of Texas at Austin

Ashish Agarwal, Assistant Professor, McCombs School of Business Ph.D, Tepper School of Business, Carnegie Mellon University

View Sample E-Portfolio here

Page 16: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

16

M E N T O R S

BECOME INDUSTRY-READY WITH LIVE MENTORSHIP

Along with strong theoretical foundations, hands-on learning goes a long way in preparing you to solve real-world business problems. As you work on real-life projects, you will receive personalised live mentorship every weekend from industry experts in Data Engineering and Data Analytics domains.

Ali Soleymani - Lead Data Scientist at Task Resource Ltd. | LinkedIn

Hossein Kalbasi - Data Engineer at Concured | LinkedIn

Mohammad Amini - Data & Applied Scientist II at Microsoft | LinkedIn

Page 17: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

17

A D M I S S I O N P R O C E S S

Admissions are conducted on a rolling basis and the admission process is closed once the requisite number of candidates have been enrolled into the program.

Fill a simple online application form.

Wait for the admission committee & faculty panel to review your application.

If selected, you will receive a letter of admission for the upcoming cohort.

Page 18: PG PROGRAM IN DATA SCIENCE & DATA MANAGEMENT SYSTEMS

READY TO ADVANCEYOUR CAREER?

SPEAK TO A PROGRAM ADVISOR

+1 512 559 1644

Have questions about the program or how it fits in with your career goals?email: [email protected]

APPLY NOW