to predict the academic performance of an elementary school using linear regression in sas

21
LOGO Submitted by: Team Gamma: Amrapalli Karan Kamalika Some Krishanu Mukherjee Somenath Sit Predict Predict Academic Academic Performance Performance (Using multiple regression in SAS) (Using multiple regression in SAS)

Upload: kamalika-some

Post on 14-Apr-2017

61 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: To predict the academic performance of an elementary school using Linear Regression in SAS

LOGO

Submitted by:Team Gamma: Amrapalli KaranKamalika SomeKrishanu MukherjeeSomenath Sit

PredictPredict AcademicAcademic PerformancePerformance(Using multiple regression in SAS)(Using multiple regression in SAS)

Page 2: To predict the academic performance of an elementary school using Linear Regression in SAS

Data description

• Dataset have performance data of 370 students of a particular junior college.

• The attributes are in 6 different datasets. They together contain attributes like name, roll no ,age, math score, regional language score, family support , educational background of mother and father etc.

• Two types of scores, number of classes absent and failures are present in the 2 different datasets – one is for mathematics and another is for regional language.

• There are some missing values in all the datasets.

Page 3: To predict the academic performance of an elementary school using Linear Regression in SAS

Diagram of DatasetsSchool_MathRoll_NoNames

failures_mathabsences_math

Math

School_FamilyRoll_NoNamesfamsizePstatusMeduFeduMjobFjobfamsup

School_OtherRoll_NoNamesinternetfamrelhealthschoolsuptraveltimereasonnursery

School_RegionalRoll_NoNames

failures_regabsences_regReg_Sub

School_BehaveRoll_NoNamesromanticactivitiesgooutDalcWalchigherfreetimestudytime

School_BasicRoll_NoNamessexageaddressguardian

Page 4: To predict the academic performance of an elementary school using Linear Regression in SAS

Our ApproachOur Approach

• Data pre-processing• Impute missing values and verify.• Check for outliers.

• Merge variables for two different subjects (Math and regional language) to single variables.

• Convert categorical variables to dichotomous variables.

• Checking correlations to see over-all score (math + regional) has any correlation with other variables or not.

Page 5: To predict the academic performance of an elementary school using Linear Regression in SAS

Our ApproachOur Approach

• Find attributes which have major impact on overall academic performance of the students.

• Using these attributes, build a model which will predict academic performance of the students for the next year.

Over_All

Major Attribute1

Major Attribute2

Major Attribute3

MinorAttribute

MinorAttributeMinor

Attribute

MinorAttribute

Page 6: To predict the academic performance of an elementary school using Linear Regression in SAS

LOGO

Model BuildingModel BuildingMultiple Linear Regression

Statistical technique to model the relationship between the explanatory and response variables.

Page 7: To predict the academic performance of an elementary school using Linear Regression in SAS

Impact of FactorsImpact of Factors

After our statistical analysis we have found below 4 variables/factors significant to our model

Variable Description ParameterEstimate Impact

StudyTime Weekly study time 8.26852 Positive

Go_Out Going out with friends -5.42282 Negative

Health Current health status of student 5.21360 Positive

Absences Absent from Mathematics and Regional Language classes

-0.43608 Negative

Page 8: To predict the academic performance of an elementary school using Linear Regression in SAS

Impact of Factors in second modelImpact of Factors in second modelAfter our statistical analysis we have found below 5 variables/factors significant to our second model, which are more impactful to the prediction.

Variable Description ParameterEstimate Impact

StudyTime Weekly study time 9.01755 Positive

Go_Out Going out with friends -5.57394 Negative

Daily_Alcohol Workday alcohol consumption -1.30313 Negative

Absences Absent from Mathematics and Regional Language classes

-0.36300 Negative

Health Current health status of student 4.81052 Positive

Page 9: To predict the academic performance of an elementary school using Linear Regression in SAS

LOGO

Model VerificationModel Verification

Page 10: To predict the academic performance of an elementary school using Linear Regression in SAS

R-Square & MAPE

R-Square 0.7792

Adj R-Sq 0.7768

For the first model

MAPE (Mean Absolute Percentage Error):

Every Model has a Mean Absolute Percentage Error value which actuallyshows how much efficient a model is, if this value is more than 10 then wecan not trust that model, otherwise we can.

Adjusted R-Square is more that 77%, which is supporting the model’s validity

MAPE=Mean(ABS((Actual value – Predicted value)/actual value) )*100

For first model MAPE is: 9.11

Page 11: To predict the academic performance of an elementary school using Linear Regression in SAS

R-Square & MAPE

R-Square 0.7832

Adj R-Sq 0.7802

For the second model

MAPE (Mean Absolute Percentage Error):

Adjusted R-Square is more that 78%, which is supporting the model’s validity

For second model MAPE is: 9.07

Both MAPE and Adjusted R-Square have been improved

Page 12: To predict the academic performance of an elementary school using Linear Regression in SAS

Normality Check• In an ideal regression

model residual distribution should follow normal distribution graph.

• This is a graph of percentage of residual values.

• Graph verified that our model is following normal distribution graph (approx.)

Distribution of Residuals

Page 13: To predict the academic performance of an elementary school using Linear Regression in SAS

Final ModelFinal Model• Our final model (Model2) is:

Over_All = 137.11759 + 9.01755 *(Study_Time) -5.57394*(Going_Out) - 1.30313*(Daily_Alcohol) + 4.81052*(Health) -0.36300 *(Absences)

• We have made an excel sheet (similar to below table) in which user(school) need to give values to below attributes and will get predicted academic performance in ‘Result (Over_All)’ column.

Predict Academic Performance for next year (Over_All)

Variable Description ParameterEstimate

Enter Your Data Result (Over_All)

StudyTime Weekly study time 9.01755

Predicted result will show here

Go_Out Going out with friends -5.57394

Daily_Alcohol Workday alcohol consumption

-1.30313

Absences Absent from Mathematics and Regional Language classes

-0.36300

Health Current health status of student

4.81052

Page 14: To predict the academic performance of an elementary school using Linear Regression in SAS

LOGO

Factor wise JustificationFactor wise JustificationAndAnd

RecommendationRecommendation

Page 15: To predict the academic performance of an elementary school using Linear Regression in SAS

Health

• This is the health score for the students.

• Students, who are healthy physically as well as mentally, will be able to concentrate in studies.

• Increase the health condition of the students can directly impact his/her academic performance.

• This variable has positive impact on the prediction, if it will be higher that means students will be able to attend more classes and will be able to focus on studies and will expect better results at the end

Page 16: To predict the academic performance of an elementary school using Linear Regression in SAS

Average Study Time• This attribute identifies study time of each student studying in the

junior college.

• From the Second Model, we can infer that students spend more time to study after school.

• More study time is bound to effect the final performance of the student.

• This variable has positive impact on the prediction, if it will be higher we can expect better academic performances.

Page 17: To predict the academic performance of an elementary school using Linear Regression in SAS

Going Out

• This is the percentage of students who like to spend their time with friends and regularly go out.

• Students who go out regularly spend less time in studies.

• Students who go out regularly, may not be able to give their 100% in study, which can directly impact their academic performance.

• This variable has negative impact on the prediction, if it will be lower we can expect better academic performance.

Page 18: To predict the academic performance of an elementary school using Linear Regression in SAS

Daily Alcohol Consumption

• This is the percentage of daily alcohol consumed per student.

• Students, who are regular drinkers, may not be able to give their 100% in class, which will directly affect their performance.

• Increase in percentage of Daily Alcohol Consumption means decrease in the overall performance.

• This variable has negative impact on the prediction, if it will be lower that means students will be able to concentrate more on studies and can expect better results.

Page 19: To predict the academic performance of an elementary school using Linear Regression in SAS

Absence from Class

• This is the percentage of students who remain absent from classes.

• Students, who are absent regularly, may not be able to give their 100% in class, which will directly affect their performance.

• Increase in percentage of absence means decrease in the overall performance.

• This variable has negative impact on the prediction, if it will be lower that means students will be able to attend more classes and can expect better results.

Page 20: To predict the academic performance of an elementary school using Linear Regression in SAS

Recommendation

• Although ‘Going Out’ and ‘Daily alcohol consumption’ are major factor in the model but it is not easy for the school to track each student and control.

• The Health is one such factor that can be controlled by the junior college. The college can organize health check-up camps at regular intervals, can arrange for On-Call doctors and can distribute few essential and regular medicines, so that the students’ health are not impacted.

• If the school arranges for the after-class tutorials for weaker students, then their study time will increase which will positively impact their academic performance.

Page 21: To predict the academic performance of an elementary school using Linear Regression in SAS

LOGO

Thank YouThank You