predicting hospital readmissions

Post on 20-Mar-2017

13 Views

Category:

Data & Analytics

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Predicting Hospital Readmissions

MANGT 665Business Analytics & Data Mining

Prof. Bongsug ChaeFinal Project

Derek ChristensenDec 8, 2016

Telling a Story with Data*(Communicating effectively with analytics)

• Summary• Recommendations• Implications of Results• Outline of Research Process

*Deloitte Review by Thomas H. Davenport

Problem Description & Introduction• - Background: Hospitals are penalized for patients that are re-

admitted less than 30 days after they are released.• - Business Objectives: To reduce or eliminate the number of

patients re-admitted less than 30 days after they are released.

• - Success Criteria: Identification of factors that increse the likelihood of a patient returning within 30 days.

• - Business Value: The average cost in 2011 for a hospital stay was $10,000.*

• - *http://www.beckershospitalreview.com/finance/11-statistics-on-average-hospital-costs-per-stay.html

Key Findings & Insights• Random Forest = 99.23%

Final Analysis & Recommendation

Next Steps- Analyse those close to the 30 day threshold - i.e. 31 to 45-60 days- Weight Data- Cross referencing between the 3 Diagnosis'- Analyzing the Order of the 3 Diagnosis'- Add more Diagnosis- More Granular in the Diagnosis- ?

Dataset• Description: The dataset contains over 56,000 HIPPA

compliant de-identified records of hospital admissions.• Source: Hack K-State 2016 : Data Science For Social

Good - https://zslie.github.io/• Details: There are 50 columns, of which is the Visit ID

and Patient ID, along with 48 factors.• Factors: The factors have varying number of attributes,

ranging from 1 to 715, so there are ~5.27x10^41 solutions.

• Factors: Descriptions below.

ETL• Performed some data manipulation directly

in excel, including:• Changed 'medical_specialy' to 'MED_SPEC_NUM'• Changed the 3 'diag_x's to 'DIAG_CAT_X'S & converted 858

unique diagnosis' into 33 Diagnosis Categories• Notes are in Challenge_1_Training_Data_Conversion.xlsx file on

the "Storage" page• Key Business Data Question Summary• Of 56,000 hospital visits in this dB:

• 6,285 were re-admitted < 30 days - these are the instances that need solved for

• 19,477 were also re-admitted, but after the 30 day threshold• 30,238 were not re-admitted - there could be some insight also

gleaned from why they DID'T have to be re-admitted

Exploratory Analysis• Preliminary possibilites correlated with readm2 has

changed versus readmitted• number_emergency = 0.103321 ==> No longer is showing

significant correlation now at 0.053• number_inpatient = 0.233149 ==> Is now the only one showing

any significant correlation at 0.162• number_diagnoses = 0.103885 ==> No longer is showing

significant correlation now at 0.045

Model Building• Decision Tree – 88.97%

Interesting questions about the data – what does it mean??

Questions

top related