gm533 final project paper

26
GM 533 Keller Graduate School of Management Date: 02/14/2010 Bryant/Smith Case 42: Hopsital Charges 1

Upload: minu73

Post on 03-Mar-2015

584 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GM533 Final Project Paper

GM 533Keller Graduate School of Management

Date: 02/14/2010

Bryant/Smith Case 42: Hopsital Charges 1

Page 2: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

EXECUTIVE SUMMARY  

The research is done to find the solutions for some of questions like; the relationship between the doctors and their charges, the type of insurance having its effects on its customers and if there is any relationship between charges, payor and physicians. The data that is used in this research analysis to find the proper equation model for the relationship between payor, physicians and charges and number of days stay at the hospital, is from the Bryant/Smith Case 42 Hospital Charges (Appendix 1). As described in the case, the hospital’s revenues are determined largely by the patients’ insurance coverage. The data being used are for normal delivery of babies.

The null hypothesis test states that there is no relation between the patients with managed care insurance and patients with commercial insurance. This means there is no difference between the patients with either insurance. The alternate hypothesis states that patients with managed care insurance are paying more than patients with commercial insurance. The 95% confidence level was chosen to calculate hypothesis test, and regression analysis.

Based on the hypothesis test, that was conducted to compare the difference between the two different insurances, it is concluded that patients with managed care insurance are paying more charges than the patients with commercial insurance.

Hypothesis test was also conducted to check whether there is a difference between the charges related to physicians. It was concluded after doing the test that, physician#2 has the highest charges among all the physicians.

Linear Regression Analysis is conducted to see if there is any relationship between the DAYS and CHRGS, PHYS, and/or PAYOR. After running several models with different independent variables, it is concluded that there is a linear relationship between DAYS and CHRGS.

INTRODUCTION

The purpose of this report is to prepare a data model based on some of the questions like; whether the patient is charged more based on the insurance, or whether the physicians have different charges for their patients, and/or if the charges (CHRGS) are related to days (DAYS), physicians (PHYS) and/or payor (PAYOR).

The data that is used to prepare the report contains four variables; hospital stay in DAYS, charges that are charged to the patients in CHRGS, type of the physician treated the patient in PHYNS and the type of insurance the patient is carrying in PAYOR. The data was given in the Bryant/Smith case 42: Hospital Charges.

ANALYSIS AND METHODOLOGY

I am doing Descriptive Statistics, Hypothesis Testing, Confidence Interval and Regression analysis; Linear and multiple.

2

Page 3: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Descriptive statistics

As mentioned in the text, descriptive statistics is a science of describing the important aspects of a set of measurements (Bowerman). The measures of central tendency measures mean, median and mode. The measures of variation measures range, standard deviation and variance. The mean is sum of numbers divided by the total numbers. As explained in encyclopedia2 of the free dictionary website, the arithmetic mean is found by adding the numbers and dividing the sum by the number of numbers in the list (Farflex, Inc). This is what is most often meant by an average. The median is the middle value in a list ordered from smallest to largest (Farflex, Inc). The mode is the most frequently occurring value on the list (Farflex, Inc).

The hospital doctors are not the employees of the hospital, but they have control over certain functions such as prescribing medicines to the patients and also prescribing them the stay at the hospital for further advanced treatments (Bryant-Smith, 2003).

I’ll start my research analysis with the comparison of two insurances; managed care insurance and commercial insurance. The question being asked in the case study is,” Do charges incurred by a patient depend on the type of insurance the patient has?” (Bryant-Smith, 2003) In respect to answer the above question I have started my research with running some descriptive statistics. Here, in this research analysis, I have separated the DAYS and CHRGS by the insurance. That means the patients who carry managed care insurance and those who carry commercial insurance. Patients who carry commercial insurance are only 96 and patients who carry managed care insurance are 193. The total number of patient data that are studied here is 289.

Comparison between Managed Care Insurance and Commercial Insurance

Let’s just look into the managed Care Insurance. The mean of $2714.28, is the average charge of one day stay at the hospital when the patient has managed care insurance. This means that, those patients who have managed care insurance, pay average of $2714.28 per day. Similarly, the mean of 2.02 days means that the average stay of the patient at the hospital carrying managed care insurance is 2.02 days. For Managed care insurance, the median for charges is $2789.00 and for days it is 2 days. These values tell us that these are the middle values in a list when sorted by the insurance type. For managed care insurance the mode is $2840 for the charges and 2 days for the days.

Descriptive statistics

  DAYS CHRGSCount 193 193 Mean 2.30 2,966.44

sample variance 1.18 1,429,843.9

6 sample standard deviation 1.09 1,195.76 Minimum 1 929 Maximum 14 14898 Range 13 13969

3

Page 4: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

population variance 1.17 1,422,435.4

4 population standard deviation 1.08 1,192.66

As measures of variation measures range, variance and standard deviation, we will look at the data that were calculated for Managed Care insurance. The range is largest measurement minus the smallest measurement. By definition, population variance is the average of the squared deviations of the individual population measurements from the population mean µ. The population standard deviation is the positive square root of the population variance.

For the managed care insurance, range is $13969 for charges and 13 for days. The population variance is $1,422,435.44 for charges and 1.17 for days. The population standard deviation is $1192.66 for charges and 1.08 for days. Specifically, range does not give us any good idea about the data. As range is just the difference between maximum value and minimum value, it does not provide us good representation of the entire data set.

We are 95% confident that the average charges of the patients carrying managed care insurance is in between $2796.67 and $3136.21. Similarly, the average stay at the hospital by these same patients is in between 2.15 days and 2.45 days at the 95% confidence level.

Now let’s look into Commercial Insurance. The mean of $2966.44, is the average charge of the one day stay at the hospital when the patient has commercial insurance. This means that, those patients who have commercial insurance, pay average of $2966.44 per day. Similarly, the mean of 2.30 days means that the average stay of the patient at the hospital carrying commercial insurance is 2.30 days. For Commercial insurance, the median for charges is $2673.50 and for days it is 2 days. These values tell us that these are the middle values in a list when sorted by the insurance type. For Commercial insurance, the mode for charges is not available as there is none repetitive values but for the days, mode is 2 days.

As explained above for the range, variance and standard deviation, the range for commercial insurance is $4031, which is the difference between the maximum charge of $4933 and minimum charge of $902. Similarly, the range for days of stay at the hospital is 2 days is the difference between maximum days if say 3 days and minimum stay of day is 1 day. The population variance for charges is $444,669.76 and for days it is 0.33 days. The standard deviation for charges is $666.84 and for days it is 0.58 days.

As the population of data is large, we are 95% confident that the charge for the patients carrying commercial insurance is in between $2578.46 and $2850.10. Similarly for days, it is in between 1.90 days and 2.14 days at 95% confidence level.

After comparing the two insurances, I have come to the conclusion that patients who carry managed care insurance pay more than those patients carrying commercial insurance. But just be looking at the data, in the descriptive statistics does not help us reach the conclusion that managed care patients are paying more charges for the services than commercial insurance patients. So I have decided to run the hypothesis test to see if my conclusion agrees with me.

4

Page 5: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Hypothesis Test

Hypothesis test is a statistical procedure to provide the evidence against or in favor of the hypothesized statement or the claim. Here we are to compare two different samples or variables, managed care insurance and commercial insurance. I claim that managed care insurance patients are paying more charges and so to see if my claim is true, I will run hypothesis test.

Let µ1 = Charges to the patient of Managed Care insurance and µ2 = charges to the patient of Commercial insurance

H0: µ1 = µ2 versus Ha: µ1 > µ2

The null hypothesis states that there is no difference between the charges that are charged to patients carrying managed care insurance and patients carrying commercial insurance. The alternate hypothesis states, that patients who carry managed care insurance pays more charges than the patients who carry commercial insurance.

I have used here 2-sample test with unknown variances. This test is used to compare two individual variables with different variances. We are here comparing two different variables; managed care insurance and commercial insurance and both of them have different variances. So this test is going to be useful in determining whether managed care insurance is more expensive then commercial insurance.

I chose the significance level of alpha to be 0.05 because I am running the tests with 95% of confidence level. So that leaves me with 0.05 significance level of alpha.

The test statistic t is 2.30. The critical value t.05 is 1.645.

Since t = 2.30 > t.05 = 1.645, we reject H0 at the 0.05 level of significance.

Since test statistic t = 2.30 is greater than critical value t.05 = 1,645, we reject null hypothesis, that states that there is no difference between the charges charged to patients with managed care insurance or to patients with commercial insurance.

In conclusion, we accept alternate hypothesis that stats that after the difference between managed care insurance and commercial insurance, it is proved that managed care insurance has much higher charges than commercial insurance.

At the significance level of alpha = 0.05, we have a strong evidence that the null hypothesis states that there is no difference between managed care insurance and commercial insurance is false.

Another question in the case study is to find out if there is a difference among the charges of the physicians. There seem to be two types of physicians; one of them has very high charges compared to other physicians and one physician has the lowest charges than others.

5

Page 6: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

The charge for physician#2 seems to be highest among all the physicians. To check, whether my assumption is right, I am conducting a 2-sample unknown variance hypothesis test.

Let µ1 = physician#2 with highest charges and µ2 = all other physicians

H0: µ1 = µ2 versus Ha: µ1 > µ2

The null hypothesis states that there is no difference between the charges of the physicians. The alternate hypothesis states that physician#2 has the highest charges amongst all other physicians.

The 2-smaple test with unknown variances is conducted to find out if any one of the hypothesis is true.

The significance level of alpha chosen is t.05.

Since t = 14.31 > t.05 = 1.645, we reject H0 at significance level of alpha = t.05.

In the conclusion, we have strong evidence at significant level of alpha = 0.05, that there is difference between the physicians and that one of the physicians or physician#2 has the highest charge amongst all.

Linear Regression Analysis

Linear Regression Analysis is conducted to check whether charges depend on Days, Physicians or the Payor. I chose CHRGS (charges) as the dependent variable and DAYS, PHYS and PAYOR as individual dependents. I ran three types of model, each with different variable. After running all three different models, I concluded that the best model is CHRGS vs. DAYS. The reason behind choosing this model is that its correlation coefficient is R = 0.801 and coefficient of determination is R2 = 0.641. The correlation coefficient R and coefficient of determination R2 is far away from 1 in all other models.

The regression equation is:

Y = 930.7042 + 884.2014 * DAYS

The slope is b1 = 884.2014. This means that for each increase in DAYS of stay, there will be expected increase in CHRGS (charges) by about $884.20. In other words, with every increase of a day of stay, there will be additional charges of $884.20.

The coefficient of determination is R2 = 0.641This means that 64.1% of variation in CHRGS (charges) can be explained by variation in DAYS. The large percentage of the variation in CHRGS is explained by the independent variable DAYS, this model is a good fit for data.The p-value for F test is p = 7.50E-66. This means that we have Extremely Strong evidence of a linear relationship between DAYS and CHRGS (charges).

6

Page 7: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

The hypotheses for F test are:H0:β1 = 0 vs. Ha:β1≠0This test tells us that if we reject H0, then we have evidence of a linear relationship between the two variables; DAYS (x variable) and CHRGS (y variable). Thus a small p-value for the F-test is evidence of a significant linear relationship between the variables.The correlation of coefficient is R = 0.801, which indicates a strong positive correlation between DAYS and CHRGS. Correlation Coefficient R is greater than 0, this means that correlation is positive. This also means that an increase in DAYS (day of stay) will increase CHRGS (charge per day). It is also concluded that the correlation is very strong because R = 0.801 ia very much close to 1.

CONCLUSION

Based on the hypothesis test, that was conducted to compare the difference between the two different insurances, it is concluded that patients with managed care insurance are paying more charges than the patients with commercial insurance.

Hypothesis test was also conducted to check whether there is a difference between the charges related to physicians. It was concluded after doing the test that, physician#2 has the highest charges among all the physicians.

Linear Regression Analysis is conducted to see if there is any relationship between the DAYS and CHRGS, PHYS, and/or PAYOR. After running several models with different independent variables, it is concluded that there is a linear relationship between DAYS and CHRGS.

In the last I would like to include that the assumptions that I made by looking at the data were reached in true manner and to mu satisfaction.

7

Page 8: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 1

Data from Bryant/Smith Case 42 Hospital Charges.

Variable MeaningDAYS is the number of days the patient stays in the hospitalCHRGS is the total expense charged to the patientPHYS is the code identifying the physiciansPAYOR indicated the type of insurance the patient carried.

1 for managed care and 0 for commercial insurance

DAYS CHRGS PHYS PAYOR DAYS CHRGS PHYS PAYOR

2 2607 2 1 2 2439 4 1

4 5063 2 1 2 2609 4 0

4 4903 2 1 2 2357 4 1

3 3418 2 1 3 3503 4 1

3 3604 2 0 2 2026 4 0

2 2324 2 1 2 1854 4 1

2 2953 2 1 3 2644 4 1

3 3709 2 1 3 2500 4 1

2 2138 2 1 2 2798 4 1

2 2681 2 0 2 2629 4 1

3 3932 2 1 3 2697 4 0

3 3283 2 0 2 2308 4 0

2 3729 2 1 1 1663 4 1

3 3392 2 1 2 2222 4 1

14 14898 2 1 1 2741 4 0

3 3819 2 1 3 2891 4 0

3 4248 2 1 2 2898 4 1

2 1905 2 1 1 1924 4 1

2 2823 2 1 3 3480 6 1

3 2785 2 0 2 1874 6 1

2 2921 2 0 2 3906 6 1

3 4933 2 0 3 1254 6 1

2 2804 2 0 2 2137 6 1

3 3287 2 1 3 3430 6 1

1 2048 2 1 3 3041 6 1

3 3617 2 0 3 4146 6 1

2 2219 2 0 3 3059 6 1

1 3381 2 0 3 2864 6 1

2 2310 2 1 2 2403 6 1

3 2907 2 1 2 2979 6 1

2 2888 2 1 2 2683 6 1

8

Page 9: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

2 2640 2 0 2 3034 6 1

DAYS CHRGS PHYS PAYOR DAYS CHRGS PHYS PAYOR

3 3826 2 0 2 2436 6 1

2 2840 2 1 1 1960 6 0

2 3137 2 0 3 2753 6 1

1 2955 2 0 3 2209 6 0

1 2184 2 0 2 2378 6 1

2 3500 4 1 2 3230 6 1

3 3585 4 1 3 3636 6 1

3 3047 4 1 2 3279 6 0

3 4475 6 0 3 2873 7 1

2 2711 6 1 2 2797 7 0

1 2062 6 0 1 2167 7 0

2 2280 7 1 2 1701 10 0

2 2939 7 1 2 3165 10 0

2 2334 7 1 2 2358 10 0

2 2809 7 1 2 3306 10 0

2 2620 7 1 1 2139 10 1

2 3090 7 1 2 3355 10 1

3 3945 7 1 3 3375 10 1

2 2435 7 1 2 3173 10 0

2 1864 7 0 2 2900 10 1

2 2592 7 1 2 2838 10 0

2 2017 7 1 3 3160 10 1

2 2666 7 0 2 2343 10 1

3 2955 7 1 2 2285 10 1

2 3204 7 1 2 2117 10 1

3 2066 7 1 2 3112 10 1

1 1793 7 1 3 3394 10 1

2 2638 7 1 2 3663 10 1

1 1840 7 0 2 2528 10 1

2 3182 7 0 1 1806 10 1

2 2218 7 1 2 3115 10 0

2 2612 7 0 2 2780 10 0

1 2789 7 1 3 2724 10 0

2 1590 7 1 2 2263 10 0

3 3868 7 1 2 2550 10 1

1 2178 7 1 2 3194 10 0

2 2665 7 1 2 2954 10 0

2 2739 7 0 3 2798 10 1

2 2314 7 1 2 2176 10 0

1 2070 7 0 2 3202 10 1

2 2292 7 0 3 3779 10 1

9

Page 10: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

2 2594 7 1 2 3458 10 0

DAYS CHRGS PHYS PAYOR DAYS CHRGS PHYS PAYOR

1 902 7 0 2 3109 10 0

3 2935 7 1 2 1567 10 1

2 3032 7 0 1 2268 10 1

2 2289 7 0 2 2858 10 1

2 4206 7 0 2 2542 10 1

2 1973 7 1 3 3218 10 1

3 3454 10 1 2 2582 11 1

2 3848 10 1 3 4185 11 1

1 2886 10 1 3 4183 11 1

2 2752 10 0 3 3289 11 0

2 2602 10 0 2 2288 11 1

1 1899 10 0 3 2564 11 0

2 2334 10 1 2 2570 11 1

2 2692 10 1 2 2944 12 1

2 3693 10 1 2 2450 12 1

2 2582 10 1 2 2207 12 1

2 2123 10 0 2 3716 12 1

3 3708 10 0 2 2247 12 1

1 2078 10 0 3 4357 12 1

2 2067 10 0 2 3050 12 1

2 2840 10 0 2 2779 12 1

2 2563 10 1 2 2991 12 1

2 3064 10 1 2 2227 12 0

2 2157 11 1 2 3270 12 0

3 2647 11 1 3 4711 12 1

2 2649 11 0 1 2081 12 1

2 2745 11 1 2 2589 12 0

4 6340 11 1 2 3768 12 1

2 3757 11 1 1 2153 12 0

3 4510 11 1 2 2522 12 1

3 3492 11 1 1 2145 12 1

6 6710 11 1 2 3455 12 0

2 3183 11 1 3 3081 12 1

3 2494 11 1 2 2251 12 1

2 2352 11 0 2 2833 12 1

2 2449 11 0 2 3077 12 1

2 2473 11 0 2 2392 12 1

2 2534 11 1 3 2181 12 0

2 2158 11 0 2 2242 12 1

2 2408 11 1 3 2687 12 1

2 2468 11 1 2 2698 12 0

10

Page 11: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

3 2133 11 0 2 1023 12 0

DAYS CHRGS PHYS PAYOR DAYS CHRGS PHYS PAYOR

2 2310 11 0 2 2414 12 1

1 2220 11 0 2 4674 12 0

2 2566 11 1 2 3139 12 0

2 2450 11 0 2 2465 12 1

1 1947 12 0 3 4049 14 1

2 2396 13 1 3 3464 14 1

2 3073 13 1 2 3034 14 1

2 2564 13 1 2 2941 14 1

2 929 13 1 2 2668 14 1

3 3369 13 1 2 3001 14 1

3 2331 13 0 1 1791 14 1

2 2543 13 1 2 3118 14 1

2 4722 13 1 2 2761 14 1

3 5633 13 1 2 3094 14 0

2 2801 13 0 2 2367 14 1

4 5499 13 1 2 2827 14 0

2 3182 13 1 1 2179 14 1

2 3800 13 0 2 2347 14 1

2 2419 13 0 2 1848 14 1

1 3675 13 1 2 3128 14 0

2 2840 13 1 2 2604 14 1

2 3105 13 0 2 3017 14 1

2 2206 13 1 2 3050 14 1

2 2741 13 1 1 2068 14 1

2 2663 13 0 3 3480 14 1

1 1953 14 1 2 2312 14 1

2 2880 14 1 3 3639 14 1

2 2424 14 1 2 2534 14 0

2 2198 14 0

11

Page 12: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 2 Descriptive Statistics For Days And Charges Of The Patients Carrying Managed Care Insurance

Descriptive statistics

  DAYS CHRGSCount 193 193 Mean 2.30 2,966.44 sample variance 1.18 1,429,843.96 sample standard deviation 1.09 1,195.76 Minimum 1 929 Maximum 14 14898 Range 13 13969

population variance 1.17 1,422,435.44 population standard deviation 1.08 1,192.66

standard error of the mean 0.08 86.07

confidence interval 95.% lower 2.15 2,796.67 confidence interval 95.% upper 2.45 3,136.21 half-width 0.15 169.77

empirical rule mean - 1s 1.21 1,770.68 mean + 1s 3.39 4,162.20 percent in interval (68.26%) 88.1% 90.2%

mean - 2s 0.13 574.92 mean + 2s 4.47 5,357.96 percent in interval (95.44%) 99.0% 97.4%

mean - 3s -0.96 -620.84 mean + 3s 5.56 6,553.72 percent in interval (99.73%) 99.0% 99.0%

Skewness 6.77 5.62 Kurtosis 70.34 51.70 coefficient of variation (CV) 47.22% 40.31%

1st quartile 2.00 2,367.00 Median 2.00 2,789.00 3rd quartile 3.00 3,287.00 interquartile range 1.00 920.00 Mode 2.00 2,840.00

12

Page 13: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 3Descriptive Statistics For Days And Charges Of The Patient Carrying Commercial Insurance

Descriptive statistics

  DAYS CHRGS Count 96 96 Mean 2.02 2,714.28 sample variance 0.34 449,350.50 sample standard deviation 0.58 670.34 Minimum 1 902 Maximum 3 4933 Range 2 4031

population variance 0.33 444,669.76 population standard deviation 0.58 666.84

standard error of the mean 0.06 68.42

confidence interval 95.% lower 1.90 2,578.46 confidence interval 95.% upper 2.14 2,850.10 half-width 0.12 135.82

empirical rule mean - 1s 1.44 2,043.95 mean + 1s 2.60 3,384.62 percent in interval (68.26%) 66.7% 79.2%

mean - 2s 0.86 1,373.61 mean + 2s 3.18 4,054.95 percent in interval (95.44%) 100.0% 93.8%

mean - 3s 0.28 703.27 mean + 3s 3.76 4,725.29 percent in interval (99.73%) 100.0% 99.0%

Skewness 0.00 0.60 Kurtosis 0.07 1.62 coefficient of variation (CV) 28.70% 24.70%

1st quartile 2.00 2,219.75 Median 2.00 2,673.50 3rd quartile 2.00 3,118.25 interquartile range 0.00 898.50 Mode 2.00 #N/A

13

Page 14: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 4

MegaStat Output Comparing Managed Care Insurance and Commercial Insurance

Hypothesis Test: Independent Groups (t-test, unequal variance)

CHRGS-MC CHRGS-CI2966.44 2714.28 mean

1192.66 666.84std. dev.

193 96 n

283 df252.16000 difference (CHRGS-MC - CHRGS-CI)109.55447 standard error of difference

0 hypothesized difference

2.30 t.0110 p-value (one-tailed, upper)

36.51497 confidence interval 95.% lower

467.80503 confidence interval 95.% upper

215.64503 margin of error

F-test for equality of variance1422437.9 variance: CHRGS-MC

444675.59variance: CHRGS-CI

3.20 F1.97E-09 p-value

14

Page 15: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 5Descriptive Statistics For Hypothesis Test to Compare Physicians#2's High Charge with All other Physicians

Descriptive statistics

  # 1 Count 252 Mean 2,793.58 sample variance 595,734.48 sample standard deviation 771.84 Minimum 902 Maximum 6710 Range 5808

1st quartile 2,304.00 Median 2,675.50 3rd quartile 3,115.75 interquartile range 811.75 Mode 2,741.00

Appendix 62-Sample t - Test – Unknown Variables for Comparing Physician#2 with All Other Physicians.

Label High CHRG PHYS 2 All Other PHYSMean 3489.49 2793.58Sd 2.70 771.84N 37 252

Label PHYS 4 PHYS 6 PHYS 7 PHYS 10mean 2611.19 2856.76 2559.30 2786.26sd 2.19 2.36 1.98 2.06n 21 25 40 54

Label PHYS 11 PHYS 12 PHYS 13 PHYS 14mean 3028.87 2787.24 3124.55 2742.00sd 2.47 2.03 2.20 2.00n 30 34 20 28

15

Page 16: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 72-Sample T – Test For Comparing High Charge Of Physician#2 With All Other Physicians.

Hypothesis Test: Independent Groups (t-test, unequal variance)

High CHRG PHYS 2 All Other PHYS3489.49 2793.58 mean

2.7 771.84 std. dev.37 252 n

251 df

695.91000 difference (High CHRG PHYS 2 - All Other PHYS)

48.62338 standard error of difference

0 hypothesized difference

14.31 t

1.12E-34 p-value (one-tailed, upper)

600.14820 confidence interval 95.% lower

791.67180 confidence interval 95.% upper

95.76180 margin of error

F-test for equality of variance595736.9856 variance: All Other PHYS

7.29variance: High CHRG PHYS 2

81719.75 F1.49E-81 p-value

16

Page 17: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Appendix 8Regression Analysis with Independent Variable DAYS and Dependent Variable CHRGS

Regression Analysis

r² 0.641 n 289 r 0.801 k 1

Std. Error 633.703 Dep. Var. CHRGS

ANOVA table

Source SS df MS F p-value

Regression 206,041,299.158

4 1 206,041,299.158

4 513.08 7.50E-66

Residual 115,253,469.914

2 287 401,580.0345

Total 321,294,769.072

7 288      

Regression output confidence interval

variables coefficients std. error t (df=287) p-value95%

lower 95% upperstd.

coeff.

Intercept 930.7042 93.8922 9.912 4.13E-

20745.899

7 1,115.508

8 0.000

DAYS 884.2014 39.0355 22.651 7.50E-

66807.369

1 961.0336 0.801

Appendix 9 Scatter plot for DAYS and CHRGS

17

Page 18: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

0 2 4 6 8 10 12 14 160

2000

4000

6000

8000

10000

12000

14000

16000

f(x) = 884.201381229977 x + 930.704217215483R² = 0.641284325148251

CHRGS vs. DAYS Regression Analysis

DAYS

CH

RG

S

Appendix 10Regression Analysis for independent variable PHYS and dependent variable CHRGS

Regression Analysis

r² 0.009 n 289 r -0.092 k 1

Std. Error 1053.532 Dep. Var. CHRGS

ANOVA table

Source SS df MS F p-valueRegression 2,744,807.3552 1 2,744,807.3552 2.47 .1169

Residual 318,549,961.717

5 287 1,109,930.1802

Total 321,294,769.072

7 288      

Regression output confidence interval

variables coefficients std. error t (df=287) p-value 95% lower 95% upperstd.

coeff.

Intercept 3,105.4358 154.6158 20.085 1.21E-

56 2,801.1111 3,409.7604 0.000 PHYS -25.5667 16.2580 -1.573 .1169 -57.5667 6.4333 -0.092

Appendix 11Regression Analysis for independent variable PAYOR and dependent variable CHRGS

Regression Analysis

r² 0.013 n 289 r 0.113 k 1

18

Page 19: GM533 Final Project Paper

Bryant/Smith Case 42: Hospital Charges

Std. Error 1051.328 Dep. Var. CHRGS

ANOVA table

Source SS df MS F p-valueRegression 4,076,432.1016 1 4,076,432.1016 3.69 .0558

Residual 317,218,336.971

0 287 1,105,290.3727

Total 321,294,769.072

7 288      

Regression output confidence interval

variables coefficients std. error t (df=287) p-value 95% lower 95% upperstd.

coeff.

Intercept 2,714.2813 107.3007 25.296 4.90E-

75 2,503.0851 2,925.477

4 0.000 PAYOR 252.1592 131.3025 1.920 .0558 -6.2787 510.5971 0.113

ReferencesBowerman. Essentials of Business Statistics. McGraw-Hill Irwin.

Bryant-Smith. (2003). Bryant-Smith: Practical Data Analysis Volume I. The McGraw-Hill companies.

Farflex, Inc. (n.d.). Mean, median and Mode. Retrieved 02 18, 2010, from The Free Dictionary: http://encyclopedia2.thefreedictionary.com/mean,+median,+and+mode

19