toward better health care service: statistical and machine ...1136114/fulltext01.pdf · statistical...

IN DEGREE PROJECT ELECTRICAL ENGINEERING,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2017

Toward Better Health Care Service: Statistical and Machine Learning Based Analysis of Swedish Patient Satisfaction Survey

YU WANG

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING

Abstract

Patients as a customer of health care service has rights to evaluate the servicethey received, and health care providers and professionals may take advantageof these evaluations to improve the health care service. To investigate the rela-tionship between patients overall satisfaction and satisfaction of specific aspects,this study uses classical statistical and machine learning based method to ana-lyze Swedish national patient satisfaction survey data.

Statistical method including cross tabulation, chi-square test, correlationmatrix and linear regression identifies the relationship between features. It isfound that patients’ demographics have a significant association between overallsatisfaction. And patients responses in each dimension show similar trend whichwill contribute to patients overall satisfaction.

Machine learning classification approaches including Naıve Bayes classifier,logistic regression, tree-based model (decision tree, random forest, adaptiveboosting decision tree), support vector machines and artificial neural networksare used to built models to classify patients overall satisfaction (positive ornegative) based on survey responses in dimensions and patients’ demographicsinformation. These models all have relatively high accuracy (87.41%–89.85%)and could help to find the important features of health care service and henceimprove the quality of health care service in Sweden.

Sammanfattning

Patienter som kund av hlsovrdstjnsten har rtt att utvrdera den tjnst de ftt, ochvrdgivare och yrkesverksamma kan utnyttja dessa utvrderingar fr att frbttravrden. Fr att underska frhllandet mellan patientens vergripande tillfredsstl-lelse och tillfredsstllelse av specifika aspekter anvnder den hr studien klassiskstatistisk och maskinbaserad metod fr att analysera svenska nationella patien-tunderskningsdata.

Statistisk metod, inklusive tvr tabulering, chi-square test, korrelationsmatrisoch linjr regression identifierar frhllandet mellan funktioner. Det r konstateratatt patienternas demografi har en betydande koppling mellan vergripande till-fredsstllelse. Och patientens svar i varje dimension visar en liknande trend somkommer att bidra till patientens vergripande tillfredsstllelse.

Klassificeringsmetoder fr maskininlrning, inklusive Naıve Bayes-klassificeraren,logistisk regression, trdbaserad modell (beslutstrd, slumpmssigt skog, adaptivtkar beslutstratt), stdvektormaskiner och konstgjorda neurala ntverk anvnds fratt bygga modeller fr att klassificera Patientens vergripande tillfredsstllelse (pos-itiv eller negativ) baserat p underskningsresponser i dimensioner och patientersdemografiinformation. Dessa modeller har alla relativt hg noggrannhet (87.41%- 89.85%) och kan hjlpa till att hitta de viktigaste egenskaperna hos vrden ochdrmed frbttra kvaliteten p vrden i Sverige.

Acknowledgment

First and foremost, I would like to express my gratitude to my thesis advisorSaikat Chatterjee. As my advisor Professor Chatterjee helped me go throughall of my research, offering constructive advice about experiment design, givingprompt and insightful feedback, and so forth. I would also like to thank thecommittee members, Professor Markus Flierl and Professor Anne Hkansson fortheir precious time, valuable suggestions and comments.

In addition, I would like to thank all the staff at IC Quality. And I especiallywant to thank Nils Press for his instructions have contributed a great deal tomy thesis writing.

Also thanks to all my class fellows who kindly supported each other in thepast two years.

Especially, I would like to express my deepest love to my parents, for theirlove, selfless support and encouragement to finish my studies.

All of the above help has contributed much to my study and research, andI express my gratitude again to all of them.

Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 This study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Background 42.1 Patient satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Swedish national patient survey . . . . . . . . . . . . . . . . . . . 5

2.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 52.2.2 Measurement scales . . . . . . . . . . . . . . . . . . . . . 52.2.3 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3 Survey Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Related work 93.1 Statistical based approaches . . . . . . . . . . . . . . . . . . . . . 93.2 Machine learning based approaches . . . . . . . . . . . . . . . . . 10

4 Method 124.1 Data pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.1.1 Missing data and imputation methods . . . . . . . . . . . 124.1.2 Weight calculation in dimension . . . . . . . . . . . . . . 154.1.3 Balancing data . . . . . . . . . . . . . . . . . . . . . . . . 164.1.4 Dummy coding . . . . . . . . . . . . . . . . . . . . . . . . 16

4.2 Basic analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2.1 Cross tabulation . . . . . . . . . . . . . . . . . . . . . . . 174.2.2 Pearson’s chi-square test . . . . . . . . . . . . . . . . . . . 174.2.3 Correlation matrix . . . . . . . . . . . . . . . . . . . . . . 184.2.4 Linear regression . . . . . . . . . . . . . . . . . . . . . . . 20

4.3 Machine learning classification techinques . . . . . . . . . . . . . 214.3.1 Naıve Bayes classifier . . . . . . . . . . . . . . . . . . . . 224.3.2 Logistic regression . . . . . . . . . . . . . . . . . . . . . . 234.3.3 Tree-based model . . . . . . . . . . . . . . . . . . . . . . . 234.3.4 Support vector machines (SVM) . . . . . . . . . . . . . . 264.3.5 Artificial neural networks . . . . . . . . . . . . . . . . . . 27

i

5 Experiments and results 305.1 Experiment design . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.1.1 Data split . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.1.2 Data pre-processing . . . . . . . . . . . . . . . . . . . . . 305.1.3 Cross validation . . . . . . . . . . . . . . . . . . . . . . . 305.1.4 Grid search . . . . . . . . . . . . . . . . . . . . . . . . . . 315.1.5 Evaluate model . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2 Naıve Bayes classifier . . . . . . . . . . . . . . . . . . . . . . . . . 335.3 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . . . 345.4 Tree-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.4.1 Decision tree . . . . . . . . . . . . . . . . . . . . . . . . . 355.4.2 Random forest . . . . . . . . . . . . . . . . . . . . . . . . 355.4.3 Adaptive boosting decision tree . . . . . . . . . . . . . . . 37

5.5 Support vector machines (SVM) . . . . . . . . . . . . . . . . . . 375.6 Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . 385.7 Model performance comparison . . . . . . . . . . . . . . . . . . . 39

6 Conclusions and Future Work 41

A 2016 patient satisfaction survey 43

B Basic analysis results 47B.1 Chi-square test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

B.1.1 Education and overall satisfaction . . . . . . . . . . . . . 47B.1.2 Gender and overall satisfaction . . . . . . . . . . . . . . . 47B.1.3 Occupation and overall satisfaction . . . . . . . . . . . . . 48B.1.4 Question X000031 and overall satisfaction . . . . . . . . . 48B.1.5 Question X000032 and overall satisfaction . . . . . . . . . 48

B.2 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

ii

Chapter 1

Introduction

Over the past decade, patient satisfaction have gained increasing attention, as itsan important and widely accepted measure of care efficiency [1]. Consumerismof today’s society has led to competitive health care environment, therefore,health care providers take patient perception into account when designing thestrategies for quality improvement.

The measurement of patient satisfaction could be described as the differencebetween the perceived and expected satisfaction for each dimension. Patients’satisfaction is related to the extent to which general health care needs are met.And patients today have a higher level of education who are increasingly hopeto learn more about their health conditions, and even to participate in planningtheir own health care and decision-making [2]. Therefore, the quality of healthcare services cannot be seen only from the health care providers based on theirprofessional standards and assessment.

Moreover, patient involvement influenced quality in health care [3]. Onthe one hand, patients experiences helps to point out improvement areas thathealth care professionals had not previously recognized. On the other hand,patient involvement in quality improvement projects helped to fill existing gapsbetween organizational functions, supporting a view of care from a patient-process perspective. Thus, patient involvement contributed to an extended viewof quality dimensions in health care.

The use of the survey instrument to identify issues of concern to patients andto provide feedback to health care service providers is a market-driven approachof turning patient satisfaction surveys into a quality improvement tool for healthcare providers. For example, evaluation of patient satisfaction was mandatoryfor all French hospitals since 1996 [4]. In Germany, it is required to measurepatients satisfaction as an element of quality management reports since 2005 [5].In Sweden, all county councils and regions have been involved in the NationalPatient Survey since 2009.

1.1 Motivation

The Swedish health care provider shows very good performance on medicaloutcomes. For instance, Sweden has high cancer survival rates compared withother Western countries [6]; and has low infant mortality rates compared with

1

other European countries and the United States [7]. By contrast, comparedto other Western countries, Swedish healthcare performs poorly in informingpatients, enabling them to participate and take on a more active role [8].

As mentioned, understanding the patients’ value-creation process from heror his perspective is at focus for health care providers that want to enhancethe patients’ perceived value. National patient survey is a powerful tool tounderstand the patients’ thought.

The survey is implemented by company IC Quality 1, on behalf of Sweden’scounty councils and regions in the cooperation and coordination of the SverigesKommuner och Landsting (SKL). And IC Quality is today in possession ofenormous amounts of survey data about patient experience, and the data isconstantly growing. However, the best stories in the survey data remain hiddenbehind rows and columns in tables that are too dicult or expensive to customize.Compared to computers, the human mind is weak at performing calculationsbut much stronger at recognizing patterns. Average person could not derivemeaning and get help from the voluminous data set.

1.2 Problem statement

Patient satisfaction has proved a difficult concept to measure and its validityand usefulness have been increasingly questioned. Classical statistical methods(descriptive statistics) are now used to analyze the Swedish national patientsurvey data 2 in Sweden. Bar charts are made to show the average positiveresponse. However, no model-based approach are used currently.

The purpose of the work is to provide better medical services access to thepatients and helps the health care organizations in various medical managementdecisions. Using modern techniques—machine learning, building classificationmodels for the health care survey data in Sweden to predict patient attitudetoward health care service (positive or negative).

With the model build, health care providers could easily found the factorwhich influence the patients’ satisfaction. Thus, narrow the gap between ex-pectations (what patient want) and perceptions (what patient get) of relevantservice attributes.

1.3 This study

This thesis is composed of six parts. The first chapter, Chapter One, is an intro-duction, which introduces the research problem, and the structure of the thesis.Chapter Two addresses research background, the significance of the patientsatisfaction, Swedish national patient survey and the data collected. Chap-ter Three reviews the related work about this domain, including two parts:statistical based approaches and machine learning based approaches. Chap-ter Four focuses on the methodology of this research, which includes the datapre-processing method, classic statistical techniques and machine learning tech-niques. Chapter Five presents experiment design for this research, and the

1Official site of IC Quality is https://icquality.se/2The result of survey data in 2015 could be found at https://patientenkat.se/sv/

resultat/ta-del-av-resultat/

2

https://icquality.se/

https://patientenkat.se/sv/resultat/ta-del-av-resultat/

https://patientenkat.se/sv/resultat/ta-del-av-resultat/

results and discussion about the outcomes. In Chapter Six, the research con-clusion is drawn through summarizing the above assessment results. Besides,limitations and implications of this research are also discussed in last part.

3

Chapter 2

Background

In this chapter, the background to the problem is addressed. The backgroundconsists of a theory section about patient satisfaction and sections introduce theSwedish national patient survey and the survey data.

2.1 Patient satisfaction

The key to a successful business is understanding one’s customers, knowing whatthey want and how satisfied they are with one’s product and service. Would itbe more appropriate to address the patients as “customer”? The word customeris defined as “a person who purchases goods or services” [9]. Today the patientsees himself as a buyer of health services. Therefore, there is a need to recognizethat every patient has certain rights, which puts a special emphasis on to thedelivery of quality health care.

In 1948, The World Health Organization (WHO) defined health as a “astate of complete physical, mental, and social well being not merely the absenceof disease or infirmity” [10]. The traditional method to patient assessment islargely based on observer ratings by health professionals. Modern medicine isslowly beginning to recognize the importance of the perspective of the patientin health care. More investigations are needed to understand the importance ofthe patient satisfaction [11].

Customer satisfaction is dened as “The degree of satisfaction provided bythe goods or services of a company as measured by the number of repeat cus-tomers” [12]. Evaluating to what extent patients are satisfied with health ser-vices is clinically relevant, as satisfied patients are more likely to take an activerole in their own care [13] and comply with treatment [14], to continue usingmedical care services and stay within a health provider. In addition, patients’views on health care is an important source to continually improve the healthcare. Health professionals may take advantage of satisfaction surveys that iden-tify potential areas for service improvement and health expenditure may beoptimized through patient-guided planning and evaluation [15].

However, customer satisfaction is not necessarily observed directly. In asocial science context, analysis of such measures is done indirectly by employ-ing proxy variables (also referred to as latent variables). In order to measurecustomer satisfaction, survey questionnaires are used, in which respondents are

4

asked to express their degree of satisfaction with regard to multiple aspectsof the product or service. The survey contains a set of observed variables,some attributes are objective, related to the services technical characteristics;and others are subjective, dealing with behaviours, feelings and psychologicalbenets.

2.2 Swedish national patient survey

All county councils and regions in Sweden have since 2009 been involved in theNational Patient Survey. The work is coordinated by the Swedish Associationof Local Authorities and Regions. National joint surveys have been conductedevery two years in primary care, somatic outpatient and inpatient, emergency,psychiatric outpatient and inpatient care, child care outpatient and inpatientcare and child psychiatry. By repeating measurements they have continuouslycollected knowledge about patients views on the received care [16].

2.2.1 Implementation

Implementation of the measurements takes four steps—planning, implementa-tion, evaluation and joint work. Firstly, planning the measurement period,participant and methodological choices based on actual needs. Secondly, thesupplier collects data from patients. Thirdly, when the survey data collected,an evaluation of the process will be made and propose possible changes for futuremeasurements. Finally, SKL is responsible for the guidelines for the conduct ofexaminations and management of historical results.

The survey will send to people who have recently visited the health serviceand they are asked to evaluate their recent visit. No results will be traceableto individual persons. The questionnaire is carried out in accordance with theData Protection Act and the industry codes of conduct. For example, the detailof the implementation of Swedish national patient survey in 2015 is shown inTable 2.1 [16].

2.2.2 Measurement scales

The National Patient Survey used a 5-point scale, called a Likert scale, wherethe minimum integer value represents ‘complete dissatisfaction’ or ‘completelydisagree’ while the maximum value stands for ‘complete satisfaction’ or ‘com-pletely agree’. In addition to the 5-point scale given entry “Not applicable”.

The advantages of a five-point response scale is that it allows the respondentto take a neutral position on an issue while providing freedom respondent to ratetheir positive or negative experience. With this response scale we can commenton the significant differences even in small populations and over time.

2.2.3 Dimensions

The National Patient Survey contains seven dimensions, whose question in thequestionnaire belongs to one of it [16]. The definition of each dimension is shownbelow:

5

Table 2.1: Implementation of Swedish national patient survey in 2015Measurement Primary care autumn 2015

Surveycompany

IC Quality

SurveyMethodology

Postal questionnaire (Mailing, postal digital invitationresponse ability and a reminder containing postal

questionnaire with digital response ability)

SamplingAll medical appointments at the health care unit inthe selection period (some county councils/regions

have also chosen to measure visits to the nurse)Selectionperiod

September 2015

InclusionIndividual visits per care (Healthcare/medical

center)/All ages/Utomlns Patients

SurveyVrdbas and primary module (additional questions for

some counties)

LanguageSwedish (digitalt- and postal reply possibility),

Spanish, French, English, Arbabiska, Farsi, Finnish,Somali (digital answers possibility)

Time forresponse

October 12-November 17

Participatingcounties /

regionsAll 21

Emotional support

This dimension is intended to show whether the patient feels that staff is activeand responsive for the patient’s anxiety, concern, fear or pain, and in turn,accessible and supportive of same for the patient satisfactory manner.

Information and knowledge

This dimension is intended to show how well patients find the treatment isinform or communicate based on individual circumstances and in a proactivemanner. It is, for example, information on delays or waiting times, the patientmust be asked questions in an understandable way, the patient is informedabout the treatment/medication/warning signs that he should pay attention,etc. The dimension shows the patient’s perception of how well the relatedparties involved.

Participation and involvement

This dimension is intended to show whether the patient feels himself involvedand participate in his own care and in the decisions making at the same time.

6

Continuity and coordination

This dimension is intended to show patients’ experience of health care capacityfor continuity and coordination. This means how well the individual’s care iscoordinated, both internally and externally. It includes how patients experiencethe staff’s ability to cooperate with each other and in relation to the patient.

Availability

This dimension refers to the patient’s experience of health care accessibility inboth proximity and contact channels, and staff availability for the patient aswell as for families.

Respect and treatment

This dimension illuminates the patient’s experience of health care capacity for atreatment tailored to individual needs and conditions. For example, whether thehospitality is characterized by respect on the basis of equal worth, compassion,commitment and or health care. This dimension is related to the dimensionParticipation and involvement.

Overall satisfaction

This dimension shows the patient’s experience of health care in terms of overallexperience, perceived effectiveness, clinical outcomes and security, etc.

2.3 Survey Data

In 2016, 186686 questionnaires1 was sent with an response rate of 42%. The78575 respondents varied by gender (39.4% men), age (range 15 - 104 years),education, employment status. An basic analysis of respondents demographicsis presented in Figure 2.1, Table 2.2 and Table 2.3.

Table 2.2: Respondents by education

Education Frequency (%)

Post-secondary education, university or college 31463 40.0High school or equivalent 23904 30.4Elementary school, folk school or equivalent 19485 24.8Others 2480 3.2No completed education 1242 1.6

Along with questions about the patients, and the remaining thirty-one ques-tions are about patient satisfaction in seven dimensions which will be analyzedto help improve the health care quality.

1The 2016 Swedish national patient satisfaction survey questionnaire could be found atappendix

7

Figure 2.1: Age distribution of respondents.

Table 2.3: Respondents by employment status

Education Frequency (%)

Pensioner 41382 52.7Employee 28557 36.3Student 2793 3.6Other 2526 3.2Unemployed 1816 3.3Missing 1497 1.9

8

Chapter 3

Related work

This section introduces the related work in this domain, including statisticalbased approaches and machine learning based approaches.

3.1 Statistical based approaches

Statistical tools and techniques play a major role in many areas of science likebiology, economics, physics etc. In particular, the science of survey sampling,survey data collection methodology, and the analysis of survey data date backa little more than one hundred years.

Descriptive statistics as a classic tool has been used for survey data analysisfor a long time [17, 18]. Estimation of totals, means, variances, and distributionpercentiles for survey variables may be the primary objective of an analysisplan or possibly an exploratory step on a path to a more multivariate treatmentof the survey data. Major reports of survey results can be filled with detailedtabulations of descriptive estimates [19]. Descriptive statistics are applied in [20]to help health care providers to estimate of his or her patients satisfaction,by summarizing the measurement of diagnostic accuracy, including sensitivity,specicity, positive predictive value (PPV), and negative predictive value (NPV).

During 1950–1990, analytical treatments of survey data expanded as newdevelopments in statistical theory and methods were introduced, empiricallytested, and refined. Important classes of methods that were introduced dur-ing this period included log-linear models and related methods for contingencytables, generalized linear models [21] (e.g., logistic regression), survival analy-sis models, general linear mixed models [22] (e.g., hierarchical linear models),structural equation models, and latent variable models. Many of these new sta-tistical techniques applied the method of maximum likelihood to estimate modelparameters and standard errors of the estimates, assuming that the survey ob-servations were independent observations from a known probability distribution(e.g., binomial, Poisson, normal).

Regression model still plays an important role in modern survey analysis.Linear regression models are used to analyze cross-sectional survey data in [23],to examine the relationship between physician advice to quit smoking and pa-tient care experiences. [24] estimated linear ordinary least squares regressionequations, using national random telephone survey data, to test for direct effects

9

of parenthood on measures of punitive attitudes toward juveniles and adults andoverall. [25] performed multilevel regression analyses to identify factors asso-ciated with lack of preventive dental care among U.S. children and state-levelfactors that explain variation in preventive dental care access across states.Logistic regression are uesd in [26] to examined the relationship of alcohol out-let density (AOD) and neighborhood poverty with binge drinking and alcohol-related problems among drinkers in married and cohabitating relationships andassessed whether these associations differed across sex. [27] try to estimate theassociation between residential mobility and non-medical use of prescriptiondrugs (NMUPD), adjusting for potential confounders using logistic regressionfor survey data.

3.2 Machine learning based approaches

Machine learning as a modern analysis technique that has potential to helpresearchers in survey analysis. At this analytical stage, data analysis can beeither confirmatory. or exploratory In statistical data analysis, t-tests and anal-ysis of variance are examples of confirmatory analysis, and factor analysis is acommon exploratory technique. By contrast, machine learning algorithms arepredominantly exploratory [28].

Machine learning provides methods, techniques, and tools that can helpsolving diagnostic and prognostic problems in a variety of medical domains.Successful implementation of machine learning methods could help the work ofmedical experts and ultimately to improve the efficiency and quality of medicalcare.

Medical diagnostic reasoning is a very important application area of machinelearning. With machine learning, patient records and their accurate diagnosesare input into a computer program to execute a learning algorithm. The re-sulting classifier can subsequently be used to help physicians to diagnose newpatients. For example, Bayesian classifier and backpropagation learning of neu-ral networks are used to improve the predictive power of the ischaemic heartdisease (IHD) diagnostic process.

Another field of application is biomedical signal processing. Machine learn-ing methods use sets of physiological signals data, which can be produced easier,and can help to model the nonlinear relationships that exist between these data,and extract parameters and features which can improve medical care. For in-stance, [29] introduces a machine learning based signal processing approach toachieve stress cognition, K-nearest neighbor (KNN) and support vector machine(SVM) were used to classified the stress into three levels based on collected bio-logical data such as Respiration, GSR Hand, GSR Foot, Heart Rate and EMG.

Nevertheless, the use of machine learning to help health care domain isnot limited to clinical support. [30] apply a regression tree (Cubist) model forpredicting the Length of Stay (LOS) for hospitalized patientsy, which could bean effective tool for healthcare providers to enable more efficient utilization ofmanpower and facilities in hospitals. Naıve Bayes algorithm are used in [31] toimprove the outcomes and decrease the cost of health care delivery by reducingpreventable hospital readmissions rates .

Furthermore, machine learning techniques also play a significant role in pa-tient satisfaction assessment. Nonlinear decision trees were used to classify

10

patient satisfaction based on survey response data of emergency departmentsin [32]. Logistic regression models were applied in [33] based on survey datato identify process measures that significantly influence patient satisfaction inemergency departments. [34] analyzed telephone interview data and sociode-mographics of hospital patients with logistic regression to investigate the mainpredictors of patient satisfaction in municipal emergency departments. Besides,machine learning could also deal with unstructured, free-text information aboutthe quality of health care. [35] tried to predict patient satisfaction from theirfree-text descriptions on the NHS Choices website using sentiment analysis basedon machine learning and natural language processing techniques.

11

Chapter 4

Method

This chapter describes the method including data pre-processing and data anal-ysis techniques used to solve the proposed problem.

4.1 Data pre-processing

4.1.1 Missing data and imputation methods

Missing values are a common problem in many data sets and seem especiallywidespread in social and economic studies, including our patient satisfactionsurveys. Since patients may fail to express their satisfaction level concerningtheir experience with a specic condition because of lack of interest, unwillingnessto criticize, or other reasons.

Missing data

There are two situations found in missing data: unit nonresponse and itemnonresponse. When a selected patients does not provide any of the informationbeing sought, unit nonresponse occurs. When a patient responds to some itemsbut not to others, item nonresponse occurs. Figure 4.1 shows the missingnessmap (shows where missingness occur) of the patient satisfaction survey datawith black represents missing data, while grey represents observed data.

There are three missing-data mechanisms as described in [36]:

• MCAR (Missing Completely At Random): the missing data are indepen-dent on any variable observed in the data set.

• MAR (Missing At Random): the missing data may depend on variablesobserved in the data set, but not on the missing values themselves.

• NMAR (Not Missing At Random, or NI, Non-Ignorable): the missing datadepend on the missing values themselves, and not on any other observedvariable.

The method to deal with missing data are dependent on why the data ismissing. Only the first two types of missing data will be considered for impu-tation since the NMAR is difficult to construct an imputation model based on

12

unobserved data. When data are missing from the responses to a questionnaire,as pointed out by [37], it is more likely that the missingness mechanism is MARthan MCAR. Thus, we could use imputation method on our survey data.

Figure 4.1: Missingness map of the patient satisfaction survey data

Imputation methods

According to [36], methods for analyzing incomplete data can be grouped intofour main categories:

1. Procedures Based on Completely Recorded Units: analyze subsets of thedata set without missing data, discards incompletely recorded units andanalyzes only the units with complete data.

2. Weighting Procedures: deal with unit nonresponse by increasing the sur-vey weights for responding units in the attempt to adjust for nonresponseas if it was part of the sample design.

3. Imputation-Based Procedures: ll in missing values with plausible values,where the resultant completed data are then analyzed by standard meth-ods as if there never were any missing values.

4. Model-Based Procedures: define a model for the observed data and infer-ences are based on likelihood or Bayesian analysis.

A missing-data method is required to yield statistically valid answers for sci-entic estimands including population means, variances, correlation coefcients,and regression coefcients. By a scientic estimand we mean a quantity of scienticinterest that can be calculated in the population and does not change its valuesdepending on the data collection design used to measure it. And only multi-ple imputation and model-based procedures can lead to valid inferences. Here

13

we use model-based procedures (K-nearest neighbor) do deal with the missingvalues in our survey data.

Imputation methods based on models are generally consist of creating apredictive model to estimate values that will substitute the missing items. Theseapproaches model the missing data estimation based on information availablein the data set. If the observed data contain useful information for predictingthe missing values, an imputation procedure can make use of this informationand maintain high precision.

K-nearest neighbor imputation

The K-nearest neighbor imputation (KNN) imputation algorithm uses only sim-ilar cases with the incomplete pattern (item nonresponse). Given an incompletepattern x, this method selects the K closest cases that are not missing values inthe attributes to be imputed, such that they minimize some distance measure.An example of the simple algorithm is shown in Figure 4.2.

Figure 4.2: Example of KNN. The incomplete pattern with blue star shouldbe imputed. If k = 3 it is assigned to the first class(green rectangles) becausethere are 2 rectangles and only 1 triangles inside the inner circle. If k = 5 it isassigned to the second class because there are 6 triangles and only 4 rectangles.

The idea behind KNN imputation is to take advantage of positive corre-lations between rows. Once the K nearest neighbours have been found, a re-placement value for the missing attribute value must be estimated. The type ofdata determine the method how replacement value is calculated. For example,the mode is frequently selected for discrete data, while the mean is used fornumerical data. In our survey data, we use Likert data where the magnitudeof a value matters, we will instead use the median for estimating a replacementvalue. The algorithm is as follows [38]:

1. Divide the dataset into two groups: Dcomplete represent observations withcomplete item information and Dmissing represents the set containing theobservations in which at least one of the items is missing.

14

2. For each vector x in Dmissing:

(a) Divide the missing observation vector into two parts: observed andmissing.

(b) Calculate the distance between the missing observation and all theobservations vectors from the complete group Dcomplete.

3. Calculate the weight of k observation vector.

4. Use K-nearest neighbors and estimate the missing values.

5. Repeat the algorithm for multiple variables fill in the missing value.

The use of a small number of k may represent a good compromise betweenperformance and need to preserve the original distribution of data and here weapply 5-nearest neighbour imputation for the dataset.

4.1.2 Weight calculation in dimension

Patient satisfaction survey consists of seven dimensions as mentioned in Sec-tion 2.2.3, and each dimension consists of some questions. These questions areweighted which contribute to the overall satisfaction for within each dimension.Based on the collected data, an algorithm based on principal component anal-ysis (PCA) is applied to determines the weights. This work is already done bythe IC Quality and detailed theoretical explanation is beyond the scope of thisproject.

Our focus is that based on the weight provided, our prediction target couldbe calculated within dimension ‘Overall satisfaction’. The calculations are doneat the individual level, as shown below:

• Respondent A has answered all the questions included in the dimension‘Overall satisfaction’, the question weight is the same with the originalcalculated weight based on the algorithm, for him/her:

– weight for question 1 is 0.34



• Respondent B has only replied to two of the questions (for example, 1 and3) in the dimension, for him/her:


w1 =0.34(initial weight)

0.34 + 0.43(sum of the weight of the question 1 and 3)= 0.44

– weight for question 2 is 0


w3 =0.43(initial weight)

0.34 + 0.43(sum of the weight of the question 1 and 3)= 0.56

15

Now each patient and question answered by each patient has a special weightin each dimension. The weighted score Sweighted in the dimension could becalculated:

Sweighted =

N∑i=1

wisi

with wi represents weight for question i and si represents patients’ response toquestion i.

4.1.3 Balancing data

For classification problems, the practical dataset are usually imbalanced; i.e., atleast one of the classes constitutes only a very small minority of the data. In thiscontext, the concern is how to correctly classify the “rare” class. However, themost commonly used classification algorithms do not work well for imbalancedata classification because they aim to minimize the overall error rate, ratherthan paying special attention to the “rare” class.

There are three common approaches to solve the problem of imbalanceddataset [39]. The first is data-level methods that modify the collection ofsamples to balance the distribution of the dataset. The second approach isalgorithm-level methods that directly modify existing learning algorithms tosensitive learning which assign a high cost to mis-classification of the minorityclass. The third methods is hybrid methods which combine the data-level andalgorithm-level method.

In this study, we will use data-level method to tackle the problem. By modi-fying the training set to make it balanced distributed. Random under-samplingis performed to balance class distribution by randomly removing majority classexamples until the majority and minority class instances are balanced.

4.1.4 Dummy coding

Some analytical methods have been initially designed for continuous variablessuch as support vector machine (SVM) which will be used in this study. Andthese methods can handle nominal variables (education, occupation, gender,etc. in our case) if they are coded appropriately. Here we will use dummycoding which is a bitwise representation of the discrete variable. For nominalattributes, new attributes are created. In every sample, the new attribute whichcorresponds to the actual nominal value of that sample gets value 1 and all othernew attributes get value 0. Take “gender” (male or femaile) as an example, itwill be replaced by two dummies, i.e. “gender = male” and “gender = female”.As shown in Table 4.1, the “gender = male” is 1 if the “gender” variable is male,otherwise it is 0; the “gender = female” is 1 if the “gender” variable is female,otherwise it is 0.

Table 4.1: Dummy coding for “gender”

gender gender = male gender = female

male 1 0female 0 1

16

4.2 Basic analysis

In this section, classicial statistics are used to analyzed the survey data.

4.2.1 Cross tabulation

Cross-tabulation (also called contingency table) analysis [40], is most often usedto analyze categorical data. A cross-tabulation is a dimensional table thatrecords the counts of respondents that have the specific characteristics describedin the cells of the table. Here we use cross tabulation table to analysis therelationship between the overall satisfaction and the categorical variables in thepatient satisfaction survey, i.e. gender, education, occupation and two questionsstated the fact but not attitude of the health care visit. The result is shown inTable 4.2

Table 4.2: Cross tabulation of the survey overall satisfaction1

Variables ValueUnsatisfied Satisfied

Count Proportion Count Proportion

GenderMan 5600 0.181 25391 0.819

Woman 10389 0.218 37195 0.782

Education

NA 510 0.206 1963 0.79412 3144 0.161 16341 0.83923 5061 0.212 18843 0.78834 6935 0.22 24528 0.7845 336 0.271 906 0.729

Occupation

NA 334 0.224 1156 0.776Employee 7177 0.251 21380 0.749

Unemployed 518 0.285 1298 0.715Student 936 0.335 1857 0.665

Pensioner 6259 0.151 35123 0.849Other 761 0.301 1765 0.699

Question X0000316Yes 4767 0.104 41104 0.896No 6858 0.489 7153 0.511NA 4119 0.236 13307 0.764

Question X0000327Yes 5428 0.137 34279 0.863No 7434 0.442 9373 0.558NA 2994 0.139 18477 0.861

4.2.2 Pearson’s chi-square test

The Pearson’s chi-square test provides the mechanism to test the statisticalsignificance of the cross-tabulation table [40]. Chi-square tests whether or not

1Due to the space limitation, some of the variable names and values are in footnotes2Elementary school, folk school or equivalent3High school or equivalent4Post-secondary education, university or college5No completed education6Feel pain during visit to health care center7Discuss patients’ improvement to health care

17

the two variables are independent.If the variables are independent (have no relationship), then the results of

the statistical test will be “non-significant” which means that there is no rela-tionship between the variables. If the variables are related, then the results ofthe statistical test will be “statistically significant” which means that we canstate that there is some relationship between the variables.

The chi-square test is based upon a chi-square distribution. The chi-squareχ2 statistic is computed as

χ2 =

n∑i=1

(f0 − fe)2

fe

where f0 and fe are the observed and expected frequencies for each of thepossible outcome, respectively.

Contingency coefficient C which range from 0 (no association) to 1 (thetheoretical maximum possible association) provides the magnitude of associationbetween the attributes in the cross tabulation. Chi-square tests the significanceof an association between two attributes; while contingency coefficient helps toknow the extent of association between those attributes. Contingency coefficientcould be calculated by

C =

√χ2

χ2 +N

where N is the sum of all frequencies in contingency table.SPSS is used to do Person’s chi-square test. In SPSS, p value concept is

used to test the relationship between two variable. The chi-square is said to besignificant at 5% level if the p value is less than 0.05 and is insignificant if it ismore than 0.05.

The result of chi-square tests are shown in Appendix B which implies thatthere is a significant association between overall satisfaction and patients’ gen-der, education, occupation and visit fact.

4.2.3 Correlation matrix

Correlation matrix is used here to find the relationship between different re-sponses of the survey data. Here the responses are in Likert scale. Likert scaledata is a type of ordinal data, thus their relationship should both consider quan-titative and ranking of the data. So both Pearson and Spearman correlationcoefficient should be used.

Pearson correlation matrix

Pearson correlation matrix is a matrix of product moment correlation coeffi-cients r. Product moment correlation coefficient is an index which provides themagnitude of linear relationship between any two variables [40]. r is given by

r =

∑(x−mx)(y −my)√∑

(x−mx)2∑

(y −my)2

where mx and my are the mean of x and y variables.

18

In addition, r ranges from -1 to +1. The positive value of r means higherscores on one variable tend to be paired with higher scores on the other, or lowerscores on one variable tend to be paired with lower scores on the other. On theother hand, negative value of r means higher scores on one variable tend to bepaired with lower scores on the other and vice versa.

Here we will analyzed the relationship between question responses in Likertscale data. The result is shown in Figure 4.3.

Figure 4.3: Pearson correlation matrix of question responses in survey

One of the main limitations of the correlation coefficient is that it measuresonly linear relationship between the two variables. Thus, correlation coefficientshould be computed only when the data are measured either on interval scaleor ratio scale.

Spearman correlation matrix

The Spearman correlation method computes the correlation between the rank ofx and the rank of y variables. The Spearman correlation between two variableswill be high when observations have a similar rank (i.e. relative position labelof the observations within the variable: 1st, 2nd, 3rd, etc.) between the two

19

variables, and low when observations have a dissimilar rank between the twovariables. ρ is given by

ρ =

∑(x′ −mx′

)(y′ −my′

)√∑

(x′ −mx′ )2∑(

y′ −my′

)2where x

′= rank (x) and y

′= rank (y).

Figure 4.4: Spearman correlation matrix of question responses in survey

Figure 4.3 and Figure 4.4 shows same trend of correlation both in quantita-tive and ranking measurement, we could somehow believe that these questionshave strong link.

4.2.4 Linear regression

It is common in practice to use linear regression to analyze ordinal data, whichignores the categorical nature of the response variable and uses standard para-metric methods for continuous response variables. This approach assigns nu-merical scores to the ordered categories and then uses linear regression.

20

Multiple linear regression are used here to model the relationship betweenseveral explanatory variables (questions in dimensions except ‘Overall satisfac-tion’) and a response variable (question in dimension ‘Overall satisfaction’ orthe weighted score calculated in dimension ‘Overall satisfaction’ according toSection 4.1.2) by fitting a linear equation to observed data. Every value of theindependent variable x is associated with a value of the dependent variable y.The regression line is written as:

y = β0 + β1x1 + β2x2 + ...+ βnxn + ε

The regression line describes how y changes with the explanatory variablesxs. β0...βn are model parameters and ε is the error or noise term. Alao ε isassumed to have a normal distribution with mean 0 and constant variance σ2,i.e. ε ∼ N(0, σ2). Thus

y ∼ N(β0 + β1x1 + β2x2 + ...+ βnxn, σ2)

and

E(y) = β0 + β1x1 + β2x2 + ...+ βnxn

We also have categorical independent variables in our dataset, such as gen-der, education and occupation. These variables also have influence on our targety as described in Section 4.2.2. However, linear regression could not handle thesevalues. Thus, dummy coding described in Section 4.1.4 will be used to solvethis problem.

The linear regression result are shown in Table 4.3 and Table 4.4. Detailedcoefficients are shown in Appendix B.2.

Table 4.3: Model summary of the linear regression

R R Square Adjusted R Square Std. Error of the Estimate

.888 .789 .789 .467688085

Table 4.4: ANOVA of the linear regression

Sum of Squares df Mean Square F Sig.

Regression 64365.681 40 1609.142 7356.678 .000Residual 17177.910 78534 .219

Total 81543.592 78574

4.3 Machine learning classification techinques

The fundamental goal of machine learning algorithms is to discover meaningfulrelationships in a body of training data, presented as individual examples, andto produce a generalization of those relationships that can be used to interpretsubsequently presented test data. As a result, several learning paradigms have

21

arisen to deal with different conditions. When the training data explicit exam-ples of what the correct output should be for given input datasets, then it iscalled supervised learning; when the training data does not contain any outputinformation at all, then it is called unsupervised learning; when the trainingdataset does not tell what the target output should be but they instead containsome possible output together with a measure of how good the output it made,then it is called reinforcement learning, etc.

In this study, we have our target output available and will focus on supervisedlearning. As we stated above, a correct or targeted output is available, whenthe type of the output variable is a numeric value, the problem is a regressionproblem; when the type of the output variable is a categorical value, the problemis a classification problem. And classification is the most commonly performedexploratory task in machine learning which here we focus.

Here we perform classification techniques to classify patients’ attitude to-wards the health care service he/she received. Try to find the important factoraffect patients satisfaction. We transform the weighted score in dimension ‘Over-all satisfaction’ as mentioned in Section 2.2.3 to binomial labels (positive = 1or negative = 0), when the score is greater than 3.5 we label it to be a positiveresponse and smaller than 3.5 to be negative response. Several machine learningalgorithms are apply here to deal with this classification problem.

4.3.1 Naıve Bayes classifier

Naıve Bayes classifier is based on Bayesian Rule [41], which points out thatposterior probability of a random event could be determined after the relevantevidence is taken into account.

P (B|A) =P (A|B)P (B)

P (A)

posterior =likelihood× prior

evidence

A naıve Bayes classifier is a probabilistic model navely assumes that thefeatures (x1, x2, ..., xn) are independent. It could be written as

P (C|x1, x2, ..., xn) =P (C)P (x1, x2, ..., xn|C)

P (x1, x2, ..., xn)

where C is the class label.As assumed by the naıve Bayes classifier, the features are all independent,

then P (x1, x2, ..., xn|C) could be re-written as a product of the component prob-abilities. Hence the posterior probability becomes

P (C|x1, x2, ..., xn) ∝ P (C)P (x1, x2, ..., xn|C)

∝ P (C)P (x1|C)P (x2|C)...P (xn|C)

∝ P (C)

n∏i=1

P (xi|C)

And the naıve Bayes classifier assigns a test sample using the above equationwith the maximum a posterior probability (MAP).

22

4.3.2 Logistic regression

Logistic regression investigates the relationship between categorical responsevariables and a set of explanatory variables [41]. The logistic regression modelis a member of the generalized linear models (GLM) class and it is an appro-priate model for studying the relationship between a binary response variabley, representing positive response (y = 1) or negative response (y = 0) , and aset of explanatory variables (x1, x2, ..., xn).

Instead of approximating the 0 and 1 values directly, logistic regression buildsa linear model based on a transformed target variable. The ratio of event tononevent is taken into consideration which is called the ’odd ratio’. Odd ratiocan take any value between 0 and infinity and it is in ratio scale. This odd ratiocan be modeled as an exponential series as shown below.

P

1− P= eβ0+β1x1+β2x2+...+βnxn

Where P and (1−P ) are probabilities of events and nonevents. x1, x2, ..., xn areindependent variables. β0, β1, β2, ..., βn are the parameters. Taking log on boththe side convert a nonlinear equation into a linear equation as shown below.

log(P

1− P) = β0 + β1x1 + β2x2 + ...+ βnxn

or equivalently,

P = exp(β0 +β1x1 +β2x2 + ...+βnxn)/(1 + exp(β0 +β1x1 +β2x2 + ...+βnxn))

log(P/1− P ) is called the link function and the parameters are estimatedusing maximum likelihood method. Unlike linear regression, logistic regressioncan handle both numeric and categorical data as independent variables.

4.3.3 Tree-based model

Tree-based methods are data-driven tools based on sequential procedures thatrecursively partition the data [42]. They exploit tree graphs to provide visualrepresentations of the rules underlying a given data set which provides simpleways of understanding and summarizing the main features of the data.

Decision tree

Decision tree is a kind of tree graph which composes of nodes and edges. (AsFigure 4.5 shows.) The leaves of the final tree are base hypotheses. For example,leaves of classification example in Figure 4.5 means different attitude towardsservice (satisfied or unsatisfied).

Decision trees are generally constructed from the dataset which containsa dependent variable Y and K (K ≥ 1) predictors X1 , . . . , XK for n(n ≥ 1) sample units: D = {(yi, xx1, ..., xiK); i = 1, ..., n}. The process ofconstructing a tree from the learning set is called tree building or tree growing.The procedures for growing trees start from the entire learning set, which isgraphically represented as a tree composed of a single leaf, also called a rootnode. The learning set is partitioned into two or more subsets according to agiven splitting rule; this produces a new tree with an increased size. Then, the

23

Figure 4.5: Decision tree example from [42].The dependent variable consideredin the example is the satisfaction of a customer with a given service providedby an international company (Satisfaction, with categories no and yes) and thepredictors are the country in which the customer lives (Country, with cate-gories Benelux, Germany, Israel and United Kingdom) and how many years thecustomer has used the service (Seniority).

partitioning procedure is recursively applied by splitting the leaves of the treeconstructed in the previous step. Thus, tree growing is performed through atop-down procedure. The result of the recursive partition process is a sequenceof nested trees having an increasing number of leaves.

For tree growing/constructing, it is of vital importance to choose a suitablegoodness-of-split criterion, which is used to score all possible splitting rules fora given leaf node and select the best one; and the denition of criteria to stop therecursive partitioning. Many recursive partitioning methods have been proposednamed ID3, C4.5, CART, CHAID, QUEST, GUIDE, CRUISE, and CTREE.Forexample, ID3’ splitting criteria is information gain; C4.5’s splitting criteria isgain ratio; CART’s splitting criteria is Gini index [43].

Random forest

Decision tree bagging relies on the fact that single-tree methods can result invery different prediction depending on which observations are included. Decisiontree bagging is to conceptualize trees fit to different subsets of data as if theywere independent draws of a random variable. With the help of bagging, wecan reduce the variance in single-tree estimates of the response by fitting manytrees and combining them as the equation shows below, where M is the numberof trees and Θm is the set of parameters that define each tree Tm.

24

f(Xi) =

M∑m=1

Tm(Xi; Θm)

The procedure for random forest algorithm is: Firstly, it bootstraps (i.e.sample with replacement) m time to get m new N size sample. Then m decisiontrees are trained based on these m samples. When make a prediction usingrandom forest, each decision has one vote, and the result is based on majorityvoting. An example of random forest is shown below in Figure 4.6.

Figure 4.6: Random forest example from [44].

When we applied bagging on single decision trees, bagging typically resultsin higher accuracy if applied to unpruned trees [45]. The reason is that unprunedtrees are overfitted to their training data and therefore have low bias and highvariance. Since bagging is a variance reduction technique, it can reduce thisvariance so that the final bagged model has both low bias and low variance andthus lower total prediction error.

Adaptive boosting decision tree

Similar to bagging decision trees to construct random forests, tree boostingapproaches also solve the problem by creating multiple trees. There are manyvariants on the idea of boosting. The most widely used method is called adaptiveboosting. However, unlike random forest which uniform aggregate decision trees,adaptive boosting decision tree is a non-uniform aggregation of decision tree [46].

Every tree has its own weight α on final prediction. The weight are initiallyequal and at step m the cases mis-classified at step m − 1 have their weightsincreased and those correctly classified have them decreased. Thus, as iterationsproceed, samples that are difficult to classify receive more attention.

f(Xi) =

N∑n=1

αmTn(Xi; Θn)

Moreover, the difference between random forest and adaptive boosting de-cision tree is not only in different weights for trees, but also in different boot-

25

strapping method. Random forest can bootstrap m samples in parallel (in-dependently), but the subsequent sample in adaptive boosting decision treefocuses on problematic observations in previous decision tree model(i.e. focuson observations can’t be solved by previous decision tree.). The bootstrapping,train model and calculate model’s weight iterate performs sequentially such thateach new tree improves the predictive power of the boosting. At last, the adap-tive boosting decision tree aggregates non-uniformly using its calculated modelweight αm.

4.3.4 Support vector machines (SVM)

Support vector machines is a blend of linear modeling and instance-based learn-ing [41]. Support vector machines use linear models to implement nonlinearclass boundaries by transforming the sample space into a new space.

Figure 4.7: Hyperplane example from [41].

Support vector machines select a small number of critical boundary instancescalled support vectors from each class and build a linear discriminant functionthat separates them as widely as possible, i.e. find the maximum-margin hy-perplane. For example, a two class dataset whose classes are linearly separable(as shown in Figure 4.7), there is a hyperplane in sample space that classifiesall training samples correctly. The maximum-margin hyperplane is the one thatgives the greatest separation between the classes.

The set of support vectors defines the maximum-margin hyperplane. The“Plus-plane” and “Minus-plane” are the plane where support vectors lies. Thedistance between “Plus-plane” and “Minus-plane” are the margin M that weneed to maximize. We could easily know that

M =2

||w||Thus the problem could be changed to

26

Figure 4.8: Decision boundary and Margin.

min1

2||w||2s.t., yi(wTxi + b) ≥ 1, i = 1, 2, ..., n

The above is an optimization problem with a convex quadratic objective andonly linear constraints that could be solved, and hence the optimal marginclassifier is found.

However, when the data could not be spearate in linear case, kernel methodswere introduced to help with the classification, common used kernels includingpolynomial kernels, Gaussian kernels and sigmoid kernels.

4.3.5 Artificial neural networks

An artificial neural network (ANN) also called neural network (NN) is a compu-tational model that is inspired by the structure of the brains biological nervoussystem which first proposed by McCulloch and Pitts [47] in 1943. Modern arti-ficial neural networks are usually used to model complex relationships betweeninputs and outputs or to find patterns in data.

Artificial neural networks consist of a large number of simple processingelements called neurons which are interconnected together via channels calledconnections. The connections between the neurons also called links, and everylink has a weight parameter associated with it. Since the performance of NNwill not significantly degraded if one of its links or neurons is faulty, Neuralnetworks’ highly interconnected structure could make them fault tolerant.

Each neurons receives stimulus from the neighboring neurons connected toit, processes the information received and product an output. Neurons in thenetwork are divided into different kinds:

• Input neurons: neurons that receive stimuli from outside the networkcalled input neurons.

• Output neurons: neurons that give outputs outside the network.

• Hidden neurons: neurons that receive stimuli from other neurons and giveoutputs as stimulus for other neurons in the network.

27

Based on the link pattern, artificial neural network can be grouped into twotypes:

• Feedforward network: the network graph has no loops.

• Feedback network: loops occur because of feedback links.

The artificial neural network has different kinds of architectures which neu-rons have different ways to process information and links have different waysto connected to each other, namely multilayer perceptrons (MLP), radial basisfunction (RBF) network, wavelet neural network, recurrent networks and self-organizing maps (SOM). Here we will discuss the most commonly used neuralnetwork multilayer perceptrons (MLP) which is a feed-forward network trainedby a back propagation algorithm.

Neurons are grouped into layers in MLP [48], the first and last layers whichrepresent the inputs and outputs in neural network are called input and outputlayers respectively, the remaining layers are called hidden layers. An MLPcontain an input layer, one or more hidden layers, and an output layer, anexample of MLP is shown in Figure 4.9.

Figure 4.9: An example of MLP with three layers.

Suppose the total number of layers is L. The 1st layer is the input layer,the Lth layer is the output layer, and layers 2 to L − 1 are hidden layers. Letthe number of neurons in lth layer be Nl, l = 1, 2, ..., L. Let wlij represent theweight of the link between jth neuron of l−1th layer and ith neuron of lth layer(wli0 is the bias for ith neuron of lth layer). Let xi represent the ith externalinput of MLP, and zli be the output of ith neuron of lth layer. A neuron in thenetwork will process the information in such way: each input is first multipliedby the corresponding weight parameter, and the resulting products are added to

a weighted sum γ =Nl−1∑j=1

wlijzl−1j , this γ will pass through a activation function

28

σ(.) (most coomonly-used sig(γ) = 1/(1+e−γ)) to produce the output to neuronin next layer, as shown in Figure 4.10

Figure 4.10: Information processing by ith neuron of lth layer

Then conputation of feedforward MLP with input X = [x1x2...xn]T andoutputs Y = [y1y2...ym]T is

z1i = xi, i = 1, 2, ..., n

zli = σ(

Nl−1∑j=0

wlijzl−1j ), i = 1, 2, ..., Nl

yi = zLi , i = 1, 2, ...,m

MLP aims to find the optimal set of weight parameters to closely repre-sents the relation between input and output. The training method used is backpropagation. The first step is to initialize the weight parameters (small randomvalues). During the training, the output values are compared with the correctanswer to compute the value of some predefined error-function, and the error isthen fed back through the network. Using this error information, the algorithmadjusts the weights of each connection in order to reduce the value of the errorfunction. The propagation and weight update measure are repeated until theperformance of the network is good enough.

29

Chapter 5

Experiments and results

This chapter contains the experiment design for the project and the results gotfrom the experiment.

5.1 Experiment design

5.1.1 Data split

Instead of just working with all of the data, we need to partition our data intoseparate roles including training dataset and testing dataset. Training datasetis used to tune the parameters of an adaptive model; while testing dataset isused to evaluate the performance of the model—how well it generalizes to data.Since testing dataset should never be seen before testing phase, we apply datasplit at the first step even before some data pre-processing such as missing valueimputation.

Partitioning is based on shuffled sampling, we build random subsets of thesurvey dataset. Here we split the dataset into 80% training dataset and 20%testing dataset. With this 80/20 split we could both get enough data to trainthe model and test the model.

5.1.2 Data pre-processing

At this step, data pre-processing is performed after split data into trainingdataset and testing dataset. Missing data will be imputed based on K-nearestneighbor algorithm (K=5 in this experiment). And the target variable weightedscore in dimension ‘Overall satisfaction’ is calculated. Then the dataset is under-sampled to balaced the dataset. Also for classification tasks, label transforma-tion also performed.

5.1.3 Cross validation

Bias and Variance dilemma in machine learning is a common problem to gettrade-off between good generalization and to avoid over-training. A general tech-nique to balance between bias and variance of the model is the cross-validation.

30

Methods including Leave-One-Out cross validation and n-fold cross valida-tion. Since the Leave-One-Out cross validation requires large amount of calcu-lation and the widely used n-fold cross validation works well enough in practice,we use 10-fold cross validation here. The procedure of n-fold cross validation [49]is as follows:

1. Divide the dataset D into n disjoint subsets D1, D2,..., Dn.

2. For i = 1, 2,..., n:

(a) Dtraining = Di, Dtesting = D\Di

(b) Train the model based on Dtraining and Dtesting to evaluate the per-formance Pi

3. Evaluate the performance of the model by Poverall = 1n

n∑i=1

Pi

10-fold cross validation divides all the samples into ten groups of samples(folds) of equal sizes (if possible). For n (=10) iterations, train the predictionmodel using nine (n-1) folds, and the fold left out is used for test. Illustrationof the partitioning of the training data shown in Figure 5.1.

Figure 5.1: 10-fold cross validation

5.1.4 Grid search

In machine learning, hyper-parameters are parameters that are not directlylearned within the model, instead they are passed as arguments to the constructthe model. The grid search provided exhaustively generates candidates from agrid of parameter values. For example, 3 parameters are selected and 10 stepsfor each parameter then the total number of combinations would be above 1000(i.e. 10 x 10 x 10). Here we perform grid search for the model which requireshyper-parameters.

5.1.5 Evaluate model

To determine which models are best to solve the problem we need metrics toevaluate the models. For the classification models, we use confusion matrix andROC chart to evaluate the performance.

31

Confusion matrix

For a classifier f , denote the confusion matrix with respect to f as C(f). ThenC(f) is a square l × l matrix for a dataset with l classes. Each element cij(f)of the confusion matrix denotes the number of examples that actually have aclass i label and that the classifier f assigns to class j. Thus, for a test set T ofexamples and a classifier f , the confusion matrix C(f) could be defined as

C(f) = {cij(f) =∑x∈T

[(y = i) ∧ (f(x) = j)]}

where x is a test example and y is its corresponding label such that y ∈{1, 2, ..., l}. For binary classification case which is also the case for our clas-sification tasks, l = 2 and the confusion matrix is a 2 × 2 matrix with the twoclasses called “negative” and “positive”. As shown in Table 5.1, the confusion

Table 5.1: Confusion matrix for the binary classification case

f Pred Negative Pred Positive

Act Negative c11(f) or TN c12(f) or FP YN = TN + FPAct Positive c21(f) or FN c22(f) or TP YP = FN + TP

fN = TN + FN fP = FP + TP

matrix contains four characteristic values: the numbers of true positives (TPs),false positives (FPs), false negatives (FNs), and true negatives (TNs). TP andTN thus stand for the number of examples from the testing set that were cor-rectly classified as positive and negative, respectively. Conversely, FN and FPstand for the positive and negative examples that were erroneously classified asnegative and positive, respectively. YN and YP denote the number of examplesto which this process assigns a positive and a negative label, respectively. Sim-ilarly, fN and fP denote the number of examples to which classifier f assignsa positive and a negative label, respectively. Main metrics defined within thematrix which will be examined in our case is as follows.

Precision or Positive Predictive Value =TP

TP + FP

Recall or True Positive Rate =TP

TP + FN

Accuracy =TP + TN

TN + FP + FN + TP

ROC graph

Graphical performance measures are very useful in performance measure asthey enable visualization of the classifier performance over the full operatingrange [50]. Receiver operating characteristic (ROC) graph is a very powerfulgraphical tool for visualizing the performance of a learning algorithm over vary-ing decision criteria.

An ROC curve is a plot in which the horizontal axis (the x axis) denotesthe false-positive rate (FPR = FP/(FP + TN)) and the vertical axis (the y

32

axis) denotes the true-positive rate (TPR = TP/(TP + FN)) of a classifier.the TPR is the sensitivity of the classifier whereas the FPR is 1-TNR (TNR =TN/(FP + TN) is the true negative rateor specificity) of the classifier. Hence,ROC analysis studies the relationship between the sensitivity and the specificityof the classifier.

The total area under the ROC curve, abbreviated as AUC. The randomclassifier cuts the ROC space in half and hence AUC(frandom) = 0.5. A classifierwith better performance than random classifier, it should have an AUC greaterthan 0.5.

Figure 5.2: The ROC space from [50]

A threshold curve also plotted in the graph, it refers to the confidence valueof the prediction. If the confidence of the example to be positive is greaterthan the threshold, the example will be classified as positive, if the confidenceis below the threshold, it will be classified as negative.

5.2 Naıve Bayes classifier

A Naıve Bayes classifier were trained to classify the patients’ attitude towardthere overall satisfaction. It is a very simple model which only requires a smallamount of training data to estimate the means and variances of the variablesnecessary for classification. However, as shown in Table 5.2, the result showsthat it has a good performance.

The classifier’s accuracy is 89.11%, with 96.06% precision rate and 90.10%recall rate. The ROC graph of the Naıve Bayes classifier is shown in Figure 5.3,and AUC of it is 0.949.

33

Table 5.2: Confusion matrix of Naıve Bayes classifier

true false true true class precision

pred. false 2656 1246 68.07%pred. true 465 11346 96.06%

class recall 85.10% 90.10%

Figure 5.3: The ROC graph of Naıve Bayes classifier

5.3 Logistic regression

Logistic regression is applied to the training dataset to train a classifier forpatient overall satisfaction. Table 5.3 shows the confusion matrix of the logisticregression model.

Table 5.3: Confusion matrix of logistic regression




The classifier’s accuracy is 89.85%, with 96.59% precision rate and 90.53%recall rate. The ROC graph of the logistic regression classifier is shown inFigure 5.4, and AUC of it is 0.957.

5.4 Tree-based model

Tree-based methods which have intuitive structure are useful tools for analyzingthe relationship between a dependent variable and a set of predictors. Here weuse decision tree, random forest (bagging of decision tree), and adaptive boosting

34

Figure 5.4: The ROC graph of logistic regression

decision tree (boosting of decision tree) to train the classification model.

5.4.1 Decision tree

Decision tree model is trained with grid-search to find the optimal parameters.The selected parameters include maximal depth of the tree, criterion (gain ratio,information gain, gini index, etc. to determine the type of decision tree), theminimal leaf size and the minimal size for split. With the optimal parameter,the confusion matrix is shown in Table 5.4.

Table 5.4: Confusion matrix of decision tree




The classifier’s accuracy is 87.70%, with 95.71% precision rate and 88.62%recall rate. The ROC graph of the decision tree classifier is shown in Figure 5.5,and AUC of it is 0.887.

5.4.2 Random forest

Random forest model is trained with grid-search to find the optimal parameters.The selected parameters include minimal gain, number of trees, maximal depthof trees, minimal leaf size. With the optimal parameter, the confusion matrixis shown in Table 5.5.

The classifier’s accuracy is 89.58%, with 95.42% precision rate and 91.38%recall rate. The ROC graph of the random forest classifier is shown in Figure 5.6,

35

Figure 5.5: The ROC graph of decision tree

Table 5.5: Confusion matrix of random forest




and AUC of it is 0.947.

Figure 5.6: The ROC graph of random forest

36

5.4.3 Adaptive boosting decision tree

Adaptive boosting decision tree model is trained with grid-search to find the op-timal parameters. The selected parameters include iteration number, maximaldepth of trees and minimal gain. With the optimal parameter, the confusionmatrix is shown in Table 5.6.

Table 5.6: Confusion matrix of adaptive boosting decision tree




The classifier’s accuracy is 87.41%, with 96.39% precision rate and 87.56%recall rate. The ROC graph of the adaptive boosting decision tree classifier isshown in Figure 5.7, and AUC of it is 0.927.

Figure 5.7: The ROC graph of adaptive boosting decision tree

5.5 Support vector machines (SVM)

Support vector machine learns from a training dataset to classifier which sepa-rates a set of positive examples from a set of negative examples of introducing themaximum margin between the two datasets. During this process, the trainingdata described by points in sample space to find the support vectors. Supportvector machine model is trained with grid-search to find the optimal parame-ters. The selected parameters include kernel type, C value and gamma value.With the optimal parameter, the confusion matrix is shown in Table 5.7.

The classifier’s accuracy is 89.19%, with 96.79% precision rate and 89.48%recall rate. The ROC graph of the support vector machine classifier is shown in

37

Table 5.7: Confusion matrix of support vector machine




Figure 5.8, and AUC of it is 0.957.

Figure 5.8: The ROC graph of support vector machine

5.6 Artificial neural networks

An artificial neural network is a set of connected input/output units in whicheach connection has a weight associated with it. During the learning process, thenetwork learns by updating the weights so as to able to predict the correct classlabel of the input. Artificial neural networks model is trained with grid-searchto find the optimal learning rate. With the optimal parameter, the confusionmatrix is shown in Table 5.8.

Table 5.8: Confusion matrix of artificial neural networks




The classifier’s accuracy is 88.86%, with 96.60% precision rate and 89.24%

38

recall rate. The ROC graph of the artificial neural network classifier is shownin Figure 5.9, and AUC of it is 0.952.

Figure 5.9: The ROC graph of artificial neural network

5.7 Model performance comparison

In this section, a summary of model performance on testing dataset will begiven consider different comparison metrics. Performance metrics consideredincluding accuracy, precision, recall and AUC. The result is shown in Table 5.9.

Table 5.9: Model performance comparison with different metrics

Model Accuracy (%) Precision (%) Recall (%) AUC

Naıve Bayes 89.11 96.06 90.10 0.949Logistic regression 89.85 96.59 90.53 0.957

Decision tree 87.70 95.71 88.62 0.887Random forest 89.58 95.42 91.38 0.947

Adaptive boosting decision tree 87.41 96.39 87.56 0.927Support vector machine 89.19 96.79 89.48 0.957

Artificial neural networks 88.86 96.60 89.24 0.952

As the result shown, logistic regression achieved highest accuracy (89.85%),support vector machine reached highest precision (96.79%), random forest re-alized highest recall (91.38%), and logistic regression as well as support vectormachine got the largest AUC (0.957).

However, the performance are all relatively high and did not have very longgaps. And the confusion matrices shown in former section imply that someclassifier might be expert in one specific domain and weak in another domain.Thus, combining classifiers (also called ensembling methods in machine learning)

39

might be considered. Ensembling methods means that we can combine classifiersby averaging or voting. In general, combining classifiers improves accuracy. Inthis case, logistic regression, random forest and support vector machine mightbe combined to get both high precision and recall, and even accuracy.

40

Chapter 6

Conclusions and FutureWork

As an attempt to analysis patient satisfaction survey, this thesis has exploredthe way of analyzing patients’ response based on classical statistical methodand machine learning method. The findings of the study could give some impli-cations to health care provider, specialists or even patients who are the targetcustomer of health care service.

The classical statistical method including cross tabulation, chi-square test,correlation matrix and linear regression are used to find the relationships be-tween features. The result shown that patients’ demographics will influencethe overall satisfaction which means people have different gender, educationand occupation will influence their satisfaction to health care service. Also thepatients’ physical condition (i.e. if they feel pain during the visit to healthcare center) also effect the patients’ opinion. Moreover, questions in dimensionsshows positive correlation with each other which indicate the consistency oftheir attitude.

Machine learning classification methods are used to build models to classifypatients overall satisfaction based on their demographics and responses to surveyquestions. Models including Naıve Bayes classifier,logistic regression, tree-basedmodel (decision tree, random forest, adaptive boosting decision tree), supportvector machines and artificial neural networks performs well in classificationtasks. These classification models could help to find the pattern behind thesurvey data and improve the health care service quality based on those results.

Future work might be in two fields: data pre-processiong (deal with patientsubjective feelings and outliers in samples) and machine learning algorithm op-timization (blending model and analyze free text).

Patients subjective feeling should be take in to consideration. Some of thequestions in survey are subjective, dealing with behaviours, feelings and psycho-logical benefits. Hence, patients may have different expectation of health careservice and different response to the same quality of service. We might need tofind way to “normalize” these subjective response.

What is more, there is not a suitable way to detect outliers. Rejectionof outliers prior to performing analysis has been regarded as an essential pre-processing step for almost as long as the methods have existed. But it is difficult

41

to dene outliers in ordinal variables. For multivariate data, however, the exami-nation of each dimension by itself or in pairs does not work, because it is possiblefor some data points to be outliers in multivariate space, but not in any of theoriginal univariate dimensions.

For machine learning algorithm optimization, blending models might be con-sidered. Also free text describing satifaction could also be analyzed. The sourceof dataset of this study is the structured survey responses from patients. An-other source of information for patient satisfaction is the last question of thesurvey—open question. The information is contained in free text, not in a set ofanswers elicited for a specific set of questions. Also the free text is not limitedto open questions of the survey, but also can be feedback of patients from blogs,newsgroups, feedback email, etc. These kind of free text information is becom-ing more and more pervasive and voluminous, is spontaneous patients’ feedback.Nature language processing technology could be performed to evaluate patientssatisfaction.

42

Appendix A

2016 patient satisfactionsurvey

056<pid>

1. Är du nöjd med de sätt du kan komma i kontakt med hälso-/vårdcentralen på (t ex 1177 Vårdguiden, telefon, e-tjänster, hemsida eller annat)? 1 2 3 4 5

Nej, inte alls

Ja, helt och hållet

Ej aktuellt

2. Fick du besöka hälso-/vårdcentralen inom rimlig tid?1 2 3 4 5

Nej, inte alls


Ej aktuellt

3. Var det enkelt att ta sig till hälso-/vårdcentralen?1 2 3 4 5

Nej, inte alls


Ej aktuellt

4. Under besöket, informerade personalen dig om eventuella förseningar? 1 2 3 4 5

Nej, inte alls


Ej aktuellt

5. Får du träffa samma läkare vid dina besök på hälso-/vårdcentralen?1 2 3 4 5

Nej, aldrig

Ja, alltid

Ej aktuellt

6. Fick du träffa den läkare du ville träffa?

Ja

Nej

Ej aktuellt

7. Fick du möjlighet att ställa de frågor du önskade?1 2 3 4 5

Nej, inte alls


Ej aktuellt

8. Om du ställde frågor till personalen, fick du svar som du förstod?1 2 3 4 5

Nej, inte alls


Ej aktuellt

Hur var ditt senaste besök på <enhetsnamn>?

VAR GOD VÄND BLAD

Vem svarar på enkäten?

Posta enkäten senast:27 november 2016

Hur du fyller i enkäten:

Enkäten riktar sig till dig som nyligen har besökt en hälso-/vårdcentral.

Enkäten är personlig. Dina åsikter kan inte ersättas med någon annans.

Har du svårt att besvara enkäten själv kan du be en anhörig eller god man att hjälpa dig. Viktigt är att dina svar inte påverkas av personen som hjälper dig.

Markera dina svar med kryss, fyll i med kulspetspenna.

Om du kryssat fel, täck hela rutan.

Frågorna besvaras på en femgradig skala, där 1 är det mest negativa och 5 det mest positiva. Du kan även välja alternativet “Ej aktuellt” om frågan inte är aktuell för dig.

Fyll endast i ETT svar per fråga om inget annat anges.

43

056<pid>

9. Om du ställde frågor till personalen, svarade personalen med medkänsla och engagemang?1 2 3 4 5

Nej, inte alls


Ej aktuellt

10. Kände du dig bemött med respekt och värdighet oavsett: kön, könsöverskridande identitet eller uttryck, etnisk tillhörighet, religion eller annan trosuppfattning, funktionsnedsättning, sexuell läggning eller ålder? 1 2 3 4 5

Nej, inte alls


Ej aktuellt

Utveckla gärna ditt svar på sista sidan under fråga 35

11. Om personalen pratade med varandra om dig, kände du dig delaktig i samtalet?1 2 3 4 5

Nej, inte alls


Ej aktuellt

12. Om du vid besöket pratade med flera i personalen, var de samstämmiga i kommunikationen?1 2 3 4 5

Nej, inte alls


Ej aktuellt

13. Upplevde du att personalen samarbetade väl?1 2 3 4 5

Nej, inte alls


Ej aktuellt

14. Bemötte läkaren dig med medkänsla och omsorg?1 2 3 4 5

Nej, inte alls


Ej aktuellt

15. Tog läkaren hänsyn till dina egna erfarenheter av din sjukdom/ditt hälsotillstånd?1 2 3 4 5

Nej, inte alls


Ej aktuellt

16. Gjorde läkaren dig delaktig i besluten beträffande din vård/behandling?1 2 3 4 5

Nej, inte alls


Ej aktuellt

17. Är du delaktig i besluten beträffande din vård/behandling till den utsträckning du önskar?1 2 3 4 5

Nej, inte alls


Ej aktuellt


18. Hade du möjlighet att vid behov få känslomässigt stöd från läkaren (t ex om du kände oro, rädsla, ångest eller motsvarande)?1 2 3 4 5

Nej, inte alls


Ej aktuellt


19. Fick du tillräckligt med information om medicinering och eventuella biverkningar?1 2 3 4 5

Nej, inte alls


Ej aktuellt

20. Fick du tillräckligt med information om behandlingen?1 2 3 4 5

Nej, inte alls


Ej aktuellt

21. Fick du tillräckligt med information om varningssignaler att vara uppmärksam på beträffande din sjukdom/ditt hälsotillstånd eller din medicinering/behandling?1 2 3 4 5

Nej, inte alls


Ej aktuellt

22. Förklarade läkaren medicineringen/behandlingen på ett sätt som du förstod?1 2 3 4 5

Nej, inte alls


Ej aktuellt

Fyll endast i ett svar per fråga om inget annat anges

44

056<pid>

23. Diskuterade du och läkaren vad du själv kan göra för att förbättra din hälsa?

Ja

Nej

Ej aktuellt

24. Kände du smärta vid besöket?

Ja

Ja, delvis

Nej

Vet ej

25. Om du vid besöket kände smärta, fick du snabbt hjälp med smärtlindring?1 2 3 4 5

Nej, inte alls


Ej aktuellt

26. Om du kände obehag beträffande din sjukdom/ditt hälsotillstånd eller din medicinering/behandling bemöttes du med medkänsla och omsorg?1 2 3 4 5

Nej, inte alls


Ej aktuellt

27. Om din familj/närstående ville prata med en läkare/sjuksköterska, hade de då möjlighet att göra det?1 2 3 4 5

Nej, inte alls


Ej aktuellt

28. Gav läkaren din familj/närstående den information de önskade?1 2 3 4 5

Nej, inte alls


Ej aktuellt

29. Anser du att personalen på hälso-/vårdcentralen samordnar dina kontakter med vården i den utsträckning du behöver?1 2 3 4 5

Nej, inte alls


Ej aktuellt

30. Anser du att ditt aktuella behov av vård har blivit tillgodosett?1 2 3 4 5

Nej, inte alls


Ej aktuellt

31. Upplevde du att atmosfären på hälso-/vårdcentralen var bra?1 2 3 4 5

Nej, inte alls


Ej aktuellt

32. Skulle du rekommendera hälso-/vårdcentralen till någon i din situation?1 2 3 4 5

Nej, inte alls


Ej aktuellt

VAR GOD VÄND BLAD


45

056<pid>

Om dig

33. Vänligen ange din högsta avslutade utbildning Ange endast ett svar.

Grundskola, folkskola eller motsvarande

Gymnasium, realskola eller motsvarande

Eftergymnasial utbildning, universitet eller högskola

Ingen avslutad utbildning

34. Vänligen ange din huvudsakliga sysselsättning Ange endast ett svar.

Arbetar

Arbetslös

Studerar

Pensionär

Övrigt

35. Har du andra synpunkter eller vill utveckla dina svar? Ditt svar kommer att återföras i sin helhet till hälso- och sjukvården.Var god skriv inom rutans ramar, texta tydligt.


Tack för din medverkan!Var god posta enkäten senast den 27 november 2016

Har du förlorat ditt svarskuvert? Skicka enkäten portofritt till: K.G.M. Datadistribution AB

c/o Softronic, SVARSPOST 20667010 169 20 Solna

Om du har några frågor hör gärna av dig till: IC Quality via e-post: [email protected] eller kostnadsfritt på telefon: 020 – 12 12 41, vardagar 8-17.

46

Appendix B

Basic analysis results

B.1 Chi-square test

B.1.1 Education and overall satisfaction

The chi-square test result of education and overall satisfaction are shown inTable B.1, Table B.2 and Table B.3.

Table B.1: Cross tabulation of the education and overall satisfactionOverall satisfaction

TotalUnsatisfied Satisfied

Education

NACount 510 1963 2473Expected Count 503.2 1969.8 2473

11Count 3144 16341 19485Expected Count 3964.6 15520.4 19485




TotalCount 15986 62581 78567Expected Count 15986 62581 78567

B.1.2 Gender and overall satisfaction

The chi-square test result of gender and overall satisfaction are shown inTable B.4, Table B.5 and Table B.6.

1Elementary school, folk school or equivalent2High school or equivalent3Post-secondary education, university or college4No completed education

47

Table B.2: Chi-square test of education and overall satisfaction

Value df Asymptotic Significance (2-sided)

Pearson Chi-Square 313.625 a 4 .000Likelihood Ratio 322.39 4 .000Linear-by-Linear Association 216.665 1 .000N of Valid Cases 78567

a 0 cells (0.0%) have expected count less than 5. The minimum expected count is252.71.

Table B.3: Symmetric measure of chi-square test on education and overall sat-isfaction

Value Approximate Significance

Nominal by Nominal Contingency Coefficient .063 .000N of Balid Cases 78567

Table B.4: Cross tabulation of the gender and overall satisfaction

Overall satisfactionTotal

Unsatisfied Satisfied

GenderMan

Count 5600 25391 30991Expected Count 6306.3 24684.7 30991

WomanCount 10389 37195 47584Expected Count 9682.7 37901.3 47584


B.1.3 Occupation and overall satisfaction

The chi-square test result of occupation and overall satisfaction are shown inTable B.7, Table B.8 and Table B.9.

B.1.4 Question X000031 and overall satisfaction


B.1.5 Question X000032 and overall satisfaction


B.2 Linear regression

48

Table B.5: Chi-square test of gender and overall satisfaction

Value df

AsymptoticSignifi-cance

(2-sided)

ExactSig.(2-

sided)

Exact Sig.(1-sided)

PearsonChi-Square

163.983a 1 .000

ContinuityCorrectionb 163.751 1 .000

Likelihood Ratio 165.797 1 .000Fisher’s Exact

Test.000 .000

Linear-by-LinearAssociation

163.981 1 .000

N of Valid Cases 78575

a 0 cells (0.0%) have expected count less than 5. The minimum expected countis 6306.27.b Computed only for a 2 × 2 table.

Table B.6: Symmetric measure of chi-square test on gender and overall satis-faction



Table B.7: Cross tabulation of the occupation and overall satisfaction



Occupation


EmployeeCount 7177 21380 28557Expected Count 5810.3 22746.7 28557

UnemployedCount 518 1298 1816Expected Count 369.5 1446.5 1816

StudentCount 936 1857 2793Expected Count 568.3 2224.7 2793

PensionerCount 6259 35123 41382Expected Count 8419.8 32962.2 41382

OtherCount 761 1765 2526Expected Count 514 2012 2526


49

Table B.8: Chi-square test of occupation and overall satisfaction




Table B.9: Symmetric measure of chi-square test on occupation and overallsatisfaction



Table B.10: Cross tabulation of the question X000031 and overall satisfaction



Question X000031

YesCount 4767 41104 45871Expected Count 9341.8 36529.2 45871

NoCount 6858 7153 14011Expected Count 2853.4 11157.6 14011



Table B.11: Chi-square test of question X000031 and overall satisfaction




Table B.12: Symmetric measure of chi-square test on question X000031 andoverall satisfaction



50

Table B.13: Cross tabulation of the question X000032 and overall satisfaction



Question X000032

YesCount 5428 34279 39707Expected Count 8073.3 31633.7 39707.0

NoCount 7434 9373 16807Expected Count 3417.2 13389.8 16807.0

NACount 2994 18477 21471Expected Count 4365.5 17105.5 21471.0

TotalCount 15856 62129 77985Expected Count 15856.0 62129.0 77985.0

Table B.14: Chi-square test of question X000032 and overall satisfaction


Pearson Chi-Square 7555.328a 2 .000Likelihood Ratio 6658.100 2 .000Linear-by-Linear Association 175.991 1 .000N of Valid Cases 77985


Table B.15: Symmetric measure of chi-square test on question X000032 andoverall satisfaction



51

Table B.16: Coefficients of linear regression

Unstandardized CoefficientsStandardizedCoefficients t Sig.

BStd.

ErrorBeta

(Constant) -0.171 0.166 -1.029 0.303education2 0.037 0.317 0.017 0.118 0.906education3 0.027 0.317 0.013 0.086 0.931education1 0.071 0.317 0.03 0.224 0.823education0 0.05 0.317 0.009 0.157 0.875education4 0.041 0.317 0.005 0.129 0.898

gender1 0.015 0.003 0.007 4.332 .000occupation1 -0.093 0.27 -0.044 -0.343 0.732occupation4 -0.046 0.27 -0.022 -0.169 0.866occupation2 -0.071 0.27 -0.01 -0.263 0.792occupation3 -0.123 0.27 -0.022 -0.454 0.65occupation5 -0.11 0.27 -0.019 -0.408 0.683occupation0 -0.079 0.27 -0.011 -0.294 0.769

X000001 0.016 0.003 0.015 5.822 .000X000004 0.068 0.003 0.059 21.948 .000X000005 -0.011 0.002 -0.012 -4.552 .000X000006 0.000 0.003 0.000 0.15 0.881X000007 0.122 0.003 0.112 37.698 .000X000013 -0.005 0.002 -0.007 -2.177 0.029X000014 0.037 0.003 0.04 11.418 .000X000015 0.016 0.002 0.021 6.935 .000X000017 0.045 0.002 0.059 22.412 .000X000026 0.069 0.002 0.073 34.296 .000X000027 0.095 0.002 0.1 45.807 .000X000028 0.01 0.002 0.008 4.291 .000X000029 0.013 0.001 0.021 10.725 .000X000030 0.026 0.001 0.037 17.448 .000X000031 -0.004 0.002 -0.004 -1.803 0.071X000032 -0.004 0.002 -0.003 -2.03 0.042X000034 0.203 0.002 0.236 85.297 .000X000244 0.032 0.003 0.033 9.596 .000X000245 0.044 0.003 0.042 12.62 .000X000247 -0.012 0.003 -0.012 -3.752 .000X000250 0.002 0.003 0.002 0.788 0.43X000251 0.053 0.003 0.058 16.979 .000X000252 0.013 0.003 0.015 4.372 .000X000255 0.02 0.003 0.023 6.328 .000X000256 0.044 0.003 0.049 14.305 .000X000366 0.012 0.003 0.011 4.116 .000X000369 0.042 0.003 0.041 13.884 .000X000461 0.095 0.003 0.103 32.305 .000

52

Bibliography

[1] L. Lord and N. Gale, “Subjective experience or objective process: un-derstanding the gap between values and practice for involving patients indesigning patient-centred care,” Journal of health organization and man-agement, vol. 28, no. 6, pp. 714–730, 2014.

[2] A. Merkouris, J. Ifantopoulos, V. Lanara, and C. Lemonidou, “Patientsatisfaction: a key concept for evaluating and improving nursing services.,”Journal of Nursing Management, vol. 7, no. 1, pp. 19–28, 1999.

[3] S. Gustavsson, Patient involvement in quality improvement. Chalmers Uni-versity of Technology, 2016.

[4] L. Boyer, P. Francois, E. Doutre, G. Weil, and J. Labarere, “Perceptionand use of the results of patient satisfaction surveys by care providers in afrench teaching hospital,” International Journal for quality in health care,vol. 18, no. 5, pp. 359–364, 2006.

[5] T. Schoenfelder, J. Klewer, and J. Kugler, “Determinants of patient satis-faction: a study among 39 hospitals in an in-patient setting in germany,”International journal for quality in health care, vol. 23, no. 5, pp. 503–509,2011.

[6] M. Coleman, D. Forman, H. Bryant, J. Butler, B. Rachet, C. Maringe,U. Nur, E. Tracey, M. Coory, J. Hatcher, et al., “Cancer survival in aus-tralia, canada, denmark, norway, sweden, and the uk, 1995–2007 (the in-ternational cancer benchmarking partnership): an analysis of population-based cancer registry data,” The Lancet, vol. 377, no. 9760, pp. 127–138,2011.

[7] M. F. MacDorman, T. Matthews, A. D. Mohangoo, and J. Zeitlin, “Inter-national comparisons of infant mortality and related factors: United statesand europe, 2010.,” National vital statistics reports: from the Centers forDisease Control and Prevention, National Center for Health Statistics, Na-tional Vital Statistics System, vol. 63, no. 5, pp. 1–6, 2014.

[8] Sveriges Kommuner och Landsting, Swedish health care from an interna-tional perspective—International comparison 2015. The Swedish Associa-tion of Local Authorities and Regions (salar), 2016.

[9] S. W. Brown, Patient satisfaction pays: Quality service for practice success.Jones & Bartlett Learning, 1993.

53

[10] W. H. Organization et al., Basic documents. World Health Organization,2014.

[11] B. Prakash, “Patient satisfaction,” Journal of Cutaneous and AestheticSurgery, vol. 3, no. 3, p. 151, 2010.

[12] BusinessDictionary.com, “customer satisfaction.” Accessed: 2017-03-30.

[13] A. Donabedian, “The quality of care: how can it be assessed?,” Jama,vol. 260, no. 12, pp. 1743–1748, 1988.

[14] B. Guldvog, “Can patient satisfaction improve health among patients withangina pectoris?,” International Journal for Quality in Health Care, vol. 11,no. 3, pp. 233–240, 1999.

[15] M. Asadi-Lari, M. Tamburini, and D. Gray, “Patients’ needs, satisfaction,and health related quality of life: towards a comprehensive model,” Healthand quality of life outcomes, vol. 2, no. 1, p. 32, 2004.

[16] Sveriges Kommuner och Landsting, “National patient survey.” Accessed:2017-04-02.

[17] R. P. Quinn and L. J. Shepard, “The 1972-73 quality of employment survey.descriptive statistics, with comparison data from the 1969-70 survey ofworking conditions.,” 1974.

[18] A. M. Tsang and N. E. Klepeis, “Descriptive statistics tables from a de-tailed analysis of the national human activity pattern survey (nhaps) data,”tech. rep., Lockheed Martin Environmental Systems and Technologies, LasVegas, NV (United States); Environmental Protection Agency, NationalExposure Research Lab., Las Vegas, NV (United States), 1996.

[19] S. G. Heeringa, B. T. West, and P. A. Berglund, Descriptive Analysis forContinuous Variables, ch. 5, pp. 117–147. CRC Press, 2010.

[20] L. M. Yarris, R. Fu, B. Frakes, N. Magaret, A. L. Adams, H. Brooks,and R. L. Norton, “How accurately can emergency department providersestimate patient satisfaction?,” Western Journal of Emergency Medicine,vol. 13, no. 4, 2012.

[21] P. McCullagh, “Generalized linear models,” European Journal of Opera-tional Research, vol. 16, no. 3, pp. 285–292, 1984.

[22] C. E. McCulloch and J. M. Neuhaus, Generalized linear mixed models.Wiley Online Library, 2001.

[23] E. Winpenny, M. N. Elliott, A. Haas, A. M. Haviland, N. Orr, W. G.Shadel, S. Ma, M. W. Friedberg, and P. D. Cleary, “Advice to quit smokingand ratings of health care among medicare beneficiaries aged 65+,” Healthservices research, vol. 52, no. 1, pp. 207–219, 2017.

[24] K. Welch, “Parental status and punitiveness: Moderating effects of genderand concern about crime,” Crime & Delinquency, vol. 57, no. 6, pp. 878–906, 2011.

54

[25] M. Lin, W. Sappenfield, L. Hernandez, C. Clark, J. Liu, J. Collins, andA. C. Carle, “Child-and state-level characteristics associated with preven-tive dental care access among us children 5–17 years of age,” Maternal andchild health journal, vol. 16, no. 2, pp. 320–329, 2012.

[26] C. M. McKinney, K. G. Chartier, R. Caetano, and T. R. Harris, “Alco-hol availability and neighborhood poverty and their relationship to bingedrinking and related problems among drinkers in committed relationships,”Journal of interpersonal violence, vol. 27, no. 13, pp. 2703–2727, 2012.

[27] M. E. Stabler, K. K. Gurka, and L. R. Lander, “Association betweenchildhood residential mobility and non-medical use of prescription drugsamong american youth,” Maternal and child health journal, vol. 19, no. 12,pp. 2646–2653, 2015.

[28] R. J. McQueen, G. Holmes, and L. Hunt, “User satisfaction with machinelearning as a data analysis method in agricultural research,” New ZealandJournal of Agricultural Research, vol. 41, no. 4, pp. 577–584, 1998.

[29] A. Ghaderi, J. Frounchi, and A. Farnam, “Machine learning-based signalprocessing using physiological signals for stress detection,” in BiomedicalEngineering (ICBME), 2015 22nd Iranian Conference on, pp. 93–98, IEEE,2015.

[30] L. Turgeman, J. H. May, and R. Sciulli, “Insights from a machine learn-ing model for predicting the hospital length of stay (los) at the time ofadmission,” Expert Systems with Applications, vol. 78, pp. 376–385, 2017.

[31] K. SHAMEER, K. W. JOHNSON, A. YAHI, R. MIOTTO, L. Li,D. RICKS, J. JEBAKARAN, P. KOVATCH, P. P. SENGUPTA, A. GELI-JNS, et al., “Predictive modeling of hospital readmission rates using elec-tronic medical record-wide machine learning: a case-study using mountsinai heart failure cohort,” in Pacific Symposium on Biocomputing. PacificSymposium on Biocomputing, vol. 22, p. 276, NIH Public Access, 2016.

[32] P. R. Yarnold, E. A. Michelson, D. A. Thompson, and S. L. Adams, “Pre-dicting patient satisfaction: a study of two emergency departments,” Jour-nal of behavioral medicine, vol. 21, no. 6, pp. 545–563, 1998.

[33] B. C. Sun, J. Adams, E. J. Orav, D. W. Rucker, T. A. Brennan, andH. R. Burstin, “Determinants of patient satisfaction and willingness toreturn with emergency care,” Annals of emergency medicine, vol. 35, no. 5,pp. 426–434, 2000.

[34] E. D. Boudreaux, R. D. Ary, C. V. Mandry, and B. McCabe, “Determinantsof patient satisfaction in a large, municipal ed: the role of demographic vari-ables, visit characteristics, and patient perceptions,” The American journalof emergency medicine, vol. 18, no. 4, pp. 394–400, 2000.

[35] F. Greaves, D. Ramirez-Cano, C. Millett, A. Darzi, and L. Donaldson, “Ma-chine learning and sentiment analysis of unstructured free-text informationabout patient experience online,” The Lancet, vol. 380, p. S10, 2012.

55

[36] R. J. Little and D. B. Rubin, Statistical analysis with missing data, ch. 5,pp. 3–23. John Wiley & Sons, 2014.

[37] Q. A. Raaijmakers, “Effectiveness of different missing data treatments insurveys with likert-type data: Introducing the relative mean substitutionapproach,” Educational and Psychological Measurement, vol. 59, no. 5,pp. 725–748, 1999.

[38] W. Ling and F. Dong-Mei, “Estimation of missing values using a weightedk-nearest neighbors algorithm,” in Environmental Science and InformationApplication Technology, 2009. ESIAT 2009. International Conference on,vol. 3, pp. 660–663, IEEE, 2009.

[39] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Trans-actions on knowledge and data engineering, vol. 21, no. 9, pp. 1263–1284,2009.

[40] J. Verma, Data analysis in management with SPSS software. SpringerScience & Business Media, 2012.

[41] M. Hall, I. Witten, and E. Frank, Data mining: Practical machine learningtools and techniques. Morgan Kaufmann.

[42] R. S. Kenett and S. Salini, Tree-based methods and decision trees, vol. 117,ch. 15, pp. 283–308. John Wiley & Sons, 2011.

[43] S. Singh and P. Gupta, “Comparative study id3, cart and c4. 5 decisiontree algorithm: a survey,” 2014.

[44] C. Nguyen, Y. Wang, and H. N. Nguyen, “Random forest classifier com-bined with feature selection for breast cancer diagnosis and prognostic,”2013.

[45] Y. Ganjisaffar, R. Caruana, and C. V. Lopes, “Bagging gradient-boostedtrees for high precision, low variance ranking models,” in Proceedings of the34th international ACM SIGIR conference on Research and developmentin Information Retrieval, pp. 85–94, ACM, 2011.

[46] R. E. Banfield, L. O. Hall, K. W. Bowyer, and W. P. Kegelmeyer, “A com-parison of decision tree ensemble creation techniques,” IEEE transactionson pattern analysis and machine intelligence, vol. 29, no. 1, 2007.

[47] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,” The bulletin of mathematical biophysics, vol. 5, no. 4,pp. 115–133, 1943.

[48] Q.-j. Zhang and K. C. Gupta, Neural networks for RF and microwavedesign (Book+ Neuromodeler Disk). Artech House, Inc., 2000.

[49] Z. Reitermanov, “Data splitting,” in WDS, vol. 10, pp. 31–36, 2010.

[50] N. Japkowicz and M. Shah, Evaluating learning algorithms: a classificationperspective. Cambridge University Press, 2011.

56

TRITA EE 2017:093

ISSN 1653-5146

www.kth.se

toward better health care service: statistical and machine ...1136114/fulltext01.pdf · statistical...

Documents