astha sharmapgcmrda229 project_report

57
PROJECT REPORT To determine the reason for lower Voters Turnout among the Urban Population, in India By: Astha Sharma Batch: 2013-14 Enrolment No: PGCMRDA229

Upload: astha-sharma

Post on 11-Apr-2017

264 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Astha sharmapgcmrda229 project_report

PROJECT REPORT

To determine the reason for lower Voters Turnout among the Urban Population, in

India

By:

Astha Sharma

Batch: 2013-14

Enrolment No: PGCMRDA229

Page 2: Astha sharmapgcmrda229 project_report

1 Reason for lower voter turnout amidst Urban Population, in India

INTRODUCTION

The purpose of this project is to identify the key reasons due to which the majority of Urban

Population does not exercise the right to franchise. The research is to try to understand what

would drive urban population to vote and propose solution(s) to the outcome.

In an ideal scenario I would like to share the outcome of the project with the election

commission or organizations which would be benefited by such information.

Page 3: Astha sharmapgcmrda229 project_report

2 Reason for lower voter turnout amidst Urban Population, in India

ACKNOWLEDGEMENT

I would like to thank Prof. Vina Vani for being my guide throughout the project. Thank you for showing

immense support and guidance towards my work. She gave a lot of chance to self-develop throughout the

project & corrected me every now and then. She has tremendous knowledge on Statistics & Data

Analysis concepts.

I would like to thank Prof. Amit & Rohit, who has been of great help teaching concepts & statistically

technics in the class. Thank you for being so kind in accommodating my doubts in your busy schedule.

A heartfelt gratitude to all my respondents who spared the time amidst their other commitments and filled

the questioner forms. Thank you so much for your valuable inputs and suggestions without which this

study would not have been able to see the light of the day.

Also a big thank you to Mr. Ashish Roy, Manager – Mu Sigma, without his support and practical

experience on Analytics it would not have been possible for the analysis to be so methodical.

Page 4: Astha sharmapgcmrda229 project_report

3 Reason for lower voter turnout amidst Urban Population, in India

TOPIC PAGE

I. ABSTRACT 4

II. BACKGROUND OF RESEARCH PROBLEM 5

2.1 INTRODUCTION 5

2.2 LITERATURE 7

2.3 REASONS FOR VOTING 7

2.4 11 REASONS WHY PEOPLE DON’T VOTE 8

2.5 ASSUMPTIONS 9

2.6 LIMITATIONS 9

III: EXPECTED CONTRIBUTION 10

3.1 RESEARCH QUESTION 11

3.2 HYPOSTHESISES 11

IV: SAMPLE, SCALES USED & INSTRUMENT OF DATA COLLECTION 12

4.1 SAMPLE 12

4.2 SCALE 12

4.3 INSTRUMENT OF DATA COLLECTION 13

V: RESEARCH DESIGN 15

VI: EXPLORATORY DATA ANALYSIS 15

6.1. SUMMARY 15

6.2 DESCRIPTIVE ANALYSIS 16

6.3 CHI-SQUARE TEST 17

6.3.1 SUMMARY: CHI-SQUARE 28

VII: FACTOR ANALYSIS 29

VIII: ONE WAY ANOVA 33

IX: MODEL 1 35

9.1 BINARY LOGISTIC REGRESSION 35

9.2 INTERPRETING COEFFICIENTS 37

9.3 ROC CURVE 38

9.4 CONCLUSION 41

X: MODEL 2 41

10.1 DISCRIMINANT ANALYSIS 42

10.2 MULTINOMIAL LOGISTICS REGRESSION 48

XI: CLASSIFICATION TREE 52

XII: SUMMARY AND CONCLUSION 54

XIII: REFERENCE SECTION 57

TABLE OF CONTENTS

Page 5: Astha sharmapgcmrda229 project_report

4 Reason for lower voter turnout amidst Urban Population, in India

PART 1: ABSTRACT

The report is aimed at understanding the reason for lower voter’s turnout among urban

population, in India.

An online survey questionnaire was designed and circulated, with the help of social media and

word of mouth, amidst the targeted population. The questionnaire was aimed at understanding

the psychological behaviour of people who voted vs the people who did not vote, in last

elections. The questionnaire also captured the demographic profile of respondents which helped

in analysing the behaviour pattern across population.

373 valid responses were obtained. Data was skewed to Bangalore population but division of

data was done on the basis of North and South India.

Analysis of data was done using various Statistical tools like Factor Analysis, Binary Logistic,

Discriminant Analysis, and Multinomial Logistic Regression depending on the hypothesis and

nature of data.

The first model was built to find out factors affecting voting decisions. The accuracy of first

model was 91.2% and it successfully tested various hypothesises.

The second model was built to understand the various factors which differentiate people who

did not vote in the last elections (196 respondents). They were divided into 3 groups; people who

do not have a voters ID, people who have ID and people who have ID but of another state.

Majority of people who did not vote in the last election had ID card of other state and hence

could not have voted. There are factors which affect them and hence impact their decision to

vote. The same were analysed in the second model. The model had an accuracy of 76.5%. The

lower sample size can be one of the factors for lower accuracy.

The result throws light on the fact that people who are motivated to vote generally believe in the

system and think a change in leadership would improve their living standards. On the other

hand, people who did not vote look forward to comfortable means like registration of ID card to

happen in corporates and voting to happen in the business park.

Further, the second model also highlighted the fact that people who do not belong to the state

they are residing in, look forward to voting to happen in more familiar surroundings like

Business parks.

Page 6: Astha sharmapgcmrda229 project_report

5 Reason for lower voter turnout amidst Urban Population, in India

Overall research gave results which could be reasoned with real life examples.

A further study, with larger and more varied sample can validate the accuracy of the model and

might throw more light related to psychological and demographic profiles of the people.

PART II: BACKGROUND OF RESEARCH PROBLEM

2.1 INTRODUCTION

In the 2009 general election, the Indian electorate was estimated to total approximately 714

million individuals, out of whom around 415 million (58.12%) actually cast a vote.

In India, Voters turnout among urban population is lower than the voters’ turnout percentage of

overall country. The statistics shows that percentage of voters turning in metros is less than that

of the overall state. The below table shows the data of the last Lok Sabha elections (2009):

STATE METRO

VOTERS TURNOUT

(2009)

VOTERS TURNOUT

(2014)

CITY STATE CITY STATE

KARNATAKA Bangalore Central 44.60% 58.80% 55.70% 67.28%

Bangalore North 46.70% 56.46%

Bangalore Rural 57.90% 68.00%

Bangalore South 44.74% 55.69%

DELHI Chandni Chowk 55.20% 51.81% 67.54% 65.09%

East Delhi 53.40% 65.35%

New Delhi 55.70% 65.00%

North East Delhi 52.40% 67.12%

North West Delhi 47.70% 61.66%

South Delhi 47.40% 62.98%

West Delhi 52.40% 66.03%

MAHARASHTRA Mumbai North 42.60% 50.50% 52.00% 61.70%

Mumbai North Central 39.50% 55.00%

Mumbai North East 42.50% 53.00%

Mumbai North West 44.10% 60.00%

Mumbai South 40.40% 54.00%

Mumbai South Central 39.50% 55.00%

Pune 40.70% 58.75%

Page 7: Astha sharmapgcmrda229 project_report

6 Reason for lower voter turnout amidst Urban Population, in India

WEST BENGAL Kolkata Uttar 64.30% 81.00% 60.07% 81.35%

Kolkata Dakshin 67.00% 65.90%

TAMIL NADU Chennai Central 61.00% 73.10% 60.90% 73.00%

Chennai North 64.90% 64.63%

Chennai South 62.70% 57.86%

ORISSA Bhubaneshwar 49.10% 65.30% 40.00% 70.00%

Source: www.indiavotes.com

In 2009 polls, people of an elite city like Bhubaneswar disappointed with worst voter turnout in

all three assembly segments - Bhubaneswar (Madhya), Bhubaneswar (Uttar) and Bhubnaeswar

(Ekamra). In all these segments, less than 40% voter turnout was recorded.

Urban Population, in India, have maximum exposure to information in various forms. They are

most informed about the condition of economy and how governments are faring across. They

have access to data to make a rational decision. In spite of all, the majority of Urban Indians do

not vote.

They also contribute the most honest amount to taxes; they are major target customers for any

bank/investment firms. They are also the key customers for any brand to establish itself in the

market. They have the maximum disposable income.

In case of government does not provide them with good facilities, they buy comfort. The urban

population have invertors/generators for power back up; they buy water by paying hefty amount

for water tankers. They buy good homes; they can buy vegetables even when the price goes up.

They can talk about politics in length; they know who the right candidate is but however,

majority of them, do not step out to vote.

Voters Turnout: Historical Data (for reference)

Year Voter

Turnout

Total vote Registration VAP

Turnout

Voting age

Population

Population

2014 66.38% NA NA NA 81,45,00,000 1,23,70,00,000

2009 58.17% 41,70,37,606 71,69,85,101 56.45% 73,87,73,666 1,15,68,97,766

2004 58.07% 38,99,48,330 67,14,87,930 60.91% 64,01,82,791 1,04,97,00,118

1999 59.99% 37,16,69,104 61,95,36,847 65.69% 56,57,80,483 98,68,56,301

1998 61.97% 37,54,41,739 60,58,80,192 67.45% 55,66,51,400 97,09,33,000

1996 57.94% 34,33,08,035 59,25,72,288 61.08% 56,20,28,100 95,25,90,000

1991 56.73% 28,27,00,942 49,83,63,801 57.23% 49,39,63,380 85,16,61,000

Page 8: Astha sharmapgcmrda229 project_report

7 Reason for lower voter turnout amidst Urban Population, in India

1989 61.98% 30,90,50,495 49,86,47,786 65.18% 47,41,43,040 81,74,88,000

1984 63.56% 24,12,46,887 37,95,40,608 64.61% 37,33,71,000 74,67,42,000

1980 56.92% 20,27,52,893 35,62,05,329 62.35% 32,51,62,040 66,35,96,000

1977 60.49% 19,42,63,915 32,11,74,327 64.67% 30,03,92,640 62,58,18,000

1971 55.25% 15,12,96,749 27,38,32,301 57.22% 26,43,93,600 55,08,20,000

1967 61.04% 15,27,24,611 25,02,07,401 63.11% 24,19,96,800 50,41,60,000

1962 55.42% 11,99,04,284 21,63,61,569 54.42% 22,03,24,090 44,96,41,000

1957 62.23% 12,05,13,915 19,36,52,179 61.15% 19,70,90,250 40,22,25,000

1952 61.17% 10,59,50,083 17,32,12,343 58.92% 17,98,30,000 36,70,00,000

Source: http://www.idea.int/vt/countryview.cfm?CountryCode=IN

2.2 LITERATURE

There is no survey/research done in India to figure out the reason for people not to vote.

Voter turnout is the percentage of eligible voters who cast a ballot in an election. (Who is

eligible varies by country, and should not be confused with the total adult population. For

example, some countries discriminate based on sex, race, and/or religion. Age and citizenship

are usually among the criteria.) After increasing for many decades, there has been a trend of

decreasing voter turnout in most established democracies since the 1960s. In general, low turnout

may be due to disenchantment, indifference, or contentment. Low turnout is often considered to

be undesirable, and there is much debate over the factors that affect turnout and how to increase

it. In spite of significant study into the issue, scholars are divided on reasons for the decline. Its

cause has been attributed to a wide array of economic, demographic, cultural, technological, and

institutional factors. There have been many efforts to increase turnout and encourage voting.

Source: http://en.wikipedia.org/wiki/Voter_turnout

2.3 REASONS FOR VOTING

The basic formula for determining whether someone will vote, on the questionable assumption

that people act completely rationally, is

Where

Page 9: Astha sharmapgcmrda229 project_report

8 Reason for lower voter turnout amidst Urban Population, in India

P is the probability that an individual's vote will affect the outcome of an election,

B is the perceived benefit that would be received if that person's favoured political party

or candidate were elected,

D originally stood for democracy or civic duty, but today represents any social or

personal gratification an individual gets from voting, and

C is the time, effort, and financial cost involved in voting.

Since P is virtually zero in most elections, PB is also near zero, and D is thus the most important

element in motivating people to vote. For a person to vote, these factors must outweigh C.

Riker and Ordeshook developed the modern understanding of D. They listed five major forms of

gratification that people receive for voting:

o Complying with the social obligation to vote;

o Affirming one's allegiance to the political system;

o Affirming a partisan preference (also known as expressive voting, or voting for a

candidate to express support, not to achieve any outcome);

o Affirming one's importance to the political system; and

o For those who find politics interesting and entertaining, researching and making a

decision.

Other political scientists have since added other motivators and questioned some of Riker and

Ordeshook's assumptions. All of these concepts are inherently inaccurate, making it difficult to

discover exactly why people choose to vote.

Recently, several scholars have considered the possibility that B includes not only a personal

interest in the outcome, but also a concern for the welfare of others in the society (or at least

other members of one's favourite group or party.

In particular, experiments in which subject altruism was measured, using a dictator game,

showed that concern for the well-being of others is a major factor in predicting turnout and

political participation. Note that this motivation is distinct from D, because voters must think

others benefit from the outcome of the election, not their act of voting in and of itself.

Source: http://en.wikipedia.org/wiki/Voter_turnout

Page 10: Astha sharmapgcmrda229 project_report

9 Reason for lower voter turnout amidst Urban Population, in India

2.4 11 REASONS WHY PEOPLE DON’T VOTE

1. Many people think their vote does not count.

2. Many people have the excuse that they are too busy to vote.

3. Voting registration is a process that people can fear or feel intimidated by.

4. Apathy is probably the most common reason for not voting.

5. Some people say they do not vote, because the 'lines are too long'.

6. Some people say they do not vote because they do not like the two candidates that are on

the news every night.

7. Some people say that they cannot get to the polling place to vote.

8. If a person is traveling, it may be an excuse for not voting.

9. Some people do not vote for a third party, because they are told it will 'spoil' the vote for

one of the big two politicians.

10. Some people say that voting does not matter, because their ONE vote will not swing the

result one way or the other.

11. Some people believe all political candidates are bought off by corporations, so why

bother voting, because the votes have already been bought and sold.

Source: www.agreenroad.blogspot.in

2.5 ASSUMPTIONS

Since the target population is Urban population of India, it is assumed that there will not

be much variation in the attitude of people towards voting

Data for cities like Mumbai and Pune is being combined since it is assumed that people

will have the same behavioural pattern due to similar demographics.

Similarly data from all the cities from a particular state is combined under the assumption

that the voting decisions will not vary, within the state.

2.6 LIMITATIONS

The sample size of 373 is too small considering the base population being very large.

Hence outcome can vary with larger sample size

The data is skewed to Bangalore population and hence may or may not be indicator of

other metros

Page 11: Astha sharmapgcmrda229 project_report

10 Reason for lower voter turnout amidst Urban Population, in India

Since the sample size is small the variable Annual Income is divided into 2 parts. Income

between 2-10 lacs and >10 lacs.

The sample size of people who have not voted in last election is 196 and out of which

only 33 do not have a registered voter’s ID card. The sample is too small for analysis.

Hence there are chances for results to be different with a larger sample size.

PART III: EXPECTED CONTRIBUTION

Indian Urban Population comprises of people who maintain a comfortable standard of living.

The kind of issues which majority of the country faces is little felt in the lives of urban

population. It is not because they are given additional facilities but it is that they pay a premium

and buy the facilities.

Changing government adds little value to their day today lives.

This research is to point out the issue that when government wants something to be done on a

mandatory basis, it makes sure that all the facilities are provided to assist people in completing

the directive. For example; ever since Adhar cards are made mandatory government facilitates

Adhar card camps in various corporates. Similarly there are agents who assist making of PAN

cards, assist filing of IT returns. But there is NO such facility provided for registration of Voter’s

ID.

The concern is when it is important for Government, things are organised and promoted through

various channels and is made sure that it is completed. But then why is it not important for

Government to encourage people to vote. Especially the urban population who is more

informed and can take good decisions in the larger benefit of the country.

If people are not willing to take extra effort to go and register and stand in the long queues of

Voter’s ID card then why do government not encourage the registration of voters through

channels like Corporates?

The contribution I would like to make through this research is to point out reasons for people not

to vote and bring out the point that certain segment of the society likes to be treated differently.

When they pay higher taxes, higher cost of living then why is their vote not made to feel

important.

Page 12: Astha sharmapgcmrda229 project_report

11 Reason for lower voter turnout amidst Urban Population, in India

Registration of Voter’s ID can be driven through corporates and can be promoted as a one of the

activities of Corporate Social Responsibility. Also, if polling booths are organized in the

Business Parks then it is much easier to track voting percentage since all have unique identity

cards. The existing technology can be used and corporates can track voters. Further, there are a

lot of chances that people would feel encouraged to vote when they are allowed to walk to the

polling booths along with their friends and colleagues in the familiar surroundings.

All of this is possible only if government takes a pledge to make sure everybody votes. This

research would help in quantifying the apprehensions the Urban Population has towards the

system of voting and if used effectively a solution to the problem can be arrived.

3.1 RESEARCH QUESTION

What are the factors driving lower voter turnout among Urban Population versus

the overall country average

3.2 The following HYPOSTHESISES will be tested:

Majority of Urban Population do not vote because they do not trust the system

They do not vote because the process of voting is very tedious; standing in long queues

A majority of population belongs to other states and hence have voter’s IDs of the state

of their origin and hence cannot vote

They do not vote because they are not happy with the choice of politicians

They perceive going for voting can be a threat to their safety

They are not comfortable with the location of polling booths

They do not consider their vote will add any value to the whole process

They feel that things will not change drastically with change of government and hence

they do not care to vote

Page 13: Astha sharmapgcmrda229 project_report

12 Reason for lower voter turnout amidst Urban Population, in India

PART IV: SAMPLE, SCALES USED & INSTRUMENT OF DATA

COLLECTION

4.1 SAMPLE

The Urban Population, mostly people working in corporates, is the target population for the

study. Amidst 401 respondents, 373 valid responses are considered for analysis which consists

of Voters and Non Voters.

The questionnaire was distributed in a manner that most of the respondents belong to urban

population. The same was done by circulating questionnaire in corporate. It is assumed that the

people working in a similar environment will have similar requirements/expectations from the

political system.

4.2 SCALE

The measurement of the variables is done using a Five Point Likert Scale, Binary answers

(Yes/No). Below is the data dictionary for defining the measurement of variables:

DEFINING VARIABLES

5 POINT LIKERT SCALE (Ordinal Variables)

Strongly disagree 1

Disagree 2

Neither agree nor disagree 3

Agree 4

Strongly agree 5

Not Applicable 6

DURATION OF STAY (Ordinal Variable)

< 5 years 1

5-10 years 2

>10 years 3

ANNUAL INCOME (Ordinal Variable)

< 2 lacs p.a. 1

2-6 lacs p.a. 2

6-10 lacs p.a. 3

10-20 lacs p.a. 4

Page 14: Astha sharmapgcmrda229 project_report

13 Reason for lower voter turnout amidst Urban Population, in India

> 20 lacs p.a. 5

DECISION TREE (Nominal Variables)

Yes 1

No 0

NA/May Be 3

The service areas to be evaluated in this study would be limited to the items that were included

in the Questioner.

4.3 INSTRUMENT OF DATA COLLECTION

Primary data was collected through structured questionnaire. Questionnaire aimed at capturing

the factors motivating people to vote and not vote. Also people were asked on their preference

for registration of voters ID in their corporate along with voting to happen in a business park

in/near their office. An online survey was created and circulated using the social media like

Facebook, LinkedIn etc.

Questionnaire consisted of 28 unique questions, not all were targeted to entire group. There are

2 major divisions of questions:

a) People who voted and people who did not vote in the last election

b) Amidst who voted did not vote; people who are registered voters and people who are

non-registered.

Below is the link for Questionnaire:

https://www.surveymonkey.com/s/XNTZNMB (can be accessed only once through a URL)

The table below is has list of all the questions. It also covers the name, type of the

Variable as used for the analysis:

Project_Data_Consolidated.xlsx

CONSOLIDATED DATE FOR ANALYSIS (POST PROCESSING)

SPSS DATA FOR EDA AND FIRST MODEL

Page 15: Astha sharmapgcmrda229 project_report

14 Reason for lower voter turnout amidst Urban Population, in India

VOTING_RESPONSE is the Dependent Variable in 1st Model where analysis is done to

determine the factors affecting the decision to vote or not to vote.

REGISTERED_VOTER is the Dependent Variable in 2nd

Model where analysis is done to

understand the factors which affect people not to vote.

Two models are developed using the data.

Page 16: Astha sharmapgcmrda229 project_report

15 Reason for lower voter turnout amidst Urban Population, in India

PART V: RESEARCH DESIGN

5.1 RESEARCH DESIGN

Figure 1: Conceptual Framework for VOTERS TURNOUT (Primary Study of Data and other articles

written on factors driving Voter’s Turnout)

ANALYSIS OF DATA

PART VI: EXPLORATORY DATA ANALYSIS

Exploratory Data Analysis is done to understand and summarize the data

6.1. SUMMARY

The data is primarily divided into voters and non-voters. Out of 401 respondents, 187 voted in

the latest elections and 214 did not vote. Below is the summary of data collected:

RAW DATA DUMP

SurveySummary_06082014.xls

SURVEY SUMMARY

FACTORS DRIVING

DECISION TO VOTE OR NOT

Importance of Vote

- Better Policy

- Better Economy

- Better Leader

Safety and Ease

- Safety Concerns

- Voting System being Tedious

Belief in Voting System

- Trust in Politician

- Trust and Interest in Political system

Social Responsibility

- Friends and Peers Voting

- Awarness created by media

Page 17: Astha sharmapgcmrda229 project_report

16 Reason for lower voter turnout amidst Urban Population, in India

373 respondents completed the questionnaire and hence are considered for analysis. Below is

the analysis:

The questionnaire is further divided into Registered and Non Registered Respondents. All

who voted are considered Registered Voters. This question was asked to all who did not vote in

the last election. Below is the overall statistics for registered and non-registered voters:

6.2 DESCRIPTIVE ANALYSIS

Descriptive Analysis is performed on all the demographical variables to describe

the main features of data collected

Voting_response, dependent variable, determines if the respondent voted in the last election or

not. Since the response is Yes/No it is a nominal variable and hence Chi-Square Test is done to

determine the relation of voting_response with other demographic variables like Age, Gender,

Education Qualification, Annual Income, Occupation, Relationship Status, Duration of Stay in

Same City, Belong to State residing in and Duration of abroad stay.

Re sp o nse

Pe rce nt

Resp o nse

Count

46.6% 187

53.4% 214

401

1sk ip p ed q ue stion

I vo ted in the las t e lections

Answer Op tions

Yes

No

answe re d q ue stio n

Re sp o nse

Pe rce nt

Resp o nse

Count

47.5% 177

52.5% 196

373

1

I vo ted in the las t e lections

Answer Op tions

Yes

No

answe re d q ue stio n

sk ip p ed q ue stion

Response

Percent

Response

Count

81.8% 305

18.2% 68

373

I am a reg is te red vo te r

Answer Op tions

Yes

No

answered question

Page 18: Astha sharmapgcmrda229 project_report

17 Reason for lower voter turnout amidst Urban Population, in India

6.3 CHI-SQUARE TEST

Below are the Chi-square tests of all the demographic variables

AGE – Ordinal Variable (voting_response * Age)

Definition: 18 to 24 – 1, 25 to 34 – 2, 35 to 44 – 3, 45 to 54 – 4, 55 or older - 5

Ho – Age of a person does not determine his/her choice to vote

H1 – Decision to vote varies with the age of a person

Re sp o nse

Pe rce nt

Re sp o nse

Co unt

22.9% 92

65.6% 263

9.2% 37

1.7% 7

0.5% 2

401

1skip p e d q ue stio n

Ag e

45 to 54

18 to 24

a nswe re d q ue stio n

35 to 44

Answe r Op tio ns

55 or older

25 to 34

Chi-Square Tests

Value Df

Asymp. Sig.

(2-sided)

Pearson Chi-Square 3.940a 4 .414

Likelihood Ratio 3.987 4 .408

Linear-by-Linear

Association

3.377 1 .066

N of Valid Cases 373

a. 4 cells (40.0%) have expected count less than 5.

The minimum expected count is .95.

With significance

value being more

than 0.05, it signifies

that voting response

rate across age

group does not vary.

Maximum

respondent are in

the age group of 25-

34 which is 67% of

the total

respondent.

Page 19: Astha sharmapgcmrda229 project_report

18 Reason for lower voter turnout amidst Urban Population, in India

Null Hypothesis is accepted

Crosstab

Age

Total 1 2 3 4 5

voting_response 0 Count 50 128 15 2 1 196

Expected Count 44.7 128.7 17.9 3.7 1.1 196.0

1 Count 35 117 19 5 1 177

Expected Count 40.3 116.3 16.1 3.3 .9 177.0

Total Count 85 245 34 7 2 373

Expected Count 85.0 245.0 34.0 7.0 2.0 373.0

GENDER – Nominal Variable (voting_response * Gender)

Of the total 373 respondents 115 are Female and 258 are Male

Definition: Male -1, Female - 2

Ho – Gender of a person does not drive his/her choice to vote

Response

Pe rcent

Respo nse

Co unt

30.8% 115

69.2% 258

373

1

Gender

Answer Op tions

Female

Male

answere d q uestion

sk ipped q ue stion

128 respondents in the age

group of 25-34 did not vote

in the last election and 117

respondents in the same age

group voted. The trend is

similar for all the age group.

69.2% of the

respondents are

Male and 30.8% of

respondents are

Female

Page 20: Astha sharmapgcmrda229 project_report

19 Reason for lower voter turnout amidst Urban Population, in India

H1 – Gender of a person drives his/her choice to vote

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Exact Sig. (2-

sided)

Exact Sig. (1-

sided)

Pearson Chi-Square .016a 1 .898

Continuity Correctionb .000 1 .987

Likelihood Ratio .016 1 .898

Fisher's Exact Test .911 .494

Linear-by-Linear

Association

.016 1 .898

N of Valid Cases 373

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 54.57.

b. Computed only for a 2x2 table

Null Hypothesis is Accepted

EDUCATION QUALIFICATION – Ordinal Variable (voting_response *

Education_Qualification)

Definition: Graduate - 1, Professional Degrees – 2, Masters – 3, Doctorate – 4

Re sponse

Pe rcent

Response

Count

3.8% 14

53.6% 200

16.9% 63

25.7% 96

373a nswe red question

Highe st Educa tio n Qua lifica tion

Answer Op tions

Doctorate

Masters

Professional Degree

Graduate

135 male respondents did

not vote in the last election

and 123 male respondents

voted. The trend is similar for

Female as well.

53.6% of the

respondents have

Masters and 25.7%

of respondents are

Graduate

With significance

value being more

than 0.05, it signifies

that gender does

not impact the

choice to vote.

Page 21: Astha sharmapgcmrda229 project_report

20 Reason for lower voter turnout amidst Urban Population, in India

Ho – The Education Qualification of a person does not have an effect on his/her choice to vote

H1 – The Education Qualification of a person affects his/her choice to vote

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square .371a 3 .946

Likelihood Ratio .371 3 .946

Linear-by-Linear Association .173 1 .677

N of Valid Cases 373

a. 0 cells (.0%) have expected count less than 5. The minimum expected

count is 6.64.

OCCUPATION – Nominal Variable (voting_response * Occupation)

Definition: Student – 1, Home Maker – 2, Government Service – 3, Professional – 4,

Self Employed – 5

Ho – Occupation of a person does not impact his/her choice to vote

H1 – Occupation of a person impacts his/her decision to vote

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square 1.605a 4 .808

Likelihood Ratio 1.619 4 .805

Linear-by-Linear Association .055 1 .814

N of Valid Cases 373

Re sp o nse

Pe rce nt

Re sp o nse

Co unt

7.0% 26

80.4% 300

1.6% 6

2.7% 10

0.0% 0

8.3% 31

373

Home Maker

Retired

Student

a nswe re d q ue stio n

Occup a tio n

Answe r Op tio ns

Self Employed

Professional

Government Service

With significance value

being more than 0.05,

it signifies that voting

response rate does not

vary with the education

qualification of a

person

With significance value

being more than 0.05,

it signifies that voting

response rate does not

vary with occupation of

a person

80.4% of the

respondents are

Professionals. This was

also the key target

segment for analysis

Page 22: Astha sharmapgcmrda229 project_report

21 Reason for lower voter turnout amidst Urban Population, in India

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square 1.605a 4 .808

Likelihood Ratio 1.619 4 .805

Linear-by-Linear Association .055 1 .814

N of Valid Cases 373

a. 3 cells (30.0%) have expected count less than 5. The minimum expected

count is 2.85.

Null Hypothesis is Accepted

ANNUAL INCOME – Ordinal Variable (voting_response * Annual_Income)

Answer Options Response

Percent Response

Count

< 10 lac p.a. 75.6% 282

> 10 lac p.a. 24.4% 91

answered question 373

Definition: < 10 lac p.a. – 0, > 10 lac p.a. – 1

Ho – The annual income of a person does not determine his choice to vote

H1 – The annual income of a person determines his choice to vote

With significance value

being more than 0.05,

it signifies that voting

response rate does not

vary with occupation of

a person

144 voters and 156 non-

voters are professionals. The

trend is similar for all the

respondents across various

Occupations.

75.6% of the respondents

have an annual income < 10

lac. 24.4% of the

respondents have an

income greater than INR 10

lac.

Page 23: Astha sharmapgcmrda229 project_report

22 Reason for lower voter turnout amidst Urban Population, in India

Null Hypothesis is Accepted

CITY OF RESIDENCE – Nominal Variable

(voting_repsonse*current_city_stay)

Definition: South India – 1, North India – 2, Others – 3

Ho – Geographical location of a person does not impact his/her decision to vote

H1 – Geographical location of a person impacts his/her decision to vote

Chi-Square Tests

Value df

Asymp. Sig. (2-

sided)

Exact Sig. (2-

sided)

Exact Sig. (1-

sided)

Pearson Chi-Square 1.973a 1 .160

Continuity Correctionb 1.648 1 .199

Likelihood Ratio 1.971 1 .160

Fisher's Exact Test .184 .100

Linear-by-Linear Association 1.968 1 .161

N of Valid Cases 373

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 43.18.

b. Computed only for a 2x2 table

Re sp o nse

Pe rce nt

Resp onse

Count

72.4% 270

22.5% 84

5.1% 19

373

South India

North India

Others

answe re d q ue stio n

City I am currently res id ing in

Answer Op tions

Significance value is more

than 0.05 and hence

indicates that annual

income impacts the

decision to vote.

154 non-voters and 128

voters are in the income

bracket of < 10 lacs p.a.

There is no significant

difference in the voting

response w.r.t. Annual

Income

72.4% of the respondents

are from South India and

22.5% of them from North

India. Data is skewed

towards Southern India.

Page 24: Astha sharmapgcmrda229 project_report

23 Reason for lower voter turnout amidst Urban Population, in India

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square .081a 2 .960

Likelihood Ratio .081 2 .960

Linear-by-Linear Association .041 1 .840

N of Valid Cases 373

a. 0 cells (.0%) have expected count less than 5. The minimum expected

count is 9.02.

Null Hypothesis is Accepted

RELATIONSHIP STATUS – Nominal Variable (voting_response *

Relationship_Status)

Definition: Single, never married – 1, Married – 2, Divorced – 3

Ho – Relationship status of a person does not impact his/her decision to vote

H1 – Relationship status of a person impacts his/her decision to vote

Respo nse

Pe rcent

Re sp o nse

Co unt

53.6% 200

44.5% 166

1.9% 7

373answere d q uestio n

Re la tionship Sta tus

Answer Op tions

Single, never married

Married

Divorced

53.6% of the respondents

are Single and 44.5% of

them are married. The

distribution of respondents

across two broad categories

of relation is almost equal.

Significance value is more

than 0.05 and hence

indicates that the city of

residence does not play a

significant role in choice of

voting

143 non-voters and 127 voters

belong to South India. 43 non-

voters and 41 voters belong to

North India. There is no

significant difference in the

pattern of voting between the

geographic locations.

Page 25: Astha sharmapgcmrda229 project_report

24 Reason for lower voter turnout amidst Urban Population, in India

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square 4.289a 2 .117

Likelihood Ratio 4.295 2 .117

Linear-by-Linear Association 4.182 1 .041

N of Valid Cases 373

a. 2 cells (33.3%) have expected count less than 5. The minimum

expected count is 3.32.

Null Hypothesis is Accepted

DURATION OF STAY IN THE SAME CITY – Ordinal Variable

(voting_response * Duration_Stay_Same_City)

Definition: <5 years - 1, 5-10 years – 2, >10 years – 3

Response

Pe rcent

Re spo nse

Co unt

45.0% 167

19.9% 74

35.0% 130

371

Dura tion o f s tay in the Sta te I am currently re ce d ing in

Answe r Op tions

< 5 years

5-10 years

>10 years

answered question

115 respondents, who did

not vote in, last elections, are

single whereas 85 single

respondents voted. The

trend is reverse for married

people 88 voted and 78 did

not vote.

Significance value is more

than 0.05 and hence

indicates that the

relationship status does

not play a significant role

in choice of voting

45.0% of respondents have

lived in same city for less than

5 years whereas 55% of the

respondents have been in the

same city for more than 5

years. The time frame should

be sufficient for people to have

voters ID created.

Page 26: Astha sharmapgcmrda229 project_report

25 Reason for lower voter turnout amidst Urban Population, in India

Ho – Duration of Stay in the same city does not impact the choice to vote

H1 – Duration of Stay in the same city impacts the choice to vote

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square 60.843a 3 .000

Likelihood Ratio 63.515 3 .000

Linear-by-Linear Association 55.298 1 .000

N of Valid Cases 373

a. 2 cells (25.0%) have expected count less than 5. The minimum

expected count is .95.

Crosstab

Duration_Stay_Same_City

Total 0 1 2 3

voting_response 0 Count 2 115 46 33 196

Expected Count 1.1 87.8 38.9 68.3 196.0

1 Count 0 52 28 97 177

Expected Count .9 79.2 35.1 61.7 177.0

Total Count 2 167 74 130 373

Expected Count 2.0 167.0 74.0 130.0 373.0

Significance value is less

than 0.05 and hence

indicates that the duration

of stay is significant factor

driving votes.

115 respondents, who did

not vote in, last elections, are

single whereas 85 single

respondents voted. The

trend is reverse for married

people 88 voted and 78 did

not vote.

There is a difference in expected count of voters and non-voters in accordance to the duration of

stay in the same city. Hence the variable has an impact on voting decision.

Page 27: Astha sharmapgcmrda229 project_report

26 Reason for lower voter turnout amidst Urban Population, in India

Null Hypothesis is Rejected

I BELONG TO THE STATE I AM RESIDING IN – Nominal Variable

(voting_response * State_residing_flag)

Definition: Yes – 1, No – 2

Ho – Decision to vote is not impacted if a person belongs to state he/she is not residing in

H1 – Decision to vote is impacted if a person belongs to state he/she is not residing in

Response

Percent

Response

Count

63.0% 235

37.0% 138

373

I be long to the s ta te I am res id ing in

Answer Op tions

Yes

No

answered question

63.0% of the respondents

belong to the state they are

residing in can be a motivating

factor to vote for the

betterment of their state.

Significance value is less than 0.05 and hence indicates factor that people belong to the state they

currently living in impacts their decision to vote.

Page 28: Astha sharmapgcmrda229 project_report

27 Reason for lower voter turnout amidst Urban Population, in India

Crosstab

State_residing_flag

Total 0 1

voting_response 0 Count 164 32 196

Expected Count 123.5 72.5 196.0

1 Count 71 106 177

Expected Count 111.5 65.5 177.0

Total Count 235 138 373

Expected Count 235.0 138.0 373.0

Null Hypothesis is Rejected

Have been abroad, if yes, total duration of stay – Ordinal Variable

(voting_response*Abroad_stay)

Definition: 0-1 year – 1, 1-3 years – 2, 3-10 years – 3, >10 years – 4, Not been abroad – 5

Response

Pe rcent

Response

Count

31.9% 119

7.0% 26

5.1% 19

1.9% 7

54.1% 202

373

Not been abroad

answered question

Have b e e n ab road , if yes, to ta l dura tion o f s tay

Answe r Op tions

0-1 year

1-3 years

3-10 years

>10 years

Amidst 235 respondents who

did not vote in the last

election, 164 people do not

belong to the state. Only 32

respondents who do not

belong to the state voted in

the last election.

There is a difference in expected

count of voters and non-voters

in accordance to the belonging

to state they are residing in.

Hence the variable has an impact

on voting decision.

54.1% of the respondents have not

been abroad and 31.9% of

respondents have spent less than 1

year outside India. Exposure to life

outside India can influence people’s

decision to make a difference.

Page 29: Astha sharmapgcmrda229 project_report

28 Reason for lower voter turnout amidst Urban Population, in India

Ho – The exposure to life outside India does not impact person’s decision to vote

H1 – The exposure to life outside India impacts the decision to vote

Chi-Square Tests

Value Df

Asymp. Sig. (2-

sided)

Pearson Chi-Square 6.517a 4 .164

Likelihood Ratio 6.900 4 .141

Linear-by-Linear Association 2.813 1 .094

N of Valid Cases 373

a. 2 cells (20.0%) have expected count less than 5. The minimum

expected count is 3.32.

Null Hypothesis is Accepted

SPSS OUTPUT CHI-SQUARE ANALYSIS

Significance value is more than

0.05 and hence indicates factor

that people have lived outside

India does not have any

significance to their choice of

voting.

The exposure to abroad stay

has not influenced people’s

decision to vote. There was a

little difference seen in the

actual and expected count.

6.3.1 SUMMARY: CHI-SQUARE

Age, Gender, Education Qualification, Occupation, Relationship Status of a person

is not influential in driving people choice to vote.

Annual Income is significant in driving people’s choice to vote.

o From the data, more number of people, in the income bracket of 2-6 lac p.a., did

not vote in the last election

Page 30: Astha sharmapgcmrda229 project_report

29 Reason for lower voter turnout amidst Urban Population, in India

PART VII: FACTOR ANALYSIS

The questionnaire was designed after considering various parameters which drives people to

vote. Voting is a personal choice and every person attributes different importance to voting.

Many factors drives the decision to vote and an effort was made to incorporate all important

attributes in the questionnaire. Many questions might be highly interrelated and brings out the

same characteristics of human psychology.

Hence, the analysis is used to identify principle variables from many variables selected for the

study. This would help in providing underlying construct of highly correlated variables.

It is assumed that the sample is homogeneous and hence the response of the people will not vary

if the same questions are asked in any other circumstances.

Variables which would help in understanding the psychology of respondents, on their reasons to

vote, are included for factor analysis. Other demographic variables will be directly used in the

model building.

The sample size is too small to comment on the higher brackets. Though it indicated

that more people in the income bracket of 10-20 lac p.a. did not vote the trend was

not the same for people with income more than 20 lacs p.a... A larger sample size

may or may not change this observation. Duration of stay in the same city highly

influences person’s decision to vote. More people have voted if they have stayed in

the city longer than 10 years. The trend is reverse for people who have stayed for

duration lesser than 5 years.

More number of people who belong to the state they are residing in has voted than

the ones who do not belong to the state. This variable also influences the decision to

vote.

Page 31: Astha sharmapgcmrda229 project_report

30 Reason for lower voter turnout amidst Urban Population, in India

The following 9 variables are considered for Analysis:

1) Vote_Tedious_TimeConsuming

2) Opinion_Safety_Concern

3) Opinion_Politicians_choice

4) Opinion_Trust_Voting_System

5) Opinion_RegstrVoterID_Org

6) Opinion_Voting_Business_Park

7) Importance_Leader

8) Opinion_Vote_Important

9) Imapct_LifeStyle

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .680

Bartlett's Test of Sphericity Approx. Chi-Square 581.581

Df 36

Sig. .000

Communalities

Initial Extraction

Vote_Tedious_TimeConsumi

ng

1.000 .569

Opinion_Safety_Concern 1.000 .601

Opinion_Politicians_choice 1.000 .826

Opinion_Trust_Voting_Syste

m

1.000 .483

Opinion_RegstrVoterID_Org 1.000 .641

Opinion_Voting_Business_P

ark

1.000 .713

Importance_Leader 1.000 .710

Opinion_Vote_Important 1.000 .779

Imapct_LifeStyle 1.000 .777

- Kaiser-Meyer-Olkin

value being slightly higher

than 0.5 signifies that

sample is adequate to work

on Factor analysis

- Sig = 0.000 for Bartlett’s

Test of Sphericity indicates

that factor analysis is useful

for reduction of data

All the variables are

Ordinal with value

ranging from 1-5 with 1

being Strongly Disagree

and 5 being Strongly

Agree

- Communalities for none of

the variables, except

Opinion_Trust_Voting_System , are

less than 0.5 hence implies that

variance explained by all the

variables is more than 50%.

Hence, all the variables are

considered for further analysis.

Communality of 0.483 is very

close to 0.5 and hence even that

is included for analysis.

Page 32: Astha sharmapgcmrda229 project_report

31 Reason for lower voter turnout amidst Urban Population, in India

Communalities

Initial Extraction

Vote_Tedious_TimeConsumi

ng

1.000 .569

Opinion_Safety_Concern 1.000 .601

Opinion_Politicians_choice 1.000 .826

Opinion_Trust_Voting_Syste

m

1.000 .483

Opinion_RegstrVoterID_Org 1.000 .641

Opinion_Voting_Business_P

ark

1.000 .713

Importance_Leader 1.000 .710

Opinion_Vote_Important 1.000 .779

Imapct_LifeStyle 1.000 .777

Extraction Method: Principal Component Analysis.

Total Variance Explained

Comp

onent

Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

Total

% of

Variance

Cumulative

% Total

% of

Variance

Cumulative

% Total

% of

Variance

Cumulative

%

1 2.380 26.450 26.450 2.380 26.450 26.450 2.291 25.459 25.459

2 1.514 16.820 43.270 1.514 16.820 43.270 1.475 16.393 41.852

3 1.189 13.209 56.478 1.189 13.209 56.478 1.228 13.645 55.497

4 1.016 11.286 67.765 1.016 11.286 67.765 1.104 12.268 67.765

5 .788 8.758 76.523

6 .768 8.533 85.056

7 .639 7.095 92.151

8 .401 4.453 96.603

9 .306 3.397 100.000

Extraction Method: Principal Component Analysis.

- Communalities for none of

the variables, except

Opinion_Trust_Voting_System , are

less than 0.5 hence implies that

variance explained by all the

variables is more than 50%.

Hence, all the variables are

considered for further analysis.

Communality of 0.483 is very

close to 0.5 and hence even that

is included for analysis.

4 factors with Eigen value of greater than 1 are considered. Extracted 4 factors

explain 67.765% of variance. Eigen value for each component in the table is a

variance of the component/factor extracted.

Page 33: Astha sharmapgcmrda229 project_report

32 Reason for lower voter turnout amidst Urban Population, in India

Rotated Component Matrixa

Component

1 2 3 4

Vote_Tedious_TimeConsumi

ng

-.139 .739 .059 .012

Opinion_Safety_Concern .079 .765 -.096 -.008

Opinion_Politicians_choice .047 -.022 -.022 .907

Opinion_Trust_Voting_Syste

m

.094 -.524 .043 .444

Opinion_RegstrVoterID_Org .024 -.206 .745 -.207

Opinion_Voting_Business_P

ark

-.061 .134 .810 .185

Importance_Leader .840 -.073 -.001 .014

Opinion_Vote_Important .877 -.056 -.016 .080

Imapct_LifeStyle .881 .003 -.032 .018

Extraction Method: Principal Component Analysis.

Rotation Method: Varimax with Kaiser Normalization.

a. Rotation converged in 5 iterations.

SPSS OUTPUT FACTOR ANALYSIS

Screen Plot visually helping

to identify the cut-off point

of 4. The Eigen value of 5th

Factor is 0.788 which is far

below 1 and hence 4 is the

right cut off.

Page 34: Astha sharmapgcmrda229 project_report

33 Reason for lower voter turnout amidst Urban Population, in India

PART VIII: ONE WAY ANOVA

One Way Analysis of Variance (ANOVA) is performed to determine whether there are any

significant differences between the means of 4 independent Factors, respectively, w.r.t to the

dependent variable. Though the significance of the factors will be tested during Model building,

ANOVA is performed a part of EDA.

H01 – there is no difference in the average response of people who consider voting important

H11 – there is difference in the average response of people who consider voting important

H02 – there is no difference in the average response of people who have concerns with safety and

ease related to voting

H12 – there is difference in the average response of people who have concerns with safety and

ease related to voting

7.1 LABELLING THE FACTORS

Factor 1 – Importance_Voting_F1

Importance_Leader: It is important to me who, gets elected

Opinion_Vote_Important: I feel my vote is important

Imapct_LifeStyle: There will be a direct impact on my day today life with whosoever gets elected

Factor 2 – Security_Ease_F2

Vote_Tedious_TimeConsuming: The process of Voting is tedious and time consuming

Opinion_Safety_Concern: Safety is a concern when I go for voting

Factor 3 – Mobility_Comfort_F3

Opinion_RegstrVoterID_Org: Registration of Voters ID should be done at my corporate, just like

other government documents

Opinion_Voting_Business_Park: Voting should be in a business parks near to work locations and can

happen on a working day

Factor 4 – Belief_Voting_System_F4

Opinion_Politicians_choice: I am satisfied with the choice of politicians

Opinion_Trust_Voting_System: The voting system is trustworthy

Page 35: Astha sharmapgcmrda229 project_report

34 Reason for lower voter turnout amidst Urban Population, in India

H03 – there is no difference in the average response of people who look forward to comfort in

terms of voters id registration and location of polling booths

H13 – there is difference in the average response of people who look forward to comfort in terms

of voters id registration and location of polling booths

H04 – there is no difference in the average response of people who believe in the voting system

H14 – there is difference in the average response of people who believe in the voting system

SPSS OUTPUT ONE WAY ANOVA

SUMMARY – ONE WAY ANOVA

The One way ANOVA for the 4 factors shows there is a difference in the means of following

factors:

Factor 1 – Importance Voting

Factor 3 – Voting Comfort

Hence the null hypothesis H01 and H03 are Rejected.

All the factors are reused during model building.

Page 36: Astha sharmapgcmrda229 project_report

35 Reason for lower voter turnout amidst Urban Population, in India

PART IX: MODEL 1

9.1 BINARY LOGISTIC REGRESSION

Binary Logistic Regression model is used to predict the chances of a person opting to vote.

The predictor variables used are the 4 Factors created, along with all the demographic

variables. Since the independent variables are a mix of continuous and categorical variables,

Binary Logistic Model is the best method to predict the dependent variable.

Even though many of the demographic variables like Age, gender etc. showed low significance

in predicating Y, as per the Chi-square, the same are used in the model because it might become

significant when interacting with other variables. The same is done just to rule out any

possibility to outcome to be different when the variables interact with each other in the model.

The below model was developed after multiple iterations. The same can be seen in the attached

output file.

Block 0: Beginning Block

We see that there are

373 cases used in the

analysis.

Page 37: Astha sharmapgcmrda229 project_report

36 Reason for lower voter turnout amidst Urban Population, in India

Variables in the Equation

B S.E. Wald df Sig. Exp(B)

Step 0 Constant -.102 .104 .967 1 .325 .903

Block 1: Method = Enter

Model Summary

Step -2 Log likelihood

Cox & Snell R

Square

Nagelkerke R

Square

1 192.205a .580 .775

a. Estimation terminated at iteration number 7 because

parameter estimates changed by less than .001.

The Block 0 output is for a model that includes only the intercept. Given the base rates of

the two decision options (196/373 = 52.5% decided not to vote, 47.5% decided to vote),

and no other information, the best strategy is to predict, for every case, that the subject will

decide not to vote. Using that strategy, we would be correct 52.5% of the time.

Under Variables in the Equation the intercept-only model is ln(odds) = -.102. If we

exponentiate both sides of this expression we find that our predicted odds [Exp(B)] = .903.

That is, the predicted odd of deciding to vote is .903. Since 177 subjects decided to vote

and 196 decided not to vote, our observed odds are 177/196 = .903

Model summary we see that the -2 Log Likelihood statistic is 192.205. This statistic

measures how poorly the model predicts the decisions -- the smaller the statistic the better

the model.

Page 38: Astha sharmapgcmrda229 project_report

37 Reason for lower voter turnout amidst Urban Population, in India

The Classification Table shows us that this rule allows us to correctly classify 169 / 177 =

95.5% of the subjects where the predicted event (deciding to vote) was observed. This is

known as the sensitivity (True Positive Rate) of prediction, the P (correct | event did occur),

that is, the percentage of occurrences correctly predicted. This rule allows us to correctly

classify 171 / 196 = 87.2% of the subjects where the predicted event was not observed. This

is known as the specificity (False Positive Rate) of prediction, the P (correct | event did not

occur), that is, the percentage of non-occurrences correctly predicted. Overall our predictions

were correct 340 out of 373 times, for an overall success rate of 91.2%.

9.2 Interpreting Coefficients

Ln[p/(1-p)] = a + b1X1 + b2X2 + b3X3 + b4X4

Annual Income, Belong to State Residing in, Importance of Voting and Comfort of Mobility

(registering ID and voting location) are significant factors driving decision to vote.

Annual Income though came out to be insignificant during Chi-square, in interaction with

other variables it became a significant factor driving voting decision.

Each coefficient increases the odds by a multiplicative amount, the amount is eb

which is

Exp(B). Every unit increase in X increases the Odds by eb

.

Annual Income: e-0.937

= 0.392; (0.392-1 = -0.608) Odds of Voting decreases 60.8% for

people with Annual Income less than 10 lacs p.a.

Belong to the state residing in: Odds of voting increases 605.0% for people who belong to

the state they are residing in.

Importance_Voting: Odds of voting increases 956.0% for people who believe their vote is

important and it is important for them who get elected at the same time believe that it would

improve their life style.

Mobility Comfort: Odds of voting decreases 34.8% for people who want voters ID

registration to happen in the corporate and voting to happen in the Business Park.

X1 X2

X3

X4

1

b1 b2 b3 b4 a

Page 39: Astha sharmapgcmrda229 project_report

38 Reason for lower voter turnout amidst Urban Population, in India

THE FIRST MODEL:

Ln[p/(1-p)] = -1.586 + (-.937)*Annual Income + 1.953*State_residing_flag +

4.571*Importance_voting_F1 + (-.428)*Mobility_Comfort_F3

SPSS OUTPUT FOR BINARY LOGISTIC MODEL AND ROC CURVE

TPR_FPR.xlsx

MANUAL CALCULATION FOR TPR AND FPR 9.3 ROC Curve Receiver Operating Characteristics Curve is used for diagnostic test evaluation

True Positive Rate (Sensitivity) is plotted in function of the False Positive Rate (100-

Specificity) for the different cut-off points of the parameters. Each point on the ROC curve

represents a sensitivity/specificity pair corresponding to a particular decision threshold. The area

under the ROC curve (AUC) is a measure of how well a parameter can distinguish between

two diagnostic groups

The summary shows the number

of respondents who voted in the

last elections (177) and number

of respondents who did not vote

in the last election (196)

Page 40: Astha sharmapgcmrda229 project_report

39 Reason for lower voter turnout amidst Urban Population, in India

Coordinates of the Curve

Test Result Variable(s):Predicted probability

Positive if Greater Than or Equal To

a Sensitivity

1 - Specificity

Specificity

(1-(1-

Specificity))

Error =

ABS(TPR-

FPR)

.5819164 .910 .092 .908 .001

.5942843 .904 .092 .908 .004

.5757379 .910 .097 .903 .007

.6012760 .898 .092 .908 .010

.5747964 .915 .097 .903 .012

.6063015 .898 .087 .913 .015

.5730680 .915 .102 .898 .017

.6120221 .893 .087 .913 .021

.5712820 .915 .107 .893 .022

.6145646 .887 .087 .913 .026

The area under the curve (AUC) is a

measure of the power of the test. AUC

being 0.961, measures the overall

model fit to be 96.1%.

In other words, the model has 96.1%

ability to correctly classify the

probability of occurrence of the event.

The closer the curve follows the left

hand border and then the top border

of the ROC space the more accurate

the test is.

Cutoff

point lies

in this

range

Page 41: Astha sharmapgcmrda229 project_report

40 Reason for lower voter turnout amidst Urban Population, in India

.9929000 .011 .000

.0000000 1.000 .990

.9938029 .006 .000

.0000000 1.000 .995

.0000000 1.000 1.000

1.0000000 .000 .000

a. The smallest cutoff value is the minimum

observed test value minus 1, and the largest

cutoff value is the maximum observed test

value plus 1. All the other cutoff values are

the averages of two consecutive ordered

observed test values.

.000

.200

.400

.600

.800

1.000

1.200

.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.001.10

Sensitivity

Specificity

.5693794 .915 .112 .888 .027

.6180047 .881 .087 .913 .032

.5677229 .921 .112 .888 .033

.6205378 .876 .087 .913 .038

.5638978 .927 .112 .888 .039

.6278912 .876 .082 .918 .043

.5589323 .932 .112 .888 .044

.6400436 .876 .077 .923 .048

.5502517 .932 .117 .883 .050

.6519254 .876 .071 .929 .053

Data deleted for easy presenting. Please

refer to the attached calculations

The graph is plotted in excel to

show the cut off probability for

the model. The intersection point

of the two curves, Specificity-

Sensitivity is Zero, is the point of

cut-off.

The Cut-off value where the

TPR is highest and FPR is

lowest is calculated.

The data is sorted to get the

cut off probability value

below which the odds of a

person to vote decreases.

This is calculated manually

from the Coordinates for

better understanding. The

cut-off probability is at the

point where the absolute

difference between

specificity and sensitivity is

zero/minimum.

Specificity = 1-(1-Specificity)

Error = ABSOLUTE (TPR-FPR)

The cut-off probability is 0.58

for the model to classify 0

and 1.

Page 42: Astha sharmapgcmrda229 project_report

41 Reason for lower voter turnout amidst Urban Population, in India

PART X: MODEL 2

INTRODUCTION

The analysis is divided into 2 parts. The first part concentrated on the factors driving voting

decision among urban population in India. The second part of the analysis is concentrated on

people who did not vote in the last elections.

As per the data, there are 3 categories of people who did not vote in the last election:

1) People who are not a registered voters (hence cannot vote)

2) People who are registered voters but did not vote

3) People who are registered voters but their Voter’s ID is of a state they are not currently

residing in

A Discriminant analysis is performed to see if there is any commonalty among the 3 groups.

SPSS INPUT FOR SECOND MODEL

9.4 CONCLUSION

People who are residing in the State, they belong to; have higher chances of voting than the others.

Further, all who feel their vote is important and think that change in leader will impact their life

style; have higher chances to vote than the ones who feel the voting is not important.

People in the income bracket of 2-10 lacs per annum have less probability of voting than the ones

in higher income bracket. There is a possibility that they do not feel importance of voting as it

might not impact their life style.

There is a requirement to encourage people to vote as a lot of people do not feel their vote is

important. People who feel it is important that registration of voter ID should happen in their

corporate and voting to happen in the business park are less likely to vote otherwise.

This is a great indication for government to encourage voting by streamlining the process better.

The further analysis, in the research, focuses on factors that affect people’s decision to vote. This

would also help in verifying the outcome from the above model.

Page 43: Astha sharmapgcmrda229 project_report

42 Reason for lower voter turnout amidst Urban Population, in India

10.1 DISCRIMINANT ANALYSIS

The objective of using Discriminant analysis is for modeling variables and to identify the

principal discriminators which differentiate behavior of the individuals of who are registered

voters with voters ID belonging to the same state, registered with voters ID of state they are

not currently residing in, not a registered voter.

Objective of Discriminant Analysis is to understand:

How the 3 groups differ with respect to the underlying variables

How people differ with respect to underlying demographic and psychographic

dimensions

Given the information on various variables, a person belongs to which segment? With

what probability

Since there are many variables, Stepwise method is used to select the best variables to use in the

model.

Page 44: Astha sharmapgcmrda229 project_report

43 Reason for lower voter turnout amidst Urban Population, in India

The stepwise method starts with a model that doesn't include any of the predictors (step

0).

At each step, the predictor with the largest F to Enter, value that exceeds the entry criteria

(by default, 3.84) is added to the model.

Box's Test of Equality of Covariance Matrices

The ANOVA table shows that following variables are significant across groups:

i. Age

ii. Relationship Status

iii. City of Residence

iv. If the person Belongs to the State he/she is residing in

v. Duration of Stay in the same State

vi. People for whom it is important who gets elected

vii. People who think it would impact their lifestyle, if they vote

The significance value of .000 indicates that the data is not homogeneous in its covariance

matrices which violate an assumption of DA.

When n is large, small deviations from homogeneity will be found significant, which is why

Box's M is interpreted in conjunction with inspection of the log determinants. Log

determinants are close to each other for group 0 and 1 and hence the assumption of equality

of covariance can hold good for the 2 groups.

Page 45: Astha sharmapgcmrda229 project_report

44 Reason for lower voter turnout amidst Urban Population, in India

Stepwise Statistics

\

The table tells the variables selected in the model after 38 iterations. The above 3 variables are

considered for the analysis.

Tolerance is the proportion of a variable’s variance not accounted by other independent variables

in the equation. It is about multicollinearity.

Tolerance = 1/VIF (the closer the value is to 1 the less collinear the variables are)

A variable with low Tolerance contributes little information to a model and can cause

computational problems. The 3 variables have high tolerance that signifies that these 3 variables

affect the model outcome significantly.

Page 46: Astha sharmapgcmrda229 project_report

45 Reason for lower voter turnout amidst Urban Population, in India

Summary of Canonical Discriminant Functions

Nearly all the variance explained by the model is through first Discriminant Function. We can

ignore the second function. For each set of function, this tests the hypothesis that the means of the

functions listed are equal across groups. The test of function 2 has a p value of .26, so this function

contributes little to the model.

Eigen Value indicates the proportion of variance explained. (Between-groups sums of squares

divided by within groups sums of squares). A large Eigen Value indicates strong function.

The canonical relation is a correlation between the discriminant scores and the levels of the

dependent variable. A high correlation indicates a function that discriminates well. The present

correlation of 0.599 is not extremely high (1.00 is perfect).

Function 1 is highly correlated to

State_Residing_Flag and Age

Function 2 is highly correlated to

Impotance_Leader

Although State_residing_flag is

correlated to both the functions, the

impact is higher on Function 1.

Page 47: Astha sharmapgcmrda229 project_report

46 Reason for lower voter turnout amidst Urban Population, in India

Discriminant Equation:

Function 1 = 0.219 + 2.806 State_residing_flag + 0.547 Age – 0.422 Importance_Leader

Function 2 = -3.719 + 1.304 State_residing_flag – 0.100 Age + 0.921 Importance_Leader

Classification Statistics

Group Centroids is means of

discriminant functions or discriminant

scores.

Registered Voter with IDs belonging to

same state and ID belonging to different

state have similar score on Function 2.

Centroids of these 2 groups are close by.

These are actual regression coefficients

of the variables and are used to form

linear discriminant function.

Page 48: Astha sharmapgcmrda229 project_report

47 Reason for lower voter turnout amidst Urban Population, in India

SPSS OUTPUT DISCRIMINANT ANALYSIS

Classification Results is a summary of a number and percent of subjects classified correctly and

incorrectly.

Overall % correctly classified = 61.2%

Discriminant scores plotted on the

Function 1 and Function 2 exhibit

that group 0 and group 2 are very

close whereas group 1 is slightly

different.

Function 1 is highly correlated with

Age and State_residing_flag and

Function 2 is correlated with

Importance_Leader.

CONCLUSION

Group 1: people who are registered voters and have ID of the state they live in; has high score on

Function 1 and is discriminated by Age of the people along with them belonging to the state they

live in.

Group 0 and 2 people are the ones who do not have registered voters ID and the ones who have ID

but of another state, respectively. The importance of who gets elected is a major discriminant,

along with them not belonging to the state they live in.

Hence, people who do not have registered voters ID and the ones who have ID but of different state are not impacted by who gets elected. This can be a factor driving their decision not to vote.

As further analysis; the factors, by what percentage, driving the decision for not creating voters ID is

studied using Multinomial Logistic Regression analysis.

Page 49: Astha sharmapgcmrda229 project_report

48 Reason for lower voter turnout amidst Urban Population, in India

10.2 MULTINOMIAL LOGISTICS REGRESSION

Analysis is done to understand the factors driving registration of voters ID card along with the

probability of a person voting. There are 3 variables to consider.

People without registered voters ID (0)

People with registered voters ID (1)

People with registered voters ID of different state (2)

H0: There is no difference between the 3 groups of people

H1: There is a significant difference in the 3 groups of people

Multinomial Logistic Regression is performed since dependent variable has more than two

categories. The category of people without a Voters ID is taken as reference category since it

differs from people who have ID of same state and who has ID of different state.

Pseudo R-Square

Cox and Snell .591

Nagelkerke .681

McFadden .440

p<0.05 means rejecting the

null hypothesis that there

is no difference between

the ‘intercept only’ and

populated model

Both of these statistics test how well the

model fits that data (expected and actual

values) and p<0.05 means that there is a

significant difference between the two i.e.

the model is not a good fit.

According to the Pearson statistic the model

is a bad fit, but the Deviance statistic

suggests otherwise.

This could be due to low frequencies in

crosstabs.

Page 50: Astha sharmapgcmrda229 project_report

49 Reason for lower voter turnout amidst Urban Population, in India

Variables which were not significant are removed for easy presentation and understanding.

Lower

Bound

Upper

Bound

Intercept -15.589 3230.327 .000 1 .996

[State_residing_flag=0] -2.997 .974 9.473 1 .002 .050 .007 .337

[Duration_Stay_Same_City=2] 2.667 1.146 5.419 1 .020 14.394 1.524 135.921

[Opinion_Politicians_choice=2] -4.019 1.572 6.535 1 .011 .018 .001 .392

[Opinion_Trust_Voting_System=1] -5.078 2.377 4.562 1 .033 .006 .000 .658

[Opinion_Trust_Voting_System=2] -3.203 1.760 3.312 1 .069 .041 .001 1.280

[Opinion_Trust_Voting_System=4] -2.856 1.635 3.051 1 .081 .058 .002 1.417

[Opinion_RegstrVoterID_Org=3] 2.596 1.250 4.314 1 .038 13.408 1.157 155.322

[Opinion_Voting_Business_Park=2] 3.978 1.502 7.015 1 .008 53.397 2.813 1013.610

Intercept 2.733 2.201 1.542 1 .214

[State_residing_flag=0] 3.454 1.061 10.603 1 .001 31.639 3.956 253.066

[Opinion_Safety_Concern=1] -4.342 1.638 7.024 1 .008 .013 .001 .323

[Opinion_Safety_Concern=2] -3.415 1.547 4.875 1 .027 .033 .002 .681

[Opinion_Safety_Concern=3] -3.582 1.547 5.363 1 .021 .028 .001 .577

[Opinion_Voting_Business_Park=2] 1.942 1.106 3.085 1 .079 6.972 .799 60.873

[Opinion_Voting_Business_Park=4] 2.146 .688 9.724 1 .002 8.554 2.220 32.968

[Importance_Leader=3] -2.430 .973 6.239 1 .012 .088 .013 .593

1

2

Parameter Estimates

Registered_Votera

B Std. Error Wald df Sig. Exp(B)

95% Confidence

Interval for Exp(B)

The pseudo R-square tells us how much of the variance in the dependent variable is

explained by the model – low values are normal in logistic regression. It is not a popular

model used for analysis.

All the variables were included in

the model and were removed one by

one to arrive at the list of variables

which are most significant.

Variables like:

Opinion_RegstrVoterID_Org and

Importance_Leader have

significance level more than 0.05 but

are still included in the analysis

since they are significant at 90%

level. Further they are also

significant in some of the levels with

respect to dependent variable.

Page 51: Astha sharmapgcmrda229 project_report

50 Reason for lower voter turnout amidst Urban Population, in India

Interpreting Coefficients…

For intercept between 0 and 1 following variables are most significant:

0: People who are not a registered voter

1: People who are registered voter and have ID of same state

i) State_residing_flag ii) Duration_stay_same_city iii) Opinion_trust_voting_system

iv) Opinion_regstrvoterID_org v) Opinion_voting_business_park and

vi) Opinion_Politicians_choice

- The odds of people having a registered Voter’s ID of same state decreases if people do

not belong to the state they are residing in.

- Odds of having a registered Voters ID of same state decreases if people have been

staying in the city for more than 5 years

- Odds of having a registered voters ID of the same state decreases for people who do

not trust the voting system.

- Odds of having a registered voters ID increases if the registration of IDs is done in the

corporate

- People who do not want voting to happen in business park have higher chances of

having registered IDs of the same state

Interpreting Coefficients…

For intercept between 0 and 2 following variables are most significant:

0: People who are not a registered voter

2: People who are registered voter and have ID of different state

i) State_residing_flag ii) Opinion_safety_concern iii) Opinion_voting_business_park and

v) Importance_Leader

- The odds of people having Voter’s ID of other state increases if people do not belong

to the state they are residing in.

- Odds of having a registered Voters ID of another state decreases for people who do not

have concern with safety while they go for voting

- People who have IDs of another state prefer voting to happen in Business Park

- Odds of having a Voter’s ID card of another state decreases for people who are neutral

about who gets elected

Page 52: Astha sharmapgcmrda229 project_report

51 Reason for lower voter turnout amidst Urban Population, in India

SPSS OUTPUT MULTINOMIAL LOGISTIC REGRESSION

The Classification Table shows us that this rule allows us to correctly classify 76.5% of the

subjects where the predicted event was observed.

In the above model 2 Binary equations are formed and overall accuracy is calculated (76.5%).

This model rightly classifies the probability, of someone with voters ID card of another state,

by 85.4% (2) which is better than the other 2 classifications (69.7% and 67.2%).

Lower sample size can be one of the reasons for lower accuracy of the model.

CONCLUSION

People who belong to the state they reside in have higher chances of having a Voters ID of the

state.

The model also leads to another conclusion that people who belong to the state they are residing in

are comfortable going to the government assigned polling booths to vote. Whereas, people who do

not belong to the state would like voting to happen in the business park, near their work location.

People who have registered voters ID of other state also have a negative perception related to their

safety when they go for voting.

This clearly indicates that people who belong to other state would feel more comfortable in an

environment familiar to them whereas people who belong to the same state are ready to go out and

vote.

Page 53: Astha sharmapgcmrda229 project_report

52 Reason for lower voter turnout amidst Urban Population, in India

PART XI: CLASSIFICATION TREE

The objective of performing classification tree is to understand the bifurcation of opinion on

some key concerns related to voting process. State_residing_flag (Do you belong to the state you

are residing in) turned out to be a significant variable in the analysis.

State_residing_flag (0 or 1) is taken as dependent variable and significance of Safety, Register of

Voters ID in corporate and voting to happen in Business Park are taken as independent variables.

How does an opinion vary with the origin of a person is assessed.

0 – No, 1 – Yes

Of 373 valid responses 235

do not belonged to the state

they are residing in and 138

belong to the same state.

CRT classification tree

formed 2 nodes on the basis

of people’s opinion on

registration of Voters ID to

happen in the corporate. The

people who do not think

registration of ID should

happen in corporate or are

neutral about it formed the

first node and people who

agree to the thought process

formed second node.

The Second node, terminal

node, is further classified into

people who are concerned

with their safety when they

go for voting.

Node 3 classifies people who do not have any concern for their safety when they go for voting

whereas node 4 groups people who have little to high level of concern when they go for voting.

NOTE: The next question has high significance to Node 2, which classifies people who believe the

registration of voters ID should happen in corporate.

Page 54: Astha sharmapgcmrda229 project_report

53 Reason for lower voter turnout amidst Urban Population, in India

Model Summary; showing

the Dependent &

Independent variables.

Maximum Tree depth is 5

The results have

considered two variables

i.e.,

Opinion_Rgstrvoterid_org

and

Opinion_safety_concern

Model overall accuracy is

63.5%.

Overall classification of 0 is

more accurate than the

classification of 1.

This can be due to lower

sample size for 1 i.e.

respondents who belong to

the state they are residing in.

CONCLUSION

- Of overall 373 respondents 297 (79.6%) believe that registration of voter ID should

happen in corporate.

- Of 235 respondents, who do not belong to state they are residing in, 198 feel that

registration of voters ID should happen in corporate.

- 165 respondents who do not belong to the state they are residing in have some amount

of concern to do with their safety when they go for voting.

Overall, people who do not belong to the state they are residing in prefer registration of ID

cards to happen in corporate and are also concerned about their safety. This is a fair

conclusion, considering people feel safe in environment they are familiar with.

Page 55: Astha sharmapgcmrda229 project_report

54 Reason for lower voter turnout amidst Urban Population, in India

PART XII: SUMMARY AND CONCLUSION

The research helped in bringing many aspects under consideration to do with the perception of

urban population, of India, towards the process of voting.

Population, living in metros, are settled in these cities due to better paying jobs/quality of life.

However, sometimes is less attached to the city they live in, if they belong to another state.

Diversity of language/culture can be a reason for the same.

The research drew attention to factors which drive urban population to vote. People who belong

to the city they reside in have higher chances to vote than the people who do not belong to the

state.

Further, the analysis also brought to light that people with annual income of INR 5-10 lacs per

annum have lesser probability of voting than the segment who is earning more than INR 10 lacs.

Also people who have lived in the city for 5-10 years have lesser probability to vote than the

ones who have stayed longer.

There can be synergy drawn from the above outcome. The urban middle class which do not

belong to the state they are residing in and have lived in the city for less than 10 years are

generally less attached to the state since they are only settling into their day today lives. They are

the people who believe in voting system and recognise that importance of voting but look for

more familiar surroundings when it comes to registering for voters ID or going to vote.

It was also noted that people who are not happy with the choice of politicians have lesser

probability to vote and people who think a change in leader will make a difference have higher

probability to vote.

To conclude there are people who vote because the change will impact their lives and then there

are people who vote because of the social responsibility they attach to the cause. There is a

middle man whose day today life does not get impacted with changing governments. Though he

knows the importance of voting, he still requires an extra incentive to take that step forward to

make a difference for the country. These people, who are well educated and informed, look

forward to familiar surroundings and easier processes. If government encourages registering of

voters ID in corporate then more number of people can take the benefit of the system. If voting

Page 56: Astha sharmapgcmrda229 project_report

55 Reason for lower voter turnout amidst Urban Population, in India

happens in business parks then more people, who look forward to safer and familiar

surroundings, can be encouraged to vote.

Metros in India are a home for a large number of populations. This segment can make a very

well informed decision. They have come out of comforts of their homes to lead a better life.

They make their own living and come around all the obstacles, of basic requirement, of day

today life. This self sufficient group needs an encouragement to go out and vote and the same

will happen if government takes interest and gives them a surrounding they will be comfortable

in.

Corporate can drive voting as a part of Corporate Social Responsibility. If we start, today, to set

up the process and drive this change; there are chances that we will be able to see higher voter’s

turnout by the next elections.

SPSS OUTPUT FOR CLASSIFICATION TRESS

FURTHER ANALYSIS

The current sample size is very small and not an indicator of the population. A similar study can

be done with a larger sample size, spread across regions, to understand the demographic and

psychological behaviour of people towards voting.

Registration of ID cards, concerns with safety can be analysed in greater details.

People with higher earnings can be analysed and the assumptions regarding they having higher

social responsibility can be verified.

Page 57: Astha sharmapgcmrda229 project_report

56 Reason for lower voter turnout amidst Urban Population, in India

PART XIII: REFERENCE

Books:

- Marketing Research by Malhotra and Dash

- Multivariate Data Analysis by Hair Black Babin Anderson and Tatham

Link:

http://www.southampton.ac.uk/ghp3/docs/unicef/workshop5.pdf

http://www.uk.sagepub.com/burns/website%20material/Chapter%2025%20-

%20Discriminant%20Analysis.pdf

http://core.ecu.edu/ofe/StatisticsResearch/SPSS%20Discriminant%20Function%20Analysis.pdf

http://www.cs.uu.nl/docs/vakken/arm/SPSS/spss6.pdf

**********THANK YOU**********