self-reported health and seasonal variation

Self-Reported Health and Seasonal Variation

Tyler Chan and Colin McBride

December 6, 2016

Sociology 3032

Final Project

Introduction: Seasonal effects on health have been a topic of interest in the

epidemiologic and other professional health fields for quite some time. Although it is

definitely dependent on the average yearly weather of the place being studied, the

significant correlation that the weather and lighting due to season have on many

different diseases and other health concerns is substantial. With that in mind, using the

2015 BRFSS data set we are going to be looking to see if there is a seasonal

correlation or variation to individual self rated health. In doing this we are hoping to find

a similar correlation/variation as we saw in many of the articles we did research on for

the purposes of this study. The correlation of these two variables is important to look at

because it could have a massive effect on the data which is collected by BRFFS and all

other health related surveys and studies that are conducted worldwide. Additionally, it

could also have a massive effect in other fields such as business in that pharmaceutical

companies may want to look into advertising specific products during times of the year

where people are most likely to experience the symptoms which the product looks to

stop. One last major importance of this study is on a personal level people need to know

that others and themselves may be more susceptible to feeling as if their health is poor

during the winter and early spring season. Thus because of this people should be more

understanding and less critical of individuals who are claiming to be feeling under the

weather or sick more often during this time of the year. Overall I think that a data

analysis of this type would be very beneficial to the health field as well as many other

fields which are dependent on a person's (or populations) well being.

Literature Review: As I briefly mentioned in the introduction to this paper, we looked at

many different articles which established a correlation between diseases and time of the

year in order to get some background in this area before actually going forward and

performing our data analysis. One of the articles which we looked at by Roger Persson

and his associates was looking at the variation in individual stress by time of the year. A

key finding of theirs during this study was “Post hoc comparisons showed that only

stress ratings made at 14:00 (8 h after awakening) exhibited monthly variation (p =

0.001), with the four highest stress scores observed in January to April and the four

lowest stress scores observed in July to October (Persson et al,. 2010). Clearly this

shows that more people are stressed out during the winter and early spring months

which can then further be associated with self-reported health. As we learned in class

there is a major correlation between stress level and an individual's self rated, we talked

about this in the form of a concept known as allostatic load. This concept basically

describes the idea that as people are subjected to greater levels of chronic stress, they

are also subject to a higher amount of strain on their bodies and minds which we think

could severely impact one's self rated health. Additionally, Ming-Jui Hung and

associates in their study found that “In conclusion, among patients without obstructive

CAD, CAS displayed seasonal variation with winter and spring peaks, followed by

autumn and summer.” (Hung et al., 2015) Again this is another indicator that bad health

and the risk of contracting certain diseases is heavily dependent upon the time of the

year. Although heart disease may not have a specific effect on individuals self rated

health; if individuals are aware that they are more susceptible to diseases and other

possible unhealthy conditions during certain times of the year, this then may cause

them to unconsciously self report lower health levels.

A third article we looked at by Maurizio Abrignani concluded that “When

examining cases of angina pectoris, fewer admissions occur during the summer,

whereas the greatest number occurs during the spring” (Abrinani et al,. 2011). Here

again although there is a slight difference of conclusion, the fact that angina pectoris

has a significant difference in hospital admissions by season asserts that the concept of

seasonal health effects is in fact a reality. In looking to apply this to our study it may be

the case that people feel the symptoms of angina pectoris at a higher severity or

frequency or both during the spring as opposed to other seasons throughout the year,

and because of this are more likely to be admitted to a hospital for the disease. On the

other hand, a counterexample to the research which we have displayed so far comes

from Marrero Osvaldo where he is able to show that “In epidemiological seasonality

studies, the amplitude is often small, so that any smoothing of the data is likely to erase

the very variation that we are trying to detect.” (Osvaldo 2013) Here the author is stating

that in his studies related specifically to epidemiology there is a very small amplitude, or

differences in health by season. So much so that if data is “smoothed out” or

manipulated in any small way it could get rid of the small variation that actually occurs

between seasons. This study is fascinating as it both specifically relates to epidemiology

and also goes against all of the other research which we have done so far for

background information regarding our research question. The last article which we

looked at to better inform us on our research topic as well as for the purpose of this lit

review is that of “Shining Light on Seasonal Depression” by Patricia Daukantas. In her

article she states “Lam estimated that, in northern populations, as many as 15 to 20

percent of adults notice some symptoms, such as problems with fatigue, oversleeping

and overeating, during the winter, but a much smaller proportion have full-blown

SAD.”(Daukantas 2008) This article is particularly important to our specific study as it

explains how, in certain populations, up to one fifth of the adults experience more

symptoms which could greatly impact individual self rated health (fatigue, oversleeping,

overeating) during the winter. In looking at these last two sources in particular, it seems

to me that we could possibly get a result ranging from almost no variation in self rated

health between season to up to a 20 percent variation in self rated health between

season. That being said I think it shows how particularly important our research is in

that there is no definitive answer to our question that has been proven time and time

again.

Research Question: Does self-reported health differ depending on the time of the

year? If so, what seasons or months have the largest correlation to self-reported health?

Hypothesis: People will report their general health as worse during the winter as

opposed to the spring, summer, and fall.

Methods Data: By using the variable “imonth”, we were able to subset brfss 2015 into

twelve different subsets by month. To subset all of the observations taken in January we

simply wrote “January<-brfss2015.full[brfss2015.full$imonth=='01',]”. Once we had the

twelve subsets January through December, we grouped the subsets into groups of

three to represent the different seasons. To make it easy, we called these subsets

“spring”, “summer”, “fall”, and “winter”. The subset “spring” represents all the

observations taken in the months of March, April, and May, the subset “summer”

represents those taken in June, July, and August, the subset “fall” consist of

observations in September, October, and November, and the subset “winter” represents

the observations taken during December, January, and February. In order to combine

all of these subsets into subsets that represented each season, we used the “rbind”

function. For example, to subset all the observations taken in the spring we typed

“spring<-rbind(March,April,May)”. We then did this for the other three seasons.

Before we were able to use “genhlth” to find the mean self-reported health of

individuals according to the season the interview took place, we realized that the

recorded values of seven and nine may give us a less accurate average. Genhlth is a

variable that could have seven different values. These values are 1,2,3,4,5,7,and 9.

Where 1 means you feel excellent and 5 means you feel that your health is poor. The

values 2,3, and 4 represent very good, good, and fair health respectively. The value

seven was given by people who did not know or were not sure of their current general

health. We felt excluding this value was necessary because not knowing your current

general health doesn’t fit on the scale between feeling excellent and poor. We also

excluded the value nine because refusing to answer doesn’t fit on this excellent-to-poor

scale as well. More importantly, we realized that the reported values of seven and nine

may significantly increase our mean, and give us less accurate results. In order to

remove these observations from our subsetted season spring, we typed “spring<-

spring[spring$genhlth<06,]”. After doing this for each season’s subset, we had a clean

sample of people who either rated their health as excellent, very good, good, fair, or

poor. Once all of the seasons only contained genhlth values of 1 to 5, we combined all

the subsetted seasons to create a subset called “year” that consisted of every

observation with a genhlth value of 1 to 5. To do this, we typed “year<-

rbind(summer,spring,fall,winter)”. After using rbind to create the subset year, we found

that we were working with a total of 440,211 out of 441,456 observations. With this

subset, we were then able to find the mean self-reported health by individuals over the

year of 2015. Unfortunately, two people out of 441,456 left their genhlth value blank, so

in order to find the mean self-reported health of individuals over the year we typed

“mean(year$genhlth,na.rm = TRUE)”. This allowed us to get a mean by removing the

two observations with blank values. Once this was done, we were able to compare this

mean with the means acquired by each season. (figure 1)

Once we found the mean self-reported health of individuals by season, we

thought it would also be interesting to find the mean self-reported health by region in the

United States. To do this, we subsetted the data into all fifty states including the District

of Columbia. Using the rbind function, we combined the states into their corresponding

region in the United States. We then had four subsets called Northeast, West, South,

and Midwest. Again, we got rid of the observations with genhlth values of seven and

nine. Once subsetted into regions consisting of only observations with genhlth values of

1-5, each of these regions had 79,296, 104,533, 67,782, and 121,441 observations

respectively. Finally, we combined these four regional subsets into one called

“newyear”. We then took the mean genhlth of each of these regions and compared it to

the “newyear’s” mean genhlth. (figure 3)

Results: After compiling the data into observations with genhlth values of 1-5, we

computed the mean self-reported health of individuals over the year of 2015. Our mean

self-reported health over the year was 2.564259. This means that out of the 440,209

people interviewed, people on average felt a little more than halfway between very good

and good. We then calculated the genhlth means for each of the subsetted seasons.

For the spring we got an average of 2.563962 for genhlth, a mean of 2.570128 for the

summer, 2.579947 for the fall, and 2.544152 for the winter.

To our surprise, the mean genhlth during the winter was significantly smaller than

the year’s average. This means that during the months of December, January, and

February, people on average actually report having better health than people

interviewed in the spring, summer, and fall. Not only that, but as seen in the histograms

(figure 2), the ratios between self-reported health values and the seasons showed very

little variation. Since these results went against our hypothesis and prior research, we

figured we should subset the data into another category and maybe find a more

significant difference in means between the year’s genhlth mean and a different

category’s genhlth mean.

Instead of comparing the year’s average self-reported health to the average self-

reported health during different seasons, we decided to compare it the average self-

reported health of people in the four different regions in the United States. After

subsetting the data into the Northeast, Midwest, South, and West, we found that the

average self-reported health among these regions were 2.475579, 2.538014, 2.578089,

and 2.497379 respectively.

When subsetting the data into regions, we found that there was a significantly

larger standard deviation between the “Newyear” mean and the region’s mean when

compared to the “year” mean and season mean.

Discussion/Conclusion: After taking all the means of self-reported health during each

season and comparing it to the year’s mean self-reported health, we found very little in

variation between the means. Not only did we find little variation in the means, but we

also found almost no variation in frequencies between the values of genhlth and the

season. We also found that people in the United States actually rate their health as

better during the winter as opposed to during the spring, summer, and fall. This result

rejected our researched hypothesis and caused us to consider other factors for why this

may be.

We believe that we may have gotten a lower mean self-reported health during

the winter as opposed to the year for a number of reasons. The main factor we

considered as to why our hypothesis was wrong was because different regions of the

United States are differently affected by the different seasons. After combining the data

into the four separate regions of the United States, we found a much more significant

variation in means between the self-reported health of the region and the year. This

may be due to the fact that the southern and western states do not experience as cold

of winters as the states located in the Northeast and Midwest.

Another reason we may have found self-reported health to be higher in the winter

as opposed to the other seasons is because of the different living environments that

people live in throughout the United States. Rural living areas allow people to come in

contact with less people making it harder for them to contract contagious diseases. This

is helpful especially during the Winter as people tend to stay indoors surrounded by

friends and family as opposed to other seasons. That being said it also puts them

farther away from resources such as doctors, hospitals, and pharmacies, which may

also cause them to report having a lower self-reported health. While rural living areas

allow people to come in contact with less people, people who live in urban areas are far

more prone to come in contact with others who have contagious diseases. These

people living in urban areas more often than not also have higher access to medical

institutions for when they may be getting sick.

Socioeconomic status also plays a large factor in your self-reported health. While

people with higher socioeconomic status may have more access to medical resources,

they are also prone to more stress. Their jobs are often more stressful which can cause

them to report having a lower self-reported health as opposed to someone with a lower

paying job and little stress. If we were to subset the data into observations by their

income group, we may have found a much more significant variation in means as

opposed to subsetting by just season.

In our research we found that there was variation between seasons and coronary

heart disease. Earlier in the semester, we also learned that coronary heart disease

affects white people more so than other races. We believe that if we were to subset the

brfss data into seasons and split each season into specific races we may find a larger

variation in means between self-reported health and races during each season as

opposed to comparing only season to self-reported health.

While all of these factors could very well play into why we got results that

rejected our hypothesis, we believe region was the largest factor as to why these results

did not come out the way we thought they might. If we compared regional self-reported

health by season we may find an even larger variation of means as opposed to just self-

reported health and season.

The data collected from our research can be helpful for many reasons. Because

we know that self-reported health varies between regions in the United States,

pharmaceutical companies can use this information to figure out where and what

demographic they should be marketing towards. Since we tested self-reported health

against region, we were able to find that states located in the South and Midwest on

average had worse self-reported health as opposed to people living in the West and

Northeast. Knowing this, pharmaceutical companies may find that they have a larger

market for antidepressants in the Midwest and South as opposed to the West and

Northeast.

Although we did not find a significant variation between self-reported health and

seasons, we were able to find one between region and year. If we had more time, we

could even split the regional self-reported health by season into even smaller

demographics such as socioeconomic status, race, and living area. From this we could

learn how money, race, and living area generally affect people living in different regions

in the United States during the different seasons of the year. This would allow people in

the health industry to better understand where and who is being affected by the different

seasons.

References:

Abrignani, M. G., S. Corrao, G. B. Biondo, R. M. Lombardo, P. Di Girolamo, A. Braschi,

A.Di Girolamo, and S. Novo. “Effects of Ambient Temperature, Humidity, and

Other Meteorological Variables on Hospital Admissions for Angina Pectoris.”

European Journal of Preventive Cardiology 19.3 (2011): 342-48. Web. 9 Nov.

2016.

Daukantas, Patricia. 2008. “Shining Light on Seasonal Depression.” Optics and

Photonics News 19(3):38–43.

Hung, Ming-Jui, Kuang-Hung Hsu, Nen-Chung Chang, and Ming-Yow Hung.

“Increased Numbers of Coronary Events in Winter and Spring Due to

Coronary Artery Spasm.” Journal of the American College of Cardiology 65.18

(2015): 2047-048. Web. 9 Nov. 2016.

Marrero, Osvaldo. “Seasonal Variation in Epidemiology.” The College Mathematics

Journal 44.5 (2013): 386-98. JSTOR. Web. 10 Nov. 2016.

Persson, Roger, Kai Ãsterberg, Anne H. Garde, Ãse M. Hansen, Palle ÃrbÃ¦k, and � �

BjÃ¶rn Karlson. "Seasonal Variation in Self-reported Arousal and Subjective

Health Complaints." Psychology, Health & Medicine 15.4 (2010): 434-44. Taylor

and Francis Online. Web. 10 Nov. 2016.

Tables and Figures:

Figure 1 (above) Figure 2 (below)

Figure 3 (below)

self-reported health and seasonal variation

Documents