self-reported health and seasonal variation
TRANSCRIPT
Self-Reported Health and Seasonal Variation
Tyler Chan and Colin McBride
December 6, 2016
Sociology 3032
Final Project
Introduction: Seasonal effects on health have been a topic of interest in the
epidemiologic and other professional health fields for quite some time. Although it is
definitely dependent on the average yearly weather of the place being studied, the
significant correlation that the weather and lighting due to season have on many
different diseases and other health concerns is substantial. With that in mind, using the
2015 BRFSS data set we are going to be looking to see if there is a seasonal
correlation or variation to individual self rated health. In doing this we are hoping to find
a similar correlation/variation as we saw in many of the articles we did research on for
the purposes of this study. The correlation of these two variables is important to look at
because it could have a massive effect on the data which is collected by BRFFS and all
other health related surveys and studies that are conducted worldwide. Additionally, it
could also have a massive effect in other fields such as business in that pharmaceutical
companies may want to look into advertising specific products during times of the year
where people are most likely to experience the symptoms which the product looks to
stop. One last major importance of this study is on a personal level people need to know
that others and themselves may be more susceptible to feeling as if their health is poor
during the winter and early spring season. Thus because of this people should be more
understanding and less critical of individuals who are claiming to be feeling under the
weather or sick more often during this time of the year. Overall I think that a data
analysis of this type would be very beneficial to the health field as well as many other
fields which are dependent on a person's (or populations) well being.
Literature Review: As I briefly mentioned in the introduction to this paper, we looked at
many different articles which established a correlation between diseases and time of the
year in order to get some background in this area before actually going forward and
performing our data analysis. One of the articles which we looked at by Roger Persson
and his associates was looking at the variation in individual stress by time of the year. A
key finding of theirs during this study was “Post hoc comparisons showed that only
stress ratings made at 14:00 (8 h after awakening) exhibited monthly variation (p =
0.001), with the four highest stress scores observed in January to April and the four
lowest stress scores observed in July to October (Persson et al,. 2010). Clearly this
shows that more people are stressed out during the winter and early spring months
which can then further be associated with self-reported health. As we learned in class
there is a major correlation between stress level and an individual's self rated, we talked
about this in the form of a concept known as allostatic load. This concept basically
describes the idea that as people are subjected to greater levels of chronic stress, they
are also subject to a higher amount of strain on their bodies and minds which we think
could severely impact one's self rated health. Additionally, Ming-Jui Hung and
associates in their study found that “In conclusion, among patients without obstructive
CAD, CAS displayed seasonal variation with winter and spring peaks, followed by
autumn and summer.” (Hung et al., 2015) Again this is another indicator that bad health
and the risk of contracting certain diseases is heavily dependent upon the time of the
year. Although heart disease may not have a specific effect on individuals self rated
health; if individuals are aware that they are more susceptible to diseases and other
possible unhealthy conditions during certain times of the year, this then may cause
them to unconsciously self report lower health levels.
A third article we looked at by Maurizio Abrignani concluded that “When
examining cases of angina pectoris, fewer admissions occur during the summer,
whereas the greatest number occurs during the spring” (Abrinani et al,. 2011). Here
again although there is a slight difference of conclusion, the fact that angina pectoris
has a significant difference in hospital admissions by season asserts that the concept of
seasonal health effects is in fact a reality. In looking to apply this to our study it may be
the case that people feel the symptoms of angina pectoris at a higher severity or
frequency or both during the spring as opposed to other seasons throughout the year,
and because of this are more likely to be admitted to a hospital for the disease. On the
other hand, a counterexample to the research which we have displayed so far comes
from Marrero Osvaldo where he is able to show that “In epidemiological seasonality
studies, the amplitude is often small, so that any smoothing of the data is likely to erase
the very variation that we are trying to detect.” (Osvaldo 2013) Here the author is stating
that in his studies related specifically to epidemiology there is a very small amplitude, or
differences in health by season. So much so that if data is “smoothed out” or
manipulated in any small way it could get rid of the small variation that actually occurs
between seasons. This study is fascinating as it both specifically relates to epidemiology
and also goes against all of the other research which we have done so far for
background information regarding our research question. The last article which we
looked at to better inform us on our research topic as well as for the purpose of this lit
review is that of “Shining Light on Seasonal Depression” by Patricia Daukantas. In her
article she states “Lam estimated that, in northern populations, as many as 15 to 20
percent of adults notice some symptoms, such as problems with fatigue, oversleeping
and overeating, during the winter, but a much smaller proportion have full-blown
SAD.”(Daukantas 2008) This article is particularly important to our specific study as it
explains how, in certain populations, up to one fifth of the adults experience more
symptoms which could greatly impact individual self rated health (fatigue, oversleeping,
overeating) during the winter. In looking at these last two sources in particular, it seems
to me that we could possibly get a result ranging from almost no variation in self rated
health between season to up to a 20 percent variation in self rated health between
season. That being said I think it shows how particularly important our research is in
that there is no definitive answer to our question that has been proven time and time
again.
Research Question: Does self-reported health differ depending on the time of the
year? If so, what seasons or months have the largest correlation to self-reported health?
Hypothesis: People will report their general health as worse during the winter as
opposed to the spring, summer, and fall.
Methods Data: By using the variable “imonth”, we were able to subset brfss 2015 into
twelve different subsets by month. To subset all of the observations taken in January we
simply wrote “January<-brfss2015.full[brfss2015.full$imonth=='01',]”. Once we had the
twelve subsets January through December, we grouped the subsets into groups of
three to represent the different seasons. To make it easy, we called these subsets
“spring”, “summer”, “fall”, and “winter”. The subset “spring” represents all the
observations taken in the months of March, April, and May, the subset “summer”
represents those taken in June, July, and August, the subset “fall” consist of
observations in September, October, and November, and the subset “winter” represents
the observations taken during December, January, and February. In order to combine
all of these subsets into subsets that represented each season, we used the “rbind”
function. For example, to subset all the observations taken in the spring we typed
“spring<-rbind(March,April,May)”. We then did this for the other three seasons.
Before we were able to use “genhlth” to find the mean self-reported health of
individuals according to the season the interview took place, we realized that the
recorded values of seven and nine may give us a less accurate average. Genhlth is a
variable that could have seven different values. These values are 1,2,3,4,5,7,and 9.
Where 1 means you feel excellent and 5 means you feel that your health is poor. The
values 2,3, and 4 represent very good, good, and fair health respectively. The value
seven was given by people who did not know or were not sure of their current general
health. We felt excluding this value was necessary because not knowing your current
general health doesn’t fit on the scale between feeling excellent and poor. We also
excluded the value nine because refusing to answer doesn’t fit on this excellent-to-poor
scale as well. More importantly, we realized that the reported values of seven and nine
may significantly increase our mean, and give us less accurate results. In order to
remove these observations from our subsetted season spring, we typed “spring<-
spring[spring$genhlth<06,]”. After doing this for each season’s subset, we had a clean
sample of people who either rated their health as excellent, very good, good, fair, or
poor. Once all of the seasons only contained genhlth values of 1 to 5, we combined all
the subsetted seasons to create a subset called “year” that consisted of every
observation with a genhlth value of 1 to 5. To do this, we typed “year<-
rbind(summer,spring,fall,winter)”. After using rbind to create the subset year, we found
that we were working with a total of 440,211 out of 441,456 observations. With this
subset, we were then able to find the mean self-reported health by individuals over the
year of 2015. Unfortunately, two people out of 441,456 left their genhlth value blank, so
in order to find the mean self-reported health of individuals over the year we typed
“mean(year$genhlth,na.rm = TRUE)”. This allowed us to get a mean by removing the
two observations with blank values. Once this was done, we were able to compare this
mean with the means acquired by each season. (figure 1)
Once we found the mean self-reported health of individuals by season, we
thought it would also be interesting to find the mean self-reported health by region in the
United States. To do this, we subsetted the data into all fifty states including the District
of Columbia. Using the rbind function, we combined the states into their corresponding
region in the United States. We then had four subsets called Northeast, West, South,
and Midwest. Again, we got rid of the observations with genhlth values of seven and
nine. Once subsetted into regions consisting of only observations with genhlth values of
1-5, each of these regions had 79,296, 104,533, 67,782, and 121,441 observations
respectively. Finally, we combined these four regional subsets into one called
“newyear”. We then took the mean genhlth of each of these regions and compared it to
the “newyear’s” mean genhlth. (figure 3)
Results: After compiling the data into observations with genhlth values of 1-5, we
computed the mean self-reported health of individuals over the year of 2015. Our mean
self-reported health over the year was 2.564259. This means that out of the 440,209
people interviewed, people on average felt a little more than halfway between very good
and good. We then calculated the genhlth means for each of the subsetted seasons.
For the spring we got an average of 2.563962 for genhlth, a mean of 2.570128 for the
summer, 2.579947 for the fall, and 2.544152 for the winter.
To our surprise, the mean genhlth during the winter was significantly smaller than
the year’s average. This means that during the months of December, January, and
February, people on average actually report having better health than people
interviewed in the spring, summer, and fall. Not only that, but as seen in the histograms
(figure 2), the ratios between self-reported health values and the seasons showed very
little variation. Since these results went against our hypothesis and prior research, we
figured we should subset the data into another category and maybe find a more
significant difference in means between the year’s genhlth mean and a different
category’s genhlth mean.
Instead of comparing the year’s average self-reported health to the average self-
reported health during different seasons, we decided to compare it the average self-
reported health of people in the four different regions in the United States. After
subsetting the data into the Northeast, Midwest, South, and West, we found that the
average self-reported health among these regions were 2.475579, 2.538014, 2.578089,
and 2.497379 respectively.
When subsetting the data into regions, we found that there was a significantly
larger standard deviation between the “Newyear” mean and the region’s mean when
compared to the “year” mean and season mean.
Discussion/Conclusion: After taking all the means of self-reported health during each
season and comparing it to the year’s mean self-reported health, we found very little in
variation between the means. Not only did we find little variation in the means, but we
also found almost no variation in frequencies between the values of genhlth and the
season. We also found that people in the United States actually rate their health as
better during the winter as opposed to during the spring, summer, and fall. This result
rejected our researched hypothesis and caused us to consider other factors for why this
may be.
We believe that we may have gotten a lower mean self-reported health during
the winter as opposed to the year for a number of reasons. The main factor we
considered as to why our hypothesis was wrong was because different regions of the
United States are differently affected by the different seasons. After combining the data
into the four separate regions of the United States, we found a much more significant
variation in means between the self-reported health of the region and the year. This
may be due to the fact that the southern and western states do not experience as cold
of winters as the states located in the Northeast and Midwest.
Another reason we may have found self-reported health to be higher in the winter
as opposed to the other seasons is because of the different living environments that
people live in throughout the United States. Rural living areas allow people to come in
contact with less people making it harder for them to contract contagious diseases. This
is helpful especially during the Winter as people tend to stay indoors surrounded by
friends and family as opposed to other seasons. That being said it also puts them
farther away from resources such as doctors, hospitals, and pharmacies, which may
also cause them to report having a lower self-reported health. While rural living areas
allow people to come in contact with less people, people who live in urban areas are far
more prone to come in contact with others who have contagious diseases. These
people living in urban areas more often than not also have higher access to medical
institutions for when they may be getting sick.
Socioeconomic status also plays a large factor in your self-reported health. While
people with higher socioeconomic status may have more access to medical resources,
they are also prone to more stress. Their jobs are often more stressful which can cause
them to report having a lower self-reported health as opposed to someone with a lower
paying job and little stress. If we were to subset the data into observations by their
income group, we may have found a much more significant variation in means as
opposed to subsetting by just season.
In our research we found that there was variation between seasons and coronary
heart disease. Earlier in the semester, we also learned that coronary heart disease
affects white people more so than other races. We believe that if we were to subset the
brfss data into seasons and split each season into specific races we may find a larger
variation in means between self-reported health and races during each season as
opposed to comparing only season to self-reported health.
While all of these factors could very well play into why we got results that
rejected our hypothesis, we believe region was the largest factor as to why these results
did not come out the way we thought they might. If we compared regional self-reported
health by season we may find an even larger variation of means as opposed to just self-
reported health and season.
The data collected from our research can be helpful for many reasons. Because
we know that self-reported health varies between regions in the United States,
pharmaceutical companies can use this information to figure out where and what
demographic they should be marketing towards. Since we tested self-reported health
against region, we were able to find that states located in the South and Midwest on
average had worse self-reported health as opposed to people living in the West and
Northeast. Knowing this, pharmaceutical companies may find that they have a larger
market for antidepressants in the Midwest and South as opposed to the West and
Northeast.
Although we did not find a significant variation between self-reported health and
seasons, we were able to find one between region and year. If we had more time, we
could even split the regional self-reported health by season into even smaller
demographics such as socioeconomic status, race, and living area. From this we could
learn how money, race, and living area generally affect people living in different regions
in the United States during the different seasons of the year. This would allow people in
the health industry to better understand where and who is being affected by the different
seasons.
References:
Abrignani, M. G., S. Corrao, G. B. Biondo, R. M. Lombardo, P. Di Girolamo, A. Braschi,
A.Di Girolamo, and S. Novo. “Effects of Ambient Temperature, Humidity, and
Other Meteorological Variables on Hospital Admissions for Angina Pectoris.”
European Journal of Preventive Cardiology 19.3 (2011): 342-48. Web. 9 Nov.
2016.
Daukantas, Patricia. 2008. “Shining Light on Seasonal Depression.” Optics and
Photonics News 19(3):38–43.
Hung, Ming-Jui, Kuang-Hung Hsu, Nen-Chung Chang, and Ming-Yow Hung.
“Increased Numbers of Coronary Events in Winter and Spring Due to
Coronary Artery Spasm.” Journal of the American College of Cardiology 65.18
(2015): 2047-048. Web. 9 Nov. 2016.
Marrero, Osvaldo. “Seasonal Variation in Epidemiology.” The College Mathematics
Journal 44.5 (2013): 386-98. JSTOR. Web. 10 Nov. 2016.
Persson, Roger, Kai Ãsterberg, Anne H. Garde, Ãse M. Hansen, Palle Ãrbæk, and � �
Björn Karlson. "Seasonal Variation in Self-reported Arousal and Subjective
Health Complaints." Psychology, Health & Medicine 15.4 (2010): 434-44. Taylor
and Francis Online. Web. 10 Nov. 2016.
Tables and Figures:
Figure 1 (above) Figure 2 (below)
Figure 3 (below)