research methods knowledge base_book_sampling

29
Sampling Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. Let's begin by covering some of the key terms in sampling like "population" and "sampling frame." Then, because some types of sampling rely upon quantitative models, we'll talk about some of the statistical terms used in sampling . Finally, we'll discuss the major distinction between probability and Nonprobability sampling methods and work through the major types in each. External Validity External validity is related to generalizing. That's the major thing you need to keep in mind. Recall that validity refers to the approximate truth of propositions, inferences, or conclusions. So, external validity refers to the approximate truth of conclusions the involve generalizations . Put in more pedestrian terms, external validity is the degree to which the conclusions in your study would hold for other persons in other places and at other times. In science there are two major approaches to how we provide evidence for a generalization. I'll call the first approach the Sampling Model. In the sampling model, you start by identifying the population you would like to generalize to. Then, you draw a fair sample from that population and conduct your research with the sample. Finally, because the sample is representative of the population, you can automatically generalize your results back to the population. There are several problems with this approach. First, perhaps you don't know at the time of your study who you might ultimately like to generalize to. Second, you may not be easily able to draw a fair or representative sample. Page 1 of 29

Upload: xshan-nnx

Post on 16-Aug-2015

10 views

Category:

Documents


0 download

DESCRIPTION

Research Methods Knowledge Base_Book_SAMPLING

TRANSCRIPT

SamplingSampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. Let's begin by covering some of the key terms in sampling like population and sampling frame. !hen, because some types of sampling rely upon "uantitative models, we'll talk about some of the statistical terms usedin sampling. #inally, we'll discuss the ma$or distinction between probability and %onprobability sampling methods and work through the ma$or types in each. External Validity&'ternal validity is related to generalizing. !hat's the ma$or thing you need to keep in mind. (ecall that validity refers to the appro'imate truth of propositions, inferences, or conclusions. So, external validity refers to the appro'imate truth of conclusions the involve generalizations. )ut in more pedestrian terms, e'ternal validity is the degree to which the conclusions in your study would hold for other persons in other places and at other times.*n science there are two ma$or approaches to how we provide evidence for a generalization. *'ll call the first approach the Sampling Model. *n the sampling model, you start by identifying the population you would like to generalize to. !hen, you draw a fair sample from that population and conduct your research with the sample. #inally, because the sample is representative of the population, you can automatically generalize your results back to the population. !here are several problems with this approach. #irst, perhaps you don't know at the time of your study who you might ultimately like to generalize to. Second, you may not be easily able to draw a fair or representative sample. !hird, it's impossible to sample across all times that you might like to generalize to (like ne't year).)age + of +,*'ll call the second approach togeneralizing the Proximal Similarity Model. ')ro'imal' means 'nearby' and 'similarity'means... well, it means 'similarity'. !he term proximalsimilarity was suggested by -onald !. .ampbell as an appropriate relabeling of the term external validity (although he was the first to admit that it probably wouldn't catch on/). 0nder this model, we begin by thinking about different generalizability conte'ts and developing a theory about which conte'ts are morelike our study and which are less so. #or instance, we might imagine several settings that have people who are more similar to the people in our study or people who are less similar. !his also holds for times and places. 1hen we place different conte'ts in terms of their relative similarities, we can call this implicit theoretical a gradient of similarity. 2nce we have developed this pro'imal similarity framework, we are able to generalize. 3ow4 1e conclude that we can generalize the results of our study to other persons, placesor times that are more like (that is, more pro'imally similar) to our study. %otice that here, we can never generalize with certainty 55 it is always a "uestion of more or less similar. Threats to External Validity6 threat to e'ternal validity is an e'planation of how you might be wrong in making a generalization. #or instance, you conclude that the results of your study (which was done in a specific place, with certain types of people, and at a specific time) can be generalizedto another conte't (for instance, another place, with slightly different people, at a slightly later time). !here are three ma$or threats to e'ternal validity because there are three ways you could be wrong 55 people, places or times. 7our critics could come along, for e'ample, and argue that the results of your study are due to the unusual type of people who were in the study. 2r, they could argue that it might only work because of the unusual place you did the study in (perhaps you did your educational study in a college town with lots of high5achieving educationally5oriented kids). 2r, they might suggest thatyou did your study in a peculiar time. #or instance, if you did your smoking cessation study the week after the Surgeon 8eneral issues the well5publicized results of the latest smoking and cancer studies, you might get different results than if you had done it the week before.Improving External Validity3ow can we improve e'ternal validity4 2ne way, based on the sampling model, suggests that you do a good $ob of drawing a sample from a population. #or instance, you should use random selection, if possible, rather than a nonrandom procedure. 6nd, once selected,)age 9 of +,you should try to assure that the respondents participate in your study and that you keep your dropout rates low. 6 second approach would be to use the theory of pro'imal similarity more effectively. 3ow4 )erhaps you could do a better $ob of describing the ways your conte'ts and others differ, providing lots of data about the degree of similarity between various groups of people, places, and even times. 7ou might even be able to mapout the degree of pro'imal similarity among various conte'ts with a methodology like concept mapping. )erhaps the best approach to criticisms of generalizations is simply to show them that they're wrong 55 do your study in a variety of places, with different peopleand at different times. !hat is, your e'ternal validity (ability to generalize) will be stronger the more you replicate your study. Sampling Terminology6s with anything else in life you have to learn the language of an area if you're going to ever hope to use it. 3ere, * want to introduce several different terms for the ma$or groups that are involved in a sampling process and the role that each group plays in the logic of sampling.!he ma$or "uestion that motivates sampling in the first place is: 1ho do you want to generalize to4 2r should it be: !o whom do you want to generalize4 *n most social research we are interested in more than $ust the people who directly participate in our study. 1e would like to be able to talk in general terms and not be confined only to the people who are in our study. %ow, there are times when we aren't very concerned about generalizing. ;aybe we're $ust evaluating a program in a local agency and we don't care whether the program would work with other people in other places and at other times. *n that case, sampling and generalizing might not be of interest. *n other cases, we would really like to be able to generalize almost universally. 1hen psychologists do research, they are often interested in developing theories that would hold for all humans. to =.D>>G and that we can say with ,,E confidence that the population value is between =.BA? and =.D9?. !he real value (in this fictitious e'ample) was =.A9 and so we have correctly estimated that value with our sample. )age , of +,Probability Sampling6 probability sampling method is any method of sampling that utilizes some form of random selection. *n order to have a random selection method, you must set up some process or procedure that assures that the different units in your population have e"ual probabilities of being chosen. 3umans have long practiced various forms of random selection, such as picking a name out of a hat, or choosing the short straw. !hese days, wetend to use computers as the mechanism for generating random numbers as the basis for random selection.Some Definitions>>). %umber the list of names from + to +>>> and then use the ball machine to select the three digits that selects each person. !he obvious disadvantage here is that you need to get the ball machines. (1here do they make those things, anyway4 *s there a ball machine industry4).%either of these mechanical procedures is very feasible and, with the development of ine'pensive computers there is a much easier way. 3ere's a simple procedure that's especially useful if you have the names of the clients already on the computer. ;any computer programs can generate a series of random numbers. Let's assume you can copy and paste the list of client names into a column in an &I.&L spreadsheet. !hen, in the column right ne't to it paste the function F(6%-() which is &I.&L's way of putting a random number between > and + in the cells. !hen, sort both columns 55 the list of names and the random number 55 by the random numbers. !his rearranges the list in random order from the lowest to the highest random number. !hen, all you have to do is take the first hundred names in this sorted list. pretty simple. 7ou could probably accomplish the whole thing in under a minute.Simple random sampling is simple to accomplish and is easy to e'plain to others. E and ?E respectively). *f we $ust did a simple random sample of nF+>> with a sampling fraction of +>E, we would e'pect by chance alone that we would only get +> and ? persons from each of our two smaller groups. 6nd, by chance, we could get fewer than that/ *f we stratify, we can do better. #irst, let's determinehow many people we want to have in each group. Let's say we still want to take a sample of +>> from the population of +>>> clients over the past year. > people in it and that you want to take a sample of nF9>. !o use systematic sampling, the population must be listed in a random order. !he sampling fraction would be f F 9>H+>> F 9>E. in this case, the interval size, k, is e"ual to %Hn F +>>H9> F ?. %ow, select a random integer from + to ?. *n our e'ample, imagine that you chose @. %ow, to select the sample, start with the @th unit in the list and take every k5th unit (every ?th, because kF?). 7ou would be sampling units @, ,, +@, +,, and so on to +>> and you would wind up with 9> units in your sample.#or this to work, it is essential that the units in the population are randomly ordered, at least with respect to the characteristics you are measuring. 1hy would you ever want to use systematic random sampling4 #or one thing, it is fairly easy to do. 7ou only have to select a single random number to start things off. *t may also be more precise than simple random sampling. #inally, in some situations there is simply no easier way to do random sampling. #or instance, * once had to do a study that involved sampling from all the books in a library. 2nce selected, * would have to go to the shelf, locate the book, and record when it last circulated. * knew that * had a fairly good sampling frame in the form of the shelf list (which is a card catalog where the entries are arranged in the order they occur on the shelf). !o do a simple random sample, * could have estimated the total number of books and generated random numbers to draw the sampleG but how would * )age += of +,find book KA@,=9, easily if that is the number * selected4 * couldn't very well count the cards until * came to A@,=9,/ Stratifying wouldn't solve that problem either. #or instance, * could have stratified by card catalog drawer and drawn a simple random sample within each drawer. >,>>>. * decided that * wanted to take a sample of +>>> for a sampling fraction of +>>>H+>>,>>> F +E. !o get the sampling interval k, * divided %Hn F +>>,>>>H+>>> F +>>. !hen * selected a random integer between + and +>>. Let's say * got ?A. %e't * did a little side study to determine how thick a thousand cards are in the card catalog (taking into account the varying ages of the cards). Let's say that on average * found that two cards that were separated by +>> cards were about .A? inches apart in the catalog drawer. !hat information gave me everything * needed to draw the sample. * counted to the ?Ath by hand and recorded the book information. !hen, * took a compass. ((emember those from your high5school math class4 !hey're the funny little metal instruments with a sharp pin on one end and a pencil on the other that you used to draw circles in geometry class.) !hen * set the compass at .A?, stuck the pin end in at the ?Ath card and pointed with the pencil end to the ne't card (appro'imately +>> books away). *n this way, * appro'imated selecting the +?Ath, 9?Ath, =?Ath, and so on. * was able to accomplish the entire selection procedure in very little time using this systematic random sampling approach. *'d probably still be there counting cards if *'d tried another random sampling method. (2kay,so * have no life. * got compensated nicely, * don't mind saying, for coming up with this scheme.)&luster )*rea+ $andom Sampling!he problem with random sampling methods when we have to sample a population that's disbursed across a wide geographic region is that you will have to cover a lot of ground geographically in order to get to each of the units you sampled. *magine taking a simple random sample of all the residents of %ew 7ork State in order to conduct personal interviews. >, you will continue sampling until you get those percentages and then you will stop. So, if you've already got the @> women for your sample, but not the si'ty men, you will continue to sample men but even if legitimate women respondents come along, you will not sample them because you have already met your "uota. !he problem here (as in much purposive sampling) is that you have to decide the specific characteristics on which you will base the "uota. 1ill it be by gender, age, education race, religion, etc.4%onproportional /uota sampling is a bit less restrictive. *n this method, you specify the minimum number of sampled units you want in each category. here, you're not concerned with having numbers that match the proportions in the population. *nstead, you simply want to have enough to assure that you will be able to talk about even small groups in the population. !his method is the nonprobabilistic analogue of stratified random sampling in that it is typically used to assure that smaller groups are ade"uately represented in your sample. 3eterogeneity Sampling 1e sample for heterogeneity when we want to include all opinions or views, and we aren't concerned about representing these views proportionately. 6nother term for this is sampling for diversity. *n many brainstorming or nominal group processes (including concept mapping), )age +D of +,we would use some form of heterogeneity sampling because our primary interest is in getting broad spectrum of ideas, not identifying the averageor modal instance ones. *n effect, what we would like to be sampling is not people, but ideas. 1e imagine that there is a universe of all possible ideas relevant to some topic and that we want to sample this population, not the population of people who have the ideas. .learly, in order to get allof the ideas, and especially the outlier or unusual ones, we have to include a broad and diverse range of participants. 3eterogeneity sampling is, in this sense, almost the opposite of modal instance sampling. Snowball Sampling *n snowball sampling, you begin by identifying someone who meets the criteria for inclusion in your study. 7ou then ask them to recommend others who they may know who also meet the criteria. 6lthough this method would hardly lead to representative samples, there are times when it may be the best method available. Snowball sampling is especially useful when you are trying to reach populations that are inaccessible or hard to find. #or instance, if you are studying the homeless, you are not likely to be able to find good lists of homeless people within a specific geographical area. 3owever, if you go to that area and identify one or two,you may find that they know very well who the other homeless people in their vicinity are and how you can find them.)age +, of +,