biostatistics 621: statistical methods i · textbooks available at huc bookstore text:...
TRANSCRIPT
Biostatistics 621: Statistical Methods I
Fall Semester 2007
Course Information
Instructor: T. Mark Beasley, PhDAssociate Professor of BiostatisticsOffice: Ryals Room 309EPhone: (205) 975-4957Email: [email protected]
When: Tuesday/Thursday 11:00 – 12:15 PMWhere: Ryals Room 107Office Hours: by apptWebsite:http://www.soph.uab.edu/Statgenetics/People/MBeasley/Courses/BST621.htm
Textbooks
Available at HUC Bookstore
Text: Biostatistics: A Foundation for Analysis in the Health Sciences by Wayne W. Daniel, published by John Wiley & Sons. ISBN 0-471-45654-3
Prerequisites
This course is the first course in the basic applied statistical methods sequence for the first year graduate students in Biostatistics.
It may be taken by other graduate students with a background in calculus and linear (matrix) algebra and those who will take more BST courses
Evaluation
All material submitted for grading must be typed, no output will be accepted unless specifically requested
Grading: – Homework: 40%– Midterm: 30%– Final: 30%
Five points will be deducted Five points will be deducted each day for late homework, each day for late homework, unless there are extenuating unless there are extenuating
circumstances. circumstances.
Objectives
BST 621 is an intermediate-level course in basic analysis methods, to introduce students to the elementary concepts, statistical models, and applications of:• probability• commonly used sampling distributions• parametric and nonparametric one and two
sample tests• confidence intervals• correlation and regression• analysis of variance (ANOVA)
Introduction
What are statistics?
What is the practice of biostatistics?
Statistics are just numbersThe practice of statistics involves measuring variability of numbers to interpret results.
What can you do with statistics?
• Analyze data after an experiment has been carried out
• Make suggestions for how experiments can be designed
• Goals:– Describe a population– Estimate variation – Prediction
Types of statistics
• Theoretical Statistics – formulas and symbols; Derivation of Statistics; Mathematical Proof
• Applied Statistics
Making sense out of data!
Useful Definitions
Data: A collection of facts, not necessarily numeric, such as:
Age Gender Hair color Weight Temperature
Measurement Scales
• Measurement is defined as the assignment of numbers to objects or events according to a defined set of rules.
• Measurement scales: various sets of rules by which numbers are assigned
Types of scales
• Nominal: Naming observations or classifying them into groups where one group is not “better” or “higher” than another.
• Ordinal: Groups or classes that can be ranked according to some criterion – there is an order.
Scales (cont)
• Interval: Order measurements with a defined, measurable difference between groups.
• Ratio: A scale with a true zero, so that equal ratios and intervals can be defined.
Population: A well defined collection of objects, such as: students (at UAB, in engineering), paint colors (from 1 company, from multiple companies), etc.
Census vs. Sample
If you collect information on all of the objects in a population, that is a census.
If you collect information on some of the population, that is a sample.
Types of Sampling
simple random sampling – the most simple sampling procedure involves selecting a subset of n objects from the population, such that each object has an equal chance of being selected
Sampling (cont)
stratified sampling – sampling a subset of n from each gender, each age group, or each school class
convenience sampling – when it isn’t possible to get a simple random sample, you sample what you have available to you
Variable: A measurement on an object that can change from one object to another. Usually denoted with lower case letters: x, y, z
Statistics
Descriptive Statistics: summary statistics, such as N, µ, σ2, σ. Often depicted using plots, such as: histograms, box and scatter plots.
Inferential Statistics: using data to make generalizations to a population. Inference is a conclusion that patterns in the data are present in the population.
• Parameters: unknown coefficients (variables) in the model, such as the mean or standard deviation. Unless you have a census (all subjects in a population), these are never truly known – only estimated.
• Statistical Significance: A precise statistical term that does not equate to practical significance. This usually means that the data provides evidence that the estimated parameter in not the null value (assumed value).
• Model: an equation that predicts the response as a function of other variables.
What is a Hypothesis?The question you are trying to answer and the
alternative (or opposite) of that question.
In statistics, the null hypothesis is usually the current standard or what you are trying to disprove. The alternative is what you are trying to show by statistically “rejecting” the null.
We will NEVER prove a hypothesis!
Case Study: CPR by Phone
In an urban setting, ~ 6% of out-of-hospital cardiac arrests survive to hospital discharge. Survival can increase if a bystander witnesses the arrest and administers cardiopulmonary resuscitation – but this happens < 50% of the time.
From the literature, when CPR is administered by a non-EMT, survival probability can ≥ least 9%. In the Seattle area1, emergency response personnel instructed bystanders in CPR over the phone. They found that 29 of 278 CPR patients survived to discharge from the hospital
(> 10%)
Question?
Does dispatcher-instructed bystander-administered CPR
improve the chances of survival?
Answering the Question with Data
1. Begin by writing down what you understand2. Outline the data and form clear and succinct
questions pertaining to what the data may imply (or what you would like to show)
3. Form a scientific question to determine if the results are random
4. Compare the data from each side of the question and decide what to believe
Staticise the Steps
Phase 1: State the Question
Phase 2: Decide How to Answer the Question
Phase 3: Answer the Question
Phase 4: Communicate the Answer to the Question
Phase 1: State the Question
1. Evaluate and describe the data
2. Review the assumptions
3. State the question—in the form of hypotheses
Phase 2: Decide How to Answer the Question
4. Decide on a summary number—a statistic—that reflects the question
5. How could random variation affect that statistic?
6. State a decision rule, using the statistic, to answer the question
Phase 3: Answer the Question
7. Calculate the statistic
8. Make a statistical decision
9. State the substantive conclusion
Phase 4: Communicate the Answer to the Question
10. Document your understanding with text, tables, or figures
How do we do this?
• We need to understand some basic principles about numbers, counting, and distributions
• We need to learn the best ways to display data and results in text, tables, and figures