![Page 1: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/1.jpg)
SADC Course in Statistics
Statistical concepts
Module B2, Session3
![Page 2: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/2.jpg)
2To put your footer here go to View > Header and Footer
Objectives
At the end of this session students will be able to:
• Define statistics
• Enter simple datasets once the data entry form is set up
• Recognise the type of each variable in a dataset
• Know some ways to summarise data of each main type
• Explain how statistical investigations deal with variability
• Differentiate between descriptive and inferential statistics
![Page 3: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/3.jpg)
3To put your footer here go to View > Header and Footer
Activities
1. This introduction
2. Entry of the data from the CAST survey
3. Discussion/presentation on statistical concepts1. Using the data entered2. And other case studies
4. The statistical glossary1. For when you need to remind yourself about
terminology
![Page 4: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/4.jpg)
4To put your footer here go to View > Header and Footer
What is statistics - 1?
From RSS webpage:
1. Statistics changes numbers into information.
2. Statistics is the art and science of deciding: • what are the appropriate data to collect, • deciding how to collect them efficiently • and then using them to give information, • answer questions, • draw inferences • and make decisions.
![Page 5: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/5.jpg)
5To put your footer here go to View > Header and Footer
What is statistics - 2?
3. Statistics is making decisions when there is uncertainty.
• We have to make decisions all the time, • in everyday life, • and as part of our jobs. • Statistics helps us make better decisions.
4. Statistics is NOT just collecting a lot of numbers• It is collecting numbers for a purpose
![Page 6: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/6.jpg)
6To put your footer here go to View > Header and Footer
What is statistics - 3?
From Wikipedia:
5. Statistics is a mathematical science pertaining to the• collection, • analysis,• interpretation or explanation• and presentation
of data.
6. Statistics are used for making informed decisions• and misused for other reasons
in all areas of business and government
![Page 7: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/7.jpg)
7To put your footer here go to View > Header and Footer
What is statistics - 4? From the book “Statistics: A guide to the unknown”:
7. Statistics is the science of learning from data.
Question 1 in the practical sheet
•From these 7 definitions – in the practical sheet• either chose the one you think is most appropriate• or make your own
a) A one – line definition
b) A longer definition
![Page 8: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/8.jpg)
8To put your footer here go to View > Header and Footer
Data checking and entry – Question 2
• What can we learn from the data you collected?• Work in pairs or small groups• First check the data from the CAST survey• Check each others, not your own
• Is it legible?• Can it be entered into the computer?• Is the response to the open-ended question clear?• Can the text be simplified?• If there are many points, ask the respondent to state
which are the most important 2 or 3.
• Brief notes (as a report) to be made in the exercise sheet
• to establish the data are ready for entry
![Page 9: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/9.jpg)
9To put your footer here go to View > Header and Footer
Data entry into Excel
Just type the number. The label is
automatic
![Page 10: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/10.jpg)
10To put your footer here go to View > Header and Footer
Data entry and checking – Question 3
• The data are now entered
• This can be a class exercise• on a single computer
• Data is entered by someone else• for each respondent (never by themselves)
• Then it must be checked• read it out• check by reading back
• Put the record number from the Excel form• on your original sheet• or add your names as another field in the Excel sheet
• Why might it be better to just have a number?
![Page 11: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/11.jpg)
11To put your footer here go to View > Header and Footer
Data entry and checking• You should now have completed question 3
• On the practical sheet
• How long to you estimate
• For 1000 records to be entered?
![Page 12: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/12.jpg)
12To put your footer here go to View > Header and Footer
Once the data are entered• Remember:
“Statistics is the science of learning from data.”
• To learn as much as possible• we must have confidence in the data• so they must be entered and checked well
• This is what we have done in the groups
• Now the data are ready for the analysis
• Before that, look at some other data sets• Look for the common points• That apply to all the sets• and look for differences
![Page 13: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/13.jpg)
13To put your footer here go to View > Header and Footer
Types of data - 1 • The analysis depends on the type of data
• What are the types here?
• For questions 1 to 6• Your answer was one of 5 categories• e.g. 1: Strongly agree, 2: Agree, … 5: Strongly disagree• These categories have an ordering• from strongly agree to strongly disagree
• This type of data are called • categorical • or factor• or qualitative
• With the ordering, they are sometimes called • ordered categorical data
![Page 14: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/14.jpg)
14To put your footer here go to View > Header and Footer
Types of data - 2
• The last question in the survey • was a sentence or two that was written
• This is also an example of qualitative data
• It is an open-ended response
• These data can be reported – and reporting the sentences can be very useful
• So it is good if they are entered as they stand
• To summarise perhaps the responses can be coded?
![Page 15: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/15.jpg)
15To put your footer here go to View > Header and Footer
Coding open-ended questions –Question 4
• This is question 4 in the practical sheet
• Looking at the responses in your groups• Could you code them?• What different codes would you have?• How would you enter the codes?
• Might you lose anything by coding
• For a quick analysis• Could you enter the complete texts• And analyse the other columns• And then code later?
• What might you lose by coding?
![Page 16: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/16.jpg)
16To put your footer here go to View > Header and Footer
Coding and entering open-ended data
• Discuss the suggestions for the codes.
• If some points are made by many students then prepare a summary,
• how many as a frequency• and as a percentage
• With the small number of responses • there is no need to enter them into the computer
• But discuss how it could be done
• It is an example of a multiple response question• because respondents may give no points• or more than one point
• If you ask for the most important observation• then it becomes a single qualitative response
![Page 17: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/17.jpg)
17To put your footer here go to View > Header and Footer
Other data sets• Zambia rainfall data
• Tanzania agriculture survey
• Look for the layout of the data• is it the same as for the simple CAST survey?
• Look for the types of data
• Which are the qualitative variables?• are they ordered?
• Which are the quantitative variables?• which of them are discrete?• and which are continuous?• have any been coded to become qualitative?
![Page 18: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/18.jpg)
18To put your footer here go to View > Header and Footer
Annual climatic data from Zambia
![Page 19: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/19.jpg)
19To put your footer here go to View > Header and Footer
Survey data from Tanzania - 1
![Page 20: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/20.jpg)
20To put your footer here go to View > Header and Footer
Survey data from Tanzania - 2
![Page 21: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/21.jpg)
21To put your footer here go to View > Header and Footer
Discussion- Question 5
• The layout of the data• Was always the same!• In a rectangle
• Each row is a record• There are as many records (rows of data) • as there were respondents, or students, or units
• Each column is a variable• Variables can be qualitative• or they can be quantitative
• Discuss which type they are • For each data sets• complete the tables in the practical sheet, question 5
![Page 22: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/22.jpg)
22To put your footer here go to View > Header and Footer
Qualitative variables• They are categorical
• They may be nominal, (which implies there is no ordering)
• Give some examples from the Tanzania survey
• They may be ordered – as in the CAST survey
• Give an ordered example from the Tanzania survey
![Page 23: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/23.jpg)
23To put your footer here go to View > Header and Footer
Examples of analysis – Tanzania surveyQuestion 6
• There are 3223 records, • but just take the 18 you can see in the figure
• Count the values for Q0123 – head of household• There were 6 Females and 12 Males• So 2/3 of the 18 households had a male head• That’s about 70% • but percentages are a bit misleading with so few numbers
• Now you give a similar summary for Q021• type of agricultural household
• And also Q3464• how often did the household have food problems
![Page 24: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/24.jpg)
24To put your footer here go to View > Header and Footer
Add a simple chart• A simple chart can also be sketched
• Here is one by Excel
• But a sketch can be “by hand”• Excel will be used for these tasks from Session 4
![Page 25: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/25.jpg)
25To put your footer here go to View > Header and Footer
Examples of analysis – CAST survey Question 7
• Do a similar analysis of the CAST survey
• To make it quick • each group could initially process just one question• then report the results to the class
• Include a hand drawn chart• Sketch a simple bar chart • and include the numbers on the chart• as shown earlier
![Page 26: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/26.jpg)
26To put your footer here go to View > Header and Footer
Quantitative variables- Question 8• They may be discrete (whole numbers)
• Give examples from the climatic data• And the Tanzania survey
• They may be (conceptually) continuous• Give examples from the data sets
• Also they may be coded into (ordered) categories• Give an example from the Tanzania survey
![Page 27: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/27.jpg)
27To put your footer here go to View > Header and Footer
Examples of analysis – Tanzania survey
• An analysis of the 18 values in Q3462– The number of times meat was eaten last week
• minimum = 0• maximum = 5• adding the values: total = 31, • so the mean = 31/18 about 1.7 times per week
• Note: the mean does not have to be an integer• just because the individual values are whole numbers
• Repeat this analysis• for Q3463 – times fish eaten last week• and HHsize
![Page 28: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/28.jpg)
28To put your footer here go to View > Header and Footer
Data analysis• As the layout of the data is always the same
• Once you know how to analyse one data set• You will have the principles to analyse them all• And we have just done one analysis!
• You have seen that• The appropriate analysis depends on the type of data
• So what are the principles • of analysing (summarising) data • of the different types?
![Page 29: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/29.jpg)
29To put your footer here go to View > Header and Footer
The methods of analysis
• How many? • are questions for qualitative variables• for example the CAST survey, the Tanzania survey
• You used summaries• Like counts, or proportions or percentages
• How large?
• How variable?• are questions for quantitative variables• for example the climatic data or the Tanzania survey
• We used summaries • Like averages, extremes and measures of spread
![Page 30: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/30.jpg)
30To put your footer here go to View > Header and Footer
A toolkit for analysis
• Different types of graph are also used
• Qualitative data• “how many”
• Quantitative data• how large• how variable
![Page 31: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/31.jpg)
31To put your footer here go to View > Header and Footer
Statistics and variation • In the CAST survey - why not just ask one student?
• In the climatic data - why not just use one year?
• In the agriculture survey - why not just use one household?
• Because there is variation between the responses
• Remember this definition?• “Statistics is making decisions • when there is uncertainty.”
![Page 32: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/32.jpg)
32To put your footer here go to View > Header and Footer
Variation is everywhere!
• In the book “Statistics a guide to the unknown”
• “Variation is everywhere. • Individuals vary• Repeated measurements on the same individual vary
• The science of statistics• provides tools for dealing with variation”
• So statistics is concerned with making sense from data, when there is variation
![Page 33: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/33.jpg)
33To put your footer here go to View > Header and Footer
Fighting the curse of variation• To do good statistics you must
• tame variation• fight the curse of variation
• You have 2 main strategies for overcoming variation
• 1. Take enough observations• In the Tanzania survey there were 3223 households
just from this one region
• 2. Measure characteristics that explain variation• Variation itself is not necessarily the problem• Variation you do not understand is the problem
![Page 34: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/34.jpg)
34To put your footer here go to View > Header and Footer
An example: explaining variation• Take the CAST survey
• Add a new record for an imaginary student• Make it VERY DIFFERENT to the existing records • So if most students were positive about CAST• Then make this record very negative, etc
• You have added variation
• Now what could you (should you) have measured • to explain this variation?
![Page 35: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/35.jpg)
35To put your footer here go to View > Header and Footer
What you could have measured• This little survey only asked about CAST
• It did not ask about you, e.g.• male/female• experience• age• computer access• etc
• These measurements could help• to understand the difference with this new student
• The Tanzania survey also asked about• Education• Possessions, etc
• Why – to be able to understand/explain variation
![Page 36: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/36.jpg)
36To put your footer here go to View > Header and Footer
Analysis and variation together
• For statistical analysis you have:• summarised columns of data• i.e. summarised individual variables
• You did this for qualitative and quantitative variables
• To fight the curse of variation• You take measurements• So you add to the rows of data
• That helps you to explain the variation
• That’s statistics for you!• You analyse the columns, i.e. the variables• And you understand variability by looking at the rows
![Page 37: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/37.jpg)
37To put your footer here go to View > Header and Footer
Types of statistics• Wikepedia says roughly:
• Statistical methods can be used to summarize • or describe a collection of data; • this is called descriptive statistics.
• In addition, patterns in the data may be modelled• and then used to draw inferences about the process
or population being studied; • this is called inferential statistics.
• Both descriptive and inferential statistics • comprise applied statistics.
![Page 38: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/38.jpg)
38To put your footer here go to View > Header and Footer
Descriptive and inferential statistics
• We have just done descriptive statistics
• We will only do descriptive statistics in this module
• The sample in the Tanzania agricultural survey • was 3223 households
• That’s just under 1% of the households in the region• See the column called WT – with values like 137• So each observation “represents 137 households
• But with such a large sample• The inferences for the whole region• Will be quite precise
• So most of what we need now is descriptive tools• In the Higher level modules • we add ideas of inferential statistics
![Page 39: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/39.jpg)
39To put your footer here go to View > Header and Footer
Glossary of statistical terms
• Each subject becomes easier• when you understand the terms
• A glossary is supplied• Called the SSC Statistical Glossary
• It explains most of the terms • For the 3 levels of this course
• So some terms may be new to you now
• An example is on the next slide• You can print the glossary if you wish• But it is good to look on-line• Then all the terms in blue are links• So you can easily move about in the document
![Page 40: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/40.jpg)
40To put your footer here go to View > Header and Footer
Example from the glossary• Descriptive statistics• If you have a large set of data, then descriptive statistics
provides graphical (e.g. boxplots) and numerical (e.g. summary tables, means, quartiles) ways to make sense of the data.
• The branch of statistics devoted to the exploration, summary and presentation of data is called descriptive statistics.
• If you need to do more than descriptive summaries and presentations it is to use the data to make inferences about some larger population.
• Inferential statistics is the branch of statistics devoted to making generalizations.
![Page 41: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/41.jpg)
41To put your footer here go to View > Header and Footer
Learning objectives
• Define statistics
• Enter simple datasets once the data entry form is set up
• Recognise the type of each variable in a dataset
• Know some ways to summarise data of each main type
• Explain how statistical investigations deal with variability
• Differentiate between descriptive and inferential statistics
![Page 42: SADC Course in Statistics Statistical concepts Module B2, Session3](https://reader035.vdocuments.site/reader035/viewer/2022081505/5515fb41550346d46f8b58ac/html5/thumbnails/42.jpg)
42To put your footer here go to View > Header and Footer
The end
• Next we move to the use of Excel
• To produce the tables and graphs
• So you can analyse all 3223 records – not just 18