STAT200 Introduction to Statistics
Assignment #1: Descriptive Statistics Data Analysis Plan
Before conducting any statistical analyses, researchers develop a plan for how they will analyze their
data to answer their research questions. The purpose of this assignment is to provide an experience
developing a descriptive statistics analysis plan. Note: This first assignment is a plan only; no statistics
will be calculated or graphs created. The second assignment will involve carrying out the plan, after
receiving feedback from your instructor.
Assignment Steps:
Step #1: Review the STAT200 data set file. (Note: This data set will be used for all three of this term’s
written assignments).
The data is a subsample from the US Department of Labor’s Consumer Expenditure Surveys (CE) and
provides information about the composition of households and their annual expenditures
(https://www.bls.gov/cex/). Detailed information on the sample and variables is included with the data
set file; please carefully review this information to familiarize yourself with the data (Note: This
information will be used in Assignment #2 to describe the dataset).
Step #2: Develop descriptive statistics data analysis plan.
➢ Task 1: Develop scenario. Imagine that you are head of a household and have to determine a
household budget plan based on the data available from the dataset. For instance, you are a 35
year old single parent with a high school diploma and one child.
➢ Task 2: Select variables for analysis that match the scenario developed in Task 1.The data set
provides information on household consumption; there are socioeconomic variables and
expenditures variables. The socioeconomic variable names start with “SE-” and the expenditure
variable names start with a “USD;” all expenditures are in US dollars. All students must use
income as one variable. Select two additional socioeconomic variables (one qualitative and one
quantitative) and two expenditures for your analysis that match the scenario you developed for
Task 1. For instance, using the example scenario of a 35 year old single parent with a high school diploma and one child, you could select “income,” “education,” and “number of children”
as socioeconomic variables and then pick two household expenditure items to show the
distribution of costs and compare that with your income. When selecting variables, think about
the following three questions:
o Why am I choosing these variables?
o What interests me about these variables?
o What do I think will be the outcome?
➢ Task 3: Determine appropriate measures of central tendency and dispersion for the selected
variables. For each quantitative variable, select at least one measure of central tendency and at
least one measure of dispersion (Please see below table for list of measures). For the qualitative
variable, select one measure of central tendency. When determining the measures of central
tendency and dispersion, think about what is appropriate given the level of measurement and
type of variable. Recommend referring to the text and information posted in our LEO classroom
to help with this task (Note: you will use this information to provide a rationale for your choice
of measures).
Measures of Central Tendency
Measures of Dispersion
● Mean
● Mode
● Median
● Range
● Sample Standard Deviation
● Variance
➢ Task 4: Determine appropriate graph and/or table for each of the selected variables. Select
one graph or table for each variable (Please see below table for list of graphs and tables). When
determining the graphs and tables, think about what is appropriate given the level of
measurement and type of variable. Recommend referring to the text and information posted in
our LEO classroom to help with this task (Note: you will use this information to provide a
rationale for your choice of graphs and/or tables).
Types of Graphs
Types of Tables
● Pie Chart
● Bar Chart
● Histogram
● Box Plots (also known as Box-and-Whiskers Plot)
● Frequency Table
● Relative Frequency Table
● Grouped Frequency Table
Step #3: Complete the “Assignment #1: Descriptive Statistics Data Analysis Plan Template.”
Remember, you will not be conducting any statistical analysis, drawing any graphs, or compiling any
tables for the first assignment. Rather, you need to wait for feedback from your instructor on this
assignment and use that feedback to complete Assignment #2.
Here are the main sections for this assignment (i.e., completing the plan template):
✓ Identifying Information. Fill in information on name, class, instructor, and date.
✓ Scenario. In this section, briefly (2-3 sentences) describe the scenario you developed in Step #2,
Task 1.
✓ Complete Table 1: Variables Selected for the Analysis. Enter information the variables selected
for analysis in Step #2, Task 2. For each selected variable be sure to include its: name as listed in
the data set, description, and variable type.
✓ Reason(s) for Selecting the Variables and Expected Outcome(s): In this section, for each
selected variable, please answer the following questions:
✓ Why did I choose this variable?
✓ What interests me about this variable?
✓ What do I think will be the outcome?
✓ Complete Table 2. Numerical Summaries of the Selected Variables. Enter information on
selected measures of central tendency and dispersion for each selected variable. Be sure to
briefly explain why you choose those measurements. Note: The information for the required
variable, “Income,” has already been completed and can be used as a guide for completing
information on the remaining variables.
✓ Complete Table 3. Type of Graphs and/or Tables for Selected Variables. Enter information on
selected graph and/or table for each selected variable. Be sure to briefly explain why you
choose those measurements. Note: The information for the required variable, “Income,” has
already been completed and can be used as a guide for completing information on the
remaining variables.
Assignment Submission: Name the file that contains your completed “Assignment #1: Descriptive
Statistics Data Analysis Plan Template” using the following format: “Assignment1-StudentLastName.”
Then, submit the file via the Assignments area in the LEO classroom in the “Assignment #1: Descriptive
Statistics Data Analysis Plan” folder and wait for your instructor’s feedback.
STAT200 - Assignment #1: Descriptive Statistics Data Analysis Plan
Scenario:
I have selected a scenario of a 35 year old married parent who is employed and has two children. I have also selected “Income”, “MaritalStatus”, and “FamilySize” as the socioeconomic variables while “Food” and “Meat” as the expenditure variables for analysis are. These are the variables that I will use to analyze the annual expenditures of the household and compare it to the income that the head of the household receives annually.
Table 1. Variables Selected for the Analysis
Variable Name in the Data Set |
Description (See the data dictionary for describing the variables.) |
Type of Variable (Qualitative or Quantitative) |
Variable 1: “Income”
|
Annual household income in USD. |
Quantitative |
Variable 2: “MaritalStatus” |
Marital Status of Head of Household |
Qualitative |
Variable 3: “FamilySize” |
Total Number of People in Family (Both Adults and Children) |
Quantitative |
Variable 4: “Food” |
Total Amount of Annual Expenditure on Food |
Quantitative |
Variable 5: “Meat” |
Total Amount of Annual Expenditure on Meat |
Quantitative |
Reason(s) for Selecting the Variables and Expected Outcome(s):
1. Variable 1: “Income” –I have selected this variable because it will help me in budgeting and also when comparing the annual expenditure to what the family can afford annually.
2. Variable 2: “MaritalStatus”- I selected his variable to give me information of whether the head of the household is a married or unmarried person and I will also use it to determine whether the marital status of the head of the household has any effect on the total expenditure of the household.
3. Variable 3: “FamilySize”- I have selected this variable to give me information about the number of people in the household factored in the expenditure of the family and I will use it to analyze whether the size of the family has any effect on the annual expenditure of the household.
4. Variable 4: “Food”- I have selected this variable to give me information about how the household spends on food and I will also use it to determine the impact of food on the total annual expenditure of the household.
5. Variable 5: “Meat”- I have selected this variable to give me information about how the household spends on meat and I will also use it to determine the impact that meat expenditure has on the total annual expenditure of the household.
Data Set Description:
The data to be used in the analysis is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures. It contains information from 30 households, where a survey responder provided the requested information; it is all self-reported information. This dataset contains four socioeconomic variables; Income, MarritalStatus, FamilySize, AgeHeadHousehold, and four expenditure variables; Food, Meat, Bake, and Fruits.
Proposed Data Analysis:
Measures of Central Tendency and Dispersion
Table 2. Numerical Summaries of the Selected Variables
Variable Name |
Measures of Central Tendency and Dispersion |
Rationale for Why Appropriate |
Variable 1: “Income”
|
● Number of Observations ● Median ● Sample Standard Deviation |
I am using median for two reasons: 1. If there are any outliers or the data is not normally distributed, the median is the best measure of central tendency. 2. The variable is quantitative.
I am using sample standard deviation for three reasons: 1. The data is a sample from a larger data set. 2. It is the most commonly used measure of dispersion. 3. The variable is quantitative.
|
Variable 2: “MaritalStatus” |
● Number of Observations ● Mode |
I will use mode because the data is qualitative |
Variable 3: “FamilySize” |
● Number of Observations ● Mean ● Median ● Sample Standard Deviation ● Variance |
● I will use mean and median because the data is quantitative ● I will use sample standard deviation and variance because the data set is a sample of a larger population, and the variable is quantitative. |
Variable 4: “Food” |
● Number of Observations ● Mean ● Median ● Sample Standard Deviation ● Variance |
● I will use mean and median because the data is quantitative ● I will use sample standard deviation and variance because the data set is a sample of a larger population, and the variable is quantitative. |
Variable 5: “Meat” |
● Number of Observations ● Mean ● Median ● Sample Standard Deviation ● Variance |
● I will use mean and median because the data is quantitative ● I will use sample standard deviation and variance because the data set is a sample of a larger population, and the variable is quantitative. |
Graphs and/or Tables
Table 3. Type of Graphs and/or Tables for Selected Variables
Variable Name |
Graph and/or Table |
Rationale for why Appropriate? |
Variable 1: “Income”
|
Graph: I will use the histogram to show the normal distribution of data.
|
Histogram is one of the best plots to show the normal distribution of quantitative level data. |
Variable 2: “MaritalStatus” |
Graph: Pie Chart I will use a pie chart because the data is qualitative. |
A pie chart is used to show the relationship between the involved variables in terms of their percentage compositions. |
Variable 3: “FamilySize” |
Graph: Histogram I will use the histogram to show the type of distribution of data. |
A Histogram can be used to show the type of distribution of quantitative level data. |
Variable 4: “Food” |
Graph: Histogram I will use the histogram to show the type of distribution of data. |
A Histogram can be used to show the type of distribution of quantitative level data. |
Variable 5: “Meat” |
Graph: Histogram I will use the histogram to show the type of distribution of data. |
A Histogram can be used to show the type of distribution of quantitative level data. |
Prices start at $12.99 per page (275 words) for writing and $8.5 for editing and proofreading.
The World Bank collected data on the percentage of GDP that a country spends on health expenditures and also the percentage...
The World Bank collected data on the percentage of GDP that a country spends on health expenditures and also the percentage of wom...
Table #A contains the value of the house and the amount of rental income in a year that the house brings in. Test at the 5%...
STAT200 Introduction to Statistics Assignment #2: Descriptive Statistics Analysis and Writeup Assignment #2: Descriptive Stati...
Table A contains the value of the house and the amount of rental income in a year that the house brings in. Find the correla...
STAT200 Introduction to Statistics Assignment #2: Descriptive Statistics Analysis and Writeup In the first assignment (Assignm...
A study was undertaken to see how accurate food labeling for calories on food that is considered reduced calorie. The group...
In Africa in 2011, the number of deaths of a female from cardiovascular disease for different age groups are in table E ("Global h...
Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the...
Don't hesitate to hire one of our best essay writers.