Assignment #2: Descriptive Statistics Analysis and Writeup
Introduction:
Use the same scenario you submitted for the first assignment with modifications using your instructor’s feedback, if needed. Include Table 1: Variables Selected for the Analysis you used in Assignment #1 to show the variables you selected for analysis.
For my scenario, I will do it on a 27-year-old single working parent with a 4-year-old going to a pre-k at a weekday school. Annual income is $30,000. In addition to income variable is SE- marital status, SE- family size, USD- Meat, and USD-Fruits. I will use MS Excel for this analysis.
Table 1. Variables Selected for the Analysis
Variable Name in data set | Description | Type of Variable (Qualitative or Quantitative) |
Variable 1: “Income”
|
Annual household income in USD. | Quantitative |
Variable 2:
SE- Marital Status |
Marital Status Head of Household. | Qualitative |
Variable 3:
SE- Family Size |
Total Number of People in Family (Both Adults and Children). | Quantitative |
Variable 4:
USD- Meat |
Total Amount of Annual Expenditures on Meat. | Quantitative |
Variable 5:
USD-Fruits |
Total Amount of Annual Expenditures on fruit. | Quantitative |
Data Set Description and Method Used for Analysis:
Results:
Variable 1: Income
Numerical Summary.
Table 2. Descriptive Analysis for Variable 1
Variable | n | Measure(s) of Central Tendency | Measure(s) of Dispersion |
Variable: Income | 30 | Median=$9,6791.50 | SD = $5,772.76 |
Graph and/or Table: Histogram of Income
Description of Findings.
As the outliers might affect the income, using the median is the most appropriate for the analysis. Most households earn between $94,867 and $101,367 per annum. Also, the median income lies in this group. It means that, as the level of income rises, the fewer the households one is likely to find at the top. Besides, one can use the standard deviation as the data is a sample from a larger population, it is widely used, and the variable is quantitative. The average deviation of the annual income from its mean was $5,772.76, which shows that the gap between the highest earners and the lowest earners was high.
Variable 2: Marital Status
Numerical Summary.
Table 3. Descriptive Analysis for Variable 2
Variable | n | Measure(s) of Central Tendency | Measure(s) of Dispersion |
Variable: Marital Status | 30 | Mode=15 | N/A |
Graph and/or Table.
Marital Status | Count |
Married | 15 |
Not Married | 15 |
Grand Total | 30 |
Description of Findings.
As the variable is qualitative, the mode is the most appropriate measure of tendency for one to use in this analysis. There is an equal number of households headed by married and unmarried individuals in the sample.
Variable 3: Family Size
Numerical Summary.
Table 4. Descriptive Analysis for Variable 3
Variable | n | Measure(s) of Central Tendency | Measure(s) of Dispersion |
Variable: Family Size | 30 | Mean=3.13 | SD = 1.13 |
Graph and/or Table.
Family Size | Count |
1 | 2 |
2 | 8 |
3 | 7 |
4 | 10 |
5 | 3 |
Description of Findings.
Here, the SE-FamilySize variable is a quantitative variable thus the mean and standard deviation is the most appropriate for measures of central tendency and dispersion, respectively. The average family consists of 3 individuals. Also, a third of the families consist of 4 people. The 3 largest households had 5 members, with the smallest 2 consisting of only one individual. The average deviation from the mean was approximately one individual.
Variable 4: Meat
Numerical Summary.
Table 5. Descriptive Analysis for Variable 4
Variable | N | Mean/Median | St. Dev. |
Variable 4: Meat | 30 | Median = $917.50 | SD = $221.05 |
Graph and/or Table.
Description of Findings.
As outliers might affect the amount spent on meat, the median is the most appropriate measure of central tendency. The median household spent $917.50 on meat in 2016. Also, most households spent between $864 and $1,114 on the same. The group represents two-thirds of the total households in the sample. Also, the data is a sample, thus, one should use the sample standard deviation to describe the dispersion. The average deviation of the amount spent was $221.05
Variable 5: Fruits
Numerical Summary.
Table 6. Descriptive Analysis for Variable 5
Variable | n | Measure(s) of Central Tendency | Measure(s) of Dispersion |
Variable: | 30 | Median = $866 | SD = $190.49 |
Graph and/or Table.
Description of Findings.
Similar to the above, the amount spent on fruits might be affected by outliers. Thus, the median is the most favorable measure of central tendency. As indicated in the histogram, most households spent between $744 and $1,164 on fruits. Only 2 households spent more than $1,164 on the same. Also, the average deviation from the mean was $190.49. It shows that most households value spending on fruits.
Discussion and Conclusion.
Briefly discuss each variable in the same sequence as presented in the results. What has the highest expenditure? What variable has the lowest expenditure? If you were to recommend a place to save money, which expenditure would it be and why? Note: The section should be no more than 2 paragraphs.
As indicated above, the median household income was $9,6791.50. It was lower than the mean household income, meaning that most households were earned below the average. Also, it is clear from the histogram that those earning were the lowest. The sample had equal numbers of households who are married and those who are not. Besides, a third of the households consisted of 4 individuals, with only two households consisting of only one individual. These households spent more on meat than they did on fruits. However, while 66.67% of families spent between $864 and $1,114 on meat, more than 93% of the households spent between $744 and $1164 on fruits. More was spent on meat than on fruits. However, most households preferred spending on fruits than on meat. Thus, the variable Fruits had less expenditure. For one to save money, I will recommend abandoning meat and embracing fruits. The expenditure on fruits are lower than those on meat.