# Business Data Analysis

**Business Data Analysis**

Income can have significant effect on people’s spending patterns. Research studies have revealed that consumer expenditure is influenced by various factors such as their income, gender, age and level of education. In order to investigate the relationship between food expenditure and take home pay in Australia, a researcher plans to survey a sample of individuals across the country. QUESTION 1 [2 marks] Briefly explain (using no more than 250 words in total for this Question 1) (a) What type of survey method the researcher could use and why? (b) What sampling method could the researcher use to select his/her sample and why? (c) What are the two main variables the researcher should consider collecting data for the purpose of the above analysis and why? Identify the data type(s) for the variables. (d) What kind of issues the researcher may face in this data collection? Suppose a researcher has collected data form a sample of 150 individuals using the sampling method you have proposed in (b). For each individual, the weekly take-home pay and weekly food expenditure were recorded. The data are stored in file FOODEXP.XLS which is available in the “Assessment > Computing Lab Assignment > Computing Lab Assignment” in the course website. Using this data set and EXCEL, answer the questions below. QUESTION 2 [4 marks] First, the researcher wishes to use graphical descriptive methods to present a summary of the data. (a) The researcher suggests using 8 class intervals such as 100-225, 225-350, 350- 475, ….., 975-1100 for one variable and class intervals 0-50, 50-100, 100-150, …. , 350-400 for the other variable. Explain how the researcher would have decided on the number of classes and the above class intervals. (b) Draw a histogram for each variable using appropriate BIN values from part (a) and comment on the shape of the two distributions. 1305AFE (Tri 1, 2017) Computing Lab Assignment Page 2 of 3 QUESTION 3 [6 marks] Second, the researcher wishes to use numerical descriptive measures to summarize the data. (a) Prepare a numerical summary report for the two variables weekly take-home pay and food expenditure by including summary measures such as mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles, for each variable. (b) Compute a numerical summary measure to measure the strength and the direction of the linear relationship between the two variables. Interpret this value. (c) Construct a 90% confidence interval estimate for the population average of weekly take-home pay. (d) Test the hypothesis that the population weekly take-home pay is more than $500 (use a 5% level of significance). (e) Construct a 95% confidence interval estimate for the population average of weekly food expenditure. (f) Test the hypothesis that the population average weekly food expenditure is less than $200 (use a 5% level of significance). Notes: Use QUARTILE.EXC command to generate the three quartiles. For part c,d,e, and f, refer to the appropriate worksheets in Confidence Interval Estimation and Hypothesis Testing excel files provided under Assessment > Computing Lab Assignment > Computing Lab Assignment folder in the course website. Input the required information to the worksheet to obtain the outputs. QUESTION 4 [6 marks] The researcher considers using regression analysis to establish a linear relationship between the two variables food expenditure and weekly take-home pay. (a) What is the dependent variable and independent variable for this analysis? Why? (b) Use an appropriate plot to investigate the relationship between the two variables. On the same plot, fit a linear trend line. (c) Estimate a simple linear regression model and present the estimated linear equation. Diplay the regression summary table and interpret the intercept and slope coefficient estimates of the estimated linear model. (d) Interpret the value of the coefficient of determination, R-squared (R2 ). 1305AFE (Tri 1, 2017) Computing Lab Assignment Page 3 of 3 QUESTION 5 [2 marks] The researcher is interested in presenting the data on weekly take-home pay and food expenditure graphically. For this purpose, the researcher categorised data on take-home pay and food expenditure into 8 groups and calculated the frequencies as given in following frequency distribution tables. Frequency distribution tables Using the data in the above frequency distribution tables and using EXCEL, answer the following questions. (a) Which graphical technique or chart should be used if the researcher is interested in comparing the number of individuals in each weekly take-home pay category? Explain the reason for the selction of this graphical chart. Construct the chart and describe what you can observe about the number of individuals belong to each takehome pay category? (b) Which graphical technique or chart should be used if the researcher is inetersted in describing the proportion of the individuals in each food expenditure category? Explain the reason for the selction of this graphical chart. Construct the chart and describe what you can observe about the proportion of individuals belong to each food expenditure category.