Project 2

Regression Analysis

P 11_17

 

DataPro is a firm that provides data-processing services to a variety of industries.  For the past year, they have collected information on the number of contracts sold, average contract price, advertising expenditures and personal selling expenditures.  By using scatter plots, correlation analysis, and regression analysis we can determine the relationships between the dependent variable and the independent variables.

 

 

 

The table below contains the data that will be used for the remainder of the project

 

Month

Contracts Sold

Average Contract Price

Advertising Expenditures

Personal Selling Expenditures

1

2,000

$2,600

$19,800

$34,000

2

2,000

$2,700

$20,300

$32,000

3

3,000

$2,700

$29,800

$45,000

4

1,250

$2,900

$12,300

$26,000

5

1,750

$3,000

$17,400

$33,000

6

2,750

$3,200

$28,200

$40,000

7

2,500

$3,200

$18,400

$41,000

8

1,000

$3,200

$10,400

$26,000

9

1,500

$3,500

$15,300

$34,000

10

1,750

$3,600

$17,400

$35,000

11

2,250

$3,700

$23,500

$39,000

12

2,500

$3,800

$26,800

$43,000

 

 

Now that we have all of the data, we can make a scatterplot to show the relationship between the dependent variable and the dependent variables. 

 

 

 

 

Both of the scatterplots above show a strong relationship between the dependent variable and the causal (independent) variable.  In most cases, as the causal variable increases so does the dependent variable.  When there is a strong relationship between variables, the data points form a line (upward sloping in this case because they are positively related).   

 

 

In the scatterplot below the points don’t form a line like the scatterplots above. They don’t seem to show any sort of relationship between the variables.

 

Therefore, just by looking at the scatterplots, we can see which causal variables are more likely to have an effect on the dependent variable.

 

The scatterplots provide a great visual tool for seeing the relationships among variables, but they don’t give an exact measure of the relationship. Using the correlation we can get an exact measure of the relationship between the variables.

 

Using the data analysis feature of Excel can make the correlation table below.

 

 

Avg.Contract Price

Adv Expenditures

Pers. Selling Expenditures

Contracts Sold

Avg.Contract Price

1

 

 

 

Adv Expenditures

0.046122361

1

 

 

Pers. Selling Expenditures

0.250233614

0.880802646

1

 

Contracts Sold

-0.016001334

0.94120431

0.936825657

1

 

 

·        The dependent variable (Contracts Sold) has a low correlation (-.016) with Average Contract Price.  Therefore this independent variable has no effect on the number of contracts sold.  In most cases this would not be included in the regression analysis model, but I will still use in my example.

The other two independent variables (Advertising Expenditures and Personal Selling Expenditures) have high correlations (.941 and .937 respectively) with the dependent variable.

 

·        Next the correlation between independent variables needs to be examined.  Unlike the relationship between dependent and independent variables, the relationship between independent variables should be low. 

·        Average Contract Price has a low correlation with the other independent variables.

·        The correlation between Advertising Expenditures and Personal Selling Expenditures is high (.88).  Usually only one of these variables would be used in the model, since they have a high correlation.   Independents with high correlation usually are not used because of their redundancy, but I am going to use all of the independents because I would only be left with one variable, and would not be able to calculate multiple regression analysis.

 

 

Using Excel’s data analysis tool, regression analysis can be calculated

 

SUMMARY OUTPUT

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Regression Statistics

 

 

 

 

 

 

 

 

Multiple R

0.9847

 

 

 

 

 

 

 

R Square

0.9697

 

 

 

 

 

 

 

Adjusted R Square

0.9584

 

 

 

 

 

 

 

Standard Error

123.9224

 

 

 

 

 

 

 

Observations

12

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA

 

 

 

 

 

 

 

 

 

df

SS

MS

F

Significance F

 

 

 

Regression

3

3934437.548

1311479.183

85.40

2.04E-06

 

 

 

Residual

8

122854.1191

15356.76489

 

 

 

 

 

Total

11

4057291.667

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

-117.5135

333.5259

-0.3523

0.7337

-886.6260

651.5991

-886.6260

651.5991

Average_ContractPrice

-0.2965

0.1020

-2.9081

0.0196

-0.5316

-0.0614

-0.5316

-0.0614

Advertising Expenditures

0.0360

0.0139

2.5790

0.0327

0.0038

0.0681

0.0038

0.0681

Pers. Selling Expenditures

0.0662

0.0144

4.6065

0.0017

0.0331

0.0993

0.0331

0.0993

 

 

·        R Squared shows that almost 97% of the variance can be explained by the variables. 

·        The equation is Y=-117.51 -.2965Average Contract Price+.036Advertising Expenditures+.0662Personal Selling Expenditures