Business Statistics: Communicating With Numbers, Third Edition
By Sanjiv Jaggia and Alison Kelly
Contents:
PART ONE
Introduction
CHAPTER 1
STATISTICS AND DATA 2
The Relevance of Statistics 4
What Is Statistics? 5
The Need for Sampling 6
Cross-Sectional and Time Series Data 6
Structured and Unstructured Data 7
Big Data 8
Data on the Web 8
Variables and Scales of Measurement 10
The Nominal Scale 11
The Ordinal Scale 12
The Interval Scale 13
The Ratio Scale 14
Synopsis of Introductory Case 15
Conceptual Review 16
PART TWO
Descriptive Statistics
CHAPTER 2
TABULAR AND GRAPHICAL METHODS 18
Summarizing Qualitative Data 20
Pie Charts and Bar Charts 21
Cautionary Comments When Constructing or Interpreting Charts or Graphs 24
Using Excel to Construct a Pie Chart and a Bar Chart 24
A Pie Chart 24
A Bar Chart 25
Using R to Construct a Pie Chart 25
Summarizing Quantitative Data 28
Guidelines for Constructing a Frequency Distribution 29
Synopsis of Introductory Case 33
Histograms, Polygons, and Ogives 33
Using Excel to Construct a Histogram, a Polygon, and an Ogive 37
A Histogram Constructed from Raw Data 37
A Histogram Constructed from a Frequency Distribution 38
A Polygon 39
An Ogive 39
Using R to Construct a Histogram, a Polygon, and an Ogive 39
A Histogram 39
A Polygon 40
An Ogive 40
Stem-and-Leaf Diagrams 45
Scatterplots 47
Using Excel and R to Construct a Scatterplot 48
Using Excel 48
Using R 49
Writing with Statistics 50
Conceptual Review 52
Additional Exercises and Case Studies 53
Exercises 53
Case Studies 56
Appendix 2.1: Guidelines for Other Software Packages 58
CHAPTER 3
NUMERICAL DESCRIPTIVE MEASURES 62
Measures of Central Location 64
The Mean 64
The Median 66
The Mode 67
The Weighted Mean 68
Using Excel and R to Calculate Measures of Central Location 69
Using Excel’s Formula Option 69
Using Excel’s Data Analysis Toolpak Option 70
Using R 71
Note on Symmetry 72
Percentiles and Boxplots 74
Calculating the pth Percentile 75
Note on Calculating Percentiles 76
Constructing and Interpreting a Boxplot 76
Using R to Construct a Boxplot 77
The Geometric Mean 79
The Geometric Mean Return 79
Arithmetic Mean versus Geometric Mean 80
The Average Growth Rate 81
Measures of Dispersion 83
Range 84
The Mean Absolute Deviation 84
The Variance and the Standard Deviation 85
The Coefficient of Variation 86
Using Excel and R to Calculate Measures of Dispersion 87
Using Excel’s Formula Option 87
Using Excel’s Data Analysis Toolpak Option 87
Using R 87
Mean-Variance Analysis and the Sharpe Ratio 89
Synopsis of Introductory Case 91
Analysis of Relative Location 92
Chebyshev’s Theorem 92
The Empirical Rule 93
z-Scores 94
Summarizing Grouped Data 97
Measures of Association 100
Using Excel and R to Calculate Measures of Association 102
Using Excel 102
Using R 102
Writing with Statistics 104
Conceptual Review 105
Additional Exercises and Case Studies 107
Exercises 107
Case Studies 110
Appendix 3.1: Guidelines for Other Software Packages 112
PART THREE
Probability and Probability Distributions
CHAPTER 4
INTRODUCTION TO PROBABILITY 114
Fundamental Probability Concepts 116
Events 117
Assigning Probabilities 119
Probabilities Expressed as Odds 122
Rules of Probability 125
The Complement Rule 125
The Addition Rule 126
The Addition Rule for Mutually Exclusive Events 127
Conditional Probability 128
Independent and Dependent Events 130
The Multiplication Rule 131
The Multiplication Rule for Independent Events 131
Contingency Tables and Probabilities 135
Synopsis of Introductory Case 138
The Total Probability Rule and Bayes’ Theorem 140
The Total Probability Rule 140
Bayes’ Theorem 143
Counting Rules 147
Writing with Statistics 150
Conceptual Review 151
Additional Exercises and Case Studies 153
Exercises 153
Case Studies 157
CHAPTER 5
DISCRETE PROBABILITY DISTRIBUTIONS 160
Random Variables and Discrete Probability Distributions 162
The Discrete Probability Distribution 163
Expected Value, Variance, and Standard Deviation 167
Expected Value 168
Variance and Standard Deviation 168
Risk Neutrality and Risk Aversion 169
Portfolio Returns 172
Properties of Random Variables 172
Expected Return, Variance, and Standard Deviation for a Portfolio 173
The Binomial Distribution 176
Using Excel and R to Obtain Binomial Probabilities 181
The Poisson Distribution 184
Synopsis of Introductory Case 187
Using Excel and R to Obtain Poisson Probabilities 187
The Hypergeometric Distribution 190
Using Excel and R to Obtain Hypergeometric Probabilities 192
Writing with Statistics 194
Conceptual Review 196
Additional Exercises and Case Studies 197
Exercises 197
Case Studies 200
Appendix 5.1: Guidelines for Other Software Packages 201
CHAPTER 6
CONTINUOUS PROBABILITY DISTRIBUTIONS 204
Continuous Random Variables and the Uniform Distribution 206
The Continuous Uniform Distribution 207
The Normal Distribution 210
Characteristics of the Normal Distribution 211
The Standard Normal Distribution 212
Finding a Probability for a Given z Value 213
Finding a z Value for a Given Probability 215
The Transformation of Normal Random Variables 217
Synopsis of Introductory Case 221
A Note on the Normal Approximation of the Binomial Distribution 221
Using Excel and R for the Normal Distribution 221
Other Continuous Probability Distributions 226
The Exponential Distribution 226
The Lognormal Distribution 229
Using Excel and R for the Exponential and Lognormal Distributions 231
Writing with Statistics 235
Conceptual Review 236
Additional Exercises and Case Studies 238
Exercises 238
Case Studies 240
Appendix 6.1: Guidelines for Other Software Packages 242
PART FOUR
Basic Inference
CHAPTER 7
SAMPLING AND SAMPLING DISTRIBUTIONS 246
Sampling 248
Classic Case of a “Bad” Sample: The Literary Digest Debacle of 1936 248
Trump’s Stunning Victory in 2016 249
Sampling Methods 250
Using Excel and R to Generate a Simple Random Sample 252
The Sampling Distribution of the Sample Mean 253
The Expected Value and the Standard Error of the Sample Mean 254
Sampling from a Normal Population 255
The Central Limit Theorem 257
The Sampling Distribution of the Sample Proportion 260
The Expected Value and the Standard Error of the Sample Proportion 260
Synopsis of Introductory Case 264
The Finite Population Correction Factor 265
Statistical Quality Control 268
Control Charts 269
Using Excel and R to Create a Control Chart 272
Writing with Statistics 276
Conceptual Review 277
Additional Exercises and Case Studies 279
Exercises 279
Case Studies 282
Appendix 7.1: Derivation of the Mean and the Variance for and 283
Sample Mean, 283
Sample Proportion, 283
Appendix 7.2: Properties of Point Estimators 283
Appendix 7.3: Guidelines for Other Software Packages 285
CHAPTER 8
INTERVAL ESTIMATION 288
Confidence Interval for the Population Mean When σ Is Known 290
Constructing a Confidence Interval for μ When σ Is Known 291
The Width of a Confidence Interval 293
Using Excel and R to Construct a Confidence Interval for μ When σ Is Known 295
Confidence Interval for the Population Mean When σ Is Unknown 298
The t Distribution 298
Summary of the tdf Distribution 299
Locating tdf Values and Probabilities 300
Constructing a Confidence Interval for μ When σ Is Unknown 301
Using Excel and R to Construct a Confidence Interval for μ When σ Is Unknown 302
Confidence Interval for the Population Proportion 307
Selecting the Required Sample Size 310
Selecting n to Estimate μ 310
Selecting n to Estimate p 311
Synopsis of Introductory Case 312
Writing with Statistics 314
Conceptual Review 315
Additional Exercises and Case Studies 316
Exercises 316
Case Studies 319
Appendix 8.1: Guidelines for Other Software Packages 321
CHAPTER 9
HYPOTHESIS TESTING 322
Introduction to Hypothesis Testing 324
The Decision to “Reject” or “Not Reject” the Null Hypothesis 324
Defining the Null and the Alternative Hypotheses 325
Type I and Type II Errors 327
Hypothesis Test for the Population Mean When σ Is Known 330
The p-Value Approach 330
Confidence Intervals and Two-Tailed Hypothesis Tests 334
Using Excel and R to Test μ When σ Is Known 335
One Last Remark 336
Hypothesis Test for the Population Mean When σ Is Unknown 339
Using Excel and R to Test μ When σ is Unknown 340
Synopsis of Introductory Case 342
Hypothesis Test for the Population Proportion 345
Writing with Statistics 348
Conceptual Review 350
Additional Exercises and Case Studies 351
Exercises 351
Case Studies 354
Appendix 9.1: The Critical Value Approach 356
Appendix 9.2: Guidelines for Other Software Packages 358
CHAPTER 10
STATISTICAL INFERENCE CONCERNING TWO POPULATIONS 360
Inference Concerning the Difference between Two Means 362
Confidence Interval for μ1 − μ2 362
Hypothesis Test for μ1 − μ2 364
Using Excel and R for Testing Hypotheses about μ1 − μ2 366
A Note on the Assumption of Normality 369
Inference Concerning Mean Differences 373
Recognizing a Matched-Pairs Experiment 374
Confidence Interval for μD 374
Hypothesis Test for μD 375
Using Excel and R for Testing Hypotheses about μD 377
One Last Note on the Matched-Pairs Experiment 379
Synopsis of Introductory Case 379
Inference Concerning the Difference between Two Proportions 382
Confidence Interval for p1 − p2 383
Hypothesis Test for p1 − p2 384
Writing with Statistics 389
Conceptual Review 390
Additional Exercises and Case Studies 392
Exercises 392
Case Studies 394
Appendix 10.1: Guidelines for Other Software Packages 396
CHAPTER 11
STATISTICAL INFERENCE CONCERNING VARIANCE 398
Inference Concerning the Population Variance 400
Sampling Distribution of S2 400
Finding χ2 df Values and Probabilities 401
Confidence Interval for the Population Variance 403
Hypothesis Test for the Population Variance 404
Note on Calculating the p-Value for a Two-Tailed Test Concerning σ2 405
Using Excel and R to Test σ2 405
Inference Concerning the Ratio of Two Population Variances 409
Sampling Distribution of S12 1/S2 2 409
Finding F(df1,df2) Values and Probabilities 410
Confidence Interval for the Ratio of Two Population Variances 412
Hypothesis Test for the Ratio of Two Population Variances 413
Using Excel and R to Test σ12⁄σ2 2 415
Synopsis of Introductory Case 416
Writing with Statistics 419
Conceptual Review 420
Additional Exercises and Case Studies 421
Exercises 421
Case Studies 423
Appendix 11.1: Guidelines for Other Software Packages 425
CHAPTER 12
CHI-SQUARE TESTS 426
Goodness-of-Fit Test for a Multinomial Experiment 428
Using R to Conduct a Goodness-of-Fit Test 432
Chi-Square Test for Independence 435
Calculating Expected Frequencies 436
Synopsis of Introductory Case 439
Using R to Conduct a Test for Independence 440
Chi-Square Tests for Normality 443
The Goodness-of-Fit Test for Normality 443
The Jarque-Bera Test 445
Using R to Conduct a Goodness-of-Fit Test for Normality and the Jarque-Bera
Test 446
Writing with Statistics 450
Conceptual Review 451
Additional Exercises and Case Studies 453
Exercises 453
Case Studies 457
Appendix 12.1: Guidelines for Other Software Packages 458
PART FIVE
Advanced Inference
CHAPTER 13
ANALYSIS OF VARIANCE 460
One-Way ANOVA Test 462
Between-Treatments Estimate of σ2: MSTR 463
Within-Treatments Estimate of σ2: MSE 464
The One-Way ANOVA Table 466
Using Excel and R to Construct a One-Way ANOVA Table 466
Multiple Comparison Methods 471
Fisher’s Least Significant Difference (LSD) Method 472
Synopsis of Introductory Case 473
Tukey’s Honestly Significant Difference (HSD) Method 474
Using R to Construct Tukey Confidence Intervals for μ1 − μ2 476
Two-Way ANOVA Test: No Interaction 480
The Sum of Squares for Factor A, SSA 482
The Sum of Squares for Factor B, SSB 483
The Error Sum of Squares, SSE 483
Using Excel and R for a Two-Way ANOVA Test— No Interaction 484
Two-Way ANOVA Test: With Interaction 489
The Total Sum of Squares, SST 490
The Sum of Squares for Factor A, SSA, and the Sum of Squares for Factor B, SSB 490
The Sum of Squares for the Interaction of Factor A and Factor B, SSAB 491
The Error Sum of Squares, SSE 492
Using Excel and R for a Two-Way ANOVA Test— With Interaction 492
Writing with Statistics 497
Conceptual Review 498
Additional Exercises and Case Studies 499
Case Studies 504
Appendix 13.1: Guidelines for Other Software Packages 506
CHAPTER 14
REGRESSION ANALYSIS 508
Hypothesis Test for the Correlation Coefficient 510
Testing the Correlation Coefficient ρxy 511
Using Excel and R to Conduct a Hypothesis Test for ρxy 511
Limitations of Correlation Analysis 513
The Linear Regression Model 515
The Simple Linear Regression Model 516
Using Excel and R to Estimate a Simple Linear Regression Model 520
The Multiple Linear Regression Model 521
Using Excel and R to Estimate a Multiple Linear Regression Model 523
Goodness-of-Fit Measures 528
The Standard Error of the Estimate 529
The Coefficient of Determination, R2 530
The Adjusted R2 533
Synopsis of Introductory Case 534
Writing with Statistics 536
Conceptual Review 538
Additional Exercises and Case Studies 539
Case Studies 541
Appendix 14.1: Guidelines for Other Software Packages 543
CHAPTER 15
INFERENCE WITH REGRESSION MODELS 544
Tests of Significance 546
Test of Individual Significance 546
Using a Confidence Interval to Determine Individual Significance 548
A Test for a Nonzero Slope Coefficient 549
Test of Joint Significance 551
Reporting Regression Results 553
Synopsis of Introductory Case 554
A General Test of Linear Restrictions 558
Interval Estimates for the Response Variable 563
Model Assumptions and Common Violations 567
Common Violation 1: Nonlinear Patterns 569
Detection 569
Remedy 571
Common Violation 2: Multicollinearity 571
Detection 571
Remedy 572
Common Violation 3: Changing Variability 572
Detection 573
Remedy 574
Common Violation 4: Correlated Observations 574
Detection 574
Remedy 575
Common Violation 5: Excluded Variables 575
Remedy 575
Summary 576
Using Excel and R to Construct Residual Plots 576
Writing with Statistics 580
Conceptual Review 582
Additional Exercises and Case Studies 584
Exercises 584
Case Studies 586
Appendix 15.1: Guidelines for Other Software Packages 588
CHAPTER 16
REGRESSION MODELS FOR NONLINEAR RELATIONSHIPS 590
Polynomial Regression Models 592
Regression Models with Logarithms 601
A Log-Log Model 602
The Logarithmic Model 603
The Exponential Model 605
Comparing Linear and Log-Transformed Models 608
Using Excel and R to Compare Linear and Log-Transformed Models 609
Synopsis of Introductory Case 611
Writing with Statistics 614
Conceptual Review 616
Additional Exercises and Case Studies 617
Exercises 617
Case Studies 619
Appendix 16.1: Guidelines for Other Software Packages 621
CHAPTER 17
REGRESSION MODELS WITH DUMMY VARIABLES 624
Dummy Variables 626
Qualitative Explanatory Variable with Two Categories 626
Qualitative Explanatory Variable with Multiple Categories 629
Interactions with Dummy Variables 635
Synopsis of Introductory Case 639
Binary Choice Models 641
The Linear Probability Model 641
The Logit Model 643
Using R to Estimate a Logit Model 646
Writing with Statistics 649
Conceptual Review 650
Additional Exercises and Case Studies 651
Exercises 651
Case Studies 655
Appendix 17.1: Guidelines for Other Software Packages 657
PART SIX
Supplementary Topics
CHAPTER 18
TIME SERIES AND FORECASTING 658
Choosing a Forecasting Model 660
Model Selection Criteria 661
Smoothing Techniques 662
Moving Average Methods 662
Exponential Smoothing Methods 665
Using Excel and R for Moving Averages and Exponential Smoothing 667
Trend Forecasting Models 670
The Linear Trend 670
The Exponential Trend 671
Polynomial Trends 674
Trend and Seasonality 678
Decomposition Analysis 678
Extracting Seasonality 679
Extracting Trend 681
Forecasting with Decomposition Analysis 682
Seasonal Dummy Variables 683
Synopsis of Introductory Case 686
Causal Forecasting Methods 688
Lagged Regression Models 688
Using R to Estimate Lagged Regression Models 690
Writing with Statistics 692
Conceptual Review 694
Additional Exercises and Case Studies 695
Exercises 695
Case Studies 698
Appendix 18.1: Guidelines for Other Software Packages 700
CHAPTER 19
RETURNS, INDEX NUMBERS, AND INFLATION 702
Investment Return 704
The Adjusted Closing Price 705
Nominal versus Real Rates of Return 706
Index Numbers 708
Simple Price Indices 708
Unweighted Aggregate Price Index 710
Weighted Aggregate Price Index 711
Synopsis of Introductory Case 715
Using Price Indices to Deflate a Time Series 717
Inflation Rate 719
Writing with Statistics 722
Conceptual Review 723
Additional Exercises and Case Studies 724
Exercises 724
Case Studies 726
CHAPTER 20
NONPARAMETRIC TESTS 728
Testing a Population Median 730
The Wilcoxon Signed-Rank Test for a Population Median 730
Using a Normal Distribution Approximation for T 733
Using R to Test a Population Median 734
Testing Two Population Medians 736
The Wilcoxon Signed-Rank Test for a Matched-Pairs Sample 736
Using R to Test for Median Differences from a Matched-Pairs Sample 737
The Wilcoxon Rank-Sum Test for Independent Samples 738
Using R to Test for Median Differences from Independent Samples 740
Using a Normal Distribution Approximation for W 741
Testing Three or More Population Medians 744
The Kruskal-Wallis Test for Population Medians 744
Using R to Conduct a Kruskal-Wallis Test 746
The Spearman Rank Correlation Text 749
Using R to Test the Spearman Rank Correlation Coefficient 751
Using a Normal Distribution Approximation for rS 752
Summary of Parametric and Nonparametric Tests 752
Synopsis of Introductory Case 753
Appendix A
Appendix B
Appendix C
The Sign Test 755
Tests Based on Runs 759
The Method of Runs Above and Below the Median 760
Using R to Conduct the Runs Test 762
Writing with Statistics 764
Conceptual Review 765
Additional Exercises and Case Studies 767
Exercises 767
Case Studies 769
Appendix 20.1: Guidelines for Other Software Packages 771
APPENDIXES
Getting Started with R 774
Tables 781
Answers to Selected Even-Numbered Exercises 793
Glossary G-1
Index I-1