# Business Statistics: Communicating with Numbers, Fourth Edition PDF by Sanjiv Jaggia and Alison Kelly

## Business Statistics: Communicating with Numbers, Fourth Edition

Sanjiv Jaggia and Alison Kelly

CONTENTS

PART ONE

Introduction

CHAPTER 1

DATA AND DATA PREPARATION 2

1.1 Types of Data 4

Sample and Population Data 4

Cross-Sectional and Time Series Data 5

Structured and Unstructured Data 6

Big Data 7

Data on the Web 9

1.2 Variables and Scales of Measurement 10

The Measurement Scales 11

1.3 Data Preparation 15

Counting and Sorting 15

A Note on Handling Missing Values 20

Subsetting 20

A Note on Subsetting Based on Data Ranges 23

1.4 Writing with Data 26

Conceptual Review 28

PART TWO

Descriptive Statistics

CHAPTER 2

TABULAR AND GRAPHICAL METHODS 32

2.1 Methods to Visualize a Categorical Variable 34

A Frequency Distribution for a Categorical Variable 34

A Bar Chart 35

A Pie Chart 35

Cautionary Comments When Constructing or Interpreting

Charts or Graphs 39

2.2 Methods to Visualize the Relationship Between

Two Categorical Variables 42

A Contingency Table 42

A Stacked Column Chart 43

2.3 Methods to Visualize a Numerical Variable 48

A Frequency Distribution for a Numerical Variable 48

A Histogram 51

A Polygon 55

An Ogive 56

Using Excel and R Construct a Polygon and an Ogive 57

2.4 More Data Visualization Methods 61

A Scatterplot 61

A Scatterplot with a Categorical Variable 63

A Line Chart 65

2.5 A Stem-and-Leaf Diagram 68

2.6 Writing with Data 70

Conceptual Review 72

Appendix 2.1: Guidelines for Other Software Packages 76

CHAPTER 3

NUMERICAL DESCRIPTIVE MEASURES 80

3.1 Measures of Central Location 82

The Mean 82

The Median 84

The Mode 84

Using Excel and R to Calculate Measures of

Central Location 85

Note on Symmetry 88

Subsetted Means 89

The Weighted Mean 90

3.2 Percentiles and Boxplots 93

A Percentile 93

A Boxplot 94

3.3 The Geometric Mean 97

The Geometric Mean Return 97

Arithmetic Mean versus Geometric Mean 98

The Average Growth Rate 99

3.4 Measures of Dispersion 101

The Range 101

The Mean Absolute Deviation 102

The Variance and the Standard Deviation 103

The Coefficient of Variation 105

3.5 Mean-Variance Analysis and the Sharpe Ratio 107

3.6 Analysis of Relative Location 109

Chebyshev’s Theorem 109

The Empirical Rule 110

z-Scores 111

3.7 Measures of Association 114

3.8 Writing with Data 117

Conceptual Review 119

Appendix 3.1: Guidelines for Other Software Packages 123

PART THREE

Probability and Probability Distributions

CHAPTER 4

INTRODUCTION TO PROBABILITY 124

4.1 Fundamental Probability Concepts 126

Events 126

Assigning Probabilities 129

4.2 Rules of Probability 133

4.3 Contingency Tables and Probabilities 139

A Note on Independence with Empirical Probabilities 141

4.4 The Total Probability Rule and Bayes’ Theorem 144

The Total Probability Rule and Bayes’ Theorem 144

Extensions of the Total Probability Rule

and Bayes’ Theorem 146

4.5 Counting Rules 149

4.5 Writing with Data 151

Conceptual Review 154

CHAPTER 5

DISCRETE PROBABILITY

DISTRIBUTIONS 160

5.1 Random Variables and Discrete Probability

Distributions 162

The Discrete Probability Distribution 162

5.2 Expected Value, Variance, and Standard

Deviation 166

Summary Measures 167

Risk Neutrality and Risk Aversion 168

5.3 Portfolio Returns 171

Properties of Random Variables 171

Summary Measures for a Portfolio 172

5.4 The Binomial Distribution 175

Using Excel and R to Obtain Binomial Probabilities 180

5.5 The Poisson Distribution 183

Using Excel and R to Obtain Poisson Probabilities 186

5.6 The Hypergeometric Distribution 189

Using Excel and R to Obtain Hypergeometric

Probabilities 191

5.7 Writing with Data 193

Case Study 193

Conceptual Review 195

Appendix 5.1: Guidelines for Other Software

Packages 198

CHAPTER 6

CONTINUOUS PROBABILITY

DISTRIBUTIONS 200

6.1 Continuous Random Variables and the Uniform

Distribution 202

The Continuous Uniform Distribution 202

6.2 The Normal Distribution 206

Characteristics of the Normal Distribution 206

The Standard Normal Distribution 207

Finding a Probability for a Given z Value 208

Finding a z Value for a Given Probability 210

The Transformation of Normal Random Variables 212

Using R for the Normal Distribution 216

A Note on the Normal Approximation

of the Binomial Distribution 217

6.3 Other Continuous Probability Distributions 221

The Exponential Distribution 221

Using R for the Exponential Distribution 224

The Lognormal Distribution 224

Using R for the Lognormal Distribution 226

6.4 Writing with Data 229

Conceptual Review 231

Appendix 6.1: Guidelines for Other Software

Packages 235

PART FOUR

Basic Inference

CHAPTER 7

SAMPLING AND SAMPLING

DISTRIBUTIONS 238

7.1 Sampling 240

Classic Case of a “Bad” Sample: The Literary Digest

Debacle of 1936 240

Trump’s Stunning Victory in 2016 241

Sampling Methods 242

Using Excel and R to Generate a Simple Random Sample 244

7.2 The Sampling Distribution of the Sample Mean 245

The Expected Value and the Standard Error

of the Sample Mean 246

Sampling from a Normal Population 247

The Central Limit Theorem 248

7.3 The Sampling Distribution of the Sample

Proportion 252

The Expected Value and the Standard Error

of the Sample Proportion 252

7.4 The Finite Population Correction Factor 257

7.5 Statistical Quality Control 259

Control Charts 260

Using Excel and R to Create a Control Chart 263

7.6 Writing With Data 267

Conceptual Review 269

Appendix 7.1: Derivation of the Mean and the

Variance for X¯ and P¯ 273

Appendix 7.2: Properties of Point Estimators 274

Appendix 7.3: Guidelines for Other Software

Packages 275

CHAPTER 8

INTERVAL ESTIMATION 278

8.1 Confidence Interval For The Population Mean

When σ Is Known 280

Constructing a Confidence Interval for μ When σ Is

Known 281

The Width of a Confidence Interval 283

Using Excel and R to Construct a Confidence

Interval for μ When σ Is Known 285

8.2 Confidence Interval For The Population Mean

When σ Is Unknown 288

The t Distribution 288

Summary of the tdf Distribution 289

Locating tdf Values and Probabilities 289

Constructing a Confidence Interval for μ

When σ Is Unknown 291

Using Excel and R to Construct a Confidence

Interval for μ When σ Is Unknown 292

8.3 Confidence Interval for the Population

Proportion 295

8.4 Selecting the Required Sample Size 298

Selecting n to Estimate μ 299

Selecting n to Estimate p 299

8.5 Writing with Data 302

Conceptual Review 303

Appendix 8.1: Guidelines for Other Software

Packages 307

CHAPTER 9

HYPOTHESIS TESTING 310

9.1 Introduction to Hypothesis Testing 312

The Decision to “Reject” or “Not Reject”

the Null Hypothesis 312

Defining the Null and the Alternative Hypotheses 313

Type I and Type II Errors 315

9.2 Hypothesis Test For The Population Mean

When σ Is Known 318

The p-Value Approach 318

Confidence Intervals and Two-Tailed Hypothesis Tests 322

One Last Remark 323

9.3 Hypothesis Test For The Population Mean

When σ Is Unknown 325

Using Excel and R to Test μ When σ is Unknown 326

9.4 Hypothesis Test for the Population

Proportion 330

9.5 Writing with Data 334

Conceptual Review 336

Appendix 9.1: The Critical Value Approach 339

Appendix 9.2: Guidelines for Other Software

Packages 342

CHAPTER 10

STATISTICAL INFERENCE CONCERNING

TWO POPULATIONS 344

10.1 Inference Concerning the Difference Between Two

Means 346

Confidence Interval for μ1 − μ2 346

Hypothesis Test for μ1 − μ2 348

Using Excel and R for Testing Hypotheses about μ1 − μ2 350

A Note on the Assumption of Normality 353

10.2 Inference Concerning Mean Differences 357

Recognizing a Matched-Pairs Experiment 357

Confidence Interval for μD 358

Hypothesis Test for μD 358

Using Excel and R for Testing Hypotheses about μD 361

One Last Note on the Matched-Pairs Experiment 362

10.3 Inference Concerning the Difference Between Two

Proportions 366

Confidence Interval for p1 − p2 366

Hypothesis Test for p1 − p2 367

10.4 Writing with Data 372

Conceptual Review 374

Appendix 10.1: Guidelines for Other Software

Packages 377

CHAPTER 11

STATISTICAL INFERENCE

CONCERNING VARIANCE 380

11.1 Inference Concerning

the Population Variance 382

Sampling Distribution of S2 382

Finding χ df 2 Values and Probabilities 383

Confidence Interval for the Population Variance 385

Hypothesis Test for the Population Variance 386

Note on Calculating the p-Value for a Two-Tailed Test

Concerning σ2 387

Using Excel and R to Test σ2 387

11.2 Inference Concerning the Ratio of Two Population

Variances 391

Sampling Distribution of S 12 ∕ S 22 391

Finding F ( df 1 , df 2 ) Values and Probabilities 392

Confidence Interval for the Ratio of Two Population

Variances 394

Hypothesis Test for the Ratio of Two Population

Variances 395

Using Excel and R to Test σ 12 ∕ σ 22 397

11.3 Writing with Data 401

Conceptual Review 403

Appendix 11.1: Guidelines for Other Software

Packages 405

CHAPTER 12

CHI-SQUARE TESTS 408

12.1 Goodness-of-Fit Test for

a Multinomial Experiment 410

Using R to Conduct a Goodness-of-Fit Test 414

12.2 Chi-Square Test for Independence 416

Calculating Expected Frequencies 417

Using R to Conduct a Test for Independence 421

12.3 Chi-Square Tests for Normality 423

The Goodness-of-Fit Test for Normality 423

The Jarque-Bera Test 426

Writing with Data 429

Conceptual Review 431

Appendix 12.1: Guidelines for Other Software

Packages 435

PART FIVE

CHAPTER 13

ANALYSIS OF VARIANCE 438

13.1 One-Way Anova Test 440

Between-Treatments Estimate of σ2: MSTR 441

Within-Treatments Estimate of σ2: MSE 442

The One-Way ANOVA Table 444

Using Excel and R to Construct a One-Way ANOVA

Table 444

13.2 Multiple Comparison Methods 449

Fisher’s Least Significant Difference (LSD) Method 449

Tukey’s Honestly Significant Difference (HSD) Method 450

Using R to Construct Tukey Confidence Intervals

for μ1 − μ2 452

13.3 Two-Way Anova Test: No Interaction 456

The Sum of Squares for Factor A, SSA 458

The Sum of Squares for Factor B, SSB 459

The Error Sum of Squares, SSE 459

Using Excel and R for a Two-Way ANOVA Test—No

Interaction 460

13.4 Two-Way Anova Test: With Interaction 465

The Total Sum of Squares, SST 466

The Sum of Squares for Factor A, SSA, and the Sum of

Squares for Factor B, SSB 466

The Sum of Squares for the Interaction of Factor A and

Factor B, SSAB 467

The Error Sum of Squares, SSE 468

Using Excel and R for a Two-Way ANOVA Test—With

Interaction 468

13.5 Writing with Data 472

Conceptual Review 474

Appendix 13.1: Guidelines for Other Software Packages 479

CHAPTER 14

REGRESSION ANALYSIS 482

14.1 Hypothesis Test for the Correlation Coefficient 484

Testing the Correlation Coefficient ρxy 485

Using Excel and R to Conduct a Hypothesis Test for ρxy 485

14.2 The Linear Regression Model 488

The Simple Linear Regression Model 489

The Multiple Linear Regression Model 493

Using Excel and R to Estimate a Linear Regression

Model 494

14.3 Goodness-of-Fit Measures 500

The Standard Error of the Estimate 501

The Coefficient of Determination, R2 502

A Cautionary Note Concerning Goodness-of-fit Measures 505

14.4 Writing with Data 507

Conceptual Review 509

Appendix 14.1: Guidelines for Other Software Packages 512

CHAPTER 15

INFERENCE WITH REGRESSION MODELS 514

15.1 Tests of Significance 516

Test of Joint Significance 516

Test of Individual Significance 518

Using a Confidence Interval to Determine Individual

Significance 520

A Test for a Nonzero Slope Coefficient 521

Reporting Regression Results 523

15.2 A General Test of Linear Restrictions 527

Using R to Conduct Partial F Tests 530

15.3 Interval Estimates for the Response Variable 532

Using R to Find Interval Estimates for the Response

Variable 535

15.4 Model Assumptions and Common Violations 537

Residual Plots 537

Assumption 1. 538

Detecting Nonlinearities 538

Remedy 539

Assumption 2. 539

Detecting Multicollinearity 540

Remedy 541

Assumption 3. 541

Detecting Changing Variability 541

Remedy 542

Assumption 4. 543

Detecting Correlated Observations 543

Remedy 544

Assumption 5. 544

Remedy 544

Assumption 6. 545

Summary of Regression Modeling 545

Using Excel and R for Residual Plots, and R for Robust

Standard Errors 545

15.5 Writing with Data 548

Conceptual Review 550

Appendix 15.1: Guidelines for Other Software Packages 553

CHAPTER 16

REGRESSION MODELS FOR

NONLINEAR RELATIONSHIPS 556

16.1 Polynomial Regression Models 558

Using R to Estimate a Quadratic Regression Model 563

The Cubic Regression Model 564

16.2 Regression Models with Logarithms 567

A Log-Log Model 568

The Logarithmic Model 570

The Exponential Model 571

Using R to Estimate Log-Transformed Models 575

Comparing Linear and Log-Transformed Models 575

Using Excel and R to Compare Linear and

Log-Transformed Models 576

A Cautionary Note Concerning Goodness-of-fit

Measures 577

16.3 Writing with Data 581

Conceptual Review 583

Appendix 16.1: Guidelines for Other Software Packages 585

CHAPTER 17

REGRESSION MODELS WITH

DUMMY VARIABLES 588

17.1 Dummy Variables 590

A Categorical Explanatory Variable with Two Categories 590

Using Excel and R to Make Dummy Variables 592

Assessing Dummy Variable Models 592

A Categorical Explanatory Variable with Multiple

Categories 593

17.2 Interactions with Dummy Variables 599

Using R to Estimate a Regression Model with a

Dummy Variable and an Interaction Variable 602

17.3 The Linear Probability Model and the Logistic

Regression Models 605

The Linear Probability Model 605

The Logistic Regression Model 606

Using R to Estimate a Logistic Regression Model 609

Accuracy of Binary Choice Models 609

Using R to Find the Accuracy Rate 611

17.4 Writing with Data 613

Conceptual Review 616

Appendix 17.1: Guidelines for Other Software Packages 620

PART SIX

Supplementary Topics

CHAPTER 18

FORECASTING WITH TIME SERIES DATA 622

18.1 The Forecasting Process for Time Series 624

Forecasting Methods 625

Model Selection Criteria 625

18.2 Simple Smoothing Techniques 626

The Moving Average Technique 627

The Simple Exponential Smoothing Technique 629

Using R for Exponential Smoothing 631

18.3 Linear Regression Models for Trend and

Seasonality 633

The Linear Trend Model 633

The Linear Trend Model with Seasonality 635

Estimating a Linear Trend Model with Seasonality

with R 637

A Note on Causal Models for Forecasting 637

18.4 Nonlinear Regression Models for Trend and

Seasonality 639

The Exponential Trend Model 639

Using R to Forecast with an Exponential Trend Model 641

The Polynomial Trend Model 642

Nonlinear Trend Models with Seasonality 643

Using R to Forecast a Quadratic Trend Model with

Seasons 645

18.5 Causal Forecasting Methods 647

Lagged Regression Models 648

Using R to Estimate Lagged Regression Models 650

18.6 Writing with Data 651

Conceptual Review 653

Appendix 18.1: Guidelines for Other Software Packages 656

CHAPTER19

RETURNS, INDEX NUMBERS,

AND INFLATION 658

19.1 Investment Return 660

Nominal versus Real Rates of Return 662

19.2 Index Numbers 664

A Simple Price Index 664

An Unweighted Aggregate Price Index 666

A Weighted Aggregate Price Index 667

19.3 Using Price Indices to Deflate a Time Series 672

Inflation Rate 674

19.4 Writing with Data 676

Conceptual Review 678

CHAPTER 20

NONPARAMETRIC TESTS 682

20.1 Testing a Population Median 684

The Wilcoxon Signed-Rank Test for a Population

Median 684

Using a Normal Distribution Approximation for T 687

Using R to Test a Population Median 688

20.2 Testing Two Population Medians 690

The Wilcoxon Signed-Rank Test for a Matched-Pairs

Sample 690

Using R to Test for Median Differences from a Matched-

Pairs Sample 691

The Wilcoxon Rank-Sum Test for Independent Samples 691

Using R to Test for Median Differences from

Independent Samples 694

Using a Normal Distribution Approximation for W 694

20.3 Testing Three or More Population Medians 697

The Kruskal-Wallis Test for Population Medians 697

Using R to Conduct a Kruskal-Wallis Test 699

20.4 The Spearman Rank Correlation Test 700

Using R to Conduct the Spearman Rank Correlation

Test 702

Summary of Parametric and Nonparametric Tests 703

20.5 The Sign Test 705

20.6 Tests Based on Runs 709

The Method of Runs Above and Below the Median 710

Using R to Conduct the Runs Test 711

20.7 Writing with Data 713

Conceptual Review 715

Appendix 20.1: Guidelines for Other Software

Packages 718

APPENDIXES

APPENDIX A Getting Started with R 721

APPENDIX B Tables 727

APPENDIX C Answers to Selected Even-

Numbered Exercises 739

Glossary 755

Index 763

This book is US\$10
To get free sample pages OR Buy this book