# Business Analytics, Fifth Edition PDF by Jeffrey D. Camm, Michael J. Fry, James J. Cochran & Jeffrey W. Ohlmann

Jeffrey D. Camm, Michael J. Fry, James J. Cochran & Jeffrey W. Ohlmann

Contents

Preface xxi

Chapter 1 Introduction to Business Analytics 1

1.1 Decision Making 3

1.3 A Categorization of Analytical Methods and Models 4

Descriptive Analytics 4

Predictive Analytics 5

Prescriptive Analytics 5

1.4 Big Data, the Cloud, and Artificial Intelligence 6

Volume 6

Velocity 6

Variety 7

Veracity 7

1.5 Business Analytics in Practice 9

Accounting Analytics 9

Financial Analytics 10

Human Resource (HR) Analytics 10

Marketing Analytics 10

Health Care Analytics 11

Supply Chain Analytics 11

Analytics for Government and Nonprofits 11

Sports Analytics 12

Web Analytics 12

1.6 Legal and Ethical Issues in the Use of Data and Analytics 12

Summary 15

Glossary 15

Problems 16

Available in the Cengage eBook:

Appendix: Getting Started with R and Rstudio

Appendix: Basic Data Manipulation with R

Chapter 2 Descriptive Statistics 21

2.1 Overview of Using Data: Definitions and Goals 23

2.2 Types of Data 24

Population and Sample Data 24

Quantitative and Categorical Data 24

Cross-Sectional and Time Series Data 24

Sources of Data 25

2.3 Exploring Data in Excel 27

Sorting and Filtering Data in Excel 27

Conditional Formatting of Data in Excel 30

2.4 Creating Distributions from Data 32

Frequency Distributions for Categorical Data 32

Relative Frequency and Percent Frequency Distributions 33

Frequency Distributions for Quantitative Data 35

Histograms 37

Frequency Polygons 42

Cumulative Distributions 43

2.5 Measures of Location 46

Mean (Arithmetic Mean) 46

Median 47

Mode 48

Geometric Mean 48

2.6 Measures of Variability 51

Range 51

Variance 52

Standard Deviation 53

Coefficient of Variation 54

2.7 Analyzing Distributions 54

Percentiles 55

Quartiles 56

z-Scores 56

Empirical Rule 57

Identifying Outliers 59

Boxplots 59

2.8 Measures of Association Between Two Variables 62

Scatter Charts 62

Covariance 64

Correlation Coefficient 67

Summary 68

Glossary 69

Problems 70

Case Problem 1: Heavenly Chocolates Web Site Transactions 80

Case Problem 2: African Elephant Populations 81

Available in the Cengage eBook:

Appendix: Descriptive Statistics with R

Chapter 3 Data Visualization 83

3.1 Overview of Data Visualization 86

Preattentive Attributes 86

Data-Ink Ratio 89

3.2 Tables 92

Table Design Principles 93

Crosstabulation 94

PivotTables in Excel 97

3.3 Charts 101

Scatter Charts 101

Recommended Charts in Excel 103

Line Charts 104

Bar Charts and Column Charts 108

A Note on Pie Charts and Three-Dimensional Charts 112

Additional Visualizations for Multiple Variables: Bubble Chart,

Scatter Chart Matrix, and Table Lens 112

PivotCharts in Excel 117

3.4 Specialized Data Visualizations 120

Heat Maps 120

Treemaps 121

Waterfall Charts 122

Stock Charts 124

Parallel-Coordinates Chart 126

3.5 Visualizing Geospatial Data 126

Choropleth Maps 127

Cartograms 129

3.6 Data Dashboards 131

Principles of Effective Data Dashboards 131

Applications of Data Dashboards 132

Summary 134

Glossary 134

Problems 136

Case Problem 1: Pelican Stores 149

Case Problem 2: Movie Theater Releases 150

Available in the Cengage eBook:

Appendix: Creating Tabular and Graphical Presentations with R

Appendix: Data Visualization with Tableau

Chapter 4 Data Wrangling: Data Management and Data

Cleaning Strategies 151

4.1 Discovery 153

Accessing Data 153

The Format of the Raw Data 157

4.2 Structuring 158

Data Formatting 159

Arrangement of Data 159

Splitting a Single Field into Multiple Fields 161

Combining Multiple Fields into a Single Field 165

4.3 Cleaning 167

Missing Data 167

Identification of Erroneous Outliers, Other Erroneous Values,

and Duplicate Records 170

4.4 Enriching 176

Subsetting Data 177

Supplementing Data 179

Enhancing Data 182

4.5 Validating and Publishing 186

Validating 186

Publishing 188

Summary 188

Glossary 189

Problems 190

Case Problem 1: Usman Solutions 197

Available in the Cengage eBook:

Appendix: Importing Delimited Files into R

Appendix: Working with Records in R

Appendix: Working with Fields in R

Appendix: Unstacking and Stacking Data in R

Chapter 5 Probability: An Introduction to Modeling

Uncertainty 199

5.1 Events and Probabilities 201

5.2 Some Basic Relationships of Probability 202

Complement of an Event 202

5.3 Conditional Probability 205

Independent Events 210

Multiplication Law 210

Bayes’ Theorem 211

5.4 Random Variables 213

Discrete Random Variables 213

Continuous Random Variables 214

5.5 Discrete Probability Distributions 215

Custom Discrete Probability Distribution 215

Expected Value and Variance 217

Discrete Uniform Probability Distribution 220

Binomial Probability Distribution 221

Poisson Probability Distribution 224

5.6 Continuous Probability Distributions 227

Uniform Probability Distribution 227

Triangular Probability Distribution 229

Normal Probability Distribution 231

Exponential Probability Distribution 236

Summary 240

Glossary 240

Problems 242

Case Problem 1: Hamilton County Judges 254

Case Problem 2: McNeil’s Auto Mall 255

Case Problem 3: Gebhardt Electronics 256

Available in the Cengage eBook:

Appendix: Discrete Probability Distributions with R

Appendix: Continuous Probability Distributions with R

Chapter 6 Descriptive Data Mining 257

6.1 Dimension Reduction 259

Geometric Interpretation of Principal Component

Analysis 259

Summarizing Protein Consumption for Maillard

Riposte 262

6.2 Cluster Analysis 266

Measuring Distance Between Observations Consisting

of Quantitative Variables 267

Measuring Distance Between Observations Consisting

of Categorical Variables 269

k-Means Clustering 271

Hierarchical Clustering and Measuring Dissimilarity

Between Clusters 275

Hierarchical Clustering versus k-Means Clustering 283

6.3 Association Rules 284

Evaluating Association Rules 286

6.4 Text Mining 287

Voice of the Customer at Triad Airlines 288

Preprocessing Text Data for Analysis 289

Movie Reviews 290

Computing Dissimilarity Between Documents 293

Word Clouds 294

Summary 295

Glossary 296

Problems 298

Case Problem 1: Big Ten Expansion 315

Case Problem 2: Know Thy Customer 316

Available in the Cengage eBook:

Appendix: Principal Component Analysis with R

Appendix: k-Means Clustering with R

Appendix: Hierarchical Clustering with R

Appendix: Association Rules with R

Appendix: Text Mining with R

Appendix: Principal Component Analysis with

Orange

Appendix: k-Means Clustering with Orange

Appendix: Hierarchical Clustering with Orange

Appendix: Association Rules with Orange

Appendix: Text Mining with Orange

Chapter 7 Statistical Inference 319

7.1 Selecting a Sample 322

Sampling from a Finite Population 322

Sampling from an Infinite Population 323

7.2 Point Estimation 326

7.3 Sampling Distributions 328

Sampling Distribution of x − 331

Sampling Distribution of p− 336

7.4 Interval Estimation 339

Interval Estimation of the Population Mean 339

Interval Estimation of the Population

Proportion 346

7.5 Hypothesis Tests 349

Developing Null and Alternative Hypotheses 349

Type I and Type II Errors 352

Hypothesis Test of the Population Mean 353

Hypothesis Test of the Population Proportion 364

7.6 Big Data, Statistical Inference, and Practical

Significance 367

Sampling Error 367

Nonsampling Error 368

Big Data 369

Understanding What Big Data Is 370

Big Data and Sampling Error 371

Big Data and the Precision of Confidence Intervals 372

Implications of Big Data for Confidence Intervals 373

Big Data, Hypothesis Testing, and p Values 374

Implications of Big Data in Hypothesis Testing 376

Summary 376

Glossary 377

Problems 380

Case Problem 1: Young Professional Magazine 390

Case Problem 2: Quality Associates, Inc. 391

Available in the Cengage eBook:

Appendix: Random Sampling with R

Appendix: Interval Estimation with R

Appendix: Hypothesis Testing with R

Chapter 8 Linear Regression 393

8.1 Simple Linear Regression Model 395

Estimated Simple Linear Regression Equation 395

8.2 Least Squares Method 397

Least Squares Estimates of the Simple Linear

Regression Parameters 399

Using Excel’s Chart Tools to Compute the Estimated

Simple Linear Regression Equation 401

8.3 Assessing the Fit of the Simple Linear Regression

Model 403

The Sums of Squares 403

The Coefficient of Determination 405

Using Excel’s Chart Tools to Compute the Coefficient

of Determination 406

8.4 The Multiple Linear Regression Model 407

Estimated Multiple Linear Regression Equation 407

Least Squares Method and Multiple Linear Regression 408

Butler Trucking Company and Multiple Linear Regression 408

Using Excel’s Regression Tool to Develop the Estimated

Multiple Linear Regression Equation 409

8.5 Inference and Linear Regression 412

Conditions Necessary for Valid Inference in the Least

Squares Linear Regression Model 413

Testing Individual Linear Regression Parameters 417

Multicollinearity 421

8.6 Categorical Independent Variables 424

Butler Trucking Company and Rush Hour 424

Interpreting the Parameters 426

More Complex Categorical Variables 427

8.7 Modeling Nonlinear Relationships 429

Piecewise Linear Regression Models 434

Interaction Between Independent Variables 436

8.8 Model Fitting 441

Variable Selection Procedures 441

Overfitting 442

8.9 Big Data and Linear Regression 443

Inference and Very Large Samples 443

Model Selection 446

8.10 Prediction with Linear Regression 447

Summary 450

Glossary 450

Problems 452

Case Problem 1: Alumni Giving 466

Case Problem 2: Consumer Research, Inc. 468

Case Problem 3: Predicting Winnings for NASCAR Drivers 469

Available in the Cengage eBook:

Appendix: Simple Linear Regression with R

Appendix: Multiple Linear Regression with R

Appendix: Linear Regression Variable Selection Procedures with R

Chapter 9 Time Series Analysis and Forecasting 471

9.1 Time Series Patterns 474

Horizontal Pattern 474

Trend Pattern 476

Seasonal Pattern 477

Trend and Seasonal Pattern 478

Cyclical Pattern 481

Identifying Time Series Patterns 481

9.2 Forecast Accuracy 481

9.3 Moving Averages and Exponential Smoothing 485

Moving Averages 486

Exponential Smoothing 490

9.4 Using Linear Regression Analysis for Forecasting 494

Linear Trend Projection 494

Seasonality Without Trend 496

Seasonality with Trend 497

Using Linear Regression Analysis as a Causal Forecasting

Method 500

Combining Causal Variables with Trend and

Seasonality Effects 503

Considerations in Using Linear Regression in

Forecasting 504

9.5 Determining the Best Forecasting Model to Use 504

Summary 505

Glossary 505

Problems 506

Case Problem 1: Forecasting Food and Beverage Sales 515

Case Problem 2: Forecasting Lost Sales 515

Appendix 9.1: Using the Excel Forecast Sheet 517

Available in the Cengage eBook:

Appendix: Forecasting with R

Chapter 10 Predictive Data Mining: Regression Tasks 523

10.1 Regression Performance Measures 524

10.2 Data Sampling, Preparation, and Partitioning 526

Static Holdout Method 526

k-Fold Cross-Validation 530

10.3 k-Nearest Neighbors Regression 535

10.4 Regression Trees 538

Constructing a Regression Tree 538

Generating Predictions with a Regression Tree 541

Ensemble Methods 543

10.5 Neural Network Regression 548

Structure of a Neural Network 548

How a Neural Network Learns 552

10.6 Feature Selection 555

Wrapper Methods 556

Filter Methods 556

Embedded Methods 557

Summary 558

Glossary 558

Problems 560

Case Problem: Housing Bubble 568

Available in the Cengage eBook:

Appendix: k-Nearest Neighbors Regression with R

Appendix: Individual Regression Trees with R

Appendix: Random Forests of Regression Trees with R

Appendix: Neural Network Regression with R

Appendix: Regularized Linear Regression with R

Appendix: k-Nearest Neighbors Regression with Orange

Appendix: Individual Regression Trees with Orange

Appendix: Random Forests of Regression Trees with Orange

Appendix: Neural Network Regression with Orange

Appendix: Regularized Linear Regression with Orange

Chapter 11 Predictive Data Mining: Classification Tasks 571

11.1 Data Sampling, Preparation, and Partitioning 573

Static Holdout Method 573

k-Fold Cross-Validation 574

Class Imbalanced Data 574

11.2 Performance Measures for Binary Classification 576

11.3 Classification with Logistic Regression 582

11.4 k-Nearest Neighbors Classification 587

11.5 Classification Trees 591

Constructing a Classification Tree 591

Generating Predictions with a Classification Tree 593

Ensemble Methods 594

11.6 Neural Network Classification 600

Structure of a Neural Network 601

How a Neural Network Learns 605

11.7 Feature Selection 609

Wrapper Methods 609

Filter Methods 610

Embedded Methods 610

Summary 612

Glossary 612

Problems 615

Case Problem: Grey Code Corporation 630

Available in the Cengage eBook:

Appendix: Classification via Logistic Regression with R

Appendix: k-Nearest Neighbors Classification with R

Appendix: Individual Classification Trees with R

Appendix: Random Forests of Classification Trees with R

Appendix: Neural Network Classification with R

Appendix: Classification via Logistic Regression with Orange

Appendix: k-Nearest Neighbors Classification with Orange

Appendix: Individual Classification Trees with Orange

Appendix: Random Forests of Classification Trees with Orange

Appendix: Neural Network Classification with Orange

12.1 Building Good Spreadsheet Models 635

Influence Diagrams 635

Building a Mathematical Model 635

Spreadsheet Design and Implementing the Model in a

12.2 What-If Analysis 640

Data Tables 640

Goal Seek 642

Scenario Manager 644

12.3 Some Useful Excel Functions for Modeling 649

SUM and SUMPRODUCT 650

IF and COUNTIF 651

XLOOKUP 654

Trace Precedents and Dependents 656

Show Formulas 656

Evaluate Formulas 658

Error Checking 658

Watch Window 659

12.5 Predictive and Prescriptive Spreadsheet Models 660

Summary 661

Glossary 661

Problems 662

Case Problem: Retirement Plan 670

Chapter 13 Monte Carlo Simulation 671

13.1 Risk Analysis for Sanotronics LLC 673

Base-Case Scenario 673

Worst-Case Scenario 674

Best-Case Scenario 674

Use of Probability Distributions to Represent Random Variables 676

Generating Values for Random Variables with Excel 677

Executing Simulation Trials with Excel 681

Measuring and Analyzing Simulation Output 682

13.2 Inventory Policy Analysis for Promus Corp 686

Generating Values for Promus Corp’s Demand 688

Executing Simulation Trials and Analyzing Output 691

13.3 Simulation Modeling for Land Shark Inc. 693

Spreadsheet Model for Land Shark 694

Generating Values for Land Shark’s Random Variables 696

Executing Simulation Trials and Analyzing Output 698

Generating Bid Amounts with Fitted Distributions 700

13.4 Simulation with Dependent Random Variables 709

Spreadsheet Model for Press Teag Worldwide 709

13.5 Simulation Considerations 714

Verification and Validation 714

Summary 715

Summary of Steps for Conducting a Simulation Analysis 715

Glossary 716

Problems 717

Case Problem 1: Four Corners 731

Case Problem 2: Ginsberg’s Jewelry Snowfall Promotion 732

Appendix 13.1: Common Probability Distributions

for Simulation 734

Chapter 14 Linear Optimization Models 743

14.1 A Simple Maximization Problem 745

Problem Formulation 746

Mathematical Model for the Par, Inc. Problem 748

14.2 Solving the Par, Inc. Problem 749

The Geometry of the Par, Inc. Problem 749

Solving Linear Programs with Excel Solver 751

14.3 A Simple Minimization Problem 755

Problem Formulation 755

Solution for the M&D Chemicals Problem 755

14.4 Special Cases of Linear Program Outcomes 757

Alternative Optimal Solutions 758

Infeasibility 759

Unbounded 760

14.5 Sensitivity Analysis 762

Interpreting Excel Solver Sensitivity Report 762

14.6 General Linear Programming Notation and More

Examples 764

Investment Portfolio Selection 765

Transportation Planning 768

Assigning Project Leaders to Clients 776

Diet Planning 779

14.7 Generating an Alternative Optimal Solution

for a Linear Program 782

Summary 783

Glossary 784

Problems 785

Case Problem1: Investment Strategy 801

Case Problem 2: Solutions Plus 802

Available in the Cengage eBook:

Appendix: Linear Programming with R

Chapter 15 Integer Linear Optimization Models 805

15.1 Types of Integer Linear Optimization Models 806

15.2 Eastborne Realty, an Example of Integer Optimization 807

The Geometry of Linear All-Integer Optimization 808

15.3 Solving Integer Optimization Problems with Excel Solver 810

A Cautionary Note About Sensitivity Analysis 813

15.4 Applications Involving Binary Variables 815

Capital Budgeting 815

Fixed Cost 816

Bank Location 820

Product Design and Market Share Optimization 822

15.5 Modeling Flexibility Provided by Binary Variables 825

Multiple-Choice and Mutually Exclusive Constraints 825

k Out of n Alternatives Constraint 826

Conditional and Corequisite Constraints 826

15.6 Generating Alternatives in Binary Optimization 827

Summary 829

Glossary 830

Problems 830

Case Problem 1: Applecore Children’s Clothing 845

Case Problem 2: Yeager National Bank 847

Available in the Cengage eBook:

Appendix: Integer Programming with R

Chapter 16 Nonlinear Optimization Models 849

16.1 A Production Application: Par, Inc. Revisited 850

An Unconstrained Problem 850

A Constrained Problem 851

Solving Nonlinear Optimization Models Using Excel

Solver 853

Sensitivity Analysis and Shadow Prices in Nonlinear

Models 855

16.2 Local and Global Optima 856

Overcoming Local Optima with Excel Solver 858

16.3 A Location Problem 860

16.4 Markowitz Portfolio Model 861

16.5 Adoption of a New Product: The Bass Forecasting

Model 866

16.6 Heuristic Optimization Using Excel’s Evolutionary

Method 869

Summary 877

Glossary 877

Problems 878

Case Problem: Portfolio Optimization with Transaction

Costs 889

Available in the Cengage eBook:

Appendix: Nonlinear Programming with R

Chapter 17 Decision Analysis 893

17.1 Problem Formulation 895

Payoff Tables 896

Decision Trees 896

17.2 Decision Analysis Without Probabilities 897

Optimistic Approach 897

Conservative Approach 898

Minimax Regret Approach 898

17.3 Decision Analysis with Probabilities 900

Expected Value Approach 900

Risk Analysis 902

Sensitivity Analysis 903

17.4 Decision Analysis with Sample Information 904

Expected Value of Sample Information 909

Expected Value of Perfect Information 909

17.5 Computing Branch Probabilities with Bayes’ Theorem 910

17.6 Utility Theory 913

Utility and Decision Analysis 914

Utility Functions 918

Exponential Utility Function 921

Summary 923

Glossary 923

Problems 925

Case Problem 1: Property Purchase Strategy 939

Case Problem 2: Semiconductor Fabrication at Axeon Labs 941

Multi-Chapter Case Problems

Capital State University Game-Day Magazines 943

Hanover Inc. 945

Appendix A Basics of Excel 947

Appendix B Database Basics with Microsoft Access 959

Appendix C Solutions to Even-Numbered Problems

(Cengage eBook)

Appendix D Microsoft Excel Online and Tools for Statistical Analysis

(Cengage eBook)

References Available in the Cengage eBook

Index 997

This book is US\$10
To get free sample pages OR Buy this book