# Applied Linear Regression Models, 4th Edition PDF by Michael H Kutner, Christopher J Nachtsheim and John Neter

## Applied Linear Regression Models, Fourth Edition

By Michael H Kutner, Christopher J Nachtsheim and John Neter Contents:

PART ONE

SIMPLE LINEAR REGRESSION 1

Chapter 1

Linear Regression with One Predictor Variable 2

1.1 Relations between Variables 2

Functional Relation between Two Variables 2

Statistical Relation between Two Variables 3

1.2 Regression Models and Their Uses 5

Historical Origins 5

Basic Concepts 5

Construction of Regression Models 7

Uses of Regression Analysis 8

Regression and Causality 8

Use of Computers 9

1.3 Simple Linear Regression Model with Distribution of Error Terms Unspecified 9

Formal Statement of Model 9

Important Features of Model 9

Meaning of Regression Parameters 11

Alternative Versions of Regression Model 12

1.4 Data for Regression Analysis 12

Observational Data 12

Experimental Data 13

Completely Randomized Design 13

1.5 Overview of Steps in Regression Analysis 13

1.6 Estimation of Regression Function 15

Method of Least Squares 15

Point Estimation of Mean Response 21

Residuals 22

Properties of Fitted Regression Line 23

1.7 Estimation of Error Terms Variance σ2 24

Point Estimator of σ2 24

1.8 Normal Error Regression Model 26

Model 26

Estimation of Parameters by Method of Maximum Likelihood 27

Cited References 33

Problems 33

Exercises 37

Projects 38

Chapter 2

Inferences in Regression and CorrelationAnalysis 40

2.1 Inferences Concerning β1 40

Sampling Distribution of b1 41

Sampling Distribution of (b1 −β1)/s{b1} 44

Confidence Interval for β1 45

Tests Concerning β1 47

2.2 Inferences Concerning β0 48

Sampling Distribution of b0 48

Sampling Distribution of (b0 −β0)/s{b0} 49

Confidence Interval for β0 49

2.3 Some Considerations on Making Inferences Concerning β0 and β1 50

Effects of Departures from Normality 50

Interpretation of Confidence Coefficient and Risks of Errors 50

Spacing of the X Levels 50

Power of Tests 50

2.4 Interval Estimation of E{Yh} 52

Sampling Distribution of ˆYh 52

Sampling Distribution of (ˆYh − E{Yh})/s{ˆYh} 54

Confidence Interval for E{Yh} 54

2.5 Prediction of New Observation 55

Prediction Interval for Yh(new) when Parameters Known 56

Prediction Interval for Yh(new) when Parameters Unknown 57

Prediction of Mean of m New Observations for Given Xh 60

2.6 Confidence Band for Regression Line 61

2.7 Analysis of Variance Approach to Regression Analysis 63

Partitioning of Total Sum of Squares 63

Breakdown of Degrees of Freedom 66

Mean Squares 66

Analysis of Variance Table 67

Expected Mean Squares 68

F Test of β1 = 0 versus β1 _= 0 69

2.8 General Linear Test Approach 72

Full Model 72

Reduced Model 72

Test Statistic 73

Summary 73

2.9 Descriptive Measures of Linear Association between X and Y 74

Coefficient of Determination 74

Limitations of R2 75

Coefficient of Correlation 76

2.10 Considerations in Applying Regression Analysis 77

2.11 Normal Correlation Models 78

Distinction between Regression and Correlation Model 78

Bivariate Normal Distribution 78

Conditional Inferences 80

Inferences on Correlation Coefficients 83

Spearman Rank Correlation Coefficient 87

Cited References 89

Problems 89

Exercises 97

Projects 98

Chapter 3

Diagnostics and Remedial Measures 100

3.1 Diagnostics for Predictor Variable 100

3.2 Residuals 102

Properties of Residuals 102

Semistudentized Residuals 103

Departures from Model to Be Studied by Residuals 103

3.3 Diagnostics for Residuals 103

Nonlinearity of Regression Function 104

Nonconstancy of Error Variance 107

Presence of Outliers 108

Nonindependence of Error Terms 108

Nonnormality of Error Terms 110

Omission of Important Predictor Variables 112

3.4 Overview of Tests Involving Residuals 114

Tests for Randomness 114

Tests for Constancy of Variance 115

Tests for Outliers 115

Tests for Normality 115

3.5 Correlation Test for Normality 115

3.6 Tests for Constancy of Error Variance 116

Brown-Forsythe Test 116

Breusch-Pagan Test 118

3.7 F Test for Lack of Fit 119

Assumptions 119

Notation 121

Full Model 121

Reduced Model 123

Test Statistic 123

ANOVA Table 124

3.8 Overview of Remedial Measures 127

Nonlinearity of Regression Function 128

Nonconstancy of Error Variance 128

Nonindependence of Error Terms 128

Nonnormality of Error Terms 128

Omission of Important Predictor Variables 129

Outlying Observations 129

3.9 Transformations 129

Transformations for Nonlinear Relation Only 129

Transformations for Nonnormality and Unequal Error Variances 132

Box-Cox Transformations 134

3.10 Exploration of Shape of Regression Function 137

Lowess Method 138

Use of Smoothed Curves to Confirm Fitted Regression Function 139

3.11 Case Example—Plutonium Measurement 141

Cited References 146

Problems 146

Exercises 151

Projects 152

Case Studies 153

Chapter 4

Simultaneous Inferences and Other Topics in Regression Analysis 154

4.1 Joint Estimation of β0 and β1 154

Need for Joint Estimation 154

Bonferroni Joint Confidence Intervals 155

4.2 Simultaneous Estimation of Mean Responses 157

Working-Hotelling Procedure 158

Bonferroni Procedure 159

4.3 Simultaneous Prediction Intervals for New Observations 160

4.4 Regression through Origin 161

Model 161

Inferences 161

Important Cautions for Using Regression through Origin 164

4.5 Effects of Measurement Errors 165

Measurement Errors in Y 165

Measurement Errors in X 165

Berkson Model 167

4.6 Inverse Predictions 168

4.7 Choice of X Levels 170

Cited References 172

Problems 172

Exercises 175

Projects 175

Chapter 5

Matrix Approach to Simple Linear Regression Analysis 176

5.1 Matrices 176

Definition of Matrix 176

Square Matrix 178

Vector 178

Transpose 178

Equality of Matrices 179

5.2 Matrix Addition and Subtraction 180

5.3 Matrix Multiplication 182

Multiplication of a Matrix by a Scalar 182

Multiplication of a Matrix by a Matrix 182

5.4 Special Types of Matrices 185

Symmetric Matrix 185

Diagonal Matrix 185

Vector and Matrix with All Elements Unity 187

Zero Vector 187

5.5 Linear Dependence and Rank of Matrix 188

Linear Dependence 188

Rank of Matrix 188

5.6 Inverse of a Matrix 189

Finding the Inverse 190

Uses of Inverse Matrix 192

5.7 Some Basic Results for Matrices 193

5.8 Random Vectors and Matrices 193

Expectation of Random Vector or Matrix 193

Variance-Covariance Matrix of Random Vector 194

Some Basic Results 196

Multivariate Normal Distribution 196

5.9 Simple Linear Regression Model in Matrix Terms 197

5.10 Least Squares Estimation of Regression Parameters 199

Normal Equations 199

Estimated Regression Coefficients 200

5.11 Fitted Values and Residuals 202

Fitted Values 202

Residuals 203

5.12 Analysis of Variance Results 204

Sums of Squares 204

Sums of Squares as Quadratic Forms 205

5.13 Inferences in Regression Analysis 206

Regression Coefficients 207

Mean Response 208

Prediction of New Observation 209

Cited Reference 209

Problems 209

Exercises 212

PART TWO

MULTIPLE LINEAR

REGRESSION 213

Chapter 6

Multiple Regression I 214

6.1 Multiple Regression Models 214

Need for Several Predictor Variables 214

First-Order Model with Two Predictor Variables 215

First-Order Model with More than Two Predictor Variables 217

General Linear Regression Model 217

6.2 General Linear Regression Model in Matrix Terms 222

6.3 Estimation of Regression Coefficients 223

6.4 Fitted Values and Residuals 224

6.5 Analysis of Variance Results 225

Sums of Squares and Mean Squares 225

F Test for Regression Relation 226

Coefficient of Multiple Determination 226

Coefficient of Multiple Correlation 227

6.6 Inferences about Regression Parameters 227

Interval Estimation of βk 228

Tests for βk 228

Joint Inferences 228

6.7 Estimation of Mean Response and Prediction of New Observation 229

Interval Estimation of E{Yh} 229

Confidence Region for Regression Surface 229

Simultaneous Confidence Intervals for Several Mean Responses 230

Prediction of New Observation Yh(new) 230

Prediction of Mean of m New Observations at Xh 230

Predictions of g New Observations 231

6.8 Diagnostics and Remedial Measures 232

Scatter Plot Matrix 232

Three-Dimensional Scatter Plots 233

Correlation Test for Normality 234

Brown-Forsythe Test for Constancy of Error Variance 234

Breusch-Pagan Test for Constancy of Error Variance 234

F Test for Lack of Fit 235

Remedial Measures 236

6.9 An Example—Multiple Regression with Two Predictor Variables 236

Setting 236

Basic Calculations 237

Estimated Regression Function 240

Fitted Values and Residuals 241

Analysis of Appropriateness of Model 241

Analysis of Variance 243

Estimation of Regression Parameters 245

Estimation of Mean Response 245

Prediction Limits for New Observations 247

Cited Reference 248

Problems 248

Exercises 253

Projects 254

Chapter 7

Multiple Regression II 256

7.1 Extra Sums of Squares 256

Basic Ideas 256

Definitions 259

Decomposition of SSR into Extra Sums of Squares 260

ANOVA Table Containing Decomposition of SSR 261

7.2 Uses of Extra Sums of Squares in Tests for Regression Coefficients 263

Test whether a Single βk = 0 263

Test whether Several βk = 0 264

7.3 Summary of Tests Concerning Regression Coefficients 266

Test whether All βk = 0 266

Test whether a Single βk = 0 267

Test whether Some βk = 0 267

Other Tests 268

7.4 Coefficients of Partial Determination 268

Two Predictor Variables 269

General Case 269

Coefficients of Partial Correlation 270

7.5 Standardized Multiple Regression Model 271

Round off Errors in Normal Equations Calculations 271

Lack of Comparability in Regression Coefficients 272

Correlation Transformation 272

Standardized Regression Model 273

X_X Matrix for Transformed Variables 274

Estimated Standardized Regression Coefficients 275

7.6 Multicollinearity and Its Effects 278

Uncorrelated Predictor Variables 279

Nature of Problem when Predictor Variables Are Perfectly Correlated 281

Effects of Multicollinearity 283

Need for More Powerful Diagnostics for Multicollinearity 289

Cited Reference 289

Problems 289

Exercise 292

Projects 293

Chapter 8

Regression Models for Quantitative and Qualitative Predictors 294

8.1 Polynomial Regression Models 294

Uses of Polynomial Models 294

One Predictor Variable—Second Order 295

One Predictor Variable—Third Order 296

One Predictor Variable—Higher Orders 296

Two Predictor Variables—Second Order 297

Three Predictor Variables—Second Order 298

Implementation of Polynomial Regression Models 298

Case Example 300

Some Further Comments on Polynomial Regression 305

8.2 Interaction Regression Models 306

Interaction Effects 306

Interpretation of Interaction Regression Models with Linear Effects 306

Interpretation of Interaction Regression Models with Curvilinear Effects 309

Implementation of Interaction Regression Models 311

8.3 Qualitative Predictors 313

Qualitative Predictor with Two Classes 314

Interpretation of Regression Coefficients 315

Qualitative Predictor with More than Two Classes 318

Time Series Applications 319

8.4 Some Considerations in Using Indicator Variables 321

Indicator Variables versus Allocated Codes 321

Indicator Variables versus Quantitative Variables 322

Other Codings for Indicator Variables 323

8.5 Modeling Interactions between Quantitative and Qualitative Predictors 324

Meaning of Regression Coefficients 324

8.6 More Complex Models 327

More than One Qualitative Predictor Variable 328

Qualitative Predictor Variables Only 329

8.7 Comparison of Two or More Regression Functions 329

Soap Production Lines Example 330

Instrument Calibration Study Example 334

Cited Reference 335

Problems 335

Exercises 340

Projects 341

Case Study 342

Chapter 9

Building the Regression Model I: Model Selection and Validation 343

9.1 Overview of Model-Building Process 343

Data Collection 343

Data Preparation 346

Preliminary Model Investigation 346

Reduction of Explanatory Variables 347

Model Refinement and Selection 349

Model Validation 350

9.2 Surgical Unit Example 350

9.3 Criteria for Model Selection 353

R2 p or SSEp Criterion 354

R2 a,p or MSEp Criterion 355

Mallows’ Cp Criterion 357

AICp and SBCp Criteria 359

PRESSp Criterion 360

9.4 Automatic Search Procedures for Model Selection 361

“Best” Subsets Algorithm 361

Stepwise Regression Methods 364

Forward Stepwise Regression 364

Other Stepwise Procedures 367

9.5 Some Final Comments on Automatic Model Selection Procedures 368

9.6 Model Validation 369

Collection of New Data to Check Model 370

Comparison with Theory, Empirical Evidence, or Simulation Results 371

Data Splitting 372

Cited References 375

Problems 376

Exercise 380

Projects 381

Case Studies 382

Chapter 10

Building the Regression Model II: Diagnostics 384

10.2 Identifying Outlying Y Observations— Studentized Deleted Residuals 390

Outlying Cases 390

Residuals and Semistudentized Residuals 392

Hat Matrix 392

Studentized Residuals 394

Deleted Residuals 395

Studentized Deleted Residuals 396

10.3 Identifying Outlying X Observations—Hat Matrix Leverage Values 398

Use of Hat Matrix for Identifying Outlying X Observations 398

Use of Hat Matrix to Identify Hidden Extrapolation 400

10.4 Identifying Influential Cases—DFFITS, Cook’s Distance, and DFBETAS Measures 400

Influence on Single Fitted Value—DFFITS 401

Influence on All Fitted Values—Cook’s Distance 402

Influence on the Regression Coefficients—DFBETAS 404

Influence on Inferences 405

10.5 Multicollinearity Diagnostics—Variance Inflation Factor 406

Informal Diagnostics 407

Variance Inflation Factor 408

10.6 Surgical Unit Example—Continued 410

Cited References 414

Problems 414

Exercises 419

Projects 419

Case Studies 420

Chapter 11

Building the Regression Model III: Remedial Measures 421

11.1 Unequal Error Variances Remedial

Measures—Weighted Least Squares 421

Error Variances Known 422

Error Variances Known up to Proportionality Constant 424

Error Variances Unknown 424

11.2 Multicollinearity Remedial Measures—Ridge Regression 431

Some Remedial Measures 431

Ridge Regression 432

11.3 Remedial Measures for Influential Cases—Robust Regression 437

Robust Regression 438

IRLS Robust Regression 439

11.4 Nonparametric Regression: Lowess Method and Regression Trees 449

Lowess Method 449

Regression Trees 453

11.5 Remedial Measures for Evaluating Precision in Nonstandard Situations—Bootstrapping 458

General Procedure 459

Bootstrap Sampling 459

Bootstrap Confidence Intervals 460

11.6 Case Example—MNDOT Traffic Estimation 464

Model Development 465

Weighted Least Squares Estimation 468

Cited References 471

Problems 472

Exercises 476

Projects 476

Case Studies 480

Chapter 12

Autocorrelation in Time Series Data 481

12.1 Problems of Autocorrelation 481

12.2 First-Order Autoregressive Error Model 484

Simple Linear Regression 484

Multiple Regression 484

Properties of Error Terms 485

12.3 Durbin-Watson Test for Autocorrelation 487

12.4 Remedial Measures for Autocorrelation 490

Use of Transformed Variables 490

Cochrane-Orcutt Procedure 492

Hildreth-Lu Procedure 495

First Differences Procedure 496

Comparison of Three Methods 498

12.5 Forecasting with Autocorrelated Error Terms 499

Cited References 502

Problems 502

Exercises 507

Projects 508

Case Studies 508

PART THREE

NONLINEAR REGRESSION 509

Chapter 13

Introduction to Nonlinear Regression and Neural Networks 510

13.1 Linear and Nonlinear Regression Models 510

Linear Regression Models 510

Nonlinear Regression Models 511

Estimation of Regression Parameters 514

13.2 Least Squares Estimation in Nonlinear

Regression 515

Solution of Normal Equations 517

Direct Numerical Search—Gauss-Newton Method 518

Other Direct Search Procedures 525

13.3 Model Building and Diagnostics 526

13.4 Inferences about Nonlinear Regression Parameters 527

Estimate of Error Term Variance 527

Large-Sample Theory 528

When Is Large-Sample Theory Applicable? 528

Interval Estimation of a Single γk 531

Simultaneous Interval Estimation of Several γk 532

Test Concerning a Single γk 532

Test Concerning Several γk 533

13.5 Learning Curve Example 533

13.6 Introduction to Neural Network Modeling 537

Neural Network Model 537

Network Representation 540

Neural Network as Generalization of Linear Regression 541

Parameter Estimation: Penalized Least Squares 542

Example: Ischemic Heart Disease 543

Model Interpretation and Prediction 546

Some Final Comments on Neural Network Modeling 547

Cited References 547

Problems 548

Exercises 552

Projects 552

Case Studies 554

Chapter 14

Logistic Regression, Poisson Regression, and Generalized Linear Models 555

14.1 Regression Models with Binary Response Variable 555

Meaning of Response Function when Outcome Variable Is Binary 556

Special Problems when Response Variable Is Binary 557

14.2 Sigmoidal Response Functions for Binary Responses 559

Probit Mean Response Function 559

Logistic Mean Response Function 560

Complementary Log-Log Response Function 562

14.3 Simple Logistic Regression 563

Simple Logistic Regression Model 563

Maximum Likelihood Estimation 564

Interpretation of b1 567

Use of Probit and Complementary Log-Log Response Functions 568

Repeat Observations—Binomial Outcomes 568

14.4 Multiple Logistic Regression 570

Multiple Logistic Regression Model 570

Fitting of Model 571

Polynomial Logistic Regression 575

14.5 Inferences about Regression Parameters 577

Test Concerning a Single βk: Wald Test 578

Interval Estimation of a Single βk 579

Test whether Several βk = 0: Likelihood Ratio Test 580

14.6 Automatic Model Selection Methods 582

Model Selection Criteria 582

Best Subsets Procedures 583

Stepwise Model Selection 583

14.7 Tests for Goodness of Fit 586

Pearson Chi-Square Goodness of Fit Test 586

Deviance Goodness of Fit Test 588

Hosmer-Lemeshow Goodness of Fit Test 589

14.8 Logistic Regression Diagnostics 591

Logistic Regression Residuals 591

Diagnostic Residual Plots 594

Detection of Influential Observations 598

14.9 Inferences about Mean Response 602

Point Estimator 602

Interval Estimation 602

Simultaneous Confidence Intervals for Several Mean Responses 603

14.10 Prediction of a New Observation 604

Choice of Prediction Rule 604

Validation of Prediction Error Rate 607

14.11 Polytomous Logistic Regression for Nominal Response 608

Pregnancy Duration Data with Polytomous Response 609

J − 1 Baseline-Category Logits for Nominal Response 610

Maximum Likelihood Estimation 612

14.12 Polytomous Logistic Regression for Ordinal Response 614

14.13 Poisson Regression 618

Poisson Distribution 618

Poisson Regression Model 619

Maximum Likelihood Estimation 620

Model Development 620

Inferences 621

14.14 Generalized Linear Models 623

Cited References 624

Problems 625

Exercises 634

Projects 635

Case Studies 640

Appendix A

Some Basic Results in Probability and Statistics 641

Appendix B

Tables 659

Appendix C

Data Sets 677

Appendix D

Selected Bibliography 687

Index 695

This book is US\$10
To get free sample pages OR Buy this book
Send email: [email protected]