Skip to main content

Principles of Managerial Statistics and Data Science

Hardcover

Pre-order

£98.50

*VAT

Principles of Managerial Statistics and Data Science

Roberto Rivera

ISBN: 978-1-119-48641-1 December 2019 600 Pages

Hardcover
Pre-order
£98.50
Download Product Flyer

Download Product Flyer

Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description.

Description

Introduces readers to the principles of managerial statistics and data science, with an emphasis on statistical literacy of business students   

Through a statistical perspective, this book introduces readers to the topic of data science, including Big Data, data analytics, and data wrangling. Chapters include multiple examples showing the application of the theoretical aspects presented. It features practice problems designed to ensure that readers understand the concepts and can apply them using real data. Over 100 open data sets used for examples and problems come from regions throughout the world, allowing the instructor to adapt the application to local data with which students can identify. Applications with these data sets include:

  • Assessing if searches during a police stop in San Diego are dependent on driver’s race
  • Visualizing the association between fat percentage and moisture percentage in Canadian cheese
  • Modeling taxi fares in Chicago using data from millions of rides
  • Analyzing mean sales per unit of legal marijuana products in Washington state

Topics covered in Principles of Managerial Statistics and Data Science include:data visualization; descriptive measures; probability; probability distributions; mathematical expectation; confidence intervals; and hypothesis testing. Analysis of variance; simple linear regression; and multiple linear regression are also included. In addition, the book offers contingency tables, Chi-square tests, non-parametric methods, and time series methods. The textbook: 

  • Includes academic material usually covered in introductory Statistics courses, but with a data science twist, and less emphasis in the theory
  • Relies on Minitab to present how to perform tasks with a computer
  • Presents and motivates use of data that comes from open portals
  • Focuses on developing an intuition on how the procedures work
  • Exposes readers to the potential in Big Data and current failures of its use
  • Supplementary material includes: a companion website that houses PowerPoint slides; an Instructor's Manual with tips, a syllabus model, and project ideas; R code to reproduce examples and case studies; and information about the open portal data  
  • Features an appendix with solutions to some practice problems

Principles of Managerial Statistics and Data Science is a textbook for undergraduate and graduate students taking managerial Statistics courses, and a reference book for working business professionals.

Preface xxi

Acknowledgments xxv

Acronyms i

1 Statistics Suck; So Why Do I Need to Learn About It? 1

1.1 Introduction 1

Practice Problems 6

1.2 Data Based Decision Making: Some Applications 7

1.3 Statistics Defined 13

1.4 Use of Technology and the New Buzzwords: Data Science, Data Analytics, and Big Data. 17

1.4.1 A Quick Look at Data Science: Some Definitions 18

Chapter Problems 21

References 22

2 Concepts in Statistics 23

2.1 Introduction 23

Practice Problems 26

2.2 Type of Data 28

Practice Problems 31

2.3 Four Important Notions in Statistics 33

Practice Problems 37

2.4 Sampling methods 39

2.4.1 Probability Sampling 39

2.4.2 Nonprobability Sampling 42

Practice Problems 46

2.5 Data Management 47

2.5.1 A Quick Look at Data Science: Data Wrangling Bal-timore Housing Variables 53

2.6 Proposing a Statistical Study 55

Chapter Problems 56

References 59

3 Data Visualization 61

3.1 Introduction 61

3.2 Visualization Methods for Categorical Variables 62

Practice Problems 68

3.3 Visualization Methods for Numerical Variables 74

Practice Problems 82

3.4 Visualizing Summaries of More Than Two Variables Simultane-ously 86

3.4.1 A Quick Look at Data Science: Does Race Affect The Chances of a Driver Being Searched During a Vehicle Stop in San Diego? 95

Practice Problems 99

3.5 Novel Data Visualization 107

3.5.1 A Quick Look at Data Science: Visualizing Associa-tion Between Baltimore Housing Variables Over 14

Years 111

Chapter Problems 115

References 133

4 Descriptive Statistics 135

4.1 Introduction 135

4.2 Measures of Centrality 137

Practice Problems 152

4.3 Measures of Dispersion 157

Practice Problems 162

4.4 Percentiles 164

4.4.1 Quartiles 165

Practice Problems 173

4.5 Measuring the Association Between Two Variables 177

Practice Problems 183

4.6 Sample Proportion and Other Numerical Statistics 185

4.6.1 A Quick Look at Data Science: Murder Rates in Los Angeles 185

4.7 How to Use Descriptive Statistics 188

Chapter Problems 189

References 197

5 Introduction to Probability 199

5.1 Introduction 199

5.2 Preliminaries 200

Practice Problems 204

5.3 The Probability of an Event 206

Practice Problems 210

5.4 Rules and Properties of Probabilities 211

Practice Problems 217

5.5 Conditional Probability and Independent Events 219

Practice Problems 228

5.6 Empirical Probabilities 230

5.6.1 A Quick Look at Data Science: Missing People Re-ports in Boston by Day of Week 234

Practice Problems 235

5.7 Counting Outcomes 240

Practice Problems 243

Chapter Problems 244

References 249

6 Discrete Random Variables 251

6.1 Introduction 251

6.2 General Properties 252

6.2.1 A Quick Look at Data Science: Number of Stroke Emergency Calls in Manhattan 259

Practice Problems 261

6.3 Properties of Expected Value and Variance 264

Practice Problems 269

6.4 Bernoulli and Binomial Random Variables 271

Practice Problems 281

6.5 Poisson Distribution 283

Practice Problems 288

6.6 Optional: Other Useful Probability Distributions 291

Chapter Problems 293

References 299

7 Continuous Random Variables 301

7.1 Introduction 301

Practice Problems 304

7.2 The Uniform Probability Distribution 304

Practice Problems 310

7.3 The Normal Distribution 312

Practice Problems 325

7.4 Probabilities for Any Normally Distributed Random Variable 328

7.4.1 A Quick Look at Data Science: Normal Distribution, A Good Match for University of Puerto Rico SATs? 332

Practice Problems 334

7.5 Approximating the Binomial Distribution 339

Practice Problems 342

7.6 Exponential Distribution 343

Practice Problems 345

Chapter Problems 347

References 350

8 Properties of Sample Statistics 351

8.1 Introduction 351

8.2 Expected Value and Standard Deviation of X 353

Practice Problems 356

8.3 Sampling Distribution of X When Sample Comes From a Nor-mal Distribution 358

Practice Problems 363

8.4 Central Limit Theorem 365

8.4.1 A Quick Look at Data Science: Bacteria at New York City Beaches 372

Practice Problems 375

8.5 Other Properties of Estimators 378

Chapter Problems 383

9 Interval Estimation for One Population Parameter 389

9.1 Introduction 389

9.2 Intuition of a Two Sided Confidence Interval 390

9.3 Confidence Interval for the Population Mean: Known 392

Practice Problems 401

9.4 Determining Sample Size for a Confidence Interval for 404

Practice Problems 405

9.5 Confidence Interval for the Population Mean: Unknown 406

Practice Problems 414

9.6 Confidence Interval for 417

Practice Problems 418

9.7 Determining Sample Size for Confidence Interval 420

Practice Problems 422

9.8 Optional: Confidence interval for 423

9.8.1 A Quick Look at Data Science: A Confidence Inter-val for the Standard Deviation of Walking Scores in Baltimore 425

Chapter Problems 427

References 431

10 Hypothesis Testing For One Population 433

10.1 Introduction 433

10.2 Basics of Hypothesis Testing 436

10.3 Steps to Perform a Hypothesis Test 444

Practice Problems 445

10.4 Inference on the Population Mean: Known Standard Deviation 447

Practice Problems 463

10.5 Hypothesis Testing for the Mean (Unknown) 470

Practice Problems 476

10.6 Hypothesis Testing for the Population Proportion 480

10.6.1 A Quick Look at Data Science: Proportion of New York City High Schools with a Mean SAT Score of 1,498 or More 485

Practice Problems 487

10.7 Hypothesis Testing for the Population Variance 491

10.8 More on the P-value and Final Remarks 493

10.8.1 Misunderstanding the P-value 495

Chapter Problems 502

References 508

11 Statistical Inference to Compare Parameters from Two Popu-lations 509

11.1 Introduction 509

11.2 Inference on Two Population Means 510

11.3 Inference on Two Population Means - Independent Samples, Variances Known 512

Practice Problems 520

11.4 Inference on Two Population Means When Two Independent Samples are Used - Unknown Variances 524

11.4.1 A Quick Look at Data Science: Suicide Rates Among Asian Men and Women in New York City 530

Practice Problems 533

11.5 Inference on Two Means Using Two Dependent Samples 537

Practice Problems 539

11.6 Inference on Two Population Proportions 541

Practice Problems 545

Chapter Problems 546

References 551

12 Analysis of Variance (ANOVA) 553

12.1 Introduction 553

Practice Problems 558

12.2 Analysis of Variance (ANOVA) for One Factor 559

Practice Problems 570

12.3 Multiple Comparisons 572

Practice Problems 577

12.4 Diagnostics of ANOVA Assumptions 578

12.4.1 A Quick Look at Data Science: Emergency Response Time for Cardiac Arrest in New York City 584

Practice Problems 588

12.5 ANOVA with Two Factors 589

Practice Problems 597

12.6 Extensions to ANOVA 600

Chapter Problems 605

References 610

13 Simple Linear Regression 611

13.1 Introduction 611

13.2 Basics of Simple Linear Regression 613

Practice Problems 617

13.3 Fitting the Simple Linear Regression Parameters 617

Practice Problems 623

13.4 Inference for Simple Linear Regression 625

Practice Problems 638

13.5 Estimating and Predicting the Response Variable 643

Practice Problems 647

13.6 A Binary X 649

Practice Problems 652

13.7 Model Diagnostics (Residual Analysis) 652

Practice Problems 662

13.8 What Correlation Doesn't Mean 665

13.8.1 A Quick Look at Data Science: Can Rate of College Educated People Help Predict the Rate of Narcotic Problems in Baltimore? 669

Chapter Problems 676

References 686

14 Multiple Linear Regression 687

14.1 Introduction 687

14.2 The Multiple Linear Regression Model 688

Practice Problems 692

14.3 Inference for Multiple Regression 695

Practice Problems 701

14.4 Multicollinearity and Other Modeling Aspects 705

Practice Problems 712

14.5 Variability Around the Regression Line: Residuals and Intervals 714

Practice Problems 717

14.6 Modifying Predictors 718

Practice Problems 719

14.7 General Linear Model 720

Practice Problems 728

14.8 Steps to Fit a Multiple Linear Regression Model 733

14.9 Other Regression Topics 735

14.9.1 A Quick Look at Data Science: Modeling Taxi Fares in Chicago 740

Chapter Problems 745

References 750

15 Inference on Association of Categorical Variables 753

15.1 Introduction 753

15.2 Association Between Two Categorical Variables 755

15.2.1 A Quick Look at Data Science: Affordability and Business Environment in Chattanooga 762

Practice Problems 766

Chapter Problems 770

References 770

16 Nonparametric Testing 773

16.1 Introduction 773

16.2 Sign Tests and Wilcoxon Sign-Rank Tests: One Sample & Matched Pairs Scenarios 774

Practice Problems 779

16.3 Wilcoxon Rank-Sum Test: Two Independent Samples 781

16.3.1 A Quick Look at Data Science: Austin Texas as a Place to Live; Do Men Rate It Higher Than Women? 783

Practice Problems 786

16.4 Kruskal-Wallis Test: More Than Two Samples 788

Practice Problems 791

16.5 Nonparametric Tests Versus Their Parametric Counterparts 792

Chapter Problems 794

References 794

17 Forecasting 797

17.1 Introduction 797

17.2 Time Series Components 798

Practice Problems 806

17.3 Simple Forecasting Models 807

Practice Problems 812

17.4 Forecasting When Data Has Trend, Seasonality 814

Practice Problems 822

17.5 Assessing Forecasts 826

17.5.1 A Quick Look at Data Science: Forecasting Tourism

Jobs in Canada 830

17.5.2 A Quick Look at Data Science: Forecasting Retail

Gross Sales of Marijuana in Denver 834

Chapter Problems 837

References 839

A Math Notation and Symbols 841

A.1 Summation 841

A.2 pth Power 842

A.3 Inequalities 842

A.4 Factorials 843

A.5 Exponential Function 844

A.6 Greek and Statistics Symbols 844

B 845

C 847

D Solutions to Odd Numbered Exercises 848

Index 885