Skip to main content

Beginning R: The Statistical Programming Language



Beginning R: The Statistical Programming Language

Mark Gardener

ISBN: 978-1-118-22616-2 May 2012 504 Pages

Download Product Flyer

Download Product Flyer

Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description. Download Product Flyer is to download PDF in new tab. This is a dummy description.


Conquer the complexities of this open source statistical language

R is fast becoming the de facto standard for statistical computing and analysis in science, business, engineering, and related fields. This book examines this complex language using simple statistical examples, showing how R operates in a user-friendly context. Both students and workers in fields that require extensive statistical analysis will find this book helpful as they learn to use R for simple summary statistics, hypothesis testing, creating graphs, regression, and much more. It covers formula notation, complex statistics, manipulating data and extracting components, and rudimentary programming.

  • R, the open source statistical language increasingly used to handle statistics and produces publication-quality graphs, is notoriously complex
  • This book makes R easier to understand through the use of simple statistical examples, teaching the necessary elements in the context in which R is actually used
  • Covers getting started with R and using it for simple summary statistics, hypothesis testing, and graphs
  • Shows how to use R for formula notation, complex statistics, manipulating data, extracting components, and regression
  • Provides beginning programming instruction for those who want to write their own scripts

Beginning R offers anyone who needs to perform statistical analysis the information necessary to use R with confidence.

Introduction xxi

Chapter 1: Introducing R: What It Is and How to Get It 1

Getting the Hang of R 2

The R Website 3

Downloading and Installing R from CRAN 3

Installing R on Your Windows Computer 4

Installing R on Your Macintosh Computer 7

Installing R on Your Linux Computer 7

Running the R Program 8

Finding Your Way with R 10

Getting Help via the CRAN Website and the Internet 10

The Help Command in R 10

Help for Windows Users 11

Help for Macintosh Users 11

Help for Linux Users 13

Help For All Users 13

Anatomy of a Help Item in R 14

Command Packages 16

Standard Command Packages 16

What Extra Packages Can Do for You 16

How to Get Extra Packages of R Commands 18

How to Install Extra Packages for Windows Users 18

How to Install Extra Packages for Macintosh Users 18

How to Install Extra Packages for Linux Users 19

Running and Manipulating Packages 20

Loading Packages 21

Windows-Specific Package Commands 21

Macintosh-Specific Package Commands 21

Removing or Unloading Packages 22

Summary 22

Chapter 2: Starting Out: Becoming Familiar with R 25

Some Simple Math 26

Use R Like a Calculator 26

Storing the Results of Calculations 29

Reading and Getting Data into R 30

Using the combine Command for Making Data 30

Entering Numerical Items as Data 30

Entering Text Items as Data 31

Using the scan Command for Making Data 32

Entering Text as Data 33

Using the Clipboard to Make Data 33

Reading a File of Data from a Disk 35

Reading Bigger Data Files 37

The read.csv() Command 37

Alternative Commands for Reading Data in R 39

Missing Values in Data Files 40

Viewing Named Objects 41

Viewing Previously Loaded Named-Objects 42

Viewing All Objects 42

Viewing Only Matching Names 42

Removing Objects from R 44

Types of Data Items 45

Number Data 45

Text Items 45

Converting Between Number and Text Data 46

The Structure of Data Items 47

Vector Items 48

Data Frames 48

Matrix Objects 49

List Objects 49

Examining Data Structure 49

Working with History Commands 51

Using History Files 52

Viewing the Previous Command History 52

Saving and Recalling Lists of Commands 52

Alternative History Commands in Macintosh OS 52

Editing History Files 53

Saving Your Work in R 54

Saving the Workspace on Exit 54

Saving Data Files to Disk 54

Save Named Objects 54

Save Everything 55

Reading Data Files from Disk 56

Saving Data to Disk as Text Files 57

Writing Vector Objects to Disk 58

Writing Matrix and Data Frame Objects to Disk 58

Writing List Objects to Disk 59

Converting List Objects to Data Frames 60

Summary 61

Chapter 3: Starting Out: Working With Objects 65

Manipulating Objects 65

Manipulating Vectors 66

Selecting and Displaying Parts of a Vector 66

Sorting and Rearranging a Vector 68

Returning Logical Values from a Vector 70

Manipulating Matrix and Data Frames 70

Selecting and Displaying Parts of a Matrix or Data Frame 71

Sorting and Rearranging a Matrix or Data Frame 74

Manipulating Lists 76

Viewing Objects within Objects 77

Looking Inside Complicated Data Objects 77

Opening Complicated Data Objects 78

Quick Looks at Complicated Data Objects 80

Viewing and Setting Names 82

Rotating Data Tables 86

Constructing Data Objects 86

Making Lists 87

Making Data Frames 88

Making Matrix Objects 89

Re-ordering Data Frames and Matrix Objects 92

Forms of Data Objects: Testing and Converting 96

Testing to See What Type of Object You Have 96

Converting from One Object Form to Another 97

Convert a Matrix to a Data Frame 97

Convert a Data Frame into a Matrix 98

Convert a Data Frame into a List 99

Convert a Matrix into a List 100

Convert a List to Something Else 100

Summary 104

Chapter 4: Data: Descriptive Statistics and Tabulation 107

Summary Commands 108

Summarizing Samples 110

Summary Statistics for Vectors 110

Summary Commands With Single Value Results 110

Summary Commands With Multiple Results 113

Cumulative Statistics 115

Simple Cumulative Commands 115

Complex Cumulative Commands 117

Summary Statistics for Data Frames 118

Generic Summary Commands for Data Frames 119

Special Row and Column Summary Commands 119

The apply() Command for Summaries on Rows or Columns 120

Summary Statistics for Matrix Objects 120

Summary Statistics for Lists 121

Summary Tables 122

Making Contingency Tables 123

Creating Contingency Tables from Vectors 123

Creating Contingency Tables from Complicated Data 123

Creating Custom Contingency Tables 126

Creating Contingency Tables from Matrix Objects 128

Selecting Parts of a Table Object 130

Converting an Object into a Table 132

Testing for Table Objects 133

Complex (Flat) Tables 134

Making “Flat” Contingency Tables 134

Making Selective “Flat” Contingency Tables 138

Testing “Flat” Table Objects 139

Summary Commands for Tables 139

Cross Tabulation 142

Testing Cross-Table (xtabs) Objects 144

A Better Class Test 144

Recreating Original Data from a Contingency Table 145

Switching Class 146

Summary 147

Chapter 5: Data: Distrib ution 151

Looking at the Distribution of Data 151

Stem and Leaf Plot 152

Histograms 154

Density Function 158

Using the Density Function to Draw a Graph 159

Adding Density Lines to Existing Graphs 160

Types of Data Distribution 161

The Normal Distribution 161

Other Distributions 164

Random Number Generation and Control 166

Random Numbers and Sampling 168

The Shapiro-Wilk Test for Normality 171

The Kolmogorov-Smirnov Test 172

Quantile-Quantile Plots 174

A Basic Normal Quantile-Quantile Plot 174

Adding a Straight Line to a QQ Plot 174

Plotting the Distribution of One Sample Against Another 175

Summary 177

Chapter 6: Si mple Hypothesis Testing 181

Using the Student’s t-test 181

Two-Sample t-Test with Unequal Variance 182

Two-Sample t-Test with Equal Variance 183

One-Sample t-Testing 183

Using Directional Hypotheses 183

Formula Syntax and Subsetting Samples in the t-Test 184

The Wilcoxon U-Test (Mann-Whitney) 188

Two-Sample U-Test 189

One-Sample U-Test 189

Using Directional Hypotheses 189

Formula Syntax and Subsetting Samples in the U-test 190

Paired t- and U-Tests 193

Correlation and Covariance 196

Simple Correlation 197

Covariance 199

Significance Testing in Correlation Tests 199

Formula Syntax 200

Tests for Association 203

Multiple Categories: Chi-Squared Tests 204

Monte Carlo Simulation 205

Yates’ Correction for 2 n 2 Tables 206

Single Category: Goodness of Fit Tests 206

Summary 210

Chapter 7: Introduction to Graphical Analysis 215

Box-whisker Plots 215

Basic Boxplots 216

Customizing Boxplots 217

Horizontal Boxplots 218

Scatter Plots 222

Basic Scatter Plots 222

Adding Axis Labels 223

Plotting Symbols 223

Setting Axis Limits 224

Using Formula Syntax 225

Adding Lines of Best-Fit to Scatter Plots 225

Pairs Plots (Multiple Correlation Plots) 229

Line Charts 232

Line Charts Using Numeric Data 232

Line Charts Using Categorical Data 233

Pie Charts 236

Cleveland Dot Charts 239

Bar Charts 245

Single-Category Bar Charts 245

Multiple Category Bar Charts 250

Stacked Bar Charts 250

Grouped Bar Charts 250

Horizontal Bars 253

Bar Charts from Summary Data 253

Copy Graphics to Other Applications 256

Use Copy/Paste to Copy Graphs 257

Save a Graphic to Disk 257

Windows 257

Macintosh 258

Linux 258

Summary 259

Chapter 8: Formula Notation and Complex Statistic s 263

Examples of Using Formula Syntax for Basic Tests 264

Formula Notation in Graphics 266

Analysis of Variance (ANOVA) 268

One-Way ANOVA 268

Stacking the Data before Running Analysis of Variance 269

Running aov() Commands 270

Simple Post-hoc Testing 271

Extracting Means from aov() Models 271

Two-Way ANOVA 273

More about Post-hoc Testing 275

Graphical Summary of ANOVA 277

Graphical Summary of Post-hoc Testing 278

Extracting Means and Summary Statistics 281

Model Tables 281

Table Commands 283

Interaction Plots 283

More Complex ANOVA Models 289

Other Options for aov() 290

Replications and Balance 290

Summary 292

Chapter 9: Manipulating Data and Extracting Components 295

Creating Data for Complex Analysis 295

Data Frames 296

Matrix Objects 299

Creating and Setting Factor Data 300

Making Replicate Treatment Factors 304

Adding Rows or Columns 306

Summarizing Data 312

Simple Column and Row Summaries 312

Complex Summary Functions 313

The rowsum() Command 314

The apply() Command 315

Using tapply() to Summarize Using a Grouping Variable 316

The aggregate() Command 319

Summary 323

Chapter 10: Regression (Li near Modeling) 327

Simple Linear Regression 328

Linear Model Results Objects 329

Coefficients 330

Fitted Values 330

Residuals 330

Formula 331

Best-Fit Line 331

Similarity between lm() and aov() 334

Multiple Regression 335

Formulae and Linear Models 335

Model Building 337

Adding Terms with Forward Stepwise Regression 337

Removing Terms with Backwards Deletion 339

Comparing Models 341

Curvilinear Regression 343

Logarithmic Regression 344

Polynomial Regression 345

Plotting Linear Models and Curve Fitting 347

Best-Fit Lines 348

Adding Line of Best-Fit with abline() 348

Calculating Lines with fitted() 348

Producing Smooth Curves using spline() 350

Confidence Intervals on Fitted Lines 351

Summarizing Regression Models 356

Diagnostic Plots 356

Summary of Fit 357

Summary 359

Chapter 11: More About Graphs 363

Adding Elements to Existing Plots 364

Error Bars 364

Using the segments() Command for Error Bars 364

Using the arrows() Command to Add Error Bars 368

Adding Legends to Graphs 368

Color Palettes 370

Placing a Legend on an Existing Plot 371

Adding Text to Graphs 372

Making Superscript and Subscript Axis Titles 373

Orienting the Axis Labels 375

Making Extra Space in the Margin for Labels 375

Setting Text and Label Sizes 375

Adding Text to the Plot Area 376

Adding Text in the Plot Margins 378

Creating Mathematical Expressions 379

Adding Points to an Existing Graph 382

Adding Various Sorts of Lines to Graphs 386

Adding Straight Lines as Gridlines or Best-Fit Lines 386

Making Curved Lines to Add to Graphs 388

Plotting Mathematical Expressions 390

Adding Short Segments of Lines to an Existing Plot 393

Adding Arrows to an Existing Graph 394

Matrix Plots (Multiple Series on One Graph) 396

Multiple Plots in One Window 399

Splitting the Plot Window into Equal Sections 399

Splitting the Plot Window into Unequal Sections 402

Exporting Graphs 405

Using Copy and Paste to Move a Graph 406

Saving a Graph to a File 406

Windows 406

Macintosh 406

Linux 406

Using the Device Driver to Save a Graph to Disk 407

PNG Device Driver 407

PDF Device Driver 407

Copying a Graph from Screen to Disk File 408

Making a New Graph Directly to a Disk File 408

Summary 410

Chapter 12: Writing Your Own Scripts: Beginning to Program 415

Copy and Paste Scripts 416

Make Your Own Help File as Plaintext 416

Using Annotations with the # Character 417

Creating Simple Functions 417

One-Line Functions 417

Using Default Values in Functions 418

Simple Customized Functions with Multiple Lines 419

Storing Customized Functions 420

Making Source Code 421

Displaying the Results of Customized Functions and Scripts 421

Displaying Messages as Part of Script Output 422

Simple Screen Text 422

Display a Message and Wait for User Intervention 424

Summary 428

Appendix: Answers to Exerci ses 433

Index 461