Wiley.com
Print this page Share
E-book

Interactive Displays: Natural Human-Interface Technologies

ISBN: 978-1-118-70620-6
400 pages
July 2014
Interactive Displays: Natural Human-Interface Technologies (111870620X) cover image

Description

How we interface and interact with computing, communications and entertainment devices is going through revolutionary changes, with natural user inputs based on touch, voice, and vision replacing or augmenting the use of traditional interfaces based on the keyboard, mouse, joysticks, etc. As a result, displays are morphing from one-way interface devices that merely show visual content to two-way interaction devices that provide more engaging and immersive experiences. This book provides an in-depth coverage of the technologies, applications, and trends in the rapidly emerging field of interactive displays enabled by natural human-interfaces.

Key features:

  • Provides a definitive reference reading on all the touch technologies used in interactive displays, including their advantages, limitations, and future trends.
  • Covers the fundamentals and applications of speech input, processing and recognition techniques enabling voice-based interactions.
  • Offers a detailed review of the emerging vision-based sensing technologies, and user interactions using gestures of hands, body, face, and eye gazes.
  • Discusses multi-modal natural user interface schemes which intuitively combine touch, voice, and vision for life-like interactions.
  • Examines the requirements and technology status towards realizing “true” 3D immersive and interactive displays.
See More

Table of Contents

About the Author xiii

List of Contributors xv

Series Editor’s Foreword xvii

Preface xix

List of Acronyms xxi

1 Senses, Perception, and Natural Human-Interfaces for Interactive Displays 1
Achintya K. Bhowmik

1.1 Introduction 1

1.2 Human Senses and Perception 4

1.3 Human Interface Technologies 9

1.3.1 Legacy Input Devices 9

1.3.2 Touch-based Interactions 11

1.3.3 Voice-based Interactions 13

1.3.4 Vision-based Interactions 15

1.3.5 Multimodal Interactions 18

1.4 Towards “True” 3D Interactive Displays 20

1.5 Summary 23

References 24

2 Touch Sensing 27
Geoff Walker

2.1 Introduction 27

2.2 Introduction to Touch Technologies 28

2.2.1 Touchscreens 30

2.2.2 Classifying Touch Technologies by Size and Application 30

2.2.3 Classifying Touch Technologies by Materials and Structure 32

2.2.4 Classifying Touch Technologies by the Physical Quantity Being Measured 33

2.2.5 Classifying Touch Technologies by Their Sensing Capabilities 33

2.2.6 The Future of Touch Technologies 34

2.3 History of Touch Technologies 35

2.4 Capacitive Touch Technologies 35

2.4.1 Projected Capacitive (P-Cap) 35

2.4.2 Surface Capacitive 47

2.5 Resistive Touch Technologies 51

2.5.1 Analog Resistive 51

2.5.2 Digital Multi-touch Resistive (DMR) 57

2.5.3 Analog Multi-touch Resistive (AMR) 59

2.6 Acoustic Touch Technologies 61

2.6.1 Surface Acoustic Wave (SAW) 61

2.6.2 Acoustic Pulse Recognition (APR) 64

2.6.3 Dispersive Signal Technology (DST) 67

2.7 Optical Touch Technologies 68

2.7.1 Traditional Infrared 68

2.7.2 Multi-touch Infrared 73

2.7.3 Camera-based Optical 76

2.7.4 In-glass Optical (Planar Scatter Detection – PSD) 81

2.7.5 Vision-based Optical 82

2.8 Embedded Touch Technologies 86

2.8.1 On-cell Mutual-capacitive 89

2.8.2 Hybrid In-cell/On-cell Mutual-capacitive 90

2.8.3 In-cell Mutual-capacitive 91

2.8.4 In-cell Light Sensing 93

2.9 Other Touch Technologies 96

2.9.1 Force-sensing 96

2.9.2 Combinations of Touch Technologies 98

2.10 Summary 98

2.11 Appendix 100

References 101

3 Voice in the User Interface 107 Andrew Breen, Hung H. Bui, Richard Crouch, Kevin Farrell, Friedrich Faubel, Roberto Gemello, William F. Ganong III, Tim Haulick, Ronald M. Kaplan, Charles L. Ortiz, Peter F. Patel-Schneider, Holger Quast, Adwait Ratnaparkhi, Vlad Sejnoha, Jiaying Shen, Peter Stubley and Paul van Mulbregt

3.1 Introduction 107

3.2 Voice Recognition 110

3.2.1 Nature of Speech 110

3.2.2 Acoustic Model and Front-end 112

3.2.3 Aligning Speech to HMMs 113

3.2.4 Language Model 114

3.2.5 Search: Solving Crosswords at 1000 Words a Second 115

3.2.6 Training Acoustic and Language Models 116

3.2.7 Adapting Acoustic and Language Models for Speaker Dependent

Recognition 116

3.2.8 Alternatives to the “Canonical” System 117

3.2.9 Performance 117

3.3 Deep Neural Networks for Voice Recognition 119

3.4 Hardware Optimization 122

3.4.1 Lower Power Wake-up Computation 122

3.4.2 Hardware Optimization for Specific Computations 123

3.5 Signal Enhancement Techniques for Robust Voice Recognition 123

3.5.1 Robust Voice Recognition 124

3.5.2 Single-channel Noise Suppression 124

3.5.3 Multi-channel Noise Suppression 125

3.5.4 Noise Cancellation 125

3.5.5 Acoustic Echo Cancellation 127

3.5.6 Beamforming 127

3.6 Voice Biometrics 128

3.6.1 Introduction 128

3.6.2 Existing Challenges to Voice Biometrics 129

3.6.3 New Areas of Research in Voice Biometrics 130

3.7 Speech Synthesis 130

3.8 Natural Language Understanding 134

3.8.1 Mixed Initiative Conversations 135

3.8.2 Limitations of Slot and Filler Technology 137

3.9 Multi-turn Dialog Management 141

3.10 Planning and Reasoning 144

3.10.1 Technical Challenges 144

3.10.2 Semantic Analysis and Discourse Representation 146

3.10.3 Pragmatics 147

3.10.4 Dialog Management as Collaboration 148

3.10.5 Planning and Re-planning 149

3.10.6 Knowledge Representation and Reasoning 149

3.10.7 Monitoring 150

3.10.8 Suggested Readings 151

3.11 Question Answering 151

3.11.1 Question Analysis 152

3.11.2 Find Relevant Information 152

3.11.3 Answers and Evidence 153

3.11.4 Presenting the Answer 153

3.12 Distributed Voice Interface Architecture 154

3.12.1 Distributed User Interfaces 154

3.12.2 Distributed Speech and Language Technology 155

3.13 Conclusion 157

Acknowledgements 158

References 158

4 Visual Sensing and Gesture Interactions 165
Achintya K. Bhowmik

4.1 Introduction 165

4.2 Imaging Technologies: 2D and 3D 167

4.3 Interacting with Gestures 170

4.4 Summary 177

References 178

5 Real-Time 3D Sensing With Structured Light Techniques 181
Tyler Bell, Nikolaus Karpinsky and Song Zhang

5.1 Introduction 181

5.2 Structured Pattern Codifications 183

5.2.1 2D Pseudo-random Codifications 183

5.2.2 Binary Structured Codifications 184

5.2.3 N-ary Codifications 187

5.2.4 Continuous Sinusoidal Phase Codifications 187

5.3 Structured Light System Calibration 191

5.4 Examples of 3D Sensing with DFP Techniques 193

5.5 Real-Time 3D Sensing Techniques 195

5.5.1 Fundamentals of Digital-light-processing (DLP) Technology 196

5.5.2 Real-Time 3D Data Acquisition 198

5.5.3 Real-Time 3D Data Processing and Visualization 199

5.5.4 Example of Real-Time 3D Sensing 200

5.6 Real-Time 3D Sensing for Human Computer Interaction Applications 201

5.6.1 Real-Time 3D Facial Expression Capture and its HCI Implications 201

5.6.2 Real-Time 3D Body Part Gesture Capture and its HCI Implications 202

5.6.3 Concluding Human Computer Interaction Implications 204

5.7 Some Recent Advancements 204

5.7.1 Real-Time 3D Sensing and Natural 2D Color Texture Capture 204

5.7.2 Superfast 3D Sensing 206

5.8 Summary 208

Acknowledgements 209

References 209

6 Real-Time Stereo 3D Imaging Techniques 215
Lazaros Nalpantidis

6.1 Introduction 215

6.2 Background 216

6.3 Structure of Stereo Correspondence Algorithms 219

6.3.1 Matching Cost Computation 220

6.3.2 Matching Cost Aggregation 221

6.4 Categorization of Characteristics 222

6.4.1 Depth Estimation Density 222

6.4.2 Optimization Strategy 224

6.5 Categorization of Implementation Platform 225

6.5.1 CPU-only Methods 225

6.5.2 GPU-accelerated Methods 226

6.5.3 Hardware Implementations (FPGAs, ASICs) 227

6.6 Conclusion 229

References 229

7 Time-of-Flight 3D-Imaging Techniques 233
Daniël Van Nieuwenhove

7.1 Introduction 233

7.2 Time-of-Flight 3D Sensing 233

7.3 Pulsed Time-of-Flight Method 235

7.4 Continuous Time-of-Flight Method 236

7.5 Calculations 236

7.6 Accuracy 239

7.7 Limitations and Improvements 240

7.7.1 TOF Challenges 240

7.7.2 Theoretical Limits 241

7.7.3 Distance Aliasing 242

7.7.4 Multi-path and Scattering 243

7.7.5 Power Budget and Optimization 243

7.8 Time-of-Flight Camera Components 244

7.9 Typical Values 244

7.9.1 Light Power Range 244

7.9.2 Background Light 245

7.10 Current State of the Art 247

7.11 Conclusion 247

References 248

8 Eye Gaze Tracking 251
Heiko Drewes

8.1 Introduction and Motivation 251

8.2 The Eyes 253

8.3 Eye Trackers 256

8.3.1 Types of Eye Trackers 256

8.3.2 Corneal Reflection Method 257

8.4 Objections and Obstacles 260

8.4.1 Human Aspects 260

8.4.2 Outdoor Use 261

8.4.3 Calibration 261

8.4.4 Accuracy 261

8.4.5 Midas Touch Problem 262

8.5 Eye Gaze Interaction Research 263

8.6 Gaze Pointing 264

8.6.1 Solving the Midas Touch Problem 264

8.6.2 Solving the Accuracy Issue 265

8.6.3 Comparison of Mouse and Gaze Pointing 266

8.6.4 Mouse and Gaze Coordination 267

8.6.5 Gaze Pointing Feedback 269

8.7 Gaze Gestures 270

8.7.1 The Concept of Gaze Gestures 270

8.7.2 Gesture Detection Algorithm 270

8.7.3 Human Ability to Perform Gaze Gestures 271

8.7.4 Gaze Gesture Alphabets 272

8.7.5 Gesture Separation from Natural Eye Movement 273

8.7.6 Applications for Gaze Gestures 274

8.8 Gaze as Context 275

8.8.1 Activity Recognition 275

8.8.2 Reading Detection 277

8.8.3 Attention Detection 279

8.8.4 Using Gaze Context 280

8.9 Outlook 280

References 281

9 Multimodal Input for Perceptual User Interfaces 285
Joseph J. LaViola Jr., Sarah Buchanan and Corey Pittman

9.1 Introduction 285

9.2 Multimodal Interaction Types 286

9.3 Multimodal Interfaces 287

9.3.1 Touch Input 287

9.3.2 3D Gesture 294

9.3.3 Eye Tracking and Gaze 299

9.3.4 Facial Expressions 300

9.3.5 Brain-computer Input 301

9.4 Multimodal Integration Strategies 303

9.4.1 Frame-based Integration 304

9.4.2 Unification-based Integration 304

9.4.3 Procedural Integration 305

9.4.4 Symbolic/Statistical Integration 305

9.5 Usability Issues with Multimodal Interaction 305

9.6 Conclusion 307

References 308

10 Multimodal Interaction in Biometrics: Technological and Usability Challenges 313
Norman Poh, Phillip A. Tresadern and Rita Wong

10.1 Introduction 313

10.1.1 Motivations for Identity Assurance 314

10.1.2 Biometrics

10.1.3 Application Characteristics of Multimodal Biometrics 314

10.1.4 2D and 3D Face Recognition 316

10.1.5 A Multimodal Case Study 317

10.1.6 Adaptation to Blind Subjects 318

10.1.7 Chapter Organization 320

10.2 Anatomy of the Mobile Biometry Platform 320

10.2.1 Face Analysis 320

10.2.2 Voice Analysis 323

10.2.3 Model Adaptation 325

10.2.4 Data Fusion 326

10.2.5 Mobile Platform Implementation 326

10.2.6 MoBio Database and Protocol 327

10.3 Case Study: Usability Study for the Visually Impaired 328

10.3.1 Impact of Head Pose Variations on Performance 329

10.3.2 User Interaction Module: Head Pose Quality Assessment 329

10.3.3 User-Interaction Module: Audio Feedback Mechanism 333

10.3.4 Usability Testing with the Visually Impaired 336

10.4 Discussions and Conclusions 338

Acknowledgements 339

References 339

11 Towards “True” 3D Interactive Displays 343
Jim Larimer, Philip J. Bos and Achintya K. Bhowmik

11.1 Introduction 343

11.2 The Origins of Biological Vision 346

11.3 Light Field Imaging 352

11.4 Towards “True” 3D Visual Displays 359

11.5 Interacting with Visual Content on a 3D Display 368

11.6 Summary 371

References 371

Index 375

See More

Author Information

Achintya K. Bhowmik, Intel Corporation, USA
Dr. Achin Bhowmik is the director of perceptual computing technology and solutions at Intel Corporation, where his group is focused on developing next-generation computing solutions based on natural human-computer interaction and visual computing technologies and applications. He is a senior member of the IEEE as well as program committee member of SID and IMID. He is associate editor of the Journal of the Society for Information Display, and was guest editor for two special volumes on "Advances in OLED Displays" and "Interactive Displays". Dr. Bhowmik is an Adjunct Professor at Kyung-Hee University, Seoul, Korea teaching courses on digital imaging & display, digital image processing and optics of liquid crystal displays. He is on the board of directors for OpenCV, the organization behind the open source computer vision library.

See More

Related Titles

Back to Top