Skip to main content

Big Data, Open Data and Data Development

Big Data, Open Data and Data Development

Jean-Louis Monino, Soraya Sedkaoui

ISBN: 978-1-119-28521-2

Mar 2016, Wiley-ISTE

170 pages



The world has become digital and technological advances have multiplied circuits with access to data, their processing and their diffusion. New technologies have now reached a certain maturity. Data are available to everyone, anywhere on the planet. The number of Internet users in 2014 was 2.9 billion or 41% of the world population. The need for knowledge is becoming apparent in order to understand this multitude of data. We must educate, inform and train the masses. The development of related technologies, such as the advent of the Internet, social networks, ""cloud-computing"" (digital factories), has increased the available volumes of data. Currently, each individual creates, consumes, uses digital information: more than 3.4 million e-mails are sent worldwide every second, or 107,000 billion annually with 14,600 e-mails per year per person, but more than 70% are spam. Billions of pieces of content are shared on social networks such as Facebook, more than 2.46 million every minute. We spend more than 4.8 hours a day on the Internet using a computer, and 2.1 hours using a mobile. Data, this new ethereal manna from heaven, is produced in real time. It comes in a continuous stream from a multitude of sources which are generally heterogeneous.

This accumulation of data of all types (audio, video, files, photos, etc.) generates new activities, the aim of which is to analyze this enormous mass of information. It is then necessary to adapt and try new approaches, new methods, new knowledge and new ways of working, resulting in new properties and new challenges since SEO logic must be created and implemented. At company level, this mass of data is difficult to manage. Its interpretation is primarily a challenge. This impacts those who are there to ""manipulate"" the mass and requires a specific infrastructure for creation, storage, processing, analysis and recovery. The biggest challenge lies in ""the valuing of data"" available in quantity, diversity and access speed.

Acknowledgements vii

Foreword  ix

Key Concepts xi

Introduction xix

Chapter 1. The Big Data Revolution 1

1.1. Understanding the Big Data universe 2

1.2. What changes have occurred in data analysis? 8

1.3. From Big Data to Smart Data: making data warehouses intelligent 12

1.4. High-quality information extraction and the emergence of a new profession: data scientists 16

1.5. Conclusion 21

Chapter 2. Open Data: A New Challenge 23

2.1. Why Open Data? 23

2.2. A universe of open and reusable data 28

2.3. Open Data and the Big Data universe 33

2.4. Data development and reuse 38

2.5. Conclusion 41

Chapter 3. Data Development Mechanisms 43

3.1. How do we develop data?  44

3.2. Data governance: a key factor for data valorization  54

3.3. CI: protection and valuation of digital assets 60

3.4. Techniques of data analysis: data mining/text mining 65

3.5. Conclusion 72

Chapter 4. Creating Value from Data Processing 73

4.1. Transforming the mass of data into innovation opportunities 74

4.2. Creation of value and analysis of open databases 82

4.3. Value creation of business assets in web data 87

4.4. Transformation of data into information or “DataViz” 94

4.5. Conclusion 100

Conclusion 101

Bibliography 109

Index 121