DescriptionSymbolic data analysis is a relatively new field that provides a range of methods for analyzing complex datasets. Standard statistical methods do not have the power or flexibility to make sense of very large datasets, and symbolic data analysis techniques have been developed in order to extract knowledge from such data. Symbolic data methods differ from that of data mining, for example, because rather than identifying points of interest in the data, symbolic data methods allow the user to build models of the data and make predictions about future events.
This book is the result of the work f a pan-European project team led by Edwin Diday following 3 years work sponsored by EUROSTAT. It includes a full explanation of the new SODAS software developed as a result of this project. The software and methods described highlight the crossover between statistics and computer science, with a particular emphasis on data mining.
1. The state of the art in symbolic data analysis: overview and future (Edwin Diday).
PART I. DATABASES VERSUS SYMBOLIC OBJECTS.
2. Improved generation of symbolic objects from relational databases (Yves Lechevallier, Aicha El Golli and George Hébrail).
3. Exporting symbolic objects to databases (Donato Malerba, Floriana Esposito and Annalisa Appice).
4. A statistical metadata model for symbolic objects (Haralambos Papageorgiou and Maria Vardaki).
5. Editing symbolic data (Monique-Noirhomme-Fraiture, Paula Brito, Anne de Baenst-Vandenbroucke and Adolphe Nahimana).
6. The normal symbolic form (Marc Csernel and Francisco de A.T. de Carvalho).
7. Visualization (Monique-Noirhomme-Fraiture and Adolphe Nahimana).
PART II. UNSUPERVISED METHODS.
8. Dissimilarity and matching (Floriana Esposito, Donato Malerba and Annalisa Appice).
9. Unsupervised divisive classification (Jean-Paul Rasson, Jean-Yves Pirçon, Pascale Lallemand and Séverine Adans).
10. Hierarchical and pyramidal clustering (Paula Brito and Francisco de A.T. de Carvalho).
11 .Clustering methods in symbolic data analysis (Francisco de A.T. de Carvalho, Yves Lechevallier and Rosanna Verde).
12. Visualizing symbolic data by Kohonen maps (Hans-Hermann Bock).
13 .Validation of clustering structure: determination of the number of clusters (André Hardy).
14. Stability measures for assessing a partition and its clusters: application to symbolic data sets (Patrice Bertrand and Ghazi Bel Mufti).
15. Principal component analysis of symbolic data described by intervals (N.Carlo Lauro, Rosanna Verde and Antonio Irpino).
16. Generalized canonical analysis (N.Carlo Lauro, Rosanna Verde and Antonio Irpino).
PART III .SUPERVISED METHODS.
17. Bayesian decision trees (Jean-Paul Rasson, Pascale Lallemand and Séverine Adans).
18. Factor discriminant analysis (N.Carlo Lauro, Rosanna Verde and Antonio Irpino).
19. Symbolic linear regression methodology (Filipe Afonso, Lynne Billard, Edwin Diday and Mehdi Limam).
20. Multi-layer perceptrons and symbolic data (Fabrice Rossi and Brieuc Conan-Guez).
PART IV. APPLICATION AND THE SODAS SOFTWARE.
21. Application to the Finnish, Spanish and Portuguese data of the European Social Survey (Soile Mustjärvi and Seppo Laaksonen).
22. People’s life values and trust components in Europe: symbolic data analysis for 20-22 countries (Seppo Laaksonen).
23. Symbolic analysis of the Time Use Survey in the Basque country (Marta Mas and Haritz Olaeta).
24. SODAS2 software: overview and methodology (Anne de Baenst-Vandenbroucke and Yves Lechevallier).