
Core Data Analysis: Summarization, Correlation, and Visualization
Boris Mirkin
Résumé
With in-depth descriptions of data analysis techniques both for summarizing and correlation, the author's unconventional approach employs the concept of multivariate data summarization as an alternative to conventional machine-learning prediction methods.
This text examines the goals of data analysis with respect to enhancing knowledge, and identifies data summarization and correlation analysis as the core issues. Data summarization, both quantitative and categorical, is treated within the encoder-decoder paradigm bringing forward a number of mathematically supported insights into the methods and relations between them. Two Chapters describe methods for categorical summarization: partitioning, divisive clustering and separate cluster finding and another explain the methods for quantitative summarization, Principal Component Analysis and PageRank.
Features:
* An in-depth presentation of K-means partitioning including a corresponding Pythagorean decomposition of the data scatter.
* Advice regarding such issues as clustering of categorical and mixed scale data, similarity and network data, interpretation aids, anomalous clusters, the number of clusters, etc.
* Thorough attention to data-driven modelling including a number of mathematically stated relations between statistical and geometrical concepts including those between goodness-of-fit criteria for decision trees and data standardization, similarity and consensus clustering, modularity clustering and uniform partitioning.
New edition highlights:
* Inclusion of ranking issues such as Google PageRank, linear stratification and tied rankings median, consensus clustering, semi-average clustering, one-cluster clustering
* Restructured to make the logics more straightforward and sections self-contained
Core Data Analysis: Summarization, Correlation and Visualization is aimed at those who are eager to participate in developing the field as well as appealing to novices and practitioners.
He develops methods for clustering and interpretation of complex data within the "data recovery" perspective. Currently these approaches are being extended to automation of text analysis problems including the development and use of hierarchical ontologies. He has published a hundred refereed papers and a dozen books, of which the latest are: "Clustering: A Data Recovery Approach" (Chapman and Hall/CRC Press, 2012) and a textbook "Introductory Data Analysis" (In Russian, URAIT Publishers, Moscow, 2016).
Caractéristiques techniques
PAPIER | |
Éditeur(s) | Springer |
Auteur(s) | Boris Mirkin |
Parution | 17/04/2019 |
Nb. de pages | 524 |
EAN13 | 9783030002701 |
Avantages Eyrolles.com
Consultez aussi
- Les meilleures ventes en Graphisme & Photo
- Les meilleures ventes en Informatique
- Les meilleures ventes en Construction
- Les meilleures ventes en Entreprise & Droit
- Les meilleures ventes en Sciences
- Les meilleures ventes en Littérature
- Les meilleures ventes en Arts & Loisirs
- Les meilleures ventes en Vie pratique
- Les meilleures ventes en Voyage et Tourisme
- Les meilleures ventes en BD et Jeunesse