Data Analysis with Open ource Tools

540 pages, parution le 14/12/2010

540 pages, parution le 14/12/2010


Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.

Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve - rather than rely on tools to think for you.

Use graphics to describe data with one, two, or dozens of variables Develop conceptual models using back-of-the-envelope calculations, as well asscaling and probability arguments Mine data with computationally intensive methods such as simulation and clustering Make your conclusions understandable through reports, dashboards, and other metrics programs Understand financial calculations, including the time-value of money Use dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situations Become familiar with different open source programming environments for data analysis "Finally, a concise reference for understanding how to conquer piles of data."-Austin King, Senior Web Developer, Mozilla

Philipp K. Janert

After previous careers in physics and softwaredevelopment, Philipp K. Janert currentlyprovides consulting services for data analysis,algorithm development, and mathematical modeling.He has worked for small start-ups and in largecorporate environments, both in the U.S. andoverseas.

He prefers simple solutions that workto complicated ones that don't, and thinks thatpurpose is more important than process. Philippis the author of "Gnuplot in Action - UnderstandingData with Graphs" (Manning Publications), and haswritten for the O'Reilly Network, IBM developerWorks,and IEEE Software. He is named inventor on a handfulof patents, and is an occasional contributor to CPAN.He holds a Ph.D. in theoretical physics from theUniversity of Washington. Visit his company websiteat


  • Chapter 1 Introduction
  • Graphics: Looking at Data
  • Chapter 2 A Single Variable: Shape and Distribution
  • Chapter 3 Two Variables: Establishing Relationships
  • Chapter 4 Time As a Variable: Time-Series Analysis
  • Chapter 5 More Than Two Variables: Graphical Multivariate Analysis
  • Chapter 6 Intermezzo: A Data Analysis Session
  • Analytics: Modeling Data
  • Chapter 7 Guesstimation and the Back of the Envelope
  • Chapter 8 Models from Scaling Arguments
  • Chapter 9 Arguments from Probability Models
  • Chapter 10 What You Really Need to Know About Classical Statistics
  • Chapter 11 Intermezzo: Mythbusting-Bigfoot, Least Squares, and All That
  • Computation: Mining Data
  • Chapter 12 Simulations
  • Chapter 13 Finding Clusters
  • Chapter 14 Seeing the Forest for the Trees: Finding Important Attributes
  • Chapter 15 Intermezzo: When More Is Different
  • Applications: Using Data
  • Chapter 16 Reporting, Business Intelligence, and Dashboards
  • Chapter 18 Predictive Analytics
  • Chapter 19 Epilogue: Facts Are Not Reality
  • Appendix Programming Environments for Scientific Computation and Data Analysis
Caractéristiques techniques du livre "Data Analysis with Open ource Tools"

Éditeur(s) O'Reilly
Auteur(s) Philipp K. Janert
Parution 14/12/2010
Nb. de pages 540
Format 18 x 23.5
Couverture Broché
Intérieur Noir et Blanc
EAN13 9780596802356
ISBN13 978-0-596-80235-6


