Data visualization

From Wikipedia, the free encyclopedia

Jump to: navigation, search
The research process from data to visualization.[1]

Data visualization is the study of the visual representation of data, defined as information which has been abstracted in some schematic form, including attributes or variables for the units of information.[2]

Contents

[edit] Overview

The main goal of data visualization is to communicate information clearly and effectively through graphical means. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key-aspects in a more intuitive way. Yet designers often fail to achieve a balance between design and function, creating gorgeous data visualizations which fail to serve their main purpose — to communicate information.[3]

Data visualization is closely related to Information graphics, Information visualization, Scientific visualization and Statistical graphics. Data visualization is currently a very active and vital area of research, teaching and development. The term unites the established field of scientific visualization and the more recent field of information visualization.[4]

[edit] History

The field's origins are in the early days of computer graphics in the 1950s, when the first graphs and figures were generated by computers. A significant boost was given to the field with the appearance, in 1987, of the NSF report "Visualization in Scientific Computing" edited by Bruce H. McCormick, Thomas A. DeFanti and Maxine D. Brown. In this report the need for new computer-based visualization techniques was stressed. With the rapid increase of computing power, larger and more complex numerical models were developed, resulting in the generation of huge numerical data sets. Also, large data sets were generated by data acquisition devices such as medical scanners and microscopes, and data was collected in large databases containing text, numerical information and multimedia information. Advanced computer graphics techniques were needed to process and visualize these massive data sets.[4]

The phrase "Visualization in Scientific Computing" which turned into Scientific Visualization was used initially to refer to visualization as a part of a process of scientific computing: the use of computer modelling and simulation in scientific and engineering practice. More recently, visualization is increasingly also concerned with data from other sources, including large and heterogeneous data collections found in business and finance, administration, digital media, etc. A new research area called Information Visualization was launched in the early 1990s, to support analysis of abstract and heterogeneous data sets in many application areas. Therefore, the phrase "Data Visualization" is gaining acceptance to include both the scientific and information visualization fields.[4]

Since then data visualization is an evolving concept whose boundaries are continually expanding and, as such, is best defined in terms of loose generalizations. It refers to the more technologically advanced techniques, which allow visual interpretation of data through the representation, modelling and display of solids, surfaces, properties and animations, involving the use of graphics, image processing, computer vision and user interfaces. It encompasses a much broader range of techniques then specific techniques as solid modelling.[5]

[edit] Data visualization scope

There are different approaches on the scope of data visualization. One common focus is on information presentation. For example Michael Friendly (2008) presumes two main parts of data visualization: statistical graphics, and thematic cartography.[2] In this line the "Data Visualization: Modern Approaches" (2007) article gives an overview of seven subjects of data visualisation:[6]

All these subjects are all close related to graphic design and information reprentation.

On the other hand, from a computer science perspective, Frits H. Post (2002) categorized the field into a number of sub-fields: [4]

  • Visualization Algorithms and Techniques
  • Volume Visualization
  • Information Visualization
  • Multiresolution Methods
  • Modelling Techniques and
  • Interaction Techniques and Architectures

The success of data visualization is due to the soundness of the basic idea behind it: the use of computer-generated images to gain insight and knowledge from data and its inherent patterns and relationships. A second premise is the utilization of the broad bandwidth of the human sensory system in steering and interpreting complex processes, and simulations involving data sets from diverse scientific disciplines and large collections of abstract data from many sources. These concepts are extremely important and have a profound and widespread impact on the methodology of computational science and engineering, as well as on management and administration. The interplay between various application areas and their specific problem solving visualization techniques is emphasized in this book. [4]

[edit] Related fields

[edit] Data acquisition

Data acquisition is the sampling of the real world to generate data that can be manipulated by a computer. Sometimes abbreviated DAQ or DAS, data acquisition typically involves acquisition of signals and waveforms and processing the signals to obtain desired information. The components of data acquisition systems include appropriate sensors that convert any measurement parameter to an electrical signal, which is acquired by data acquisition hardware.

[edit] Data analysis

Data analysis is the process of looking at and summarizing data with the intent to extract useful information and develop conclusions. Data analysis is closely related to data mining, but data mining tends to focus on larger data sets, with less emphasis on making inference, and often uses data that was originally collected for a different purpose. In statistical applications, some people divide data analysis into descriptive statistics, exploratory data analysis and confirmatory data analysis, where the EDA focuses on discovering new features in the data, and CDA on confirming or falsifying existing hypotheses.

Types of data analysis are:

[edit] Data governance

Data governance encompasses the people, processes and technology required to create a consistent, enterprise view of an organisation's data in order to:

  • Increase consistency & confidence in decision making
  • Decrease the risk of regulatory fines
  • Improve data security
  • Maximize the income generation potential of data
  • Designate accountability for information quality

[edit] Data management

Data management comprises all the academic disciplines related to managing data as a valuable resource. The official definition provided by DAMA is that "Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise." This definition is fairly broad and encompasses a number of professions which may not have direct technical contact with lower-level aspects of data management, such as relational database management.

[edit] Data mining

Data mining is the process of sorting through large amounts of data and picking out relevant information. It is usually used by business intelligence organizations, and financial analysts, but is increasingly being used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods.

It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data"[7] and "the science of extracting useful information from large data sets or databases."[8] Data mining in relation to enterprise resource planning is the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making.[9]

[edit] Data modeling

The data modeling process.

Data modeling in software engineering is the process of creating a data model by applying formal data model descriptions using data modeling techniques. Data modeling is a technique for defining business requirements for a database. It is sometimes called database modeling because a data mode] is eventually implemented in a database.[10]

The figure illustrates the way data models are developed and used today. A conceptual data model is developed based on the data requirements for the application that is being developed, perhaps in the context of an activity model. The data model will normally consist of entity types, attributes, relationships, integrity rules, and the definitions of those objects. This is then used as the start point for interface or database design.[11]

[edit] See also

Software programs/ visualization applications/graphics toolkit
Organizations

[edit] References

  1. ^ National Visualization and Analytics Center. Retrieved 1 Juli 2008.
  2. ^ a b Michael Friendly (2008). "Milestones in the history of thematic cartography, statistical graphics, and data visualization".
  3. ^ "Data Visualization and Infographics" in: Graphics, Monday Inspiration, January 14th, 2008.
  4. ^ a b c d e Frits H. Post, Gregory M. Nielson and Georges-Pierre Bonneau (2002). Data Visualization: The State of the Art.
  5. ^ Paul Reilly, S. P. Q. Rahtz (eds.) 1992. Archaeology and the Information Age: A Global Perspective. p.92.
  6. ^ "Data Visualization: Modern Approaches". in: Graphics, August 2nd, 2007
  7. ^ W. Frawley and G. Piatetsky-Shapiro and C. Matheus (Fall 1992). "Knowledge Discovery in Databases: An Overview". AI Magazine: pp. 213–228. ISSN 0738-4602. 
  8. ^ D. Hand, H. Mannila, P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge, MA. ISBN 0-262-08290-X. 
  9. ^ Ellen Monk, Bret Wagner (2006). Concepts in Enterprise Resource Planning, Second Edition. Thomson Course Technology, Boston, MA. ISBN 0-619-21663-8. 
  10. ^ Whitten, Jeffrey L.; Lonnie D. Bentley, Kevin C. Dittman. (2004). Systems Analysis and Design Methods. 6th edition. ISBN 025619906X.
  11. ^ Matthew West and Julian Fowler (1999). Developing High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE).

[edit] Further reading

  • Chandrajit Bajaj, Bala Krishnamurthy (1999). 'Data Visualization Techniques.
  • William S. Cleveland (1993). Visualizing Data. Hobart Press.
  • William S. Cleveland (1994). The Elements of Graphing Data. Hobart Press.
  • Alexander N. Gorban, Balázs Kégl and Andrey Zinovyev (2007). Principal Manifolds for Data Visualization and Dimension Reduction.
  • John P. Lee and Georges G. Grinstein (eds.) (1994). Database Issues for Data Visualization: IEEE Visualization '93 Workshop, San Diego.
  • Peter R. Keller and Mary Keller (1993). Visual Cues: Practical Data Visualization.
  • Frits H. Post, Gregory M. Nielson and Georges-Pierre Bonneau (2002). Data Visualization: The State of the Art.

[edit] External links

Personal tools
Languages