FLOPS

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Computer Performance
Name flops
megaflop 106
gigaflop 109
teraflop 1012
petaflop 1015
exaflop 1018
zettaflop 1021
yottaflop 1024

In computing, FLOPS (or flops or flop/s) is an acronym meaning FLoating point Operations Per Second. The FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations, similar to instructions per second. Since the final S stands for "second", conservative speakers consider "FLOPS" as both the singular and plural of the term, although the singular "FLOP" is frequently encountered. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "FLoating-point OPeration", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate.

NEC's SX-9 supercomputer was the world's first vector processor to exceed 100 gigaFLOPS per single core. IBM's supercomputer dubbed Blue Gene/P is designed to eventually operate at three petaFLOPS.[1] However, the IBM Roadrunner is the first supercomputer to sustain one petaFLOPS.[2]

A basic calculator performs relatively few FLOPS. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time to exceed that needed by the operator. A response time below 0.1 second in a calculation context is usually perceived as instantaneous by a human operator,[3] so a simple calculator with multiply and divide needs only about 10 FLOPS.

Contents

[edit] Measuring performance

In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark must be available on all computers of interest. One example is the LINPACK benchmark.

There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance.

For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective.

Historically, the earliest reliably documented serious use of the Floating Point Operation as a metric appears to be AEC justification to Congress for purchasing a Control Data CDC 6600 in the mid-1960s.

The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security amended the Export Administration Regulations to base controls on Adjusted Peak Performance (APP) in Weighted TeraFLOPS (WT).

[edit] Records

In November 2008, the latest upgrade to the Cray XT Jaguar supercomputer at the Department of Energy’s (DOE’s) Oak Ridge National Laboratory (ORNL) has increased the system's computing power to a peak 1.64 “petaflops,” or quadrillion mathematical calculations per second, making Jaguar the world’s first petaflop system dedicated to open research.

In June 2008, AMD released ATI Radeon HD4800 series, which are reported to be the first GPU's to achieve one teraFLOP scale. On August 12, 2008 AMD released the ATI Radeon HD 4870X2 graphics card with two Radeon R770 GPUs totalling 2.4 teraFLOPs.

On May 25, 2008, an American military supercomputer built by IBM, named 'Roadrunner', reached the computing milestone of one petaflop by processing more than 1.026 quadrillion calculations per second. It headed the June, 2008[4] and November, 2008[5] TOP500 list of the most powerful supercomputers (excluding grid computers). The computer's name, Roadrunner, refers to the state bird of New Mexico.[6]

On February 4, 2008, the NSF and the University of Texas opened full scale research runs on an AMD, Sun supercomputer Ranger, the most powerful supercomputing system in the world for open science research, which operates at sustained speed of half a petaflop.

On October 25, 2007, NEC Corporation of Japan issued a press release[7] announcing its SX series model SX-9, claiming it to be the world's fastest vector supercomputer with a peak processing performance of 839 teraFLOPS. The SX-9 features the first CPU capable of a peak vector performance of 102.4 gigaFLOPS per single core.

On June 26, 2007, IBM announced the second generation of its top supercomputer, dubbed Blue Gene/P and designed to continuously operate at speeds exceeding one petaFLOPS. When configured to do so, it can reach speeds in excess of three petaFLOPS.[8] In June 2007, Top500.org reported the fastest computer in the world to be the IBM Blue Gene/L supercomputer, measuring a peak of 596 TFLOPS[9]. The Cray XT4 hit second place with 101.7 TFLOPS.

In June 2006, a new computer was announced by Japanese research institute RIKEN, the MDGRAPE-3. The computer's performance tops out at one petaFLOPS, almost two times faster than the Blue Gene/L, but MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the Top500.org list. It has special-purpose pipelines for simulating molecular dynamics.

Distributed computing uses the Internet to link personal computers to achieve a similar effect:

  • Folding@Home is of February 2009 sustaining over 4.9 PFLOPS [10], the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3 and powerful GPU units.[11]
  • The entire BOINC averages over 1.4 PFLOPS as of March 15, 2009[12].
  • SETI@Home computes data averages more than 528 TFLOPS[13]
  • Einstein@Home is crunching more than 150 TFLOPS[14]
  • As of August 2008, GIMPS is sustaining 27 TFLOPS.[15]

Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.13 GHz. The 80-core chip can increase this result to 2 TFLOPS at 6.26 GHz, although the thermal dissipation at this frequency exceeds 190 watts[16].

As of 2008, the fastest PC processors (quad-core) perform over 70 GFLOPS (Intel Core i7 965 XE) in double precision[17]. GPUs are considerably more powerful, for example, in the GeForce 200 Series the nVidia GTX 280 performs around 933 GFLOPS on 240 processing elements in single precision calculations[18], and that while GPUs are highly efficient at single precision calculations they are not as flexible as a general purpose CPUs in double precision operations.

[edit] Future developments

In May 2008 a collaboration was announced between NASA, SGI and Intel to build a 1 petaflop computer in 2009, scaling up to 10 PFLOPs by 2012.[19]

Given the current speed of progress, Supercomputers are projected to reach 1 Exaflop in 2019.[20] Erik P. DeBenedictis of Sandia National Laboratories theorizes that a Zettaflop computer is required to accomplish full weather modeling, which could cover a two week time span accurately.[21] Such systems might be built around 2030.[22]

[edit] Cost of computing

[edit] Hardware costs

The following is a list of examples of computers that demonstrates how performance has increased drastically and price has decreased drastically. The "cost per GFLOPS" is the cost for a set of hardware that would theoretically operate at one gigaflop per second. During the era when no single computation platform was able to achieve one GFLOPS, this table lists the total cost for multiple instances of a fast computation platform whose speed sums to one GFLOPS. Otherwise, the least expensive computing platform able to achieve one GFLOPS is listed.

Date Approximate cost per GFLOPS Technology Comments
1961 US$1,100,000,000,000 ($1.1 trillion), or US$1,100 per FLOPS About 17 million IBM 1620 units costing $64,000 each The 1620s multiplication operation takes 17.7ms.[23]
1984 US$15,000,000 Cray X-MP
1997 US$30,000 Two 16-processor Beowulf clusters with Pentium Pro microprocessors[24]
2000, April $1,000 Bunyip Beowulf cluster Bunyip was developed at Australian National University, and was the first sub-US$1/MFLOPS computing technology. It won the Gordon Bell Prize in 2000.
2000, May $640 KLAT2 KLAT2 was developed at the University of Kentucky.
2003, August $82 KASY0 KASY0 was also developed at the University of Kentucky.
2007, March $0.42 Ambric AM2045[25]


The trend toward a higher and higher number of transistors that can be placed inexpensively on an integrated circuit follows Moore's law. This trend explains the increasing speed and decreasing cost of computer processing.

[edit] Operation costs

In energy cost, according to the Green500 list, as of November 2008 the most efficient TOP500 supercomputer runs at 536.24 MFLOPS per watt. This translates to an energy requirement of 1.86 watts per GFLOPS, however this energy requirement will be much greater for less efficient supercomputers.

Hardware costs for low cost supercomputers may be less significant than energy costs when running continuously for several years. A Playstation 3 (PS3) 40 GiB (65 nm Cell) costs $399 and consumes 135 watts[26] or $118 of electricity each year if operated 24 hours per day, conservatively assuming U.S. national average residential electric rates of $0.10/kWh[27] (0.135 kW × 24 h × 365 d × 0.10 $/kWh = $118.26). The operating cost of electricity for 3.5 years ($413) is more than the cost of the PS3. However, "extreme gamers" only spend about 45 hours per week gaming[28], so in an "extreme" case, only 317 kWh are consumed annually at a cost of $31.68. Therefore a more realistic "extreme gamer" would require more than 12.5 years for total operating costs to exceed the original purchase price.

[edit] See also

[edit] References

  1. ^ IBM Press Release (2007-06-26). "IBM Triples Performance of World's Fastest, Most Energy-Efficient Supercomputer" (HTML). IBM. http://www-03.ibm.com/press/us/en/pressrelease/21791.wss. Retrieved on 2008-01-30. 
  2. ^ "Military supercomputer sets record - CNET News.com". http://news.cnet.com/Military-supercomputer-sets-record/2100-1010_3-6241145.html?tag=nefd.top. 
  3. ^ "Response Times: The Three Important Limits" (HTML). Jakob Nielsen. http://www.useit.com/papers/responsetime.html. Retrieved on 2008-06-11. 
  4. ^ Sharon Gaudin (2008-06-09). "IBM's Roadrunner smashes 4-minute mile of supercomputing". Computerworld. http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=hardware&articleId=9095318&taxonomyId=12&intsrc=kc_top. Retrieved on 2008-06-10. 
  5. ^ Austin ISC08
  6. ^ Fildes, Jonathan (2008-06-09). "Supercomputer sets petaflop pace". BBC News. http://news.bbc.co.uk/1/hi/technology/7443557.stm. Retrieved on 2008-07-08. 
  7. ^ "NEC Launches World's Fastest Vector Supercomputer, SX-9". NEC. 2007-10-25. http://www.nec.co.jp/press/en/0710/2501.html. Retrieved on 2008-07-08. 
  8. ^ "June 2008". TOP500. http://www.top500.org/lists/2008/06. Retrieved on 2008-07-08. 
  9. ^ "29th TOP500 List of World's Fastest Supercomputers Released". Top500.org. 2007-06-23. http://top500.org/news/2007/06/23/29th_top500_list_world_s_fastest_supercomputers_released. Retrieved on 2008-07-08. 
  10. ^ "Client statistics by OS". Folding@Home. 2009-01-22. http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats. Retrieved on 2009-01-23. 
  11. ^ Staff (November 6, 2008). "Sony Computer Entertainment's Support for Folding@home™ Project on PlayStation®3 Receives This Year's "Good Design Gold Award"". Sony Computer Entertainment Inc.. Sony Computer Entertainment Inc. (Sony Computer Entertainment Inc.). http://www.scei.co.jp/corporate/release/081106de.html. Retrieved on December 11, 2008. 
  12. ^ "Credit overview". BOINC. http://www.boincstats.com/stats/project_graph.php?pr=bo. Retrieved on 2008-08-04. 
  13. ^ "SETI@Home Credit overview". BOINC. http://www.boincstats.com/stats/project_graph.php?pr=sah. Retrieved on 2008-08-04. 
  14. ^ "Server Status". Einstein@Home. http://einstein.phys.uwm.edu/server_status.php. Retrieved on 2008-07-08. 
  15. ^ Internet PrimeNet Server Parallel Technology for the Great Internet Mersenne Prime Search
  16. ^ http://www.bit-tech.net/hardware/2007/04/30/the_arrival_of_teraflop_computing/2
  17. ^ "Intel Core i7 Performance Preview". TECHGAGE. 2008-11-03. http://techgage.com/article/intel_core_i7_performance_preview/9. Retrieved on 2008-11-17. 
  18. ^ http://www.tomshardware.com/reviews/nvidia-gtx-280,1953-2.html
  19. ^ "NASA collaborates with Intel and SGI on forthcoming petaflops super computers". Heise online. 2008-05-09. http://www.heise.de/english/newsticker/news/107683. 
  20. ^ Thibodeau, Patrick (2008-06-10). "IBM breaks petaflop barrier". InfoWorld. http://www.infoworld.com/article/08/06/10/IBM_breaks_petaflop_barrier_1.html. 
  21. ^ DeBenedictis, Erik P. (2005). "Reversible logic for supercomputing". Proceedings of the 2nd conference on Computing frontiers. pp. 391–402. ISBN 1595930191. 
  22. ^ "IDF: Intel says Moore's Law holds until 2029". Heise Online. 2008-04-04. http://www.heise.de/english/newsticker/news/106017. 
  23. ^ IBM 1961 BRL Report
  24. ^ Loki and Hyglac
  25. ^ Halfill, Tom R. (2006-10-10). "204101.qxd Ambric’s New Parallel Processor". Microprocessor Report (Reed Electronics Group): 1–9. http://www.ambric.com/pdf/MPR_Ambric_Article_10-06_204101.pdf 204101.qxd. Retrieved on 2008-07-08. 
  26. ^ Quilty-Harper, Conrad (2007-10-30). "40 GB PS3 features 65 nm chips, lower power consumption". Engadget. http://www.engadget.com/2007/10/30/40gb-ps3-features-65nm-chips-lower-power-consumption/. Retrieved on 2008-07-08. 
  27. ^ "Average Retail Price of Electricity to Ultimate Customers by End-Use Sector, by State". Energy Information Administration. 2008-06-10. http://www.eia.doe.gov/cneaf/electricity/epm/table5_6_a.html. Retrieved on 2008-07-08. 
  28. ^ "Extreme gamers spend an average of 45 hours per week playing video games". The NPD Group. 2008-08-11. http://www.npd.com/press/releases/press_080811.html. Retrieved on 2009-01-16. 

[edit] External links

Personal tools