List comprehension

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set-builder notation (set comprehension.) as distinct from the use of map and filter functions.

Contents

[edit] Overview

Consider the following example in set builder notation.

S=\{\,2\cdot x\mid x \in \mathbb{N},\ x^2>3\,\}

This can be read, "S is the set of all 2 times x where x is an item in the set of natural numbers, for which x squared is greater than 3."

In this annotated version of the example:

S=\{\,\underbrace{2\cdot x}_{\color{Violet}\text{output function}}\mid \underbrace{x}_{\color{Violet}\text{variable}} \in \underbrace{\mathbb{N}}_{\color{Violet}\text{input set}},\ \underbrace{x^2>3}_{\color{Violet}\text{predicate}}\,\}
  • x is the variable representing members of an input set.
  • \mathbb{N} represents the input set, which in this example is the set of natural numbers
  • x2 > 3 is a predicate function acting as a filter on members of the input set.
  • 2\cdot x is an output function producing members of the new set from members of the input set that satisfy the predicate function.
  • {} brackets contain the expression
  • \mid , the vertical bar and the comma are separators.


A list comprehension has the same syntactic components to represent generation of a list in order from an input list or iterator:

  • A variable representing members of an input list.
  • An input list (or iterator).
  • An optional predicate expression.
  • And an output expression producing members of the output list from members of the input iterable that satisfy the predicate.

The order of generation of members of the output list is based on the order of items in the input.


In Haskell's list comprehension syntax, this set-builder construct would be written similarly, as:

s = [ 2*x | x <- [0..], x^2 > 3 ] 

Here, the list [0..] represents \mathbb{N}, x^2>3 represents the predicate, and 2*x represents the output expression.


List comprehensions give results in a defined order, (unlike the members of sets); and list comprehensions may generate the members of a list in order, rather than produce the entirety of the list thus allowing, for example, the previous Haskell definition of the members of an infinite list.

[edit] History

The SETL programming language (later 1960's) had a set formation construct, and the computer algebra system AXIOM (1973) has a similar construct that processes streams, but the first use of the term "comprehension" for such constructs was in Rod Burstall and John Darlington's description of their programming language NPL from 1977.

Comprehensions were proposed as a query notation for databases[1] and were implemented in the Kleisli database query language.[2]

[edit] Examples in different programming languages

The following provides a few examples of specific syntax used in programming languages. For a more comprehensive comparison, please see the main article in the Programming language comparison series.

Although the original example denotes an infinite list, few languages can express that, so in some of those cases we show how to take a subset of 0,1,...,100 rather than a subset of \mathbb{N}.


[edit] Clojure

An infinite lazy sequence:

(for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x))

This code produces the first 20 items of the above sequence:

(take 20 (for [x (iterate inc 0) :when (> (* x x) 3)] (* 2 x)))

Result:

(4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42)


[edit] C#

The infinite set of natural numbers can be made available as a static property with a getter:

public static IEnumerable<int> Naturals 
{
    get { for (int n = 0; ; n++) yield return n; } 
}

Then the set can be defined by a comprehension:

var s = from x in Naturals where x*x > 3 select x*2;

Latest .NET 3.5 Framework has Enumerable.Range method which allows to write like this:

var s = from x in Enumerable.Range(0, int.MaxValue) where x*x > 3 select x*2;

or equivalent code through chain methods:

var s = Enumerable.Range(0, int.MaxValue).Where(x => x*x > 3).Select(x => x*2);

[edit] Erlang

The same example in Erlang:

S = [2*X || X <- lists:seq(0,100), X*X > 3].


[edit] F#

The F# generator comprehension has the list comprehension syntax elements. Generator comprehensions can be used to generate Lists, Sequences (like lists but evaluated on-demand) and Arrays (not discussed here).

Generators are of the form [for x in collection do ... yield expr] for lists and seq {for x in collection do ... yield expr} for sequences. For example:

(* Int32.MaxValue used to indicate "infinite" *)
> seq { for x in 0..System.Int32.MaxValue do
          if x*x > 3 then yield 2*x } ;;
val it : seq<int> = seq [4; 6; 8; 10; ...]


[edit] Haskell

(Please refer to the main example in the overview).


[edit] JavaFX

Using the for statement and a boolean expression:

  var s = for (i in [1..100][n | n*n > 3]) { 2*i }


[edit] Javascript 1.8

Number.prototype.__iterator__=function(){for (let i=0; i<this; i++) yield i}
 
var s = [2*i for (i in 100) if (i*i>3)]

[edit] OCaml

OCaml Batteries Included has uniform comprehension syntax for lists, arrays, enumerations (like streams), lazy lists (like lists but evaluated on-demand), sets, hashtables, etc.

Comprehension are of the form [? expression | x <- enumeration ; condition; condition ; ...]

For instance,

#   [? 2 * x | x <- 0 -- max_int ; x * x > 3];;
- : int Enum.t = <abstr>

or, to compute a list,

#   [? List: 2 * x | x <- 0 -- 100 ; x * x > 3];;
- : int list = [2; 4; 6; 8; 10]

or, to compute a set,

#   [? PSet: 2 * x | x <- 0 -- 100 ; x * x > 3];;
- : int PSet.t = <abstr>

etc.


[edit] Python

The Python programming language has a corresponding syntax for expressing list comprehensions. The near-equivalent in Python to the example above is as follows:

S = [2*x for x in range(101) if x**2 > 3]

A generator expression may be used in Python 2.4 to achieve functional equivalence with S using a generator to iterate an infinite list:

from itertools import count
S = (2*x for x in count() if x**2 > 3)


[edit] Scala

Using the for-comprehension:

val s = for (x <- Stream.from(0); if x*x > 3) yield 2*x


[edit] Visual Prolog

S = [ 2*X || X = std::fromTo(1, 100), X^2 > 3 ]


[edit] Similar constructs

[edit] Monad comprehension

In Haskell, a monad comprehension is a generalization of the list comprehension to other monads in functional programming.


[edit] Set comprehension

Version 3 of the Python language introduces syntax for set comprehensions. Similar in form to list comprehensions, set comprehensions generate Python sets instead of lists.

>>> s = {v for v in 'ABCDABCD' if v not in 'CB'}
>>> print(s)
{'A', 'D'}
>>> type(s)
<class 'set'>
>>>


[edit] Dict comprehension

Version 3 of the Python language introduces syntax for dictionary comprehensions. Similar in form to list comprehensions, dictionary comprehensions generate Python dictionaries instead of lists.

>>> d = {k:v for k,v in zip(range(4), 'ABCD') if v not in 'CB'}
>>> print(d)
{0: 'A', 3: 'D'}
>>> type(d)
<class 'dict'>
>>>


[edit] Parallel list comprehension

The Glasgow Haskell Compiler has an extension called parallel list comprehension (also known as zip-comprehension) that permits multiple independent branches of qualifiers within the list comprehension syntax. Whereas qualifiers separated by commas are dependent, qualifier branches separated by pipes are evaluated in parallel.


[edit] XQuery and XPath

Like the original NPL use, these are fundamentally database access languages.

This makes the comprehension concept more important, because it is computationally infeasible to retrieve the entire list and operate on it (the initial 'entire list' may be an entire XML database).

In XPath, the expression:

/library/book//paragraph[@style='first-in-chapter']

is conceptually evaluated as a series of "steps" where each step produces a list and the next step applies a filter function to each element in the previous step's output. See: http://www.w3.org/TR/xpath#section-Location-Steps.

In XQuery, full XPath is available, but FLWOR statements are also used, which is a more powerful comprehension construct. See http://www.w3schools.com/XQuery/xquery_flwor.asp.

for $b in //book
where $b[@pages < 400]
order by $b//title
return
  <shortBook>
    <title>{$b//title}</title>
    <firstPara>{($book//paragraph)[1]}</firstPara>
  </shortBook>

Here the XPath //book is evaluated to create a sequence (aka list); the where clause is a functional "filter", the order by sorts the result, and the <shortBook>...</shortBook> XML snippet is actually an anonymous function that builds/transforms XML for each element in the sequence using the 'map' approach found in other functional languages.

So, in another functional language the above FLWOR statement may be implemented like this:

 map(
   newXML(shortBook, newXML(title, $1.title), newXML(firstPara, $1...))
   filter(
     lt($1.pages, 400), 
     xpath(//book)
   )
 )


[edit] See also


[edit] Notes and references

  • List Comprehension in The Free On-line Dictionary of Computing, Editor Denis Howe.
  • Phil Trinder [1] Comprehensions, a query notation for DBPLs, Proceedings of the third international workshop on Database programming languages : bulk types & persistent data: bulk types & persistent data, Nafplion, Greece, Pages 55-68, 1992.
  • Philip Wadler. Comprehending Monads. Proceedings of the 1990 ACM Conference on LISP and Functional Programming, Nice. 1990.
  • Limsoon Wong [2] The Functional Guts of the Kleisli Query System, International Conference on Functional Programming, Proceedings of the fifth ACM SIGPLAN international conference on Functional programming, Pages 1-10, 2000.

Haskell:

OCaml:

Python:

PostScript:

Common Lisp

Clojure

Axiom:

  • Axiom stream examples: [3]


[edit] External links

Personal tools