Generic multiset programming with discrimination-based joins and symbolic Cartesian products

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Generic multiset programming with discrimination-based joins and symbolic Cartesian products. / Henglein, Fritz; Larsen, Ken Friis.

In: Higher-Order and Symbolic Computation, Vol. 23, No. 3, 2010, p. 337-370.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Henglein, F & Larsen, KF 2010, 'Generic multiset programming with discrimination-based joins and symbolic Cartesian products', Higher-Order and Symbolic Computation, vol. 23, no. 3, pp. 337-370. https://doi.org/10.1007/s10990-011-9078-8

APA

Henglein, F., & Larsen, K. F. (2010). Generic multiset programming with discrimination-based joins and symbolic Cartesian products. Higher-Order and Symbolic Computation, 23(3), 337-370. https://doi.org/10.1007/s10990-011-9078-8

Vancouver

Henglein F, Larsen KF. Generic multiset programming with discrimination-based joins and symbolic Cartesian products. Higher-Order and Symbolic Computation. 2010;23(3):337-370. https://doi.org/10.1007/s10990-011-9078-8

Author

Henglein, Fritz ; Larsen, Ken Friis. / Generic multiset programming with discrimination-based joins and symbolic Cartesian products. In: Higher-Order and Symbolic Computation. 2010 ; Vol. 23, No. 3. pp. 337-370.

Bibtex

@article{a04b367df8514e0396a32fb925b9b0b1,
title = "Generic multiset programming with discrimination-based joins and symbolic Cartesian products",
abstract = "This paper presents GMP, a library for generic, SQL-style programming with multisets. It generalizes the querying core of SQL in a number of ways: Multisets may contain elements of arbitrary first-order data types, including references (pointers), recur- sive data types and nested multisets; it contains an expressive embedded domain-specific language for specifying user-definable equivalence and ordering relations, extending the built-in equality and inequality predicates; it admits mapping arbitrary functions over mul- tisets, not just projections; it supports user-defined predicates in selections; and it allows user-defined aggregation functions.Most significantly, it avoids many cases of asymptotically inefficient nested iteration through Cartesian products that occur in a straightforward stream-based implementation of multisets. It accomplishes this by employing two novel techniques: symbolic (term) repre- sentations of multisets, specifically for Cartesian products, for facilitating dynamic symbolic computation, which intersperses algebraic simplification steps with conventional data pro- cessing; and discrimination-based joins, a generic technique for computing equijoins based on equivalence discriminators, as an alternative to hash-based and sort-merge joins.Full source code for GMP in Haskell, which is based on generic top-down discrimina- tion (not included), is included for experimentation. We provide illustrative examples whose performance indicates that GMP, even without requisite algorithm and data structure engi- neering, is a realistic alternative to SQL even for SQL-expressible queries.",
author = "Fritz Henglein and Larsen, {Ken Friis}",
year = "2010",
doi = "10.1007/s10990-011-9078-8",
language = "English",
volume = "23",
pages = "337--370",
journal = "Higher-Order and Symbolic Computation",
issn = "1388-3690",
publisher = "Springer",
number = "3",

}

RIS

TY - JOUR

T1 - Generic multiset programming with discrimination-based joins and symbolic Cartesian products

AU - Henglein, Fritz

AU - Larsen, Ken Friis

PY - 2010

Y1 - 2010

N2 - This paper presents GMP, a library for generic, SQL-style programming with multisets. It generalizes the querying core of SQL in a number of ways: Multisets may contain elements of arbitrary first-order data types, including references (pointers), recur- sive data types and nested multisets; it contains an expressive embedded domain-specific language for specifying user-definable equivalence and ordering relations, extending the built-in equality and inequality predicates; it admits mapping arbitrary functions over mul- tisets, not just projections; it supports user-defined predicates in selections; and it allows user-defined aggregation functions.Most significantly, it avoids many cases of asymptotically inefficient nested iteration through Cartesian products that occur in a straightforward stream-based implementation of multisets. It accomplishes this by employing two novel techniques: symbolic (term) repre- sentations of multisets, specifically for Cartesian products, for facilitating dynamic symbolic computation, which intersperses algebraic simplification steps with conventional data pro- cessing; and discrimination-based joins, a generic technique for computing equijoins based on equivalence discriminators, as an alternative to hash-based and sort-merge joins.Full source code for GMP in Haskell, which is based on generic top-down discrimina- tion (not included), is included for experimentation. We provide illustrative examples whose performance indicates that GMP, even without requisite algorithm and data structure engi- neering, is a realistic alternative to SQL even for SQL-expressible queries.

AB - This paper presents GMP, a library for generic, SQL-style programming with multisets. It generalizes the querying core of SQL in a number of ways: Multisets may contain elements of arbitrary first-order data types, including references (pointers), recur- sive data types and nested multisets; it contains an expressive embedded domain-specific language for specifying user-definable equivalence and ordering relations, extending the built-in equality and inequality predicates; it admits mapping arbitrary functions over mul- tisets, not just projections; it supports user-defined predicates in selections; and it allows user-defined aggregation functions.Most significantly, it avoids many cases of asymptotically inefficient nested iteration through Cartesian products that occur in a straightforward stream-based implementation of multisets. It accomplishes this by employing two novel techniques: symbolic (term) repre- sentations of multisets, specifically for Cartesian products, for facilitating dynamic symbolic computation, which intersperses algebraic simplification steps with conventional data pro- cessing; and discrimination-based joins, a generic technique for computing equijoins based on equivalence discriminators, as an alternative to hash-based and sort-merge joins.Full source code for GMP in Haskell, which is based on generic top-down discrimina- tion (not included), is included for experimentation. We provide illustrative examples whose performance indicates that GMP, even without requisite algorithm and data structure engi- neering, is a realistic alternative to SQL even for SQL-expressible queries.

U2 - 10.1007/s10990-011-9078-8

DO - 10.1007/s10990-011-9078-8

M3 - Journal article

VL - 23

SP - 337

EP - 370

JO - Higher-Order and Symbolic Computation

JF - Higher-Order and Symbolic Computation

SN - 1388-3690

IS - 3

ER -

ID: 37559988