PhD thesis defense by Niels Bjørn Bugge Grathwohl – University of Copenhagen

PhD thesis defense by Niels Bjørn Bugge Grathwohl

November 4th, 2015 at 1.00 pm, Niels Bjørn Bugge Grathwohl will defend his PhD thesis.

Title

Parsing With Regular Expressions & Extensions to Kleene Algebra

Abstract

In the first part of this thesis, we investigate methods for regular expression parsing.

We present an O(mn) two-pass algorithm for greedy regular expression parsing in a semi-streaming fashion for expressions of size m and input of size n.
Pass one outputs a log of k bits per input symbol, where k is the number of alternatives and Kleene stars in the expression. This log is used in the second pass to produce a full parse tree.

The two-pass algorithm is extended to an O(2^{m log m}+mn)-time optimally streaming parsing algorithm: parts of the parse tree are output as soon as it is semantically possible to do so.
To be optimal, the algorithm performs a PSPACE-complete preprocessing step; for a fixed RE the running time is linear in the input size.

Finally, we present and implement a determinization procedure, omitting the preprocessing step, and a surface language, Kleenex, for expressing general string transductions.
We have implemented a compiler that translates Kleenex programs into efficient C code.
The resulting programs are essentially optimally streaming, run in worst-case linear time in the input size, and show consistent high performance in the 1 Gbps range on various use cases.

In the second part of this thesis, we study two extensions to Kleene algebra.

Chomsky algebra is an algebra with a structure similar to Kleene algebra, but with a generalized mu-operator for recursion instead of the Kleene star.
We show that the axioms of idempotent semirings along with continuity of the mu-operator completely axiomatize the equational theory of the context-free languages.

KAT+B! is an extension to Kleene algebra with tests (KAT) that adds mutable state.
We describe a test algebra B! for mutable tests and give a commutative coproduct between KATs.
Combining the axioms of B! with those of KAT and some commutativity conditions completely axiomatize the equational theory of an arbitrary KAT enriched with mutable state.

For an electronic copy of the thesis, please contact Jette Giovanni Møller

Assessment Committee

  • Chairman: Associate Professor Andrzej Filinski, Department of Computer Science, University of Copenhagen, DK
  • Member 1: Senior lecturer Alexandra Silva, Department of Computer Science, University College London, UK
  • Member 2: Assistant professor Nate Foster, Department of Computer Science, Cornell University, USA

Academic advisors

  • Professor Fritz Henglein, Department of Computer Science, University of Copenhagen, DK