Regular expression containment: coinductive axiomatization and computational interpretation

Publikation: Bidrag til tidsskriftKonferenceartikelForskningfagfællebedømt

We present a new sound and complete axiomatization of regular
expression containment. It consists of the conventional axiomatization
of concatenation, alternation, empty set and (the singleton set
containing) the empty string as an idempotent semiring, the fixedpoint
rule E = 1 + E E for Kleene-star, and a general coinduction
rule as the only additional rule.
Our axiomatization gives rise to a natural computational interpretation
of regular expressions as simple types that represent parse
trees, and of containment proofs as coercions. This gives the axiomatization
a Curry-Howard-style constructive interpretation: Containment
proofs do not only certify a language-theoretic containment,
but, under our computational interpretation, constructively
transform a membership proof of a string in one regular expression
into a membership proof of the same string in another regular
expression.
We show how to encode regular expression equivalence proofs
in Salomaa’s, Kozen’s and Grabmayer’s axiomatizations into our
containment system, which equips their axiomatizations with a
computational interpretation and implies completeness of our axiomatization.
To ensure its soundness, we require that the computational
interpretation of the coinduction rule be a hereditarily total
function. Hereditary totality can be considered the mother of syntactic
side conditions: it “explains” their soundness, yet cannot be
used as a conventional side condition in its own right since it turns
out to be undecidable.
We discuss application of regular expressions as types to bit
coding of strings and hint at other applications to the wide-spread
use of regular expressions for substring matching, where classical
automata-theoretic techniques are a priori inapplicable.
Neither regular expressions as types nor subtyping interpreted
coercively are novel per se. Somewhat surprisingly, this seems to
be the first investigation of a general proof-theoretic framework for
the latter in the context of the former, however.
OriginalsprogEngelsk
TidsskriftA C M / S I G P L A N Notices
Vol/bind46
Udgave nummer1
Sider (fra-til)385-398
Antal sider14
ISSN1523-2867
DOI
StatusUdgivet - 2011
Begivenhed38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages - Austin, USA
Varighed: 26 jan. 201128 jan. 2011
Konferencens nummer: 38

Konference

Konference38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
Nummer38
LandUSA
ByAustin
Periode26/01/201128/01/2011

ID: 37560029