LAIR: A Language for Automated Semantics-Aware Text Sanitization based on Frame Semantics

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

We present \lair{}: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and redact the textual content. While \lair{} presupposes superficial knowledge of frames and frame semantics, it requires only limited prior programming experience. It neither contain scripting or I/O primitives, nor does it contain general loop constructions and is not Turing-complete. We have implemented a \lair{} compiler and integrated it in a pipeline for automated redaction of web pages. We detail our experience with automated redaction of web pages for subjectively undesirable content; initial experiments suggest that using a small language based on semantic recognition of undesirable terms can be highly useful as a supplement to traditional methods of text sanitization.
Original languageEnglish
Title of host publicationProceedings of the 3rd IEEE International Conference on Semantic Computing (ICSC 2009)
Number of pages6
PublisherIEEE Computer Society Press
Publication date2009
Pages47-52
ISBN (Print)978-0-7695-3800-6
DOIs
Publication statusPublished - 2009
EventIEEE International Conference on Semantic Computing - Berkeley, United States
Duration: 14 Sep 200916 Sep 2009
Conference number: 3

Conference

ConferenceIEEE International Conference on Semantic Computing
Nummer3
LandUnited States
ByBerkeley
Periode14/09/200916/09/2009

ID: 16239403