Data Management Systems (DMS)

The Data Management Systems (DMS) Lab conducts computer systems research in areas emerging with new challenges in data management. Projects include design of spatial databases, scalable data streaming, actor database systems and in-memory databases, graph analysis systems, and cloud computing deployments. 

The group is keen on validating their work experimentally -- we love writing code, which is not to say that our love for the blackboard is in any way diminished. :-)

When conducting our work we usually resort to one or more of the following:

  • Abstractions & Languages
  • Combinatorial Optimization
  • Indexing & Data Structures
  • System Implementation & Design
  • Statistics & Prediction
  • Parallelism & Distribution

You can learn about our work in detail through our publications.

The group is leading the organization of the EDBT/ICDT 2020 Joint Conference in Copenhagen, Denmark. Please do not miss out on the opportunity to join us for this exciting event! 

 

 

 

 

Open Geodata Serving

In a collaboration with the Danish Geodata Agency, we have explored new approaches to cook and serve geodata to the public on the Web. A main challenge in cartography is producing maps of high quality over complex shapes requires the craft of human expertise. However, given the explosion in geospatial data, the pressure for high-productivity tools for cartography is increasing at a fast pace. Our work has explored how to create a new class of declarative cartography tools. Our language CVL, the Cartographic Visualization Language, can be processed entirely within a spatial DBMS, opening up exciting opportunities for automatic optimization and scalability. Additionally, we have investigated how declarative cartography can be achieved efficiently inside the DBMS in the presence of fine-grained access control. In a separate line of work, we have also analyzed production logs for map-serving web services. These production logs reveal strong spatial and temporal concentration patterns which can be exploited for more efficient caching.

Behavioral Simulations and Computer Games

In collaboration with the Cornell Database Group, we have worked on a new scripting platform for games and agent-based simulations. Our recent work in this project has been around iterated spatial join techniques optimized for main memory, as well as communication, especially mean latency and jitter optimizations, for cloud environments. We have also explored techniques for automatic parallelization of large-scale behavioral simulations, as well as efficient checkpoint-recovery techniques for Massively Multiplayer Online Games (MMOs).

Multidimensional Indexing and Large Main Memories

We have also studied index structures for either read-intensive or write-intensive workloads. For the first class of workloads, we have studied experimentally, together with collaborators from Saarland University and ETH Zurich, the performance of one specific index structure, the Dwarf index. For the second class of workloads, we have studied how to answer queries over collections of moving objects, e.g., for vehicle tracking or spatial agent-based simulations. The problem is challenging because these applications have very high update rates that result from continuous movement. Our technique, MOVIES, is based on frequently rebuilding index snapshots in main memory. Using data partitioning over multiple nodes in a small cluster, we have scaled MOVIES up to 100 million moving objects over the road network of Germany, while keeping snapshot latencies below a few seconds.

Dataspaces and Personal Information Management

In early work at the ETH Zurich Systems Group, we have co-designed the iMeMex Dataspace Management System, a hybrid information integration architecture that allows users to transition from search to data integration in a pay-as-you-go fashion. Unlike traditional relational DBMS, iMeMex does not take full control of the data, but offers services over one's complex personal dataspace. We have explored several interesting themes in the design of iMeMex, such as the definition of a unified data model for personal information, a novel technique based on mapping hints (called trails) to increase the level of integration of personal information over time, and the search over graphs of user data created by view definitions.

 

People

Name Title Phone E-mail
Liu, Yijian PhD Student   E-mail
Nunes Laigner, Rodrigo PhD Fellow +4535327137 E-mail
Quan, Li PhD Fellow +4535333027 E-mail
Vaz Salles, Marcos António Associate Professor +4523839958 E-mail
Zhou, Yongluan Professor +4529883168 E-mail