Masterarbeit
Enhancing Read Query Execution on RDF Data with Schema-based Indexing
Completion
2026/08
Research Area
Intelligent Information Management
Students
Eshwari Kangutkar
Advisers
Christoph Göpfert M.Sc.
Description
The aim of this master’s thesis is to design and implement an efficient approach for RDF data indexing and query execution. The focus is on improving query performance compared to issuing SPARQL queries on the triplestore. This applies specifically to read queries on data structured according to predefined, known shapes modeled using the Shape Expressions Language (ShEx). The solution must use an established indexing platform, such as Apache Solr, as a basis for building the search index.
First, a requirements analysis must be conducted. This is followed by a review of the state of the art in approaches to RDF data indexing. Existing solutions need to be classified and evaluated according to the identified requirements. Building on these findings, the objective of this thesis project is to develop solutions for the following tasks: 1) mapping shapes from ShEx schemas into an indexing schema, 2) ingesting data from a triplestore into the target index platform using existing solutions for SPARQL query generation from ShEx schemas, and 3) developing mechanisms to keep the index up-to-date after write operations on the knowledge graph. The conceptual feasibility of the proposed approach should be demonstrated through the development of a prototype. The effectiveness of the solution has to be evaluated in a suitable evaluation, with a focus on read query scenarios.


