Extensions for Jena SPARQL Statement Handling

Jena provides the Query and UpdateRequest domain classes, however it lacks a unifying infrastructure. The goal of this module is to provide it.

Features

  • Interfaces and implementations for a uniform infrastructure of SPARQL statements and SPARQL parsers
  • Removal of unused prefixes
  • Utils to apply element / algebra operations on both queries and update requests.

Design

Sparql Statements

The main interface is SparqlStmt which has the specializations SparqlQueryStmt, SparqlUpdateStmt and SparqlStmtUnknown. Each SparqlStmt instance is constructed from a String or an Object representation. A SparqlStmt always has a string representation obtainable via toString. A SparqlStmt may be parsed whereas parsing of the original string may have failed with the cause retrievable via getParseException. If a SparqlStmt was parsed, toString will return the SPARQL serialization of the parsed object (Query or UpdateRequest). In that case, getOriginalString yields the string that was fed to the parser.

Retaining the original string is particularly useful is middleware scenarios: Parsing of a statement may fail due to undefined prefixes in the middleware, but forwarding the orginal string to another endpoint may still succeed.

public interface SparqlStmt {
    boolean isQuery();
    boolean isUpdateRequest();
    boolean isUnknown();

    boolean isParsed();

    SparqlStmtUpdate getAsUpdateStmt();
    SparqlStmtQuery getAsQueryStmt();

    QueryParseException getParseException();
    String getOriginalString();

    PrefixMapping getPrefixMapping();

    SparqlStmt clone();

    default Query getQuery();
    UpdateRequest getUpdateRequest();
}

SPARQL Parsers

A SparqlStmt parser is conceptually a Function<String, SparqlStmt>. Implementations of parser support configuration, namely prefix mapping, syntax and base IRI. Likewise, SparqlQueryParser and SparqlUpdateParser are Function<String, SparqlQueryStmt> and Function<String, SparqlUpdateStmt>, respectively.

Jena provides static methods in the QueryFactory and UpdateFactory classes for parsing SPARQL queries and update requests. These methods are wrapped SparqlQueryParserImpl and SparqlUpdateParser.

The SparqlStmtParserImpl implementation for convenience provides several static factory methods. Under the hood they create a SparqlQueryParser and SparqlUpdateParser. Parsing a statement first runs the query parser and if it fails, the update parser is invoked. Upon creation of parsers, the actAsClassifier flag can be set which controls whether to raise exceptions if both parsers fail or whether to yield SparqlStmt instances with isParser() returning false. If actAsClassifier is enabled and both parsers fail, the query string is classified depending on which parser consumed the most bytes from the input (based the line / column information of the raised parse exception). On this basis, an appropriate StmtQueryStmt or SparqlUpdateStmt instance is created. If parsing suceeded, SparqlStmt.isParsed will return true and the getQuery or getUpdateRequest will return the appropriate object.

SparqlStmtParser parser = SparqlStmtParserImpl.create();

// If you only need to deal with UpdateRequests, use can use SparqlUpdateParserImpl instead:
// SparqlUpdateParser parser = SparqlUpdateParserImpl.create();

SparqlStmt stmt = parser.apply("PRFIX foo: <http://foo.bar/baz/> INSERT DATA { <urn:s> <urn:p> <urn:o> }");
System.out.println("isParsed: " + stmt.isParsed());
System.out.println("UpdateRequest.toString(): " + stmt.getUpdateRequest());

// Remove the unused `foo:` prefix (in place transformation)
SparqlStmtUtils.optimizePrefixes(stmt);

Enhancing parser functionality

Prefix optimization can be added to any SparqlStmtParser using a wrapper function

SparqlStmtParser parser;
parser = SparqlStmtParser.create();
parser = SparqlStmtParser.wrapWithOptimizePrefixes(parser);

Namespace tracking inserts all seen namespaces to a PrefixMapping instance. The PrefixMapping may in turn be consulted by the parser:

PrefixMapping pm = new PrefixMappingImpl();
pm.setNsPrefixes(PrefixMapping.Extended);

SparqlStmt stmt;
SparqlStmt parser;
parser = SparqlStmtParserImpl.create(Syntax.syntaxARQ, pm, /* actAsClassifier= */ true);
parser = SparqlStmtParser.wrapWithNamespaceTracking(pm, parser);


stmt = parser.parse("SELECT * { ?s a eg:Foobar }");
System.out.println("parsed: " + stmt.isParsed());
// Printed 'false' because eg: is an unknown prefix

parser.parse("PREFIX eg: <http://www.example.org/> SELECT * { }");

parser.parse("SELECT * { ?s a eg:Foobar }");
System.out.println("parsed: " + stmt.isParsed());
// Printed 'true' because eg: is known be a prior statement

SPARQL Stmt Iterator

Using the machinary, files containing sequences of SPARQL statements can be processed in a similar fashion as .sql scripts that are comprised of a sequence of SQL satements. Given an InputStream and a SparqlStmtParser the utility function SparqlStmtIterator parse(InputStream in, Function<String, SparqlStmt> parser) bundles the two together into an iterator that reads SPARQL statements from that stream. In order to read from a file or classpath resource the class SparqlStmtMgr provides useful convenience methods:

PrefixMapping pm = new PrefixMappingImpl();
List<Query> SparqlStmtMgr.loadQueries("file.sparql", pm);