Query Engine

Queries describe the patterns you want to retrieve. The engine favors extreme simplicity and aims for predictable latency and skew resistance without any tuning. Every constraint implements the Constraint trait so operators, sub-languages and even alternative data sources can compose cleanly.

Queries as Schemas

You might notice that trible.space does not define a global ontology or schema beyond associating attributes with a ValueSchema or BlobSchema. This is deliberate. The semantic web taught us that per-value typing, while desirable, was awkward in RDF: literal datatypes are optional, custom types need globally scoped IRIs and there is no enforcement, so most data degenerates into untyped strings. Trying to regain structure through global ontologies and class hierarchies made schemas rigid and reasoning computationally infeasible. Real-world data often arrives with missing, duplicate or additional fields, which clashes with these global, class-based constraints.

Our approach is to be sympathetic to edge cases and have the system deal only with the data it declares capable of handling. These application-specific schema declarations are exactly the shapes and constraints expressed by our queries1. Data not conforming to these queries is simply ignored by definition, as a query only returns data satisfying its constraints.2

Join Strategy

The query engine uses the Atreides family of worst-case optimal join algorithms. These algorithms leverage cardinality estimates to guide a depth-first search over variable bindings, providing skew-resistant and predictable performance. For a detailed discussion, see the Atreides Join chapter.

Query Languages

Instead of a single query language, the engine exposes small composable constraints that combine with logical operators such as and and or. These constraints are simple yet flexible, enabling a wide variety of operators while still allowing the engine to explore the search space efficiently.

The query engine and data model are flexible enough to support many query styles, including graph, relational and document-oriented queries.

For example, the namespace module offers macros that generate constraints for a given trible pattern in a query-by-example style reminiscent of SPARQL or GraphQL but tailored to a document-graph data model. It would also be possible to layer a property-graph language like Cypher or a relational language like Datalog on top of the engine.3

Great care has been taken to ensure that query languages with different styles and semantics can coexist and even be mixed with other languages and data models within the same query. For practical examples of the current facilities, see the Query Language chapter.

1

Note that this query-schema isomorphism isn't necessarily true in all databases or query languages, e.g., it does not hold for SQL.

2

In RDF terminology: We challenge the classical A-Box & T-Box dichotomy by replacing the T-Box with a "Q-Box", which is descriptive and open rather than prescriptive and closed. This Q-Box naturally evolves with new and changing requirements, contexts and applications.

3

SQL would be a bit more challenging, as it is surprisingly imperative with its explicit JOINs and ORDER BYs, and its lack of a clear declarative semantics. This makes it harder to implement on top of a constraint-based query engine tailored towards a more declarative and functional style.