This is the multi-page printable view of this section. Click here to print.
Essentials
- 1: SQL Introduction
- 2: Documents
- 3: Data Types
- 4: Expressions
1 - SQL Introduction
Tables
Though Genji stores its data in tables, there is no concept of rows or columns. In Genji:
- a table is a collection of documents
- a document is a collection of fields
- a field is a key-value pair
Each document is assigned a primary key, which is a unique identifier.
Schemas
Like relational databases, tables can have a schema to control and validate data. There are three kinds of schema that can be defined in Genji:
- strict schemas
- partial schemas
- no schema (or schemaless)
Strict schemas
This kind of schema is similar to the ones found in other databases: It defines all the fields of every document.
But unlike other databases, it has some really interesting properties that are taylored for documents:
- It is possible to define constraints on nested fields
- When inserting a document that contains fields that where not defined in the schema, these fields are ignored
Let’s see an example:
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
age INTEGER,
email TEXT NOT NULL UNIQUE
address (
zipCode TEXT
),
CHECK(age >= 0)
)
This will create a table users
with the following constraints:
- All documents must have a non-empty
id
field, whose type can be converted to an integer. This field will be used as the primary key of the table and will be stored as an integer. - All documents must have a non-empty
name
field that can be converted toTEXT
. - If a document has an
age
field, it will be converted to an integer. The field has to be positive. - All documents must have a non-empty email field that can be converted to
TEXT
and it must be unique across the whole table. - If a document has an
address
field and its value is a document with azipCode
field, then its value will be converted toTEXT
.
Inserting documents on strict tables can be done in multiple ways:
INSERT INTO users
(id, name, age, email, address)
VALUES (
1,
'Jim',
10,
'jim@host.com',
{
zipCode: '12345'
}
)
-- or
INSERT INTO users VALUES {
id: 1,
name: 'Jim',
age: 10,
email: 'jim@host.com',
address: {
zipCode: '12345'
}
}
Partial schemas
When dealing with data coming from different sources (APIs, files, streams, etc.), is might be useful to define constraints only on a set of fields, while storing the other fields as-is. These kind of schemas are called partial schemas.
Let’s create a table that can be used to store some Genji’s Github issues consumed from the Github API:
CREATE TABLE github_issues (
id INTEGER PRIMARY KEY,
title TEXT,
user (
id INTEGER UNIQUE
site_admin BOOLEAN,
...
),
...
)
By using the ...
notation, it is possible to tell Genji to insert all fields of incoming documents but to apply constraints only to certain fields. It also works on nested documents.
Here is how to use the Genji CLI to populate this table:
curl https://api.github.com/repos/genjidb/genji/issues | genji insert --db mydb -t github_issues
Tip
Partial schemas are also interesting when you want to create indexes or specific constraints on certain fieldsSchemaless tables
Unlike relational databases, it is also possible to define schemaless tables. Documents stored in schemaless tables can be completely different from one another.
CREATE TABLE teams;
-- equivalent to
CREATE TABLE teams (...);
Inserting documents can then be done with no constraints:
INSERT INTO teams (name, league) VALUES ('Real Madrid', 'Ligua');
INSERT INTO teams (name, league, members) VALUES ('PSG', 'Ligue 1', ['Messi']);
Warning
When using partial schemas or schemaless tables, any undeclared numeric field is considered a floating point decimal number.Dropping a table
CREATE TABLE
will return an error if the table already exists.
To remove a table and all of its content, use the DROP TABLE
command:
DROP TABLE users;
This will remove the users
table and all of its documents. If DROP TABLE
is called on a non-existing table, it will return an error.
Inserting documents
The INSERT
statement is used to add documents to tables. It was designed with two goals in mind:
- provide an API familiar to users used to other SQL databases
- provide an alternative API designed to simplify inserting complex documents
Here is the first form:
INSERT INTO countries (name, population) VALUES ('France', 67900000);
-- if the countries table has a strict or partial schema, field names can be omitted.
INSERT INTO countries VALUES ('France', 67900000);
Because Genji is a document database, this form also supports nested fields and arrays:
INSERT INTO bands (name, members, albums)
VALUES (
"Guns N' Roses",
["Axl Rose", "Slash", "Steven Adler", "Duff McKagan"],
[
{
name: "Appetite for Destruction",
releaseYear: "1987"
},
{
name: "G N' R Lies",
releaseYear: "1988"
}
]
);
The second form uses the document literal notation. It is useful when inserting complex documents that would make the first form too verbose:
INSERT INTO bands VALUES {
"name": "Guns N' Roses",
"members": ["Axl Rose", "Slash", "Steven Adler", "Duff McKagan"],
"albums": [
{
name: "Appetite for Destruction",
releaseYear: "1987"
},
{
name: "G N' R Lies",
releaseYear: "1988"
}
]
}
Tip
JSON documents can be used in the insert statement directlySelecting documents
Querying documents from a table can be achieved by using the SELECT
statement.
The output of SELECT
statements is a strean of documents that can be decoded and represented in any form.
Here is an example of selecting all documents of the users
table:
SELECT * FROM players;
Here is a json representation of the output
{
"name": "Rafael Nadal",
"age": 36,
"nationality": "Spain",
"career": {
"australia": 2,
"france": 14,
"wimbledon": 2,
"us": 4
},
"coach": [
"Francisco Roig",
"Carlos Moyá",
"Marc López"
]
}
{
"name": "Roger Federer",
"age": 40,
"nationality": "Switzerland",
"career": {
"australia": 6,
"france": 1,
"wimbledon": 8,
"us": 5
},
"coach": [
"Ivan Ljubičić",
"Severin Lüthi",
]
}
{
"name": "Andrew Barron Murray",
"coach": [
"Ivan Lendl"
]
}
Let’s break it down:
SELECT
: Run the SELECT command*
: This is the projection, it indicates how to build the documents returned by the result of the query. Here, we are using a special projection called the wildcard, which is a way to tell Genji to simply return all of the fields of each document.FROM players
: Indicates from which table we want to query the data.
Understanding projections
Now, let’s query only the name and age of each player:
SELECT name, age FROM players;
{
"name": "Rafael Nadal",
"age": 36
}
{
"name": "Roger Federer",
"age": 40
}
{
"name": "Andrew Barron Murray",
"age": null
}
The result contains three documents, all of them have a name
and age
fields.
A projection guarantees that all the documents returned by the query will contain the selected fields, even if the original documents don’t have that information. In our example, the Andrew Barron Murray
document doesn’t have an age
field, so its projected value is null
.
The only exception is for the *
wildcard, which projects all the fields of the original document.
Querying nested fields
Let’s determine how many French Open each of them has won in their career:
SELECT name, career.france FROM players;
{
"name": "Rafael Nadal",
"career.france": 14
}
{
"name": "Roger Federer",
"career.france": 1
}
{
"name": "Andrew Barron Murray",
"career.france": null
}
In this example, we used a path to select the career.france
field.
Let’s add the information about the first coach of each player:
SELECT name, career.france, coach[0] FROM players;
{
"name": "Rafael Nadal",
"career.france": 14,
"coach[0]": "Francisco Roig"
}
{
"name": "Roger Federer",
"career.france": 1,
"coach[0]": "Ivan Ljubičić"
}
{
"name": "Andrew Barron Murray",
"career.france": null,
"coach[0]": "Ivan Lendl"
}
coach[0]
is a notation that indicates to select the element at index 0
of the coach
array.
Filter documents
Until now, we always performed our queries on every document of the table.
Let’s only query those whose career
document is set.
SELECT name FROM players WHERE career IS NOT NULL;
{
"name": "Rafael Nadal"
}
{
"name": "Roger Federer"
}
This time, the result contains only two documents.
The WHERE
clause allows filtering the documents returned. To do that, it evaluates an expression on every document:
- if the result of the evaluation is truthy, the document is selected
- if the result of the evaluation is falsy, the document is filtered out
SELECT name, age FROM players WHERE age < 40;
{
"name": "Rafael Nadal",
"age": 36
}
In this example, only Rafael Nadal satisfies the query:
- Roger Federer’s age is 40 which is not
< 40
- Andrew Barron Murray’s age is
null
, which is also not< 40
Filtering on values in nested objects
We can filter on values in nested arrays using the IN
operator:
SELECT name, coach FROM players WHERE 'Ivan Ljubičić' IN coach;
{
"name": "Roger Federer",
"coach": ["Ivan Ljubičić", "Severin Lüthi"]
}
And values in nested documents:
SELECT name, career.wimbledon AS wimbledon FROM players WHERE career.wimbledon > 3;
{
"name": "Roger Federer",
"wimbledon": 8
}
Ordering results
The order in which documents are returned when reading a table is not guaranteed unless sorted explicitly.
To control the order of the results, we need to use the ORDER BY
clause
SELECT name, career.australia AS australia FROM players ORDER BY career.australia;
{
"name": "Andrew Barron Murray",
"australia": null
}
{
"name": "Rafael Nadal",
"australia": 2
}
{
"name": "Roger Federer",
"australia": 6
}
The order in which documents appear depends on three factors:
- the direction of the order
- the type of the field used for ordering
- the value of the field used for ordering
By default, the direction is ascending, from the smallest value to the highest.
When it comes to ordering, there is a hierarchy between types:
NULL
< BOOLEAN
< numbers < TEXT
or BLOB
In the example above, the career
field of Andrew Barron Murray doesn’t exist, so it is treated as null
, and then appears first in the result.
Then, Rafael Nadal and Roger Federer have an career.australia
field which is an INTEGER
, there are compared with each other and returned in ascending order.
The direction can be controlled by using ASC
or DESC
clauses.
SELECT name, career.australia AS australia FROM players ORDER BY career.australia ASC;
// returns the same results as above
SELECT name, career.australia AS australia FROM players ORDER BY career.australia DESC;
{
"name": "Roger Federer",
"australia": 6
}
{
"name": "Rafael Nadal",
"australia": 2
}
{
"name": "Andrew Barron Murray",
"australia": null
}
Using functions
Projections can also use functions to add more power to the queries.
Select the primary key
Every document has a primary key, which is a unique value.
When a document is inserted without an explicit primary key, an implicit docid
is created automatically. Implicit primary keys don’t appear in the results though, even when using SELECT *
.
To select them, we can use the pk()
function.
SELECT pk(), name FROM players;
{
"pk()": [
1
],
"name": "Rafael Nadal"
}
{
"pk()": [
2
],
"name": "Roger Federer"
}
{
"pk()": [
3
],
"name": "Andrew Barron Murray"
}
Because the primary key can be composite, pk()
always returns an array.
Here is an example with another table with a composite primary key
CREATE TABLE cars (
brand TEXT,
name TEXT,
year INTEGER,
PRIMARY KEY (brand, name)
);
INSERT INTO cars VALUES ('Ford', 'Mustang', 1965);
INSERT INTO cars VALUES ('Peugeot', '205', 1984);
SELECT pk(), name FROM cars;
{
"pk()": [
"Ford",
"Mustang"
],
"name": "Mustang"
}
{
"pk()": [
"Peugeot",
"205"
],
"name": "205"
}
Other SELECT clauses
The SELECT statement is very rich and can do much more. Read the SELECT reference for more information.
Updating documents
The UPDATE
statement makes it possible to update one or more documents in a table.
Consider a table users
with the following documents in it.
{
"name": "Thor",
"age": 1000
}
{
"name": "Hulk",
"group": "Avengers",
"age": 42
}
UPDATE users SET group = "Avengers"
Let’s break it down:
UPDATE users
runs theUPDATE
statement on theusers
tableSET
indicates the list of changes we want to performgroup = "Avengers"
sets thegroup
field of the document to the value “Avengers”
Without a WHERE
clause, this statement will run on all the documents of the table. Here is the state of the table after running this command:
{
"name": "Thor",
"age": 1000,
"group": "Avengers"
}
{
"name": "Hulk",
"age": 42,
"group": "Avengers"
}
The first document didn’t have a group
field before. It’s because the SET
clause actually sets fields in the document, regardless of their existence. This is a good way to add new fields to documents.
Since we can add or modify fields using the SET
clause, it is also possible to delete fields using the UNSET
clause:
UPDATE users UNSET age;
This will delete the age
field from all the documents. If the field doesn’t exist it does nothing.
To update only a subset of documents, we can use the WHERE
clause. In the following example, only the documents that satisfy the age = 2
condition will be updated.
UPDATE users UNSET group WHERE age = 2;
Deleting documents
Documents can be deleted using the DELETE
statement.
Let’s start with the simplest form:
DELETE FROM products;
This command deletes all the documents of the products
table.
To delete only a few documents, use the WHERE
clause:
DELETE FROM products WHERE sales_count < 1000;
For every document, the WHERE
clause evaluates any expression that follows, here sales_count < 1000
. If the result is truthy, the document gets deleted.
The DELETE
statement doesn’t return an error if no document matches the WHERE
clause, or if there aren’t any document in the table.
Using indexes
Indexes are created using the CREATE INDEX
statement.
-- create a named index
CREATE INDEX user_country_idx ON users(country);
-- enforce uniqueness
CREATE UNIQUE INDEX user_email_idx ON users(email);
-- index name can be ommitted, let the engine generate one for you
CREATE UNIQUE INDEX ON users(username);
-- it is possible to create indexes on nested fields
CREATE INDEX ON players(career.us);
-- composite indexes are also supported
CREATE INDEX ON players(career.us, name);
To delete indexes, use the DROP INDEX
statement
DROP INDEX user_email_idx;
2 - Documents
Genji stores records as documents. A document is an object that contains pairs that associate a string field to a value of any type. Genji SQL represents documents as JSON objects, though they support far more types.
Here is a JSON representation of the structure of a document:
{
field1: value1,
field2: value2,
field3: value3,
...
}
Example of a document using Genji SQL syntax:
{
name: "Nintendo Switch",
price: {
base: 379.99,
vat: 20,
total: base + base * vat / 100
},
brand: "Nintendo",
"top-selling-games": [
"Mario Odyssey",
"Zelda Breath of the Wild"
]
}
Each field name must be a string, but values can be of any type, including another document, an array or an expression.
Any JSON object is a valid document and can be inserted as-is.
Field names
Field names can be any string, with only one exception: they cannot be empty.
Paths
A path is a way to refer to fields of a document or elements of an array.
Given the following document:
{
"name": "Foo",
"address": {
"city": "Lyon",
"zipcode": "69001"
},
"friends": [
{
"name": "Bar",
"address": {
"city": "Paris",
"zipcode": "75001"
}
},
{
"name": "Baz",
"address": {
"city": "Ajaccio",
"zipcode": "20000"
},
"favorite game": "FF IX"
}
]
}
Accessing a top-level field can be achieved by simply referring to its name.
Example: name
will evaluate to "Foo"
.
To access a nested field, either concatenate the fields with the .
character, or use the []
notation
Examples: address.city
will evaluate to "Lyon"
Examples: address["city"]
will evaluate to "Lyon"
To access an element of an array, use the index of the element
Examples:
friends[0]
will evaluate to{"name": "Bar","address": {"city":"Paris","zipcode": "75001"}}
friends[1].name
will evaluate to"Baz"
friends[1]."favorite game"
will evaluate to"FF IX"
3 - Data Types
Genji provides a list of simple data types to store and manipulate data.
Name | Description | From | To |
---|---|---|---|
BOOL | Can be either true or false |
false |
true |
INTEGER | Signed integer which takes 1, 2, 4 or 8 bytes depending on the size of the stored number | -9223372036854775808 | 9223372036854775807 |
DOUBLE | 8 bytes decimal | -1.7976931348623157e+308 | 1.7976931348623157e+308 |
BLOB | Variable size BLOB of data | ||
TEXT | Variable size UTF-8 encoded string | ||
ARRAY | ARRAY of values of any type | ||
DOCUMENT | Object that contains pairs that associate a string field to a value of any type |
The case of NULL
In Genji, NULL is treated as both a value and a type. It represents the absence of data, and is returned in various cases:
- when selecting a field that doesn’t exists
- when selecting a field whose value is NULL
- as the result of the evaluation of an expression
Conversion
Whenever Genji needs to manipulate data of different types, depending on the situation it will rely on either:
- explicit conversion: The source type and destination type are clearly identified. Ex: When inserting data to field with a constraint or when doing a
CAST
. - implicit conversion: Two values of different types need to be compared or used by an operator during the evaluation of an expression
Explicit conversion
Explicit conversion is used when we want to convert a value of a source type into a target type. However, Genji types are not all compatible with one another, and when a user tries to convert them, Genji returns an error. Here is a table describing type compatibility.
Source type | Target type | Converted | Example |
---|---|---|---|
BOOL | INTEGER | yes, 1 if true , otherwise 0 |
CAST(true AS INTEGER) -> 1 |
BOOL | TEXT | yes, 'true' if true , otherwise 'false' |
CAST(true AS TEXT) -> 'true' |
INTEGER | BOOL | yes, false if zero, otherwise true |
CAST(10 AS BOOL) -> true |
INTEGER | DOUBLE | yes | CAST(10 AS DOUBLE) -> 10.0 |
INTEGER | TEXT | yes | CAST(10 AS TEXT) -> '10' |
DOUBLE | INTEGER | yes, cuts off the decimal part | CAST(10.5 AS DOUBLE) -> 10 |
DOUBLE | TEXT | yes | CAST(10.5 AS DOUBLE) -> '10.5' |
TEXT | BOOL | yes, if the content is a valid boolean | CAST('true' AS BOOL) -> true |
TEXT | INTEGER | yes, if the content is a valid integer | CAST('10' AS INTEGER) -> 10 |
TEXT | DOUBLE | yes, if the content is a valid decimal number | CAST('10.4' AS DOUBLE) -> 10.4 |
TEXT | BLOB | yes, if the content is a valid base64 value | CAST('aGVsbG8K' AS BLOB) -> 'aGVsbG8K' |
TEXT | ARRAY | yes, if the content is a valid json array | CAST('[1, 2, 3]' AS ARRAY) -> [1, 2, 3] |
TEXT | DOCUMENT | yes, if the content is a valid json object | CAST('{"a": 1}' AS DOCUMENT) -> {"a": 1} |
BLOB | TEXT | yes, the content will be encoded in base64 | |
ARRAY | TEXT | yes, the content will be encoded as a json array | CAST([1, 2, 3] AS TEXT) -> '[1, 2, 3]' |
DOCUMENT | TEXT | yes, the content will be encoded as a json object | CAST({a: 1} AS DOUBLE) -> '{"a": 1}' |
NULL | any type | yes, NULL |
CAST(NULL AS DOUBLE) -> NULL |
Implicit conversion
There is only one kind of implicit conversion: INTEGER
to DOUBLE
. This usually takes place during the evaluation of an expression involving INTEGER and DOUBLE values.
No other conversion is applied unless it’s explicit.
4 - Expressions
Expressions are components that can be evaluated to a value.
Example:
1 + 1 # expression
-> 2 # result
An expression can be found in two forms:
- unary: meaning it contains only one component
- binary: meaning it contains two expressions, or operands, and one operator. i.e.
<expr> <operator> <expr>
Example:
/* Unary expressions */
1
name
"foo"
/* Binary expressions */
age >= 18
1 AND 0
Here is a list of all supported expressions:
Diagram(
Choice(
0,
Link("literal-value"),
Link("parameter"),
Link("field-path"),
Sequence("NOT", Link("expr")),
Sequence(Link("expr"), Link("binary-operator", "#operators"), Link("expr")),
Sequence(
Link("expr"),
Optional("NOT"),
Choice(
0,
Link("IN", "#comparison-operators"),
Link("LIKE", "#comparison-operators")
),
Link("expr")
),
Sequence(
Link("expr"),
Link("IS", "#comparison-operators"),
Optional("NOT"),
Link("expr")
),
Sequence(
Link("expr"),
Optional("NOT"),
Link("BETWEEN", "#comparison-operators"),
Link("expr"),
"AND",
Link("expr")
),
Sequence(
Link("function", "#functions"),
"(",
OneOrMore(Link("expr"), ","),
")"
),
Sequence("CAST", "(", Link("expr"), "AS", "type-name", ")"),
Sequence("NEXT", "VALUE", "FOR", "sequence-name"),
Sequence("(", Link("expr"), ")")
)
);
Literal value
Any literal evaluates to the closest compatible type.
Diagram(
Choice(
0,
Link("integer-literal"),
Link("number-literal"),
Link("string-literal"),
Link("blob-literal"),
Link("bool-literal"),
Link("array-literal"),
Link("document-literal"),
"NULL"
)
);
Integer literal
Diagram(Sequence(Optional(Choice(0, "+", "-")), OneOrMore("digit")));
An integer literal is a sequence of characters that only contain digits. They may start with a +
or -
sign.
123456789
+100
-455
Integer literals are evaluated into the INTEGER
type.
If an integer is bigger than the maximum value of a 64 bit integer or smaller than the minimum 64 bit integer value, it will be evaluated as a DOUBLE
.
Number literal
Diagram(
Sequence(
Choice(
0,
Sequence(OneOrMore("digit")),
Sequence(OneOrMore("digit"), ".", OneOrMore("digit")),
Sequence(".", OneOrMore("digit"))
),
Optional(
Sequence(
Choice(0, "e", "E"),
Optional(Choice(0, "+", "-")),
OneOrMore("digit")
)
)
)
);
A number literal is a sequence of characters that contains three parts:
- a sequence of digits
- a decimal point (i.e.
.
) - a sequence of digits
They may start with a +
or a -
sign.
123.456
+3.14
-1.0
Number literals are evaluated to the DOUBLE
type.
String literal
Diagram(
Choice(
0,
Sequence('"', ZeroOrMore("unicode-character"), '"'),
Sequence("'", ZeroOrMore("unicode-character"), "'")
)
);
A string literal is a sequence of utf-8 encoded characters surrounded by double or single quotes. They may contain any unicode character or escaped single or double quotes (i.e \'
or \"
).
"l'école des fans"
'(╯ಠ_ಠ)╯︵ ┳━┳'
'foo \''
String literals are evaluated to the TEXT
type.
Blob literal
Diagram(
Choice(
0,
Sequence('"\\x', ZeroOrMore("hexadecimal-character"), '"'),
Sequence("'\\x", ZeroOrMore("hexadecimal-character"), "'")
)
);
A blob literal starts with \x
followed by a list of hexadecimal characters which is then decoded into raw bytes by Genji.
Blob literals are evaluated to the BLOB
type.
Bool literal
Diagram(Choice(0, "TRUE", "FALSE"));
A boolean literal is any sequence of character that is written as true
or false
, regardless of the case.
true
false
TRUE
FALSE
tRUe
FALse
Boolean literals are evaluated into the BOOL
type.
Array literal
Diagram(
Choice(
0,
Sequence("[", OneOrMore(Link("expr"), ","), "]"),
Sequence("(", OneOrMore(Link("expr"), ","), ")")
)
);
An array literal is any sequence of character that starts and ends with either:
[
and]
(
and)
and that contains a coma-separated list of expressions.
[1.5, "hello", 1 > 10, [true, -10], {foo: "bar"}]
Array literals are evaluated into the ARRAY
type.
Document literal
Diagram(
"{",
OneOrMore(
Sequence(Choice(0, "identifier", "string"), ":", Link("expr")),
","
),
"}"
);
A document is any sequence of character that starts and ends with {
and }
and that contains a list of pairs.
Each pair associates an identifier with an expression, both separated by a colon. Each pair must be separated by a coma.
{
foo: 1,
bar: "hello",
baz: true AND false,
"long field": {
a: 10
}
}
In a document, the identifiers are referred to as fields.
In the example above, the document has four top-level fields (foo
, bar
, baz
and long field
) and one nested field a
.
Note that any JSON object is a valid document.
Document literals are evaluated into the DOCUMENT
type.
Identifier
Diagram(
Choice(
0,
OneOrMore(Choice(0, "ascii-letter", "digit", "_")),
Sequence("`", OneOrMore("unicode-character"), "`")
)
);
Identifiers are a sequence of characters that refer to table names, field names and index names.
Identifiers may be unquoted or surrounded by backquotes. Depending on that, different rules may apply.
Unquoted identifiers | Identifiers surrounded by backquotes |
---|---|
Must begin with an uppercase or lowercase ASCII character or an underscore | May contain any unicode character, other than the new line character (i.e. \n ) |
May contain only ASCII letters, digits and underscore | May contain escaped " character (i.e. \" ) |
foo
_foo_123_
`頂きます (*`▽´)_旦~~`
`foo \` bar`
Field path
Diagram(
Sequence(
"identifier",
Optional(
OneOrMore(
Choice(
0,
Sequence(".", "identifier"),
Sequence("[", "integer-literal", "]"),
Sequence("[", "string-literal", "]")
),
","
),
"skip"
)
)
);
A field path is any sequence of characters that contains one or more identifiers separated by dots or square brackets.
foo
foo.bar[10]
foo["long field"][0].bat.`other long field`
Depending on the context, a single identifier with no dot or square bracket will be parsed as an identifier or as a field path.
Field paths are evaluated into the value they refer to. They are used to select a value from a document. Their type will depend on the type of the value extracted from the document.
Given the following document:
{
"recipes": 10,
"cooking-time": {
"eggs": [3, 6, 9]
},
}
Here are examples on how field paths are evaluated:
recipes
-> 10
`cooking-time`
-> {
"eggs": [
3,
6,
9
]
}
`cooking-time`.eggs[2]
-> 9
`cooking-time`.eggs[10]
-> NULL
Parameter
Diagram(Choice(0, "?", "$identifier"));
A parameters is an expressions used to represent a value passed when the query is evaluated. Genji supports two types of parameters:
- Positional parameters:
?
- Named parameters:
$
followed by an identifier
Functions
Diagram(
Sequence(
"identifier",
Optional(Sequence(".", "identifier"), "skip"),
"(",
ZeroOrMore(Link("expr"), ","),
")"
)
);
A function name is an expression that represent a builtin function. It can either represent a global function or a function within a package.
count()
typeof("hello")
math.atan2(1.1, 1.1)
String functions
Function -> Returns | Description |
---|---|
strings.LOWER(val: string) -> string |
Format val to lower-case |
strings.UPPER(val: string) -> string |
Format val to upper-case |
strings.TRIM(val: string) -> string |
Removes all spaces from each side of val |
strings.TRIM(val: string, trim: string) |
Removes trim characters from each side of val |
strings.LTRIM(val: string) -> string |
Removes all spaces from the left side of val |
strings.LTRIM(val: sting, trim: string) |
Removes trim characters from the left side of val |
strings.RTRIM(val: string) -> string |
Removes all spaces from the right side of val |
strings.RTRIM(val: string, trim: string) |
Removes trim characters from the right side of val |
Operators
Diagram(
Choice(
0,
"||",
"*",
"/",
"%",
"+",
"-",
"|",
"&",
"^",
">",
">=",
"<",
"<=",
"=",
"!=",
"IS",
"IN",
"LIKE",
"AND",
"OR"
)
);
Genji provides a list of operators that can be used to compute operations with expressions. Operators are binary expressions, meaning they always take exactly two operands. It is possible though to combine multiple operators to create an evaluation tree.
Logical operators
Logical operators are operators that return a boolean under certain conditions.
Name | Description |
---|---|
AND |
Evaluates to true if both operands are truthy |
OR |
Evaluates to true if either the left operand or the right are truthy |
An expression is truthy if it evaluates to a non zero-value of its type.
Comparison operators
These operators are used to compare values and evaluate to a boolean.
Name | Description |
---|---|
= | Evaluates to true if operands are equal, otherwise returns false |
!= | Evaluates to true if operands are not equal, otherwise returns false |
> | Evaluates to true if the left-side expression is greater than the right-side expression, otherwise returns false |
>= | Evaluates to true if the left-side expression is greater than or equal to the right-side expression, otherwise returns false |
< | Evaluates to true if the left-side expression is less than the right-side expression, otherwise returns false |
<= | Evaluates to true if the left-side expression is less than or equal to the right-side expression, otherwise returns false |
IN | Evaluates to true if the left-side expression is equal to one of the values of the right-side array |
NOT IN | Evaluates to false if the left-side expression is equal to one of the values of the right-side array |
IS | Has the same behaviour as = except that it returns true if both operands are NULL |
IS NOT | Has the same behaviour as != except that it supports comparing with NULL |
BETWEEN | Evaluates to true if the left-side expression is between the two boundaries |
Examples:
1 = 1
-> true
1 > 2.5
-> false
3 IN [1, 2, 3]
-> true
5 BETWEEN 2 AND 10
-> true
Conversion during comparison
Prior to comparison, an implicit conversion is operated for the operands to be of the same type.
Not all types can be compared together. When two incompatible types are compared, the comparison always returns false
,
except if one of the operands is NULL, in that case it returns NULL.
Example:
1 > "hello"
-> false
1 < "hello"
-> false
The comparison follows a list of rules that are executed in order:
- If one of the operands is NULL, return
NULL
. - If both operands are documents, use the Comparing documents rule
- If both operands are arrays, use the Comparing arrays rule
- If both operands are numbers (INTEGER or DOUBLE), cast the integer to DOUBLE then compare them together.
- If both operands have the same type, compare them together.
In any other case, return false
.
Comparing documents
The fields of each document are sorted, then they are compared one by one, until they are found not equal. The comparison is then determined by the result of the comparison between these two values. If both keys are equal, compare the values.
{a: 1, b: 2} = {b: 2, a: 1}
-> true
{} = {}
-> true
{a: 1, b: 3} > {a: 1, b: 2}
-> true
{a: 100} > {aa: 1}
-> false
Comparing arrays
Each elements of both arrays are compared one by one, index by index, until they are found not equal. The comparison is then determined by the result of the comparison between these two values.
[1, 2, 3] > [1, 1 + 1, 1]
-> true
Let’s break down the example above:
- Index 0:
1
and1
are equal, the comparison continues - Index 1:
2
and1 + 1
are equal, the comparison continues - Index 2:
3
is greater then1
, the comparison stops and the first array is considered greater than the second one
Two empty arrays are considered equal:
[] = []
-> true
The size of arrays doesn’t matter, unless all the elements of the smallest one are equal to the other one. In that case the biggest array is considered greater.
[3] > [1, 100000]
-> true
[1, 2] < [1, 2, 3]
-> true
Arithmetic operators
Name | Description |
---|---|
+ |
Adding two values |
- |
Substracting two values |
* |
Multiplying two values |
/ |
Dividing two values |
% |
Find the remainder after division of one number by another |
& |
Bitwise AND |
| |
Bitwise OR |
^ |
Bitwise XOR |
Arithmetic operations are supported only for the following types:
INTEGER
DOUBLE
Note that INTEGER
and DOUBLE
types can be calculated together, in that case INTEGER
values will be converted to DOUBLE
prior the operation.
Any usage of these operators on incompatible types will return NULL
.
3 + 3.5
-> 6.5
3 + '1'
-> NULL
The case of NULL
Any arithmetic operation with one of the operand being NULL
returns NULL
.
NULL + 1
-> NULL
5 * 10 - NULL
-> NULL
Division rules
The division obeys a few rules depending on the types of the operands:
- Dividing two integers, always results in an integer
- Dividing by zero, returns
NULL
Return type and overflow
The type of the result of an operation doesn’t necessarily match the type of the operands.
- The result of a DOUBLE operation will always return a DOUBLE
- The result of an INTEGER operation will return an INTEGER, unless the return value is bigger than the maximum value of 64-bit integer. In that case, the return type will be a DOUBLE
Other operators
Name | Description |
---|---|
|| |
Concatenation of two TEXT values |
Evaluation tree and precedence
When parsed, an expression is turned into an evaluation tree so it is possible to combine operators to form complex expressions. The order in which these expressions are executed depends on the priority of the operator.
Here is the list of operators ordered by ascending precedence. Operators with higher precedence are executed before the ones with lower precedence
OR
AND
=
,!=
,<
,<=
,>
,>=
+
,-
,|
,^
*
,/
,%
,&
||
Example:
3 + 4 * 2 > 10 AND 2 - 2 = false
-> true
This expression can be represented as the following tree:
.
└── AND
├── >
│ ├── +
│ │ ├── 3
│ │ └── *
│ │ ├── 4
│ │ └── 2
│ └── 10
└── -
├── 2
└── 2
The deepest branches will be executed first, recursively until reaching the root.