Functions that activate auto aggregation.
Data Setup
LOAD 'age';
SET search_path TO ag_catalog;
SELECT create_graph('graph_name');
SELECT * FROM cypher('graph_name', $$
CREATE (a:Person {name: 'A', age: 13}),
(b:Person {name: 'B', age: 33, eyes: "blue"}),
(c:Person {name: 'C', age: 44, eyes: "blue"}),
(d1:Person {name: 'D', eyes: "brown"}),
(d2:Person {name: 'D'}),
(a)-[:KNOWS]->(b),
(a)-[:KNOWS]->(c),
(a)-[:KNOWS]->(d1),
(b)-[:KNOWS]->(d2),
(c)-[:KNOWS]->(d2)
$$) as (a agtype);
min
min()
returns the minimum value in a set of values.
Syntax: min(expression)
Returns:
A property type, or a list, depending on the values returned by expression.
Arguments:
Name | Description |
expression | An expression returning a set containing any combination of property types and lists thereof. |
Considerations:
- Any null values are excluded from the calculation.
- In a mixed set, any string value is always considered to be lower than any numeric value, and any list is always considered to be lower than any string.
- Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
min(null)
returns null.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (v:Person)
RETURN min(v.age)
$$) as (min_age agtype);
The lowest of all the values in the property age is returned.
Result:
min_age |
13 |
1 row(s) returned |
Using min()
with Lists
Data Setup:
To clarify the following example, assume the next three commands are run first:
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:'d'})
$$) as (result agtype);
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:['a', 'b', 23]})
$$) as (result agtype);
SELECT * FROM cypher('graph_name', $$
CREATE (:min_test {val:[1, 'b', 23]})
$$) as (result agtype);
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (v:min_test)
RETURN min(v.val)
$$) as (min_val agtype);
The lowest of all the values in the set—in this case, the list [‘a’, ‘b’, 23]—is returned, as (i) the two lists are considered to be lower values than the string “d”, and (ii) the string “a” is considered to be a lower value than the numerical value 1.
Result:
min_age |
[“a”, “b”, 23] |
1 row(s) returned |
max
max()
returns the maximum value in a set of values.
Syntax: max(expression)
Returns:
A property type, or a list, depending on the values returned by expression.
Arguments:
Name | Description |
expression | An expression returning a set containing any combination of property types and lists thereof. |
Considerations:
- Any null values are excluded from the calculation.
- In a mixed set, any numeric value is always considered to be higher than any string value, and any string value is always considered to be higher than any list.
- Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
max(null)
returns null.
Query:
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN max(n.age)
$$) as (max_age agtype);
The highest of all the values in the property age is returned.
Result:
min_age |
44 |
1 row(s) returned |
stDev
stDev()
returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N - 1 as the denominator, and should be used when taking a sample of the population for an unbiased estimate. When the standard deviation of the entire population is being calculated, stDevP
should be used.
Syntax: stDev(expression)
Returns:
An agtype float.
Arguments:
Name | Description |
expression | An agtype number expression |
Considerations:
- Any null values are excluded from the calculation.
stDev(null)
returns 0.0 (zero).
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN stDev(n.age)
$$) as (stdev_age agtype);
The standard deviation of the values in the property age is returned.
Result:
stdev_age |
15.716233645501712 |
1 row(s) returned |
stDevP
stDevP()
returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N as the denominator, and should be used when calculating the standard deviation for an entire population. When the standard deviation of only a sample of the population is being calculated, stDev
should be used.
Syntax: stDevP(expression)
Returns:
An agtype float.
Arguments:
Name | Description |
expression | An agtype number expression |
Considerations:
- Any null values are excluded from the calculation.
stDevP(null)
returns 0.0 (zero).
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN stDevP(n.age)
$$) as (stdevp_age agtype);
The population standard deviation of the values in the property age is returned.
Result:
stdevp_age |
12.832251036613439 |
1 row(s) returned |
percentileCont
percentileCont()
returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a linear interpolation method, calculating a weighted average between two values if the desired percentile lies between them. For nearest values using a rounding method, see percentileDisc
.
Syntax: percentileCont(expression, percentile)
Returns:
An agtype float.
Arguments:
Name | Description |
expression | An agtype number expression |
percentile | An agtype number value between 0.0 and 1.0 |
Considerations:
- Any null values are excluded from the calculation.
percentileCont(null, percentile)
returns null.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN percentileCont(n.age, 0.4)
$$) as (percentile_cont_age agtype);
The 40th percentile of the values in the property age is returned, calculated with a weighted average. In this case, 0.4 is the median, or 40th percentile.
Result:
percentile_cont_age |
29.0 |
1 row(s) returned |
percentileDisc
percentileDisc()
returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a rounding method and calculates the nearest value to the percentile. For interpolated values, see percentileCont
.
Syntax: percentileDisc(expression, percentile)
Returns:
An agtype float.
Arguments:
Name | Description |
expression | An agtype number expression |
percentile | An agtype number value between 0.0 and 1.0 |
Considerations:
- Any null values are excluded from the calculation.
percentileDisc(null, percentile)
returns null.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN percentileDisc(n.age, 0.5)
$$) as (percentile_disc_age agtype);
The 50th percentile of the values in the property age is returned.
Result:
percentile_cont_age |
33.0 |
1 row(s) returned |
count
count()
returns the number of values or records, and appears in two variants:
count(*)
returns the number of matching recordscount(expr)
returns the number of non-null values returned by an expression.
Syntax: count(expression)
Returns:
An agtype integer.
Arguments:
Name | Description |
expression | An expression |
Considerations:
count(*)
includes records returning null.count(expr)
ignores null values.count(null)
returns 0 (zero).count(*)
can be used to return the number of nodes; for example, the number of nodes connected to some node n.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[]->(x)
RETURN n.age, count(*)
$$) as (age agtype, number_of_people agtype);
The age property of the start node n (with a name value of ‘A’) and the number of nodes related to n are returned.
Result:
age | number_of_people |
13 | 3 |
1 row(s) returned |
Using count(*)
can be used to group and count relationship types, returning the number of relationships of each type.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[r]->()
RETURN type(r), count(*)
$$) as (label agtype, count agtype);
The relationship type and the number of relationships with that type are returned.
Result:
label | count |
“KNOWS” | 3 |
1 row(s) returned |
Using count(expression)
to return the number of values
Instead of simply returning the number of records with count(*)
, it may be more useful to return the actual number of values returned by an expression.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n {name: 'A'})-[]->(x)
RETURN count(x)
$$) as (count agtype);
The number of nodes connected to the start node n is returned.
Result:
count |
3 |
1 row(s) returned |
Counting non-null values
count(expression)
can be used to return the number of non-null values returned by the expression.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN count(n.age)
$$) as (count agtype);
The number of nodes with the label Person
that have a non-null value for the age property is returned.
Result:
count |
3 |
1 row(s) returned |
Counting with and without duplicates
In this example we are trying to find all our friends of friends, and count them:
- The first aggregate function,
count(DISTINCT friend_of_friend)
, will only count afriend_of_friend
once, asDISTINCT
removes the duplicates. - The second aggregate function,
count(friend_of_friend)
, will consider the samefriend_of_friend
multiple times.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (me:Person)-[]->(friend:Person)-[]->(friend_of_friend:Person)
WHERE me.name = 'A'
RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)
$$) as (friend_of_friends_distinct agtype, friend_of_friends agtype);
Both B and C know D and thus D will get counted twice when not using DISTINCT
.
Result:
friend_of_friends_distinct | friend_of_friends |
1 | 2 |
1 row |
avg
avg()
returns the average of a set of numeric values.
Syntax: avg(expression)
Returns:
An agtype integer
Arguments:
Name | Description |
expression | An expression returning a set of numeric values. |
Considerations:
- Any null values are excluded from the calculation.
avg(null)
returns null.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN avg(n.age)
$$) as (avg_age agtype);
The average of all the values in the property age is returned.
Result:
avg_age |
30.0 |
1 row(s) returned |
sum
sum()
returns the sum of a set of numeric values.
Syntax: sum(expression)
Returns:
An agtype float
Arguments:
Name | Description |
expression | An expression returning a set of numeric values. |
Considerations:
- Any null values are excluded from the calculation.
sum(null)
returns null.
Query
SELECT *
FROM cypher('graph_name', $$
MATCH (n:Person)
RETURN sum(n.age)
$$) as (total_age agtype);
The sum of all the values in the property age is returned.
Result:
total_age |
90 |
1 row(s) returned |