Functions that activate auto aggregation.
Data Setup
LOAD 'age';SET search_path TO ag_catalog;SELECT create_graph('graph_name');SELECT * FROM cypher('graph_name', $$CREATE (a:Person {name: 'A', age: 13}),(b:Person {name: 'B', age: 33, eyes: "blue"}),(c:Person {name: 'C', age: 44, eyes: "blue"}),(d1:Person {name: 'D', eyes: "brown"}),(d2:Person {name: 'D'}),(a)-[:KNOWS]->(b),(a)-[:KNOWS]->(c),(a)-[:KNOWS]->(d1),(b)-[:KNOWS]->(d2),(c)-[:KNOWS]->(d2)$$) as (a agtype);
min
min() returns the minimum value in a set of values.
Syntax: min(expression)
Returns:
A property type, or a list, depending on the values returned by expression.
Arguments:
| Name | Description |
| expression | An expression returning a set containing any combination of property types and lists thereof. |
Considerations:
- Any null values are excluded from the calculation.
- In a mixed set, any string value is always considered to be lower than any numeric value, and any list is always considered to be lower than any string.
- Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
min(null)returns null.
Query
SELECT *FROM cypher('graph_name', $$MATCH (v:Person)RETURN min(v.age)$$) as (min_age agtype);
The lowest of all the values in the property age is returned.
Result:
| min_age |
| 13 |
| 1 row(s) returned |
Using min() with Lists
Data Setup:
To clarify the following example, assume the next three commands are run first:
SELECT * FROM cypher('graph_name', $$CREATE (:min_test {val:'d'})$$) as (result agtype);SELECT * FROM cypher('graph_name', $$CREATE (:min_test {val:['a', 'b', 23]})$$) as (result agtype);SELECT * FROM cypher('graph_name', $$CREATE (:min_test {val:[1, 'b', 23]})$$) as (result agtype);
Query
SELECT *FROM cypher('graph_name', $$MATCH (v:min_test)RETURN min(v.val)$$) as (min_val agtype);
The lowest of all the values in the set—in this case, the list [‘a’, ‘b’, 23]—is returned, as (i) the two lists are considered to be lower values than the string “d”, and (ii) the string “a” is considered to be a lower value than the numerical value 1.
Result:
| min_age |
| [“a”, “b”, 23] |
| 1 row(s) returned |
max
max() returns the maximum value in a set of values.
Syntax: max(expression)
Returns:
A property type, or a list, depending on the values returned by expression.
Arguments:
| Name | Description |
| expression | An expression returning a set containing any combination of property types and lists thereof. |
Considerations:
- Any null values are excluded from the calculation.
- In a mixed set, any numeric value is always considered to be higher than any string value, and any string value is always considered to be higher than any list.
- Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
max(null)returns null.
Query:
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN max(n.age)$$) as (max_age agtype);
The highest of all the values in the property age is returned.
Result:
| min_age |
| 44 |
| 1 row(s) returned |
stDev
stDev() returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N - 1 as the denominator, and should be used when taking a sample of the population for an unbiased estimate. When the standard deviation of the entire population is being calculated, stDevP should be used.
Syntax: stDev(expression)
Returns:
An agtype float.
Arguments:
| Name | Description |
| expression | An agtype number expression |
Considerations:
- Any null values are excluded from the calculation.
stDev(null)returns 0.0 (zero).
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN stDev(n.age)$$) as (stdev_age agtype);
The standard deviation of the values in the property age is returned.
Result:
| stdev_age |
| 15.716233645501712 |
| 1 row(s) returned |
stDevP
stDevP() returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N as the denominator, and should be used when calculating the standard deviation for an entire population. When the standard deviation of only a sample of the population is being calculated, stDev should be used.
Syntax: stDevP(expression)
Returns:
An agtype float.
Arguments:
| Name | Description |
| expression | An agtype number expression |
Considerations:
- Any null values are excluded from the calculation.
stDevP(null)returns 0.0 (zero).
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN stDevP(n.age)$$) as (stdevp_age agtype);
The population standard deviation of the values in the property age is returned.
Result:
| stdevp_age |
| 12.832251036613439 |
| 1 row(s) returned |
percentileCont
percentileCont() returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a linear interpolation method, calculating a weighted average between two values if the desired percentile lies between them. For nearest values using a rounding method, see percentileDisc.
Syntax: percentileCont(expression, percentile)
Returns:
An agtype float.
Arguments:
| Name | Description |
| expression | An agtype number expression |
| percentile | An agtype number value between 0.0 and 1.0 |
Considerations:
- Any null values are excluded from the calculation.
percentileCont(null, percentile)returns null.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN percentileCont(n.age, 0.4)$$) as (percentile_cont_age agtype);
The 40th percentile of the values in the property age is returned, calculated with a weighted average. In this case, 0.4 is the median, or 40th percentile.
Result:
| percentile_cont_age |
| 29.0 |
| 1 row(s) returned |
percentileDisc
percentileDisc() returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a rounding method and calculates the nearest value to the percentile. For interpolated values, see percentileCont.
Syntax: percentileDisc(expression, percentile)
Returns:
An agtype float.
Arguments:
| Name | Description |
| expression | An agtype number expression |
| percentile | An agtype number value between 0.0 and 1.0 |
Considerations:
- Any null values are excluded from the calculation.
percentileDisc(null, percentile)returns null.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN percentileDisc(n.age, 0.5)$$) as (percentile_disc_age agtype);
The 50th percentile of the values in the property age is returned.
Result:
| percentile_cont_age |
| 33.0 |
| 1 row(s) returned |
count
count() returns the number of values or records, and appears in two variants:
count(*)returns the number of matching recordscount(expr)returns the number of non-null values returned by an expression.
Syntax: count(expression)
Returns:
An agtype integer.
Arguments:
| Name | Description |
| expression | An expression |
Considerations:
count(*)includes records returning null.count(expr)ignores null values.count(null)returns 0 (zero).count(*)can be used to return the number of nodes; for example, the number of nodes connected to some node n.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n {name: 'A'})-[]->(x)RETURN n.age, count(*)$$) as (age agtype, number_of_people agtype);
The age property of the start node n (with a name value of ‘A’) and the number of nodes related to n are returned.
Result:
| age | number_of_people |
| 13 | 3 |
| 1 row(s) returned | |
Using count(*) can be used to group and count relationship types, returning the number of relationships of each type.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n {name: 'A'})-[r]->()RETURN type(r), count(*)$$) as (label agtype, count agtype);
The relationship type and the number of relationships with that type are returned.
Result:
| label | count |
| “KNOWS” | 3 |
| 1 row(s) returned | |
Using count(expression) to return the number of values
Instead of simply returning the number of records with count(*), it may be more useful to return the actual number of values returned by an expression.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n {name: 'A'})-[]->(x)RETURN count(x)$$) as (count agtype);
The number of nodes connected to the start node n is returned.
Result:
| count |
| 3 |
| 1 row(s) returned |
Counting non-null values
count(expression) can be used to return the number of non-null values returned by the expression.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN count(n.age)$$) as (count agtype);
The number of nodes with the label Person that have a non-null value for the age property is returned.
Result:
| count |
| 3 |
| 1 row(s) returned |
Counting with and without duplicates
In this example we are trying to find all our friends of friends, and count them:
- The first aggregate function,
count(DISTINCT friend_of_friend), will only count afriend_of_friendonce, asDISTINCTremoves the duplicates. - The second aggregate function,
count(friend_of_friend), will consider the samefriend_of_friendmultiple times.
Query
SELECT *FROM cypher('graph_name', $$MATCH (me:Person)-[]->(friend:Person)-[]->(friend_of_friend:Person)WHERE me.name = 'A'RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)$$) as (friend_of_friends_distinct agtype, friend_of_friends agtype);
Both B and C know D and thus D will get counted twice when not using DISTINCT.
Result:
| friend_of_friends_distinct | friend_of_friends |
| 1 | 2 |
| 1 row |
avg
avg() returns the average of a set of numeric values.
Syntax: avg(expression)
Returns:
An agtype integer
Arguments:
| Name | Description |
| expression | An expression returning a set of numeric values. |
Considerations:
- Any null values are excluded from the calculation.
avg(null)returns null.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN avg(n.age)$$) as (avg_age agtype);
The average of all the values in the property age is returned.
Result:
| avg_age |
| 30.0 |
| 1 row(s) returned |
sum
sum() returns the sum of a set of numeric values.
Syntax: sum(expression)
Returns:
An agtype float
Arguments:
| Name | Description |
| expression | An expression returning a set of numeric values. |
Considerations:
- Any null values are excluded from the calculation.
sum(null)returns null.
Query
SELECT *FROM cypher('graph_name', $$MATCH (n:Person)RETURN sum(n.age)$$) as (total_age agtype);
The sum of all the values in the property age is returned.
Result:
| total_age |
| 90 |
| 1 row(s) returned |
