Functions that activate auto aggregation.

Data Setup

  1. LOAD 'age';
  2. SET search_path TO ag_catalog;
  3. SELECT create_graph('graph_name');
  4. SELECT * FROM cypher('graph_name', $$
  5. CREATE (a:Person {name: 'A', age: 13}),
  6. (b:Person {name: 'B', age: 33, eyes: "blue"}),
  7. (c:Person {name: 'C', age: 44, eyes: "blue"}),
  8. (d1:Person {name: 'D', eyes: "brown"}),
  9. (d2:Person {name: 'D'}),
  10. (a)-[:KNOWS]->(b),
  11. (a)-[:KNOWS]->(c),
  12. (a)-[:KNOWS]->(d1),
  13. (b)-[:KNOWS]->(d2),
  14. (c)-[:KNOWS]->(d2)
  15. $$) as (a agtype);

min

min() returns the minimum value in a set of values.

Syntax: min(expression)

Returns:

  1. A property type, or a list, depending on the values returned by expression.

Arguments:

Name Description
expression An expression returning a set containing any combination of property types and lists thereof.

Considerations:

  • Any null values are excluded from the calculation.
  • In a mixed set, any string value is always considered to be lower than any numeric value, and any list is always considered to be lower than any string.
  • Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
  • min(null) returns null.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (v:Person)
  4. RETURN min(v.age)
  5. $$) as (min_age agtype);

The lowest of all the values in the property age is returned.

Result:

min_age
13
1 row(s) returned

Using min() with Lists

Data Setup:

To clarify the following example, assume the next three commands are run first:

  1. SELECT * FROM cypher('graph_name', $$
  2. CREATE (:min_test {val:'d'})
  3. $$) as (result agtype);
  4. SELECT * FROM cypher('graph_name', $$
  5. CREATE (:min_test {val:['a', 'b', 23]})
  6. $$) as (result agtype);
  7. SELECT * FROM cypher('graph_name', $$
  8. CREATE (:min_test {val:[1, 'b', 23]})
  9. $$) as (result agtype);

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (v:min_test)
  4. RETURN min(v.val)
  5. $$) as (min_val agtype);

The lowest of all the values in the set—in this case, the list [‘a’, ‘b’, 23]—is returned, as (i) the two lists are considered to be lower values than the string “d”, and (ii) the string “a” is considered to be a lower value than the numerical value 1.

Result:

min_age
[“a”, “b”, 23]
1 row(s) returned

max

max() returns the maximum value in a set of values.

Syntax: max(expression)

Returns:

  1. A property type, or a list, depending on the values returned by expression.

Arguments:

Name Description
expression An expression returning a set containing any combination of property types and lists thereof.

Considerations:

  • Any null values are excluded from the calculation.
  • In a mixed set, any numeric value is always considered to be higher than any string value, and any string value is always considered to be higher than any list.
  • Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
  • max(null) returns null.

Query:

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN max(n.age)
  5. $$) as (max_age agtype);

The highest of all the values in the property age is returned.

Result:

min_age
44
1 row(s) returned

stDev

stDev() returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N - 1 as the denominator, and should be used when taking a sample of the population for an unbiased estimate. When the standard deviation of the entire population is being calculated, stDevP should be used.

Syntax: stDev(expression)

Returns:

  1. An agtype float.

Arguments:

Name Description
expression An agtype number expression

Considerations:

  • Any null values are excluded from the calculation.
  • stDev(null) returns 0.0 (zero).

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN stDev(n.age)
  5. $$) as (stdev_age agtype);

The standard deviation of the values in the property age is returned.

Result:

stdev_age
15.716233645501712
1 row(s) returned

stDevP

stDevP() returns the standard deviation for the given value over a group. It uses a standard two-pass method, with N as the denominator, and should be used when calculating the standard deviation for an entire population. When the standard deviation of only a sample of the population is being calculated, stDev should be used.

Syntax: stDevP(expression)

Returns:

  1. An agtype float.

Arguments:

Name Description
expression An agtype number expression

Considerations:

  • Any null values are excluded from the calculation.
  • stDevP(null) returns 0.0 (zero).

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN stDevP(n.age)
  5. $$) as (stdevp_age agtype);

The population standard deviation of the values in the property age is returned.

Result:

stdevp_age
12.832251036613439
1 row(s) returned

percentileCont

percentileCont() returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a linear interpolation method, calculating a weighted average between two values if the desired percentile lies between them. For nearest values using a rounding method, see percentileDisc.

Syntax: percentileCont(expression, percentile)

Returns:

  1. An agtype float.

Arguments:

Name Description
expression An agtype number expression
percentile An agtype number value between 0.0 and 1.0

Considerations:

  • Any null values are excluded from the calculation.
  • percentileCont(null, percentile) returns null.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN percentileCont(n.age, 0.4)
  5. $$) as (percentile_cont_age agtype);

The 40th percentile of the values in the property age is returned, calculated with a weighted average. In this case, 0.4 is the median, or 40th percentile.

Result:

percentile_cont_age
29.0
1 row(s) returned

percentileDisc

percentileDisc() returns the percentile of the given value over a group, with a percentile from 0.0 to 1.0. It uses a rounding method and calculates the nearest value to the percentile. For interpolated values, see percentileCont.

Syntax: percentileDisc(expression, percentile)

Returns:

  1. An agtype float.

Arguments:

Name Description
expression An agtype number expression
percentile An agtype number value between 0.0 and 1.0

Considerations:

  • Any null values are excluded from the calculation.
  • percentileDisc(null, percentile) returns null.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN percentileDisc(n.age, 0.5)
  5. $$) as (percentile_disc_age agtype);

The 50th percentile of the values in the property age is returned.

Result:

percentile_cont_age
33.0
1 row(s) returned

count

count() returns the number of values or records, and appears in two variants:

  • count(*) returns the number of matching records
  • count(expr) returns the number of non-null values returned by an expression.

Syntax: count(expression)

Returns:

  1. An agtype integer.

Arguments:

Name Description
expression An expression

Considerations:

  • count(*) includes records returning null.
  • count(expr) ignores null values.
  • count(null) returns 0 (zero).
  • count(*) can be used to return the number of nodes; for example, the number of nodes connected to some node n.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n {name: 'A'})-[]->(x)
  4. RETURN n.age, count(*)
  5. $$) as (age agtype, number_of_people agtype);

The age property of the start node n (with a name value of ‘A’) and the number of nodes related to n are returned.

Result:

age number_of_people
13 3
1 row(s) returned

Using count(*) can be used to group and count relationship types, returning the number of relationships of each type.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n {name: 'A'})-[r]->()
  4. RETURN type(r), count(*)
  5. $$) as (label agtype, count agtype);

The relationship type and the number of relationships with that type are returned.

Result:

label count
“KNOWS” 3
1 row(s) returned

Using count(expression) to return the number of values

Instead of simply returning the number of records with count(*), it may be more useful to return the actual number of values returned by an expression.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n {name: 'A'})-[]->(x)
  4. RETURN count(x)
  5. $$) as (count agtype);

The number of nodes connected to the start node n is returned.

Result:

count
3
1 row(s) returned

Counting non-null values

count(expression) can be used to return the number of non-null values returned by the expression.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN count(n.age)
  5. $$) as (count agtype);

The number of nodes with the label Person that have a non-null value for the age property is returned.

Result:

count
3
1 row(s) returned

Counting with and without duplicates

In this example we are trying to find all our friends of friends, and count them:

  • The first aggregate function, count(DISTINCT friend_of_friend), will only count a friend_of_friend once, as DISTINCT removes the duplicates.
  • The second aggregate function, count(friend_of_friend), will consider the same friend_of_friend multiple times.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (me:Person)-[]->(friend:Person)-[]->(friend_of_friend:Person)
  4. WHERE me.name = 'A'
  5. RETURN count(DISTINCT friend_of_friend), count(friend_of_friend)
  6. $$) as (friend_of_friends_distinct agtype, friend_of_friends agtype);

Both B and C know D and thus D will get counted twice when not using DISTINCT.

Result:

friend_of_friends_distinct friend_of_friends
1 2
1 row

avg

avg() returns the average of a set of numeric values.

Syntax: avg(expression)

Returns:

  1. An agtype integer

Arguments:

Name Description
expression An expression returning a set of numeric values.

Considerations:

  • Any null values are excluded from the calculation.
  • avg(null) returns null.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN avg(n.age)
  5. $$) as (avg_age agtype);

The average of all the values in the property age is returned.

Result:

avg_age
30.0
1 row(s) returned

sum

sum() returns the sum of a set of numeric values.

Syntax: sum(expression)

Returns:

  1. An agtype float

Arguments:

Name Description
expression An expression returning a set of numeric values.

Considerations:

  • Any null values are excluded from the calculation.
  • sum(null) returns null.

Query

  1. SELECT *
  2. FROM cypher('graph_name', $$
  3. MATCH (n:Person)
  4. RETURN sum(n.age)
  5. $$) as (total_age agtype);

The sum of all the values in the property age is returned.

Result:

total_age
90
1 row(s) returned