âšď¸ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.3 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://dzone.com/articles/neo4jcypher-collect-and |
| Last Crawled | 2026-04-13 16:45:14 (8 days ago) |
| First Indexed | 2016-04-18 23:39:44 (10 years ago) |
| HTTP Status Code | 200 |
| Content | |
| Meta Title | Neo4j/Cypher: WITH, COLLECT, and EXTRACT |
| Meta Description | As I mentioned in my last post Iâm trying to get the hang of the WITH statement in neo4jâs cypher query language and I found another application when trying to... |
| Meta Canonical | null |
| Boilerpipe Text | As I
mentioned in my last post
Iâm trying to get the hang of the
WITH
statement in neo4jâs
cypher
query language and I found another application when trying to work out which opponents teams played on certain days.
I started out with a query which grouped the data set by day and showed the opponents that were played on that day:
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
RETURN DISTINCT day.name, COLLECT(TRIM(REPLACE(REPLACE(game.name, "Manchester United", ""), "vs", "")))
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| day.name | opponents |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton","Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City","Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
The way weâve got the opponents is a bit of a hack â the name of the
two teams is in the ânameâ property of a game node and weâve removed
âManchester Unitedâ and the word âvsâ to get the opponentâs name.
I thought itâd be cool if we could separate the games on each day
based on whether Manchester United were playing at home or away.
With
a lot of help from Wes Freeman
we ended up with the following query which does the job:
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as d, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH d, COLLECT([type(h),opp.name]) AS games
RETURN d,
EXTRACT(c in FILTER(x in games: HEAD(x) = "home_team") : HEAD(TAIL(c))) AS home,
EXTRACT(c in FILTER(x in games: HEAD(x) = "away_team") : HEAD(TAIL(c))) AS away
We use a similar approach with COLLECT as in the
previous post
whereby we have a collection of tuples describing whether Manchester United were at home or not and who they were playing.
A neat thing that Wes pointed out is that since there are only 2
teams per game weâre able to get the opponent node easily because itâs
the only other node that can match the âhome_team|away_teamâ
relationship since weâve already matched our team.
If we run the query just up to the last WITH we get the following result:
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | games |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | [["home_team","Liverpool"],["home_team","Everton"],["away_team","Southampton"],["away_team","Liverpool"],["away_team","Newcastle United"],["away_team","Chelsea"],["away_team","Manchester City"],["away_team","Swansea City"],["away_team","Tottenham Hotspur"]] |
| "Wednesday" | [["home_team","Southampton"],["home_team","West Ham United"],["home_team","Newcastle United"]] |
| "Monday" | [["away_team","Everton"]] |
| "Saturday" | [["home_team","Reading"],["home_team","Fulham"],["home_team","Wigan Athletic"],["home_team","Tottenham Hotspur"],["home_team","Stoke City"],["home_team","Arsenal"],["home_team","Queens Park Rangers"],["home_team","Sunderland"],["home_team","West Bromwich Albion"],["home_team","Norwich City"],["away_team","Reading"],["away_team","Aston Villa"],["away_team","Norwich City"],["away_team","Fulham"],["away_team","Queens Park Rangers"]] |
| "Tuesday" | [["away_team","Wigan Athletic"]] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
We then use the
FILTER
function to choose either the opponents Manchester United played at home or away and then we use the
EXTRACT
function to get the opponent from the tuple:
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | home |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | [] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City"] |
| "Tuesday" | [] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
+-----------------------------------------------------------------------------------------------------------------------------+
| d | away |
+-----------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | [] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------+
(I ran the query twice alternating between the last two lines so
that itâs readable here. In actual fact the away teams would be in a
column next to the home teams)
I thought it was quite interesting how many games Manchester United
play away on a Sunday â I think all of those games were probably
televised so I thought theyâd be more evenly split between home and away
matches. Adding televised matches is perhaps
another layer to add to the graph
.
Itâs probably more useful to summarise how many games were played on
each day at home and away rather than who theyâre against and we can use
the
REDUCE
function to do this:
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as dayName, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH dayName, COLLECT([type(h),opp.name]) AS games
RETURN dayName,
REDUCE(homeGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "home_team") : HEAD(TAIL(c))) : homeGames + 1) as home,
REDUCE(awayGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "away_team") : HEAD(TAIL(c))) : awayGames + 1) as away,
REDUCE(totalGames=0, game in games : totalGames + 1) as total
+-----------------------------------+
| dayName | home | away | total |
+-----------------------------------+
| "Sunday" | 2 | 7 | 9 |
| "Wednesday" | 3 | 0 | 3 |
| "Monday" | 0 | 1 | 1 |
| "Saturday" | 10 | 5 | 15 |
| "Tuesday" | 0 | 1 | 1 |
+-----------------------------------+
5 rows
An alternative way of writing the initial query would be the following which Michael Hunger suggested on the thread:
START team = node:teams('name:"Manchester United"')
MATCH p=team-[:home_team|away_team]-game-[:home_team|away_team]-(), game-[:on_day]-day
WITH day.name as dayName, COLLECT([LAST(p), HEAD(RELS(p))]) AS opponents
WITH dayName,
EXTRACT(y in FILTER(x in opponents: TYPE(HEAD(TAIL(x))) = "home_team") : HEAD(y)) AS home,
EXTRACT(y in FILTER(x in opponents : TYPE(HEAD(TAIL(x))) = "away_team") : HEAD(y)) AS away
RETURN dayName,
EXTRACT(team in home: team.name) AS homeOpponents,
EXTRACT(team in away: team.name) AS awayOpponents
ORDER BY dayName
Here we take a slightly different approach where we make use of
functions that we can apply to a matching path. We create a collection
of tuples where
LAST(p)
matches the opponent node and
HEAD(RELS(p))
matches the âhome_teamâ or âaway_teamâ relationship accordingly.
We then filter the collection to find the times that we played at
home and away. This is done by taking the second value from the tuple
and then calling
TYPE
on it which either returns
âhome_teamâ or âaway_teamâ. We then extract the first value from the
tuple which is the opponent node.
In the last part of the query we extract the name from the opponent nodes. |
| Markdown | [](https://dzone.com/)
Thanks for visiting DZone today,
![user avatar]()
[Edit Profile](https://dzone.com/articles/neo4jcypher-collect-and)
- [Manage Email Subscriptions](https://dzone.com/articles/neo4jcypher-collect-and)
- [How to Post to DZone](https://dzone.com/articles/how-to-submit-a-post-to-dzone?utm_source=DZone&utm_medium=user_dropdown&utm_campaign=how_to_post)
- [Article Submission Guidelines](https://dzone.com/articles/dzones-article-submission-guidelines)
[Sign Out](https://dzone.com/users/logout.html) [View Profile](https://dzone.com/articles/neo4jcypher-collect-and)
Post
-  [Post an Article](https://dzone.com/content/article/post.html)
- [Manage My Drafts](https://dzone.com/articles/neo4jcypher-collect-and)
Over 2 million developers have joined DZone.
[Log In](https://dzone.com/users/login.html) / [Join](https://dzone.com/static/registration.html)
[Refcards](https://dzone.com/refcardz) [Trend Reports](https://dzone.com/trendreports)
[Events](https://dzone.com/events) [Video Library](https://dzone.com/events/video-library)
[Refcards](https://dzone.com/refcardz)
[Trend Reports](https://dzone.com/trendreports)
Events
[View Events](https://dzone.com/events) [Video Library](https://dzone.com/events/video-library)
### Related
- [AI-Driven Automated Trading System](https://dzone.com/articles/ai-driven-automated-trading-system)
- [Applying Oracle 19c Release Update (RU): A Practical Guide from My DBA Experience](https://dzone.com/articles/applying-oracle-19c-release-update-ru)
- [Why Queues Donât Fix Scaling Problems](https://dzone.com/articles/queues-dont-absorb-load-they-delay-bankruptcy-1)
- [Migration from Lovable Cloud to Supabase](https://dzone.com/articles/migration-from-lovable-cloud-to-supabase-1)
### Trending
- [Stranger Things in Java: Enum Types](https://dzone.com/articles/stranger-things-in-java-enum-types)
- [Understanding Java Records From Java 16](https://dzone.com/articles/what-are-java-records)
- [Building Scalable Agentic Assistants: A Graph-Based Approach](https://dzone.com/articles/scalable-agentic-ai-assistants-graph)
- [How Spec-Driven Development Brings Structure to AI-Assisted Engineering and How We Put It to the Test](https://dzone.com/articles/how-spec-driven-development-brings-structure)
1. [DZone](https://dzone.com/)
2. [Data Engineering](https://dzone.com/data-engineering)
3. [Databases](https://dzone.com/databases)
4. Neo4j/Cypher: WITH, COLLECT, and EXTRACT
# Neo4j/Cypher: WITH, COLLECT, and EXTRACT
###
By

[Mark Needham](https://dzone.com/users/387289/markhneedham.html)
¡
Mar. 22, 13 ¡ Interview
Likes (0)
Comment
Save
[Tweet](https://dzone.com/articles/neo4jcypher-collect-and)
[Share](https://www.linkedin.com/sharing/share-offsite/?url=https://dzone.com/articles/neo4jcypher-collect-and)
17\.8K Views
Join the DZone community and get the full member experience.
[Join For Free](https://dzone.com/static/registration.html)
As I [mentioned in my last post](http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-getting-the-hang-of-the-with-statement/) Iâm trying to get the hang of the [WITH](http://docs.neo4j.org/chunked/milestone/query-with.html) statement in neo4jâs [cypher](http://docs.neo4j.org/chunked/milestone/cypher-query-lang.html) query language and I found another application when trying to work out which opponents teams played on certain days.
I started out with a query which grouped the data set by day and showed the opponents that were played on that day:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
RETURN DISTINCT day.name, COLLECT(TRIM(REPLACE(REPLACE(game.name, "Manchester United", ""), "vs", "")))
```
```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| day.name | opponents |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton","Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City","Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
```
The way weâve got the opponents is a bit of a hack â the name of the two teams is in the ânameâ property of a game node and weâve removed âManchester Unitedâ and the word âvsâ to get the opponentâs name.
I thought itâd be cool if we could separate the games on each day based on whether Manchester United were playing at home or away.
With [a lot of help from Wes Freeman](https://groups.google.com/forum/?fromgroups=#%21topic/neo4j/D4M1gXKwQ3U) we ended up with the following query which does the job:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as d, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH d, COLLECT([type(h),opp.name]) AS games
RETURN d,
EXTRACT(c in FILTER(x in games: HEAD(x) = "home_team") : HEAD(TAIL(c))) AS home,
EXTRACT(c in FILTER(x in games: HEAD(x) = "away_team") : HEAD(TAIL(c))) AS away
```
We use a similar approach with COLLECT as in the [previous post](http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-getting-the-hang-of-the-with-statement/) whereby we have a collection of tuples describing whether Manchester United were at home or not and who they were playing.
A neat thing that Wes pointed out is that since there are only 2 teams per game weâre able to get the opponent node easily because itâs the only other node that can match the âhome\_team\|away\_teamâ relationship since weâve already matched our team.
If we run the query just up to the last WITH we get the following result:
```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | games |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | [["home_team","Liverpool"],["home_team","Everton"],["away_team","Southampton"],["away_team","Liverpool"],["away_team","Newcastle United"],["away_team","Chelsea"],["away_team","Manchester City"],["away_team","Swansea City"],["away_team","Tottenham Hotspur"]] |
| "Wednesday" | [["home_team","Southampton"],["home_team","West Ham United"],["home_team","Newcastle United"]] |
| "Monday" | [["away_team","Everton"]] |
| "Saturday" | [["home_team","Reading"],["home_team","Fulham"],["home_team","Wigan Athletic"],["home_team","Tottenham Hotspur"],["home_team","Stoke City"],["home_team","Arsenal"],["home_team","Queens Park Rangers"],["home_team","Sunderland"],["home_team","West Bromwich Albion"],["home_team","Norwich City"],["away_team","Reading"],["away_team","Aston Villa"],["away_team","Norwich City"],["away_team","Fulham"],["away_team","Queens Park Rangers"]] |
| "Tuesday" | [["away_team","Wigan Athletic"]] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
```
We then use the [FILTER](http://docs.neo4j.org/chunked/milestone/query-function.html#functions-filter) function to choose either the opponents Manchester United played at home or away and then we use the [EXTRACT](http://docs.neo4j.org/chunked/milestone/query-function.html#functions-extract) function to get the opponent from the tuple:
```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | home |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | [] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City"] |
| "Tuesday" | [] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
+-----------------------------------------------------------------------------------------------------------------------------+
| d | away |
+-----------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | [] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------+
```
*(I ran the query twice alternating between the last two lines so that itâs readable here. In actual fact the away teams would be in a column next to the home teams)*
I thought it was quite interesting how many games Manchester United play away on a Sunday â I think all of those games were probably televised so I thought theyâd be more evenly split between home and away matches. Adding televised matches is perhaps [another layer to add to the graph](http://www.markhneedham.com/blog/2012/07/21/neo4j-embracing-the-sub-graph/).
Itâs probably more useful to summarise how many games were played on each day at home and away rather than who theyâre against and we can use the REDUCE function to do this:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as dayName, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH dayName, COLLECT([type(h),opp.name]) AS games
RETURN dayName,
REDUCE(homeGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "home_team") : HEAD(TAIL(c))) : homeGames + 1) as home,
REDUCE(awayGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "away_team") : HEAD(TAIL(c))) : awayGames + 1) as away,
REDUCE(totalGames=0, game in games : totalGames + 1) as total
```
```
+-----------------------------------+
| dayName | home | away | total |
+-----------------------------------+
| "Sunday" | 2 | 7 | 9 |
| "Wednesday" | 3 | 0 | 3 |
| "Monday" | 0 | 1 | 1 |
| "Saturday" | 10 | 5 | 15 |
| "Tuesday" | 0 | 1 | 1 |
+-----------------------------------+
5 rows
```
An alternative way of writing the initial query would be the following which Michael Hunger suggested on the thread:
```
START team = node:teams('name:"Manchester United"')
MATCH p=team-[:home_team|away_team]-game-[:home_team|away_team]-(), game-[:on_day]-day
WITH day.name as dayName, COLLECT([LAST(p), HEAD(RELS(p))]) AS opponents
WITH dayName,
EXTRACT(y in FILTER(x in opponents: TYPE(HEAD(TAIL(x))) = "home_team") : HEAD(y)) AS home,
EXTRACT(y in FILTER(x in opponents : TYPE(HEAD(TAIL(x))) = "away_team") : HEAD(y)) AS away
RETURN dayName,
EXTRACT(team in home: team.name) AS homeOpponents,
EXTRACT(team in away: team.name) AS awayOpponents
ORDER BY dayName
```
Here we take a slightly different approach where we make use of functions that we can apply to a matching path. We create a collection of tuples where LAST(p) matches the opponent node and HEAD(RELS(p)) matches the âhome\_teamâ or âaway\_teamâ relationship accordingly.
We then filter the collection to find the times that we played at home and away. This is done by taking the second value from the tuple and then calling TYPE on it which either returns âhome\_teamâ or âaway\_teamâ. We then extract the first value from the tuple which is the opponent node.
In the last part of the query we extract the name from the opponent nodes.
Extract Database
Published at DZone with permission of Mark Needham. [See the original article here.](http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-with-collect-extract/)
Opinions expressed by DZone contributors are their own.
### Related
- AI-Driven Automated Trading System
- Applying Oracle 19c Release Update (RU): A Practical Guide from My DBA Experience
- Why Queues Donât Fix Scaling Problems
- Migration from Lovable Cloud to Supabase
## Partner Resources
Ă
***
Comments
The likes didn't load as expected. Please refresh the page and try again.
ABOUT US
- [About DZone](https://dzone.com/pages/about)
- [Support and feedback](https://dzone.com/cdn-cgi/l/email-protection#53202623233c21271337293c3d367d303c3e)
- [Community research](https://dzone.com/pages/dzone-community-research)
ADVERTISE
- [Advertise with DZone](https://advertise.dzone.com/)
CONTRIBUTE ON DZONE
- [Article Submission Guidelines](https://dzone.com/articles/dzones-article-submission-guidelines)
- [Become a Contributor](https://dzone.com/pages/contribute)
- [Core Program](https://dzone.com/pages/core)
- [Visit the Writers' Zone](https://dzone.com/writers-zone)
LEGAL
- [Terms of Service](https://technologyadvice.com/terms-conditions/)
- [Privacy Policy](https://technologyadvice.com/privacy-policy/)
CONTACT US
- 3343 Perimeter Hill Drive
- Suite 215
- Nashville, TN 37211
- [\[email protected\]](https://dzone.com/cdn-cgi/l/email-protection#077472777768757347637d6869622964686a)
Let's be friends: |
| Readable Markdown | As I [mentioned in my last post](http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-getting-the-hang-of-the-with-statement/) Iâm trying to get the hang of the [WITH](http://docs.neo4j.org/chunked/milestone/query-with.html) statement in neo4jâs [cypher](http://docs.neo4j.org/chunked/milestone/cypher-query-lang.html) query language and I found another application when trying to work out which opponents teams played on certain days.
I started out with a query which grouped the data set by day and showed the opponents that were played on that day:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
RETURN DISTINCT day.name, COLLECT(TRIM(REPLACE(REPLACE(game.name, "Manchester United", ""), "vs", "")))
```
```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| day.name | opponents |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton","Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City","Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
```
The way weâve got the opponents is a bit of a hack â the name of the two teams is in the ânameâ property of a game node and weâve removed âManchester Unitedâ and the word âvsâ to get the opponentâs name.
I thought itâd be cool if we could separate the games on each day based on whether Manchester United were playing at home or away.
With [a lot of help from Wes Freeman](https://groups.google.com/forum/?fromgroups=#%21topic/neo4j/D4M1gXKwQ3U) we ended up with the following query which does the job:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as d, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH d, COLLECT([type(h),opp.name]) AS games
RETURN d,
EXTRACT(c in FILTER(x in games: HEAD(x) = "home_team") : HEAD(TAIL(c))) AS home,
EXTRACT(c in FILTER(x in games: HEAD(x) = "away_team") : HEAD(TAIL(c))) AS away
```
We use a similar approach with COLLECT as in the [previous post](http://www.markhneedham.com/blog/2013/03/20/neo4jcypher-getting-the-hang-of-the-with-statement/) whereby we have a collection of tuples describing whether Manchester United were at home or not and who they were playing.
A neat thing that Wes pointed out is that since there are only 2 teams per game weâre able to get the opponent node easily because itâs the only other node that can match the âhome\_team\|away\_teamâ relationship since weâve already matched our team.
If we run the query just up to the last WITH we get the following result:
```
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | games |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | [["home_team","Liverpool"],["home_team","Everton"],["away_team","Southampton"],["away_team","Liverpool"],["away_team","Newcastle United"],["away_team","Chelsea"],["away_team","Manchester City"],["away_team","Swansea City"],["away_team","Tottenham Hotspur"]] |
| "Wednesday" | [["home_team","Southampton"],["home_team","West Ham United"],["home_team","Newcastle United"]] |
| "Monday" | [["away_team","Everton"]] |
| "Saturday" | [["home_team","Reading"],["home_team","Fulham"],["home_team","Wigan Athletic"],["home_team","Tottenham Hotspur"],["home_team","Stoke City"],["home_team","Arsenal"],["home_team","Queens Park Rangers"],["home_team","Sunderland"],["home_team","West Bromwich Albion"],["home_team","Norwich City"],["away_team","Reading"],["away_team","Aston Villa"],["away_team","Norwich City"],["away_team","Fulham"],["away_team","Queens Park Rangers"]] |
| "Tuesday" | [["away_team","Wigan Athletic"]] |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
```
We then use the [FILTER](http://docs.neo4j.org/chunked/milestone/query-function.html#functions-filter) function to choose either the opponents Manchester United played at home or away and then we use the [EXTRACT](http://docs.neo4j.org/chunked/milestone/query-function.html#functions-extract) function to get the opponent from the tuple:
```
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| d | home |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Liverpool","Everton"] |
| "Wednesday" | ["Southampton","West Ham United","Newcastle United"] |
| "Monday" | [] |
| "Saturday" | ["Reading","Fulham","Wigan Athletic","Tottenham Hotspur","Stoke City","Arsenal","Queens Park Rangers","Sunderland","West Bromwich Albion","Norwich City"] |
| "Tuesday" | [] |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows
+-----------------------------------------------------------------------------------------------------------------------------+
| d | away |
+-----------------------------------------------------------------------------------------------------------------------------+
| "Sunday" | ["Southampton","Liverpool","Newcastle United","Chelsea","Manchester City","Swansea City","Tottenham Hotspur"] |
| "Wednesday" | [] |
| "Monday" | ["Everton"] |
| "Saturday" | ["Reading","Aston Villa","Norwich City","Fulham","Queens Park Rangers"] |
| "Tuesday" | ["Wigan Athletic"] |
+-----------------------------------------------------------------------------------------------------------------------------+
```
*(I ran the query twice alternating between the last two lines so that itâs readable here. In actual fact the away teams would be in a column next to the home teams)*
I thought it was quite interesting how many games Manchester United play away on a Sunday â I think all of those games were probably televised so I thought theyâd be more evenly split between home and away matches. Adding televised matches is perhaps [another layer to add to the graph](http://www.markhneedham.com/blog/2012/07/21/neo4j-embracing-the-sub-graph/).
Itâs probably more useful to summarise how many games were played on each day at home and away rather than who theyâre against and we can use the REDUCE function to do this:
```
START team = node:teams('name:"Manchester United"')
MATCH team-[h:home_team|away_team]-game-[:on_day]-day
WITH day.name as dayName, game, team, h
MATCH team-[:home_team|away_team]-game-[:home_team|away_team]-opp
WITH dayName, COLLECT([type(h),opp.name]) AS games
RETURN dayName,
REDUCE(homeGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "home_team") : HEAD(TAIL(c))) : homeGames + 1) as home,
REDUCE(awayGames=0, game in EXTRACT(c in FILTER(x in games: head(x) = "away_team") : HEAD(TAIL(c))) : awayGames + 1) as away,
REDUCE(totalGames=0, game in games : totalGames + 1) as total
```
```
+-----------------------------------+
| dayName | home | away | total |
+-----------------------------------+
| "Sunday" | 2 | 7 | 9 |
| "Wednesday" | 3 | 0 | 3 |
| "Monday" | 0 | 1 | 1 |
| "Saturday" | 10 | 5 | 15 |
| "Tuesday" | 0 | 1 | 1 |
+-----------------------------------+
5 rows
```
An alternative way of writing the initial query would be the following which Michael Hunger suggested on the thread:
```
START team = node:teams('name:"Manchester United"')
MATCH p=team-[:home_team|away_team]-game-[:home_team|away_team]-(), game-[:on_day]-day
WITH day.name as dayName, COLLECT([LAST(p), HEAD(RELS(p))]) AS opponents
WITH dayName,
EXTRACT(y in FILTER(x in opponents: TYPE(HEAD(TAIL(x))) = "home_team") : HEAD(y)) AS home,
EXTRACT(y in FILTER(x in opponents : TYPE(HEAD(TAIL(x))) = "away_team") : HEAD(y)) AS away
RETURN dayName,
EXTRACT(team in home: team.name) AS homeOpponents,
EXTRACT(team in away: team.name) AS awayOpponents
ORDER BY dayName
```
Here we take a slightly different approach where we make use of functions that we can apply to a matching path. We create a collection of tuples where LAST(p) matches the opponent node and HEAD(RELS(p)) matches the âhome\_teamâ or âaway\_teamâ relationship accordingly.
We then filter the collection to find the times that we played at home and away. This is done by taking the second value from the tuple and then calling TYPE on it which either returns âhome\_teamâ or âaway\_teamâ. We then extract the first value from the tuple which is the opponent node.
In the last part of the query we extract the name from the opponent nodes. |
| ML Classification | |
| ML Categories | null |
| ML Page Types | null |
| ML Intent Types | null |
| Content Metadata | |
| Language | en |
| Author | Mark Needham |
| Publish Time | not set |
| Original Publish Time | 2016-04-18 23:39:44 (10 years ago) |
| Republished | No |
| Word Count (Total) | 1,304 |
| Word Count (Content) | 969 |
| Links | |
| External Links | 24 |
| Internal Links | 30 |
| Technical SEO | |
| Meta Nofollow | No |
| Meta Noarchive | No |
| JS Rendered | No |
| Redirect Target | null |
| Performance | |
| Download Time (ms) | 889 |
| TTFB (ms) | 887 |
| Download Size (bytes) | 30,471 |
| Shard | 102 (laksa) |
| Root Hash | 8443605588788139902 |
| Unparsed URL | com,dzone!/articles/neo4jcypher-collect-and s443 |