Make Your Joins Efficient with Ballerina Query Expressions
Are you new to Ballerina Query Expressions? No worries… 😀 Let me give a quick intro then!! Query expressions in Ballerina allow you to write queries for data processing using SQL-like syntax. With this language integrated query feature, Ballerina offers you more quality in querying your data much faster and accurately. Exciting isn't it?? 😉 You can read my previous article to get started with query expressions in Ballerina.
In this article, I’ll be taking you a step further and walking you through joins in query expressions in Ballerina.
`join` Clause
join
clause is one of the intermediate clauses which form the query pipeline in query expression. It is used to perform either inner equijoin
or left outer equijoin
. The language specification for using thejoin
clause is as follows.
join-clause := [outer] join typed-binding-pattern in expression join-on-condition
join-on-condition := on expression equals expression
The behavior of join
clause is similar to from
clause which is used to iterate any iterable value. It takes two source iterables as inputs. The elements in each source must contain a property that can be compared to a property in the other source. A comparison of each value in a specific iteration is carried out to identify the matching items based on the condition defined using join-on-condition
. The iteration is skipped for the non-matching items, which dissatisfies the on
condition.
The equals
keyword emphasizes that the outer source iterable is consumed by the left key and the inner source iterable is consumed by the right key. Therefore left side of the equals
has only the outer source in its scope and the right side of the equals
only has the inner source in its scope. The result of the query expression with join
depends on the type of join you are performing, whether it’s an inner or a left outer equijoin. The join
clause is implemented as a hash join which makes it more efficient in the data retrieval.
You can try out the samples in this blog post using the latest version of Ballerina, swan-lake-alpha1.
Inner Equijoin
An inner equijoin only returns values from joined data sources where a certain condition of equality is met.
Consider the following query to get the first name, last name, and department name of all people with department information.
The above query expression uses join
clause to match each Person
value with Department
using department id. If a Person
value has no matching Department
, that Person
value will not appear in the result set. The query constructs a list of DeptPerson
type members. The output of the above query would be,
[{"fname":"Melina","lname":"Kodel","dept":"HR"},{"fname":"Tom","lname":"Riddle","dept":"HR"},{"fname":"Tobi","lname":"Biden","dept":"Operations"}]
Left Outer Equijoin
The left outer equijoin returns all elements from the left data source and matched elements from the right data source where a certain condition of equality is met. The right side element would produce a null
if left side element has no matches. To perform left outer equijoin you need to add outer
keyword to your join
clause.
Consider the following example where you want to get names of all people and their department name if exists.
According to the above query expression each Person
value is joined with Department
based on the department id and each element in personList
is included in the result even if there are no matches in the deptList
. When Department
is empty for any Person
a null
value is produced. Therefore the result of the above query expression is,
[{"fname":"Melina","lname":"Kodel","dept":"HR"},{"fname":"Tom","lname":"Riddle","dept":"HR"},{"fname":"Tobi","lname":"Biden","dept":"Operations"},{"fname":"Emily","lname":"Blunt","dept":null}]
Join usage with streams and tables
Similar to lists, join
can be used with streams, and tables as well.
♦ ️Get details about people in each department and order them by department name when the iterable in join
clause is a stream
.
{"value":{"fname":"Melina","lname":"Kodel","dept":"HR"}}
{"value":{"fname":"Tom","lname":"Riddle","dept":"HR"}}
{"value":{"fname":"Tobi","lname":"Biden","dept":"Operations"}}
♦ Get details about people in each department and order them by department name when the iterable in join
clause is a table
.
{"value":{"id":1,"fname":"Melina","lname":"Kodel","deptId":1}}
{"value":{"id":4,"fname":"Tom","lname":"Riddle","deptId":1}}
{"value":{"id":2,"fname":"Tobi","lname":"Biden","deptId":2}}
Multiple joins
Any number of joins can be used in your query expression to perform multiple joins where each join
clause matches up the result of the previous join
clause to produce the final result.
Let’s take a look at the following example where you want to get the department name and the manager name for each person. In order to get the department name to which a person belongs, we need to join each person with the department. That result can be joined with manager details to get the manager name of each department.
The first join
clause matches Person
values with Department
based on the department id and produces a list of all people having department details. The second join
clause matches Manager
values with their Department
based on the department id so that a list of all people having department details with their manager details are produced. The output of the above query expression is,
[{“name”:”Melina”,”dept”:”HR”,”manager”:”James”},{“name”:”Tobi”,”dept”:”Operations”,”manager”:”Emma”},{“name”:”Tom”,”dept”:”HR”,”manager”:”James”}
Nested Query Expressions with Joins
Similar to SQL, a nested query expression or an inner query expression is a query expression within another query expression. An inner query expression can be embedded within a join
clause where its resultant value set can be utilized in the outer query expression.
The following example shows where a query expression embedded in the join
clause outputs a list
containing the first two managers and it is used to get the details of all people having matching manager details.
The output of the above query expression is,
[{"name":"Melina","manager":"James"},{"name":"Emily","manager":"Mike"},{"name":"Tom","manager":"James"}]
And yeah that’s it! Hope you got an understanding about working with joins in Ballerina query expressions! 😁 Don’t forget to go through the Ballerina website if you want to find out more exciting features. 😇
Till we meet again, Adiós!! 🙋🏽
References