Note: phiên bản Tiếng Việt của bài này ở link dưới.
https://duongnt.com/elasticsearch-dsl-kotlin-vie
One of the things I like most about Kotlin is how easy it is to develop a DSL (domain specific language) with it. This is due to a few features like lambda with receiver, infix function, and unary function. Today, we will develop a toy DSL to write SearchRequest
to query data from an Elasticsearch cluster.
You can download the sample code in this article from the link below.
https://github.com/duongntbk/elasticsearch-dsl-demo
Prerequisites
Similar to the last article, we need a local Elasticsearch cluster to test our code. You can set up one by following this guide. Also, we will use the same test data as well, so I won’t copy it here.
Why do we need a DSL?
In a previous article, we explored how to write a SearchRequest
with the library Elasticsearch API Client. With its robust interface, we can write all the queries we can think of. But because this library has to support all use cases, its syntax is quite complicated and verbose. As a reminder, here is how we write a request with Elasticsearch API Client.
SearchRequest.of { s -> s
.index("footballer")
.query { q -> q
.bool { b -> b
.must { m -> m
.bool { b -> b
.should { s -> s
.term { t -> t
.field("position")
.value("rw")
.boost(2F)
}
}
.should { s -> s
.term { t -> t
.field("position")
.value("lw")
}
}
}
}
.must { m -> m
.bool { b -> b
.should { s -> s
.range { r -> r
.field("age")
.lte(JsonData.of(23))
}
}
.should { s -> s
.range { r -> r
.field("salary")
.lte(JsonData.of(200))
.boost(2F)
}
}
}
}
}
}
}
In this article, we will develop a DSL to simplify the code above. This is a sneak peak of what we will come up with.
val query = and {
+or {
+(AGE lte 23)
+((SALARY lte 200) boost 2F)
}
+or {
+((POSITION eq "rw") boost 2F)
+(POSITION eq "lw")
}
}
The snippet above is not magic; it uses only plain-old Kotlin. In the next sections, we will go step-by-step until we reach that final version. But first, let’s take a detour and look at lambda with receiver, an important feature that enables concise DSL syntax.
A primer on lambda with receiver
First class functions in Kotlin
As we know, functions are first-class in Kotlin. This means a function can receive a different function as an argument, as well as return a function to the caller. And wherever we can use a full-fledged function, we can also use a lambda with the same signature. Below is a simple function that receives a lambda with one StringBuilder
argument. It invokes that lambda with a newly created StringBuilder
object and returns the String
value from that StringBuilder
.
fun demo(func: (StringBuilder) -> Unit): String {
val sb = StringBuilder()
func(sb)
return sb.toString()
}
Here is how we call that demo
function with a lambda. Notice that we use the special variable it
instead of using the ->
syntax.
println(demo {
it.append("a")
it.append("b")
})
As we expected, the code above prints the two characters ab
to the console.
Lambda with receiver
Even though we can simplify the code above by using the special variable it
, we still need to repeat that variable whenever we want to call a StringBuilder
method. Instead, we can designate the first argument in the function parameter of demo
as a receiver. This allows us to refer to its members directly without using a qualifier.
fun demo(func: StringBuilder.() -> Unit): String {
val sb = StringBuilder()
sb.func()
return sb.toString()
}
println(demo {
append("c")
append("d")
})
The code above prints cd
to the console just as we expected.
Building our DSL
Create a BoolQuery with the keyword “or”
As a start, we want to create a BoolQuery that searches for documents that match any of its children with the syntax or {}
. To do that, we need a helper class called OrOperator
.
class OrOperator {
// A list to store all child queries inside a BooleanQuery
private val children = mutableListOf<Query>()
// Add new query to the children list
fun add(childQuery: Query) {
children.add(childQuery)
}
// Create a BoolQuery.Builder and add all its child queries with the `should` method.
// Then build the query
fun build(): Query {
val builder = BoolQuery.Builder()
builder.should(children)
return builder.build()._toQuery()
}
}
Then we can write an or method that uses lambda with receiver to add queries to a BoolQuery.Builder
and build it.
fun or(
addQuery: OrOperator.() -> Unit // addQuery is a lambda with receiver
): Query {
val operator = OrOperator()
operator.addQuery()
return operator.build()
}
With it, we can write a query to find either a left-winger or a right-winger.
val positionTerm1 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val positionTerm2 = TermQuery.Builder().field("position").value("rw").build()._toQuery()
val query = or {
add(positionTerm1)
add(positionTerm2)
}
Honestly, this syntax is not that much more readable than the default one. But it is a good start, and we can still improve.
Add queries to the children list with an unary function
The add
function doesn’t look too cool, let’s use the +
symbol instead. In this case, we have only one operand for the plus function, so it is an unary function. Let’s add that to the OrOperator
class.
operator fun Query.unaryPlus() {
children.add(this)
}
Now we can update the DSL to this version.
val positionTerm1 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val positionTerm2 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val query = or {
+positionTerm1
+positionTerm2
}
Create a TermQuery with an infix function
The users of our DSL shouldn’t need to know about TermQuery
or its syntax. Since we only need two variables to define a TermQuery
, isn’t it cool if they can just write "position" eq "lw"
? That is closer to natural language and is much more readable and concise. We can easily achieve this by defining an infix function with the name eq
. And to avoid polluting the global namespace, we will put that function inside the OrOperator
class as well.
infix fun String.eq(value: String): Query =
TermQuery.Builder().field(this).value(value).build()._toQuery()
Our DSL now becomes this.
val query = or {
// This syntax only works inside an "or {}" block
+("position" eq "lw")
+("position" eq "lw")
}
Unfortunately, we need a pair of brackets here because otherwise the unary function will take precedence. But other than that our, DSL is looking great.
Handle RangeQuery
RangeQuery
is very similar to TermQuery
. It does have a lot more operators, but in this article we only care about the lte
operator. We can implement it inside the OrOperator
class like this. Notice that the value type is now Int
instead of String
.
infix fun String.lte(value: Int): Query =
RangeQuery.Builder().field(this).lte(JsonData.of(value)).build()._toQuery()
We can write a new query to find young or cheap players.
val query = or {
+("age" lte 23)
+("salary" lte 200)
}
Create a BoolQuery with the keyword “and”
Now we want to combine the two queries created by the new DSL inside a BoolQuery
. And we want to find documents that match both conditions. As you may have guessed, we need an AndOperator
class first. We can simply copy the OrOperator
helper class and call the must
method on the builder instead of the should
method. But those two classes will be almost identical. Let’s move the common parts to an abstract class.
abstract class BaseOperator {
protected val children = mutableListOf<Query>()
operator fun Query.unaryPlus() {
children.add(this)
}
infix fun String.eq(value: String): Query =
TermQuery.Builder().field(this).value(value).build()._toQuery()
infix fun String.lte(value: Int): Query =
RangeQuery.Builder().field(this).lte(JsonData.of(value)).build()._toQuery()
abstract fun build(): Query
Then inside the AndOperator and OrOperator classes, we only need to correctly override the build
method.
class AndOperator: BaseOperator() {
override fun build(): Query {
val builder = BoolQuery.Builder()
builder.must(children)
return builder.build()._toQuery()
}
class OrOperator: BaseOperator() {
override fun build(): Query {
val builder = BoolQuery.Builder()
builder.should(children)
return builder.build()._toQuery()
}
And we need an and method to use the AndOperator
class, similar to the or method and OrOperator
.
fun and(
addQuery: AndOperator.() -> Unit
): Query {
val operator = AndOperator()
operator.addQuery()
return operator.build()
}
Use both “and” and “or” queries together
The next step is to write a combined query to search for either a left-winger or a right-winger, who is either young or cheap.
val secondQuery = or {
+("position" eq "rw")
+("position" eq "lw")
}
val firstQuery = or {
+("age" lte 23)
+("salary" lte 200)
}
val query = and {
+firstQuery
+secondQuery
}
Or even better, we don’t need to create temporary variables to hold the two or
queries.
val query = and {
+or {
+("age" lte 23)
+("salary" lte 200)
}
+or {
+("position" eq "rw")
+("position" eq "lw")
}
}
Add query boosting to our DSL
The last step is to add query-boosting syntax. You might still remember that in the beginning of today’s article, we wrote a boost query like this: +((SALARY lte 200) boost 2F)
. The syntax to support boosting is actually very similar to the syntax to support TermQuery
or RangeQuery
. We define boost
as an infix function on the Query
type that takes a float
parameter. Let’s add it to the BaseOperator
class.
infix fun Query.boost(factor: Float): Query = if (factor == 1F) {
this // short-circuit in case the boost factor is 1
} else BoolQuery.Builder().must(this).boost(factor).build()._toQuery()
Finally, our code now looks very similar to the code in the introduction section.
val query = and {
+or {
+("age" lte 23)
+(("salary" lte 200) boost 2F)
}
+or {
+(("position" eq "rw") boost 2F)
+("position" eq "lw")
}
}
DSL and type safety
What if we write something like or { +("batman" eq "fw") }
. Clearly, we don’t have a field called batman
in our index, and this will throw an exception. Shouldn’t our DSL be smart enough to prevent this? Fortunately, we don’t have to sacrifice type safety in our DSL. At a start, we can define all the supported fields as constants. Our code will then become this.
val query = and {
+or {
+(AGE lte 23)
+((SALARY lte 200) boost 2F)
}
+or {
+((POSITION eq "rw") boost 2F)
+(POSITION eq "lw")
}
}
Of course, this is not perfect, as people can still resort to writing queries without using our constants. But then that has to be a deliberate choice. And we can always modify our DSL to ensure type safety. Below is an idea, I’ll leave the implementation as an exercise for our readers.
- Define a
TERM
class that has just aname
property of typeString
. - Define a companion object in the
TERM
class with three properties also of typeTERM
namedAGE/POSITION/SALARY
. - The properties of type
TERM
above are initiated with their names set toage/position/salary
. - Change the infix method
eq
to work onTERM
instead ofString
.infix fun TERM.eq(value: String): Query = TermQuery.Builder().field(this.name).value(value).build()._toQuery()
- We can now write our query like this:
+(TERM.POSITION eq "lw")
. - If we import the companion properties of the class
Term
, we can even write+(POSITION eq "lw")
.
Conclusion
DSL is an excellent tool to bridge the gap between domain experts and developers. Imagine a scouting company with a huge database of footballers all over the world. Using a DSL, any scout or coach can search this database to find their perfect transfer target without any programming knowledge.
On the developer’s side, a DSL helps write clear and concise code that is more readable and understandable. This helps us focus on the high-level logic without getting lost in implementation details.
One Thought on “Kotlin DSL example with Elasticsearch”