Note: phiên bản Tiếng Việt của bài này ở link dưới.

https://duongnt.com/elasticsearch-dsl-kotlin-vie

Kotlin DSL example with Elasticsearch

One of the things I like most about Kotlin is how easy it is to develop a DSL (domain specific language) with it. This is due to a few features like lambda with receiver, infix function, and unary function. Today, we will develop a toy DSL to write SearchRequest to query data from an Elasticsearch cluster.

You can download the sample code in this article from the link below.

https://github.com/duongntbk/elasticsearch-dsl-demo

Prerequisites

Similar to the last article, we need a local Elasticsearch cluster to test our code. You can set up one by following this guide. Also, we will use the same test data as well, so I won’t copy it here.

Why do we need a DSL?

In a previous article, we explored how to write a SearchRequest with the library Elasticsearch API Client. With its robust interface, we can write all the queries we can think of. But because this library has to support all use cases, its syntax is quite complicated and verbose. As a reminder, here is how we write a request with Elasticsearch API Client.

SearchRequest.of { s -> s
    .index("footballer")
    .query { q -> q
        .bool { b -> b
            .must { m -> m
                .bool { b -> b
                    .should { s -> s
                        .term { t -> t
                            .field("position")
                            .value("rw")
                            .boost(2F)
                        }
                    }
                    .should { s -> s
                        .term { t -> t
                            .field("position")
                            .value("lw")
                        }
                    }
                }
            }
            .must { m -> m
                .bool { b -> b
                    .should { s -> s
                        .range { r -> r
                            .field("age")
                            .lte(JsonData.of(23))
                        }
                    }
                    .should { s -> s
                        .range { r -> r
                            .field("salary")
                            .lte(JsonData.of(200))
                            .boost(2F)
                        }
                    }
                }
            }
        }
    }
}

In this article, we will develop a DSL to simplify the code above. This is a sneak peak of what we will come up with.

val query = and {
    +or {
        +(AGE lte 23)
        +((SALARY lte 200) boost 2F)
    }
    +or {
        +((POSITION eq "rw") boost 2F)
        +(POSITION eq "lw")
    }
}

The snippet above is not magic; it uses only plain-old Kotlin. In the next sections, we will go step-by-step until we reach that final version. But first, let’s take a detour and look at lambda with receiver, an important feature that enables concise DSL syntax.

A primer on lambda with receiver

First class functions in Kotlin

As we know, functions are first-class in Kotlin. This means a function can receive a different function as an argument, as well as return a function to the caller. And wherever we can use a full-fledged function, we can also use a lambda with the same signature. Below is a simple function that receives a lambda with one StringBuilder argument. It invokes that lambda with a newly created StringBuilder object and returns the String value from that StringBuilder.

fun demo(func: (StringBuilder) -> Unit): String {
    val sb = StringBuilder()
    func(sb)
    return sb.toString()
}

Here is how we call that demo function with a lambda. Notice that we use the special variable it instead of using the -> syntax.

println(demo {
    it.append("a")
    it.append("b")
})

As we expected, the code above prints the two characters ab to the console.

Lambda with receiver

Even though we can simplify the code above by using the special variable it, we still need to repeat that variable whenever we want to call a StringBuilder method. Instead, we can designate the first argument in the function parameter of demo as a receiver. This allows us to refer to its members directly without using a qualifier.

fun demo(func: StringBuilder.() -> Unit): String {
    val sb = StringBuilder()
    sb.func()
    return sb.toString()
}

println(demo {
    append("c")
    append("d")
})

The code above prints cd to the console just as we expected.

Building our DSL

Create a BoolQuery with the keyword “or”

As a start, we want to create a BoolQuery that searches for documents that match any of its children with the syntax or {}. To do that, we need a helper class called OrOperator.

class OrOperator {
    // A list to store all child queries inside a BooleanQuery
    private val children = mutableListOf<Query>()

    // Add new query to the children list
    fun add(childQuery: Query) {
        children.add(childQuery)
    }

    // Create a BoolQuery.Builder and add all its child queries with the `should` method.
    // Then build the query
    fun build(): Query {
        val builder = BoolQuery.Builder()
        builder.should(children)
        return builder.build()._toQuery()
    }
}

Then we can write an or method that uses lambda with receiver to add queries to a BoolQuery.Builder and build it.

fun or(
    addQuery: OrOperator.() -> Unit // addQuery is a lambda with receiver
): Query {
    val operator = OrOperator()
    operator.addQuery()
    return operator.build()
}

With it, we can write a query to find either a left-winger or a right-winger.

val positionTerm1 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val positionTerm2 = TermQuery.Builder().field("position").value("rw").build()._toQuery()
val query = or {
    add(positionTerm1)
    add(positionTerm2)
}

Honestly, this syntax is not that much more readable than the default one. But it is a good start, and we can still improve.

Add queries to the children list with an unary function

The add function doesn’t look too cool, let’s use the + symbol instead. In this case, we have only one operand for the plus function, so it is an unary function. Let’s add that to the OrOperator class.

operator fun Query.unaryPlus() {
    children.add(this)
}

Now we can update the DSL to this version.

val positionTerm1 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val positionTerm2 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val query = or {
    +positionTerm1
    +positionTerm2
}

Create a TermQuery with an infix function

The users of our DSL shouldn’t need to know about TermQuery or its syntax. Since we only need two variables to define a TermQuery, isn’t it cool if they can just write "position" eq "lw"? That is closer to natural language and is much more readable and concise. We can easily achieve this by defining an infix function with the name eq. And to avoid polluting the global namespace, we will put that function inside the OrOperator class as well.

infix fun String.eq(value: String): Query =
    TermQuery.Builder().field(this).value(value).build()._toQuery()

Our DSL now becomes this.

val query = or {
    // This syntax only works inside an "or {}" block
    +("position" eq "lw")
    +("position" eq "lw")
}

Unfortunately, we need a pair of brackets here because otherwise the unary function will take precedence. But other than that our, DSL is looking great.

Handle RangeQuery

RangeQuery is very similar to TermQuery. It does have a lot more operators, but in this article we only care about the lte operator. We can implement it inside the OrOperator class like this. Notice that the value type is now Int instead of String.

infix fun String.lte(value: Int): Query =
    RangeQuery.Builder().field(this).lte(JsonData.of(value)).build()._toQuery()

We can write a new query to find young or cheap players.

val query = or {
    +("age" lte 23)
    +("salary" lte 200)
}

Create a BoolQuery with the keyword “and”

Now we want to combine the two queries created by the new DSL inside a BoolQuery. And we want to find documents that match both conditions. As you may have guessed, we need an AndOperator class first. We can simply copy the OrOperator helper class and call the must method on the builder instead of the should method. But those two classes will be almost identical. Let’s move the common parts to an abstract class.

abstract class BaseOperator {
    protected val children = mutableListOf<Query>()

    operator fun Query.unaryPlus() {
        children.add(this)
    }

    infix fun String.eq(value: String): Query =
        TermQuery.Builder().field(this).value(value).build()._toQuery()

    infix fun String.lte(value: Int): Query =
        RangeQuery.Builder().field(this).lte(JsonData.of(value)).build()._toQuery()

    abstract fun build(): Query

Then inside the AndOperator and OrOperator classes, we only need to correctly override the build method.

class AndOperator: BaseOperator() {
    override fun build(): Query {
        val builder = BoolQuery.Builder()
        builder.must(children)
        return builder.build()._toQuery()
    }

class OrOperator: BaseOperator() {
    override fun build(): Query {
        val builder = BoolQuery.Builder()
        builder.should(children)
        return builder.build()._toQuery()
    }

And we need an and method to use the AndOperator class, similar to the or method and OrOperator.

fun and(
    addQuery: AndOperator.() -> Unit
): Query {
    val operator = AndOperator()
    operator.addQuery()
    return operator.build()
}

Use both “and” and “or” queries together

The next step is to write a combined query to search for either a left-winger or a right-winger, who is either young or cheap.

val secondQuery = or {
    +("position" eq "rw")
    +("position" eq "lw")
}
val firstQuery = or {
    +("age" lte 23)
    +("salary" lte 200)
}
val query = and {
    +firstQuery
    +secondQuery
}

Or even better, we don’t need to create temporary variables to hold the two or queries.

val query = and {
    +or {
        +("age" lte 23)
        +("salary" lte 200)
    }
    +or {
        +("position" eq "rw")
        +("position" eq "lw")
    }
}

Add query boosting to our DSL

The last step is to add query-boosting syntax. You might still remember that in the beginning of today’s article, we wrote a boost query like this: +((SALARY lte 200) boost 2F). The syntax to support boosting is actually very similar to the syntax to support TermQuery or RangeQuery. We define boost as an infix function on the Query type that takes a float parameter. Let’s add it to the BaseOperator class.

infix fun Query.boost(factor: Float): Query = if (factor == 1F) {
    this // short-circuit in case the boost factor is 1
} else BoolQuery.Builder().must(this).boost(factor).build()._toQuery()

Finally, our code now looks very similar to the code in the introduction section.

val query = and {
    +or {
        +("age" lte 23)
        +(("salary" lte 200) boost 2F)
    }
    +or {
        +(("position" eq "rw") boost 2F)
        +("position" eq "lw")
    }
}

DSL and type safety

What if we write something like or { +("batman" eq "fw") }. Clearly, we don’t have a field called batman in our index, and this will throw an exception. Shouldn’t our DSL be smart enough to prevent this? Fortunately, we don’t have to sacrifice type safety in our DSL. At a start, we can define all the supported fields as constants. Our code will then become this.

val query = and {
    +or {
        +(AGE lte 23)
        +((SALARY lte 200) boost 2F)
    }
    +or {
        +((POSITION eq "rw") boost 2F)
        +(POSITION eq "lw")
    }
}

Of course, this is not perfect, as people can still resort to writing queries without using our constants. But then that has to be a deliberate choice. And we can always modify our DSL to ensure type safety. Below is an idea, I’ll leave the implementation as an exercise for our readers.

  • Define a TERM class that has just a name property of type String.
  • Define a companion object in the TERM class with three properties also of type TERM named AGE/POSITION/SALARY.
  • The properties of type TERM above are initiated with their names set to age/position/salary.
  • Change the infix method eq to work on TERM instead of String.
    infix fun TERM.eq(value: String): Query =
        TermQuery.Builder().field(this.name).value(value).build()._toQuery()
    
  • We can now write our query like this: +(TERM.POSITION eq "lw").
  • If we import the companion properties of the class Term, we can even write +(POSITION eq "lw").

Conclusion

DSL is an excellent tool to bridge the gap between domain experts and developers. Imagine a scouting company with a huge database of footballers all over the world. Using a DSL, any scout or coach can search this database to find their perfect transfer target without any programming knowledge.

On the developer’s side, a DSL helps write clear and concise code that is more readable and understandable. This helps us focus on the high-level logic without getting lost in implementation details.

A software developer from Vietnam and is currently living in Japan.

One Thought on “Kotlin DSL example with Elasticsearch”

Leave a Reply