Note: phiên bản Tiếng Việt của bài này ở link dưới.
https://duongnt.com/elasticsearch-api-client-kotlin-vie
The Elasticsearch Java API Client is now the recommended library to interact with Elasticsearch. It provides a fluent way for building requests and parsing responses, with strong typing and good integration with JSON object mappers. And thanks to the interoperability between Java and Kotlin, we can use it almost as is in Kotlin projects.
Today, we will use the Elasticsearch API Client and Kotlin to query data from the cluster we set up in the previous article. As well as exploring two main ways to create a request with this new API.
You can download the sample code in this article from the link below.
https://github.com/duongntbk/elasticsearchclient-demo
Prerequisites
Set up and populate an Elasticsearch cluster locally
Please follow this guide to set up an Elasticsearch cluster on your local machine. Also, we will use the exact same test data as in our previous article.
PUT footballer/_bulk
{ "create": { } }
{ "name": "Ronaldo","position":"fw", "age": 38, "salary": 4430}
{ "create": { } }
{ "name": "Messi","position":"fw", "age": 36, "salary": 1440}
{ "create": { } }
{ "name": "Sancho","position":"lw", "age": 23, "salary": 373}
{ "create": { } }
{ "name": "Antony","position":"lw", "age": 23, "salary": 200}
{ "create": { } }
{ "name": "Salah","position":"rw", "age": 30, "salary": 350}
{ "create": { } }
{ "name": "Vinicius Junior","position":"lw", "age": 22, "salary": 354}
{ "create": { } }
{ "name": "Mahrez","position":"rw", "age": 32, "salary": 160}
{ "create": { } }
{ "name": "Rashford","position":"fw", "age": 25, "salary": 247}
{ "create": { } }
{ "name": "Bukayo Saka","position":"rw", "age": 21, "salary": 70}
{ "create": { } }
{ "name": "Gnabry","position":"rw", "age": 27, "salary": 365}
Install the necessary dependencies
As can be seen in the build.gradle.kts file, we only need to add three libraries.
implementation("org.elasticsearch.client:elasticsearch-rest-client:8.8.1")
implementation ("co.elastic.clients:elasticsearch-java:8.8.1")
implementation ("com.fasterxml.jackson.core:jackson-databind:2.12.3")
Connect to the cluster
The code to create a connection to our cluster is in the ElasticsearchClientWrapper class. Below are some of the important parts.
Providing the CA Fingerprint, this will be used to verify the identity of the Elasticsearch cluster. Also, we need to specify a username and password here as well.
val sslContext = TransportUtils.sslContextFromCaFingerprint(fingerprint)
val credsProv = BasicCredentialsProvider()
credsProv.setCredentials(AuthScope.ANY, UsernamePasswordCredentials(login, password))
Our local cluster only accepts HTTPS connections, which is why we need to specify the https
protocol here.
.builder(HttpHost("localhost", 9200, "https"))
As our certificate is self-signed, we need to bypass the hostname verification process. Needless to say, we should never do this in production.
.setSSLHostnameVerifier { _, _ -> true } // DANGER!!
Our wrapper class implements the Closeable
interface. An interesting detail is that instead of closing the ElasticsearchClient
object, we need to close the RestClientTransport
object.
transport.close()
Use the wrapper to send a request and read the response
Send a request
As can be seen here our wrapper simply forwards the request to the ElasticsearchClient
object. The response will be in JSON format, so we also need to provide a POJO class to parse each document in the response. We will use the data class Footballer in our example.
val response = client.search(request, Footballer::class.java)
Note: the wrapper only accepts requests as a SearchRequest
object. But the ElasticsearchClient
class also accepts a lambda to create the request. However, this is out of scope for today’s article.
Read the response
The response is of type SearchResponse<TDocument>
, or SearchResponse<Footballer>
in our case. I wrote this simple function to print information retrieved from the response.
Print the total number of returned documents.
println("Hits: ${response.hits().total()?.value()}")
For each document, we print its name and score.
for (hit in response.hits().hits()) {
println("Name: ${hit.source()?.name}, Score: ${hit.score()}")
}
Write a few simple queries
Using the provided DSL
The default method to create a SearchRequest
is via a Domain Specific Language (DSL). Its syntax is very similar to the HTTP requests we sent via Kibana’s developer console in the last article.
For example, here is an HTTP request to find documents with name == Rashford
.
GET /footballer/_search
{
"query" : {
"match" : { "name": "Rashford" }
}
}
And here is how we write the same request for the Elasticsearch API client.
val request = SearchRequest.of { s -> s
.index("footballer")
.query { q -> q
.match { t -> t
.field("name")
.query("Rashford")
}
}
}
This is the response from the Elasticsearch cluster, printed with the function printResults. It matches our expectations.
Hits: 1
Name: Rashford, Score: 2.1382177
Using QueryBuilder classes
The DSL is nice and all, but sometimes using the QueryBuilder classes is more straightforward, especially for people who are used to the old HighLevelRestClient
. Fortunately, the new client also supports building queries with QueryBuilder classes.
We will build a SearchRequest
object equivalent to the following HTTP request.
GET /footballer/_search
{
"query" : {
"bool" : {
"should": [
{ "term": { "position": "lw" }},
{ "term": { "position": { "value": "rw", "boost": 2 }}}
]
}
}
}
As we can see, it is composed of two TermQuery
objects grouped by a should
operator under a BoolQuery
. And the second TermQuery
has a boost factor of 2
. Let’s recreate it with QueryBuilder classes.
val term1 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val term2 = TermQuery.Builder().field("position").value("rw").boost(2F).build()._toQuery()
val boolQuery = BoolQuery.Builder()
.should(term1, term2)
.build()
._toQuery()
val request = SearchRequest.Builder()
.index("footballer")
.query(boolQuery)
.build()
The request above should print the following result to console. As we can see, players who are right winger received a small boost.
Hits: 7
Name: Salah, Score: 1.7876358
Name: Mahrez, Score: 1.7876358
Name: Bukayo Saka, Score: 1.7876358
Name: Gnabry, Score: 1.7876358
Name: Sancho, Score: 1.1451323
Name: Antony, Score: 1.1451323
Name: Vinicius Junior, Score: 1.1451323
Mixing both methods together
After reading all this, we might ask what is the better approach? Should we use the DSL or the QueryBuilder classes? Well, only a Sith deals in absolutes. It turns out that mixing both methods together helps our code become less verbose and more readable.
Below is the final HTTP request we used in the previous article.
GET /footballer/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{ "term": { "position": { "value": "rw", "boost": 2 }}},
{ "term": { "position": "lw" }}
]
}
},
{
"bool": {
"should": [
{ "range": { "age": { "lte": 23 }}},
{ "range": { "salary": {"lte": 200, "boost": 2 }}}
]
}
}
]
}
}
}
It is possible to create an equivalent SearchRequest
object using only the DSL or only the QueryBuilder classes. But as we can see in those links, the DSL-only version is hard to read, and we cannot see the overall structure of the request in the QueryBuilder-only version.
Instead, we can have the best of both worlds by combining those methods together. First, we create the leaf nodes using QueryBuilder classes.
val positionTerm1 = TermQuery.Builder().field("position").value("rw").boost(2F).build()._toQuery()
val positionTerm2 = TermQuery.Builder().field("position").value("lw").build()._toQuery()
val ageRange = RangeQuery.Builder().field("age").lte(JsonData.of(23)).build()._toQuery()
val salaryRange = RangeQuery.Builder().field("salary").lte(JsonData.of(200)).boost(2F).build()._toQuery()
Then we use the DSL to create the overall structure of our request, substituting in the leaf nodes created above.
val request = SearchRequest.of { s -> s
.index("footballer")
.query { q -> q
.bool { b -> b
.must { m -> m
.bool { b -> b
.should(positionTerm1)
.should(positionTerm2)
}
}
.must { m -> m
.bool { b -> b
.should(ageRange)
.should(salaryRange)
}
}
}
}
}
And the result is similar to what we got last time.
Hits: 5
Name: Bukayo Saka, Score: 4.787636
Name: Antony, Score: 4.145132
Name: Mahrez, Score: 3.7876358
Name: Sancho, Score: 2.1451323
Name: Vinicius Junior, Score: 2.1451323
Conclusion
The new API Client makes querying data from an Elasticsearch cluster a breeze. Although it might take some time to get used to, the DSL syntax can help us preserve the overall structure of a request. And the good old QueryBuilder classes help us keep our code readable as our queries grow in complexity.
One Thought on “Query data with Kotlin and Elasticsearch API Client”