Note: phiên bản Tiếng Việt của bài này ở link dưới.

https://duongnt.com/redis-raw-bytes-vie

Send raw bytes to Redis with RedisTemplate

Normally, when we use RedisTemplate to interact with Redis, it converts our data into its string representation, using UTF-8, before performing serialization. Although this approach is good enough in most cases, we can squeeze out extra memory by sending the data in raw bytes format to Redis. But is that something worth doing? In today’s article, we will try implementing that solution, and see what downsides it has.

You can download all sample code from the link below.

https://github.com/duongntbk/redis-raw-bytes-demo

Prerequisites

We need a Redis instance to run the test code in this article. To keep things simple, we will use Docker. Run the following command in your command line and replace yourpassword with a password of your choice.

docker run -d --name redis -p 6379:6379 redis:latest redis-server --requirepass <yourpassword>

Our problem

Let’s say we have to implement a caching solution using Redis, with the following requirements.

  • The key is a Long object with 10 ~ 11 digits on average.
  • The value is a Double object that has 17 decimal digits (the maximum allowed by Double type).
  • It should support pipelining, so that we can send multiple entries and set their expiration using just one connection.
  • We don’t want to manually serialize/deserialize data every time we interact with Redis.

Use RedisTemplate with built-in serializers

Setting up RedisTemplate bean

Here is how we set up a RedisTemplate<Long, Double>. The interesting part is below, where we use GenericToStringSerializer as our serializers.

redisTemplate.keySerializer = GenericToStringSerializer(Long::class.java)
redisTemplate.valueSerializer = GenericToStringSerializer(Double::class.java)

The class GenericToStringSerializer transforms the key and value into their string representations before performing the serialization.

We can then get an instance of the bean from the app context.

val redisTemplateSerialize = context.getBean(
    "redisTemplateSerialize", RedisTemplate::class.java
) as RedisTemplate<Long, Double>

Size of an entry when using the built-in serializer

Below is the code to send two entries to Redis.

redisTemplate.opsForValue().multiSet(
	mapOf(
		6359284517L to 0.5238106733071787,
		Long.MAX_VALUE to 0.6238106733071787,
	)

Here are the entries inside Redis. Both the keys and values have been converted into String.

127.0.0.1:6379> get "6359284517"
"0.5238106733071787"
127.0.0.1:6379> get "9223372036854775807"
"0.6238106733071787"

Now let’s look at their size.

127.0.0.1:6379> memory usage "6359284517"
(integer) 80
127.0.0.1:6379> memory usage "9223372036854775807"
(integer) 88
127.0.0.1:6379>

As we can see, the size of 9223372036854775807 is bigger than the size of 6359284517. This is because the former has more digits, which results in a longer String. And those String objects take up more space than the original Long objects. Similarly, because our values have a lot of digits, their String form also takes up more space than the size of a Double object.

Use custom serializers to send raw bytes to Redis

Set up a RedisTemplate with custom serializers

We need to replace the GenericToStringSerializer to stop RedisTemplate from converting out entries into String objects. Unfortunately, Spring Data Redis does not support Long <-> ByteArray and Double <-> ByteArray serializers out of the box. But we can create them ourselves by implementing the RedisSerializer<T> interface. We can find such implementations in the serializer folder of the demo repo.

Then we can simply use those custom serializers to set up a new bean, as can be seen here.

redisTemplate.keySerializer = LongToByteArraySerializer()
redisTemplate.valueSerializer = DoubleToByteArraySerializer()

And we can get an instance of that bean from the app context.

val redisTemplateByteArray = context.getBean(
    "redisTemplateByteArray", RedisTemplate::class.java
) as RedisTemplate<Long, Double>

Size of an entry as raw bytes

The code using redisTemplateByteArray to send data to Redis is identical to the one in the previous section. But let’s look at the entries in Redis.

127.0.0.1:6379> keys *
1) "%\xfb\n{\x01\x00\x00\x00"
2) "\xff\xff\xff\xff\xff\xff\xff\x7f"
127.0.0.1:6379> get "%\xfb\n{\x01\x00\x00\x00"
"?\xe0\xc3\x0e\x99\xe4\xcde"
127.0.0.1:6379> get "\xff\xff\xff\xff\xff\xff\xff\x7f"
"?\xe3\xf6A\xcd\x18\x00\x98"

All the keys and values are now in ByteArray format. Now let’s check their size.

127.0.0.1:6379> memory usage "%\xfb\n{\x01\x00\x00\x00"
(integer) 72
127.0.0.1:6379> memory usage "\xff\xff\xff\xff\xff\xff\xff\x7f"
(integer) 72

As we can see, the size of both entries are now 72 bytes. This is a 10% reduction in our typical case, and a 18% reduction in the extreme case, where the key has the maximum number of digits.

Use the connection to send raw bytes to Redis

A RedisTemplate object encapsulates the serialization process and generally is the recommended way to interact with Redis. However, it does not support batching commands before sending. To use batching, we need to use the underlying LettuceConnetion.

The steps are below.

  • Create a bean of type RedisConnectionFactory.
  • Retrieve the factory from the AppContext.
  • Get a connection from the factory, use it to open a pipeline, send the data, then close it (code).

As we can see, we need to take care of the conversion from Long/Double to ByteArray by ourselves. However, the ability to batch multiple requests in one connection can be worth the trouble.

The downsides

At 10 ~ 18%, the amount of memory we managed to save might seem lower than expected. After all, the string “6359284517” is 10 bytes, and the string “0.5238106733071787” is 18 bytes. A naive calculation would put the size of a String converted entry as 28 bytes, compared to 16 bytes of a Long and Double combination. However, Redis also needs to allocate memory for the data structure of the entry, as well as to store its expiration date (if any).

And another downside is that sending raw bytes to Redis actually costs more memory if the numbers are short (not many digits). This is because an UTF-8 String can be as small as 1 byte, while a Long or Double object always occupies 8 bytes, no matter how many digits it has.

Below is the size of the entry { 1: 1.0 }.

127.0.0.1:6379> memory usage "1"
(integer) 56
127.0.0.1:6379> memory usage "\x01\x00\x00\x00\x00\x00\x00\x00"
(integer) 72

Not only did we not save any memory, we even increased memory usage by almost 30% per entry. Unless we are sure that our keys and values always have many digits, perhaps we should stick to the default serializer.

Can we use a pass-through serializer?

What if we use a serializer that does nothing and perform the conversion to/from ByteArray ourselves? Here is how we set it up. Note that the type of the template is now RedisTemplate<ByteArray, ByteArray>

fun redisTemplate(): RedisTemplate<ByteArray, ByteArray> {
    //...

    redisTemplate.keySerializer = RedisSerializer.byteArray()
    redisTemplate.valueSerializer = RedisSerializer.byteArray()
    //...
}

And here is how we use it.

val key = // code to convert 6359284517L to ByteArray
val value = // code to convert Long.MAX_VALUE to ByteArray
redisTemplate.opsForValue().set(key, value)

Functionally, this approach works just as well as the previous one using custom serializers. But I don’t think there is any reason to do this. After all, why litter our code base with byte conversion logic if we can encapsulate it inside a serializer class when we can’t even take advantage of batching using pipeline?

Conclusion

Sending data as raw bytes to Redis can save us some memory. But we should perform detailed benchmarks before making this change. As it can backfire if our data doesn’t follow a very specific format.

A software developer from Vietnam and is currently living in Japan.

Leave a Reply