DEV Community

Cover image for 10 Hidden Pitfalls of Using Redis Distributed Locks
Leapcell
Leapcell

Posted on

10 Hidden Pitfalls of Using Redis Distributed Locks

Cover

In daily development, Redis distributed locks are often used to solve data read/write issues in concurrent requests. However, using Redis distributed locks comes with many pitfalls. This article will analyze and explain 10 pitfalls of Redis distributed locks.

1. Non-atomic operations (setnx + expire)

When it comes to implementing a Redis distributed lock, many developers immediately think of using the setnx + expire commands. That is, use setnx to acquire the lock, and if successful, then use expire to set an expiration time on the lock.

Pseudo-code:

if (jedis.setnx(lock_key, lock_value) == 1) { // Acquire lock
    jedis.expire(lock_key, timeout); // Set expiration time
    doBusiness // Business logic
}
Enter fullscreen mode Exit fullscreen mode

This code has a major pitfall: setnx and expire are executed separately and are not atomic! If the process crashes or is restarted right after executing setnx but before expire, the lock will never expire. As a result, other threads will never be able to acquire the lock.

2. Overwritten by another client's request (setnx + value as expiration time)

To solve the issue of locks not being released due to exceptions, some suggest putting the expiration timestamp in the value of setnx. If lock acquisition fails, you can then check the stored value against the current system time to determine if the lock has expired. Pseudo-code implementation:

long expireTime = System.currentTimeMillis() + timeout; // Current time + timeout
String expireTimeStr = String.valueOf(expireTime); // Convert to string

// If the lock does not exist, return true
if (jedis.setnx(lock_key, expireTimeStr) == 1) {
    return true;
}

// If the lock exists, retrieve its expiration time
String oldExpireTimeStr = jedis.get(lock_key);

// If the stored expiration time is less than current time, it's expired
if (oldExpireTimeStr != null && Long.parseLong(oldExpireTimeStr) < System.currentTimeMillis()) {

    // Lock is expired; attempt to overwrite it with new expiration time
    String oldValueStr = jedis.getSet(lock_key, expireTimeStr);

    if (oldValueStr != null && oldValueStr.equals(oldExpireTimeStr)) {
        // In concurrent scenarios, only the thread whose set value matches the old value gets the lock
        return true;
    }
}

// Lock acquisition failed in all other cases
return false;
Enter fullscreen mode Exit fullscreen mode

This approach also has a pitfall: when the lock expires and multiple clients concurrently call jedis.getSet(), only one will successfully acquire the lock. However, that client’s expiration time might be overwritten by another, leading to inconsistencies.

3. Forgetting to set an expiration time

While reviewing code, I once saw a distributed lock implementation like this:

try {
  if (jedis.setnx(lock_key, lock_value) == 1) { // Acquire lock
     doBusiness // Business logic
     return true; // Lock acquired and business logic processed
  }
  return false; // Lock acquisition failed
} finally {
    unlock(lockKey); // Release lock
}
Enter fullscreen mode Exit fullscreen mode

What’s wrong here? That’s right — the expiration time is missing. If the program crashes during execution and doesn’t reach the finally block, the lock won’t be deleted. This makes unlocking unreliable. Therefore, when using distributed locks, always set an expiration time.

4. Forgetting to release the lock after business processing

Many developers use Redis's set command with extended parameters to implement distributed locks.

Extended parameters of SET key value:

  • NX: Set only if the key does not exist, ensuring only the first client gets the lock.
  • EX seconds: Set expiration in seconds.
  • PX milliseconds: Set expiration in milliseconds.
  • XX: Set only if the key exists.

Some might write pseudo-code like this:

if (jedis.set(lockKey, requestId, "NX", "PX", expireTime) == 1) { // Acquire lock
   doBusiness // Business logic
   return true; // Lock acquired and business logic processed
}
return false; // Lock acquisition failed
Enter fullscreen mode Exit fullscreen mode

At first glance, this looks fine, but there's a problem — it forgets to release the lock! If you always wait for the expiration to release the lock, efficiency suffers. You should release the lock after business logic completes.

Correct usage:

try {
  if (jedis.set(lockKey, requestId, "NX", "PX", expireTime) == 1) { // Acquire lock
     doBusiness // Business logic
     return true; // Lock acquired and business logic processed
  }
  return false; // Lock acquisition failed
} finally {
    unlock(lockKey); // Release lock
}
Enter fullscreen mode Exit fullscreen mode

5. Thread B's lock gets released by Thread A

Consider the following pseudo-code:

try {
  if (jedis.set(lockKey, requestId, "NX", "PX", expireTime) == 1) { // Acquire lock
     doBusiness // Business logic
     return true; // Lock acquired and business logic processed
  }
  return false; // Lock acquisition failed
} finally {
    unlock(lockKey); // Release lock
}
Enter fullscreen mode Exit fullscreen mode

What’s the issue here?

In a concurrent scenario where threads A and B both attempt to acquire the lock, suppose Thread A gets the lock first (set to expire in 3 seconds). If its business logic is slow and takes more than 3 seconds, Redis will auto-expire the lock. Then Thread B acquires the lock and starts executing. If Thread A finishes its task and releases the lock afterward, it inadvertently releases Thread B's lock.

The correct approach is to add a unique request identifier (e.g., requestId) when acquiring the lock, and only release the lock if the identifier matches:

try {
  if (jedis.set(lockKey, requestId, "NX", "PX", expireTime) == 1) { // Acquire lock
     doBusiness // Business logic
     return true; // Lock acquired and business logic processed
  }
  return false; // Lock acquisition failed
} finally {
    if (requestId.equals(jedis.get(lockKey))) { // Check if it's the same requestId
      unlock(lockKey); // Release lock
    }
}
Enter fullscreen mode Exit fullscreen mode

6. Releasing the lock is not atomic

Even the previous code has a flaw:

if (requestId.equals(jedis.get(lockKey))) { // Check if it's the same requestId
    unlock(lockKey); // Release lock
}
Enter fullscreen mode Exit fullscreen mode

Because the check (get) and the release (del) are two separate operations, they are not atomic. If the lock has already expired by the time unlock(lockKey) is called, then the lock might have been acquired by another client. Releasing it now would remove someone else’s lock, which is dangerous.

This introduces a consistency issue — the check and deletion must be atomic. To ensure atomicity when releasing the lock, you can use Redis + Lua script, like this:

if redis.call('get', KEYS[1]) == ARGV[1] then
   return redis.call('del', KEYS[1])
else
   return 0
end
Enter fullscreen mode Exit fullscreen mode

7. Lock expires but the business logic hasn't finished

After acquiring the lock, if the lock expires due to timeout, Redis will automatically delete it. However, the business logic might not be finished yet, leading to premature release of the lock.

Some developers think the simple solution is to set a longer expiration time. But consider this: it’s possible to start a watchdog thread for the thread that has acquired the lock. This thread can periodically check if the lock still exists and, if so, extend its expiration to prevent early release.

This issue has been addressed by the open-source framework Redisson.

Once a thread acquires the lock, Redisson starts a watchdog, a background thread that checks the lock every 10 seconds. If the thread still holds the lock, the watchdog keeps extending its TTL. This is how Redisson solves the problem of premature lock expiration when business logic hasn't finished.

8. Redis distributed lock becomes ineffective when used with @Transactional

Take a look at this pseudo-code:

@Transactional
public void updateDB(int lockKey) {
  boolean lockFlag = redisLock.lock(lockKey);
  if (!lockFlag) {
    throw new RuntimeException("Please try again later");
  }
  doBusiness // Business logic
  redisLock.unlock(lockKey);
}
Enter fullscreen mode Exit fullscreen mode

In this case, a Redis distributed lock is used within a transactional method. Once this method is executed:

  1. The transaction begins due to Spring’s AOP.
  2. The Redis lock is acquired.
  3. After business logic executes, the Redis lock is released.
  4. Only then is the transaction committed.

This causes a problem: the lock is released before the transaction is committed. Another thread may acquire the lock and execute its logic, reading stale data that hasn’t been committed yet by the first transaction.

Why does this happen?

Spring AOP starts the transaction before updateDB() runs. The Redis lock is then acquired inside the method. Once the method completes, the lock is released, but the transaction is still not committed.

Correct approach: acquire the lock before entering the transactional method — before the transaction even starts. That way, lock-protected code is completely within a consistent state.

9. Reentrant locks

The Redis distributed locks we’ve discussed so far are non-reentrant.

Non-reentrancy means that if a thread already holds a lock and tries to acquire it again (within the same thread), it will block or fail. In other words, a thread can only acquire the same lock once.

This type of lock works for most business cases, but some scenarios do require reentrancy. When designing your distributed lock, consider whether your application requires a reentrant distributed lock.

To implement reentrant behavior in Redis, two problems need to be solved:

  • How to track which thread currently holds the lock.
  • How to maintain the count of how many times the lock has been acquired (reentrancy count).

To build a reentrant distributed lock, you can refer to the design of Java’s ReentrantLock. Alternatively, you can use Redisson, which natively supports reentrant locks.

10. Issues caused by Redis master-slave replication

When implementing a Redis distributed lock, beware of problems caused by Redis’s master-slave replication setup. Redis is often deployed as a cluster:

Imagine Thread A acquires a lock on the master node, but the lock key hasn’t yet been replicated to the slave nodes. If the master node goes down, one of the slaves may be promoted to master. Now, Thread B can acquire the same lock key, because the key doesn’t exist in the new master. But Thread A still believes it holds the lock. Now both threads think they have the lock — this breaks lock safety.

To solve this, Redis author antirez proposed a more advanced distributed locking algorithm called Redlock.

The core idea of Redlock:

Use multiple Redis master nodes to ensure high availability. These nodes are completely independent — no replication between them. The same locking logic (acquire/release) is applied on each master.

Suppose we have 5 Redis master nodes on separate servers. Redlock’s steps:

  1. Sequentially try to acquire the lock on all 5 master nodes.
  2. If any node is unreachable (e.g., network latency), skip it after a timeout.
  3. If lock acquisition succeeds on at least 3 out of 5 nodes, and the total time used is less than the lock’s TTL, the lock is considered successful.
  4. If acquisition fails, release all previously acquired locks.

We are Leapcell, your top choice for hosting backend projects.

Leapcell

Leapcell is the Next-Gen Serverless Platform for Web Hosting, Async Tasks, and Redis:

Multi-Language Support

  • Develop with Node.js, Python, Go, or Rust.

Deploy unlimited projects for free

  • pay only for usage — no requests, no charges.

Unbeatable Cost Efficiency

  • Pay-as-you-go with no idle charges.
  • Example: $25 supports 6.94M requests at a 60ms average response time.

Streamlined Developer Experience

  • Intuitive UI for effortless setup.
  • Fully automated CI/CD pipelines and GitOps integration.
  • Real-time metrics and logging for actionable insights.

Effortless Scalability and High Performance

  • Auto-scaling to handle high concurrency with ease.
  • Zero operational overhead — just focus on building.

Explore more in the Documentation!

Try Leapcell

Follow us on X: @LeapcellHQ


Read on our blog

Top comments (0)