What is a distributed lock Three ways to implement distributed locks

Hits: 0

In many scenarios, in order to ensure the eventual consistency of data, we need a lot of technical solutions, such as distributed transactions, distributed locks, etc. So what exactly is a distributed lock, what business scenarios are distributed locks used in, and how to implement distributed locks?

1 Why use distributed locks

When we are developing an application, if we need to perform multi-threaded synchronous access to a shared variable, we can use the lock we learned to process it, and it can run perfectly without bugs!
Note that this is a stand-alone application. Later, when the business develops, it needs to be clustered. An application needs to be deployed on several machines and then load balanced, as shown in the following figure:

As can be seen from the above figure, variable A exists in the memory of three servers (this variable A is mainly embodied as a member variable in a class, which is a stateful object). If no control is added, variable A will also be stored in Allocate a piece of memory, and send three requests to operate on this variable at the same time. Obviously, the result is wrong! Even if they are not sent at the same time, the three requests operate on data in three different memory areas respectively, there is no sharing or visibility between variables A, and the processing results are wrong!
If this scenario does exist in our business, we need a way to solve this problem!
In order to ensure that a method or property can only be executed by the same thread at the same time in the case of high concurrency, in the case of single-machine deployment of traditional monolithic applications, functions related to concurrent processing can be used for mutual exclusion control. However, with the needs of business development, after the original single-machine deployment system is evolved into a distributed cluster system, because the distributed system is multi-threaded, multi-process and distributed on different machines, this will make the original single-machine deployment situation. The control lock strategy is invalid, and the pure application cannot provide the ability of distributed lock. In order to solve this problem, a cross-machine mutual exclusion mechanism is needed to control the access to shared resources, which is the problem to be solved by distributed locks!

2. What conditions should a distributed lock have

Before analyzing the three implementations of distributed locks, let’s first understand what conditions distributed locks should have:
1. In a distributed system environment, a method can only be executed by one thread of one machine at the same time;
2. High Available lock acquisition and release lock;
3. High-performance lock acquisition and release lock;
4. Reentrant feature;
5. Lock failure mechanism to prevent deadlock;
6. Non-blocking lock feature, that is, not acquired The lock will directly return the failure to acquire the lock.

Three, three implementations of distributed locks

At present, almost many large-scale websites and applications are deployed in a distributed manner, and the problem of data consistency in distributed scenarios has always been a relatively important topic. The distributed CAP theory tells us that “no distributed system can satisfy Consistency, Availability and Partition tolerance at the same time, and can only satisfy two at most at the same time.” Therefore, many systems in At the beginning of the design, it is necessary to make a choice between these three. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure “eventual consistency”, as long as the final time is within the range acceptable to users.
In many scenarios, in order to ensure the final consistency of data, we need a lot of technical solutions, such as distributed transactions, distributed locks, etc. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time.

Implement distributed locks based on databases;
Implement distributed locks based on caches (Redis, etc.);
Distributed lock based on Zookeeper;

Fourth, the realization method based on the database

The core idea of ​​database-based implementation is: create a table in the database, the table contains fields such as method name, and create a unique index on the method name field. If you want to execute a method, use this method name to add to the table. Insert data, acquire the lock after successful insertion, delete the corresponding row data to release the lock after the execution is complete.

(1) Create a table:

DROP  TABLE  IF  EXISTS  {{EJS0}} ;
 CREATE  TABLE  {{EJS1}} (
   {{EJS2}}  int ( 11 ) unsigned  NOT  NULL AUTO_INCREMENT COMMENT  'primary key' ,
   {{EJS3}}  varchar ( 64 ) NOT  NULL  COMMENT  'locked method name' ,
   {{EJS4}}  varchar ( 255 ) NOT  NULL  COMMENT  'remarks' ,
   {{EJS5}}  timestamp  NOT  NULL  DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY ({{EJS6}}),
  UNIQUE KEY {{EJS7}} ({{EJS8}}) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT= 3  DEFAULT  CHARSET =utf8 COMMENT = 'Method in lock' ;

(2) To execute a method, use the method name to insert data into the table:

Because we have method_namemade a unique constraint , if there are multiple requests submitted to the database at the same time, the database will ensure that only one operation can be successful, then we can consider that the thread that succeeded in the operation has obtained the lock of the method and can execute the method. body content.

(3) The lock is acquired if the insertion is successful, and the corresponding row data is deleted after the execution is completed to release the lock:

delete from method_lock where method_name ='methodName';

Note: This is just a method based on the database, there are many other ways to use the database to implement distributed locks!
Using this implementation method based on database is very simple, but for the conditions that distributed locks should have, it has some problems that need to be solved and optimized:
1. Because it is implemented based on database, the availability and performance of the database will directly affect the distribution Therefore, the database requires dual-machine deployment, data synchronization, and master-standby switching;
2. It does not have the reentrant feature, because the row data exists until the lock is released by the same thread, and it cannot be successfully inserted again. Therefore, it is necessary to add a new column to the table to record the information of the machine and thread that currently obtains the lock. When acquiring the lock again, first check whether the machine and thread information in the table is the same as the current machine and thread, if they are the same 3. There is
no lock failure mechanism, because it is possible that after the data is successfully inserted, the server is down, the corresponding data is not deleted, and the lock cannot be obtained after the service is restored, so it is necessary to update the table in the table. A column is added to record the failure time, and a timed task is required to clear the failed data;
4. It does not have the feature of blocking locks, and it returns to failure directly if the lock is not obtained, so it is necessary to optimize the acquisition logic and loop multiple times to obtain it.
5. Various problems will be encountered in the process of implementation. In order to solve these problems, the implementation method will become more and more complicated; relying on the database requires a certain resource overhead, and performance issues need to be considered.

Five, Redis-based implementation

  1. Reasons for choosing Redis to implement distributed locks:
    (1) Redis has high performance;
    (2) Redis commands support this well and are more convenient to implement
  2. Introduction to commands:
    (1) SETNX
    SETNX key val: when And only when the key does not exist, set a string whose key is val and return 1; if the key exists, do nothing and return 0.
    (2) expire
    expire key timeout: Set a timeout for the key, in seconds, after which the lock will be automatically released to avoid deadlock.
    (3) delete
    delete key:
    These three commands are mainly used to delete keys when using Redis to implement distributed locks.
  3. Implementation ideas:
    (1) When acquiring a lock, use setnx to lock, and use the expire command to add a timeout time to the lock. If the time exceeds this time, the lock is automatically released. The value of the lock is a randomly generated UUID. Through this Judge when releasing the lock.
    (2) When acquiring the lock, a timeout period for acquisition is also set. If this time is exceeded, the acquisition of the lock is abandoned.
    (3) When releasing the lock, judge whether it is the lock by UUID, if it is the lock, execute delete to release the lock.
  4. Simple implementation code of distributed lock:

# 
connect redis_client = redis.Redis(host= "localhost" ,
                           port=6379,
                           password=password,
                           db=10)

#Acquire a lock
lock_name: lock name
acquire_time: the time the client waits to acquire the lock
time_out: the timeout of the lock
def  acquire_lock (lock_name, acquire_time= 10 , time_out= 10 ) : 
    """Acquire a distributed lock"""
    identifier = str(uuid.uuid4())
    end = time.time() + acquire_time
    lock = "string:lock:" + lock_name
     while time.time() < end:
         if redis_client.setnx(lock, identifier):
             # Set a timeout for the lock to prevent the process from crashing and preventing other processes from acquiring the lock
            redis_client.expire(lock, time_out)
            return identifier
        elif not redis_client.ttl(lock):
            redis_client.expire(lock, time_out)
        time.sleep(0.001)
    return False

#Release a lock 
def  release_lock (lock_name, identifier) ​​: 
    """Universal lock release function""" 
    lock = "string:lock:" + lock_name
    pip = redis_client.pipeline(True)
    while True:
        try:
            pip.watch(lock)
            lock_value = redis_client.get(lock)
            if not lock_value:
                return True

            if lock_value.decode() == identifier:
                pip.multi()
                pip.delete(lock)
                pip.execute()
                return True
            pip.unwatch()
            break
        except redis.excetions.WacthcError:
            pass
    return False

5. Test the distributed lock just implemented

In the example, 50 threads are used to simulate killing a commodity in seconds, and the – operator is used to reduce the commodity. From the orderliness of the results, it can be seen whether it is in a locked state.

def seckill():
    identifier=acquire_lock('resource')
    print(Thread.getName(), "Acquired lock" )
    release_lock('resource',identifier)


for i in range(50):
    t = Thread(target=seckill)
    t.start()

6. Implementation based on ZooKeeper

ZooKeeper is an open source component that provides consistent services for distributed applications. Inside it is a hierarchical file system directory tree structure, which stipulates that there can only be one unique file name in the same directory. The steps to implement distributed locks based on ZooKeeper are as follows:
(1) Create a directory mylock;
(2) Thread A creates a temporary sequence node in the mylock directory if it wants to acquire the lock;
(3) Acquires all child nodes in the mylock directory, and then acquires If the sibling node smaller than itself does not exist, it means that the current thread sequence number is the smallest and the lock is obtained;
(4) Thread B obtains all nodes, judges that it is not the smallest node, and sets the monitoring node to the next smallest node;
(5) Thread After processing A, delete its own node, and thread B listens to the change event to determine whether it is the smallest node, and if so, acquires the lock.
An Apache open source library, Curator, is recommended here. It is a ZooKeeper client. The InterProcessMutex provided by Curator is an implementation of distributed locks. The acquire method is used to acquire the lock, and the release method is used to release the lock.
Advantages: It has the characteristics of high availability, reentrancy, and blocking lock, which can solve the problem of deadlock.
Disadvantages: Because it needs to create and delete nodes frequently, the performance is not as good as the Redis method.

7. Summary

The above three implementation methods are not perfect in all occasions. Therefore, the most suitable implementation method should be selected according to different application scenarios.
In a distributed environment, it is sometimes important to lock resources, such as snapping up a resource. In this case, using distributed locks can control resources well.
Of course, in specific use, many factors need to be considered, such as the selection of timeout time and the selection of lock acquisition time, which have a great impact on the amount of concurrency. The distributed lock implemented above is only a simple implementation, mainly a thought

Reprinted in: /liuqingzheng/p/11080501.html

You may also like...

Leave a Reply

Your email address will not be published.