Summary of RabbitMQ knowledge points

[Summary of RabbitMQ] knowledge points

[message queue]

MQ related concepts

what is MQ

MQ = Message Queue = message queue

Message queue: A [FIFO] (first-in, first-out) queue that stores messages . It is a cross-process communication mechanism used to pass messages upstream and downstream.[]

Why use MQ?

  1. Traffic peak
    reduction For example, if a system can process up to 10,000 orders per second, this processing capacity is more than enough during normal hours. However, at peak times (such as Double Eleven), 20,000 requests can be made in one second. At this time, the system cannot handle so many requests. The traditional method is to restrict users from placing orders when the number of requests exceeds 10,000. Using the message queue can be used as a buffer to divide the orders processed within one second into a period of time for processing. Of course, this will affect the user’s response time, but it is much better than the experience of not being able to place an order.
  2. Application decoupling
    Taking e-commerce system as an example, there are order system, inventory system, logistics system and payment system in the application. When a user creates an order, if the inventory, logistics, and payment systems are coupled to each other, when a system fails, the order placement operation will be abnormal. However, if it is converted to a message queue-based method, the problem will be reduced a lot. For example, if the logistics system fails to send, it will take a few minutes to repair. During this period, the memory to be processed by the logistics system will be cached in the message queue, and the user’s ordering operation can be completed normally. After the logistics system is restored, the order information will continue to be processed, and in this process, the user cannot feel the abnormality of the logistics system.

  3. Asynchronous processing
    For example, there are two systems A and B. If A calls B, B will take a long time to execute, but A needs to know when B can finish executing. There are generally two ways to deal with it in the past: First, A calls B’s query api after a period of time. The second is that A provides a callback interface, which is called to notify A of the service when B finishes processing. But neither way is very elegant. After using the message queue, you can do this. After A calls the service of B, it only needs to listen to the message processed by B. When B finishes processing, it will send a message to MQ, and MQ will send the message to the A service.

Classification of MQ

ActiveMQ
advantage: The throughput of a single machine is 10,000, the timeliness is ms, and the availability is high. It is based on the master-slave architecture to achieve high availability, and it is not easy to lose data.
shortcoming: The official community maintains less and less, and high-throughput scenarios are less used.

Kafka:
advantage: Born for big data, with high throughput, it is mostly used in real-time computing and log collection scenarios in the field of big data.
shortcoming: Community updates are slow.

RocketMQ:
advantage: Java language implementation, single-machine throughput of 100,000 levels, and support for 1 billion-level message accumulation.
shortcoming: Not many client languages ​​are supported.

RabbitMQ:
advantage: One of the mainstream message middleware completed on the basis of AMQP (Advanced Message Queuing Protocol), the high concurrency feature implemented by erlang language, the throughput reaches 10,000, supports multiple languages, and has high community activity.
shortcoming: Commercial version charges.

Choice of MQ

Kafka:

It is suitable for Internet data collection services that generate a large amount of data. It is recommended for large companies to use it. If there is a log collection function, Kafka is the first choice.

RocketMQ:

Used in the field of financial Internet.

RabbitMQ:

The timeliness is delicate, the community is highly active, and small and medium-sized companies are preferred.

RabbitMQ

Introduction

RabbitMQ is a message middleware responsible for receiving, storing, and forwarding messages. Similar to the express station in life, it is responsible for receiving express, storing express, and forwarding express.

Four concepts

Producer : Responsible for producing data
Exchange : Receive messages from producers and push messages to [queues] . Decide whether the message is to be pushed to a specific queue, to multiple queues, or discarded.
Queue : store messages.
Consumer : receives messages.

RabbitMQ core part

Six modes

  1. Simple Mode
    2, Working Mode
    3, Publish/Subscribe Mode
    4, Routing Mode
    5, Topic Mode
    6, Publish Confirmation Mode

How RabbitMQ works

Broker : An application for receiving and distributing messages, RabbitMQ Server is Message Broker
Virtual host : designed for multi-tenancy and security factors, divides the basic components of AMQP into a virtual group, similar
to the namespace concept in the network. When multiple different users use the services provided by the same RabbitMQ server,
multiple vhosts can be divided, and each user creates connections such as exchange/queue in its own vhost : TCP connection between publisher/consumer and broker
Channel
: if Every time RabbitMQ is accessed, a Connection is established. When the amount of messages is large, the overhead of establishing a TCP
Connection will be huge and the efficiency will be low. Channel is a logical connection established within the connection. If the application supports multi-threading, usually each thread creates a separate channel for communication. AMQP method includes channel id to help the
client and message broker identify the channel, so the channels are completely Isolated. As a lightweight
Connection, Channel greatly reduces the overhead of the operating system to establish a TCP connection.
Exchange : The message reaches the first stop of the broker. According to the distribution rules, it matches the routing key in the query table and distributes it.
message to the queue. Common types are: direct (point-to-point), topic (publish-subscribe) and fanout
(multicast)
Queue : The message is finally sent here to wait for the consumer to take away
Binding : The virtual connection between exchange and queue, in binding It can contain routing key, Binding information is
saved in the query table in exchange, which is used for the distribution basis of message

simple mode

Features : A producer sends messages to a queue, which is responsible for consuming messages, which is also the simplest mode.

Operating mode

This mode is called “Work Queues” in English, that is, work queues , also known as task queues .

Polling for distribution messages

In this mode, one producer corresponds to one message queue, and one message queue corresponds to multiple worker threads (consumers). These worker threads consume messages in a polling manner, and a message can only be consumed once.

If the producer now sends a message to the queue, the message is as follows:
(aa,bb,cc,dd,ee)
The first consumption: a worker thread will be randomly selected for consumption, assuming that it is worker thread 1, and aa will be consumed at this time
Second consumption: Since worker thread 1 has already consumed it, it is the turn of worker thread 2 to consume and consume bb
The third consumption, worker thread 3 consumes cc
The fourth consumption, worker thread 1 consumes dd
.....

message reply

It may take a while for consumers to complete a task, and if one of the consumers processes a long task and only partially completes it suddenly it dies, at which point messages are lost. Because once RabbitMQ delivers a message to the consumer, it immediately marks the message for deletion.

Therefore, in order to ensure that the message is not lost during the sending process, the response . That is, after the consumer receives the message and processes the message, it tells rabbitmq that it has been processed, and rabbitmq can delete the message.

Message response is mainly divided into automatic response and manual response.

auto answer

Auto-reply: The message is considered to have been delivered as soon as it is sent . But this method has a drawback, that is, if a connection occurs on the consumer side or the channel is closed before the message is received, then the message is lost.

answer manually

Manual answering can set whether to answer in batches. Figure 1 is a batch response, and Figure 2 is a non-batch response.

The advantage of manual response is that it can achieve no loss during message consumption, and can respond in batches and reduce network congestion.

Messages are automatically requeued

If the consumer loses the connection for some reason (its channel is closed, the connection is closed, or the TCP connection is lost), causing the message to
not send an ACK acknowledgment, RabbitMQ will know that the message was not fully processed and will re-queue it. If other consumers
can handle it at this point, it will soon redistribute it to another consumer.
This ensures that no messages are lost even if a consumer dies occasionally .
We can see that message 1 was originally consumed by C1, but C1 lost the connection, resulting in no response to the message. Then the message will be distributed to other consumers for consumption (and still only consumed once).

RabbitMQ persistence

message persistence

Since the message will be deleted when it is sent from the producer, we have introduced a response mechanism earlier, that is, the producer will delete it only when the consumer successfully answers. But how to ensure that the message sent by the message producer is not lost when the RabbitMQ service is stopped? We know that queues and messages exist in memory by default, so we need to mark queues and messages as persistent.

But this method actually has a drawback: when the message is just ready to be stored on the disk but has not been stored yet, the message is still cached at an interval. There is no actual writing to disk at this point. The persistence guarantee is not strong, but it is enough for ordinary simple tasks. If you need a stronger persistence strategy, you can see the release confirmation later.

unfair distribution

Earlier we said that RabbitMQ uses polling distribution, but this strategy is not very good. Because the actual processing speed of consumers is not the same. For example, if a consumer has a fast processing speed, and another has a very slow processing speed, if polling is also used, the efficiency is very low. Efficiency can only be improved by doing more work for fast-processing consumers.

We can do this by setting the prefetch value, which defines the maximum number of unacknowledged messages allowed on the channel. Overall efficiency can be improved by assigning more prefetch values ​​to fast-processing consumers.

release confirmation

The producer sets the channel to confirm mode. Once the channel enters confirm mode, all
messages will be assigned a unique ID (starting from 1). Once the message is delivered to all matching queues, The broker will
send an acknowledgment to the producer (containing the unique ID of the message), which makes the producer know that the message has arrived at the destination queue correctly.
If the message and the queue are durable, the acknowledgment message will be written before the message is written. After the disk is issued, the broker sends back an
acknowledgment message to the producer, which contains the serial number.

The advantage of confirm mode is that it is asynchronous and prevents message loss.

Confirmation release is divided into: single confirmation release, batch confirmation release, and asynchronous confirmation release.

Single Confirmed Post: Confirm one for every one posted.
Batch Confirm Release: Confirm a batch every time a batch is sent.
Asynchronous confirmation release: The producer is only responsible for sending messages, and the switch uses the callback function to ensure reliability after confirming the messages in turn.

The respective characteristics: :

switch

What is a switch

The core idea of ​​the RabbitMQ messaging model is that messages produced by producers are never sent directly to queues. Instead, producers can only send messages to exchanges.

Types of switches:
direct, topic, headers, fanout

nameless switch

When we specify an empty string, it means to use the default switch, which is an unnamed switch.

temporary queue

When we need a queue, we can not specify a name, but by creating a random name or let the server choose to also give a random queue name, and when we disconnect the consumer, the queue will be automatically deleted.

bind

Binding: It is a bridge between a switch and a queue. It binds a switch to which queue at a high speed.

Fanout switch (Fanout)

Fanout switch: broadcasts all messages it receives to
all queues it knows about.

Direct switch (Direct)

Direct exchange: The message only goes to the routingKey queue it is bound to.
In the above picture, we can see that X is bound to two queues, and the binding type is direct. The binding key of queue Q1 is orange, and the binding key of queue Q2 has two binding keys: one binding key is black, and the other binding key is green.
In this binding case, the producer publishes the message to the exchange, and the message with the binding key orange will be published to the queue Q1. Messages with binding keys blackgreen and blackgreen will be published to queue Q2, and messages of other message types will be discarded.

multiple binding

Of course, if the binding type of exchange is direct, but the keys of the multiple queues it binds are all the same, in this
case, although the binding type is direct, it behaves somewhat similar to fanout. As shown above.

Topics

Disadvantages of fan-out switches: you cannot specify which consumer to send a message to, and you can only send all of them without a brain.
Disadvantage of direct switch: The matching rule of the routing_key set is fixed and cannot be changed flexibly.

Topic switch: The routing_key of messages sent to a topic switch cannot be written arbitrarily, and must meet certain requirements. It must be a list of words, separated by dots. These words can be arbitrary words.
Special:
*(asterisk) can replace one word

(pound sign) can replace zero or more words

quick .orange .rabbit is received
 by queue Q1Q2 lazy .orange .elephant is received
 by queue Q1Q2 quick .orange .fox is received
 by queue Q1 lazy .brown .fox is received
 by queue Q2 lazy .pink .rabbit satisfies both bindings fixed but only received once
 by queue Q2 quick .brown .fox does not match any binding will not be received by any queue will be discarded
 quick .orange .male .rabbit is four words does not match any binding will be discarded
 lazy .orange .male .rabbit is four words but matches Q2

dead letter queue

what is a dead letter

Dead letter: A message that cannot be consumed.
Dead letter queue: messages that cannot be consumed will be streamed to this queue.

The producer delivers the message to the broker or directly to the queue, and the consumer takes out the message from the queue for consumption, but sometimes some messages in the queue cannot be consumed due to specific reasons. It has become a dead letter, and if there is a dead letter, there will naturally be a dead letter queue.

Application scenario: In order to ensure that the message data of the order business is not lost, it is necessary to use the dead letter queue mechanism of RabbitMQ. When the message
consumption is abnormal, the message is put into the dead letter queue. For example, the user places an order in the mall successfully and After clicking to pay, it will automatically expire if the payment is not made within the specified time.

Cause of Dead Letters

  1. The message TTL expires
  2. The queue reaches the maximum length (the queue is full, no more data can be added to mq)
  3. The message is rejected (basic.reject or basic.nack) and requeue=false

Dead letter architecture diagram

delay queue

what is delay

In the case of the dead letter queue in the previous section, there is a situation in which the message TTL expires, so the message enters the process of the dead letter queue. This process in isolation is called a delay queue.

Delay Queue: A delay queue is a queue used to store elements that need to be processed at a specified time. That is, the producer sends a message to the queue normally, and after a certain delay time in the queue, it enters the dead letter queue and is then consumed by the consumer.

There are many scenarios in which the delayed queue is used:
1. If the order is not paid within ten minutes, it will be automatically canceled
. 2. If the newly created store has not uploaded the product within ten days, a message reminder will be sent automatically.
3. After the user has successfully registered, if the user does not log in within three days, a text message will be reminded.
4. The user initiates a refund, and if it is not processed within three days, the relevant operators will be notified.
5. After the meeting is scheduled, it is necessary to notify each participant to participate in the meeting ten minutes before the scheduled time.

queue TTL

What is TTL?

TTL is an attribute of a message or queue in RabbitMQ, indicating the maximum survival time of a message or all messages in the queue, in milliseconds.

Architecture diagram

Create two queues QA and QB, set the TTL of the two queues to 10S and 40S respectively, then create a switch X and a dead letter switch Y, their types are direct, create a dead letter queue QD, their binding relationship as follows:

Problems with the above case

Each time a new time requirement is added, a new queue is added.

Delay queue optimization

In order to solve the above problem, we adopt this scheme.

Architecture diagram

A new queue QC is added here. The binding relationship is as follows. The queue does not set TTL time.

There is a problem

When there are two messages, the first message can be consumed after it exists for 40 seconds, and the second message can be consumed after it exists for 2 seconds. But as a result, we will find that both messages are consumed after 40 seconds.

Reason: The first-in-first-out feature of the queue causes the messages that can be consumed in 2 seconds to wait for the previous message to be dequeued, and finally it takes 40 seconds to be consumed.

This kind of problem can be implemented by RabbitMQ plugin to delay queue.

Release Confirmation Advanced

The case of publishing an acknowledgment we talked about earlier means that when a message is sent from the producer and enters the matching queue, the exchange will issue an acknowledgment (including the message ID) to the producer. But in a more extreme case: if one of the switches or queues disappears (i.e. a RabbitMQ restart occurs), resulting in message loss, how to solve this situation? The answer is to take the advanced version of the release confirmation.

Release Confirmation Advanced:

Other knowledge points of RabbitMQ

idempotency

The problem exists: place an order scenario. When the user executes the payment and the payment is successful, a network exception suddenly occurs (the amount has been deducted at this time), the user does not know, so he clicks payment again, which eventually leads to the payment twice.

Solution: In a stand-alone system, we only need to put the operation into a transaction, and if an error occurs, it will be rolled back.

Similar problems in MQ:
when the consumer has consumed message 1 and is ready to send ack confirmation to MQ, a network interruption occurs. At this time, the producer thinks that the message is not consumed, so it resends it, or sends it to other consumers. Causes repeated consumption of messages.

Solution: To ensure the idempotency of MQ consumers, use a global ID or write a unique ID (such as timestamp, UUID), and use the ID to determine whether the message has been consumed each time a message is consumed.

priority queue

Order reminder scenario: But after we place an order, we need to remind the user to pay. We can simply set up a queue (which can be implemented through Redis List or RabbitMQ), so that the information that needs to be reminded is queued, but if we still have a requirement , that is, we do not remind users according to the order in which they enter the queue, but each reminder has a priority, that is, if it is an order from a large merchant, it will be reminded first, and a small order can be reminded later, then the priority queue can be used. .

lazy queue

By default, when a producer sends a message to RabbitMQ, the message in the queue will be stored in memory as much as possible, so that the message can be sent to the consumer more quickly. Even persistent messages have a backup in memory while being written to disk. When RabbitMQ needs to release the memory, it will page the messages in the memory to the disk. This operation will take a long time and will also block the operation of the queue, so that it cannot receive new messages.

A lazy queue will store messages on disk as much as possible, and will only be loaded into memory when consumers consume the corresponding messages. One of its important design goals is to be able to support longer queues, that is, to support more message storage.

RabbitMQ cluster

Reasons to use clusters

If the single-machine RabbitMQ has a memory crash, or the machine is powered off or the motherboard fails, messages will be lost, and the cluster can solve this problem.

Build a cluster

Prepare three servers
and configure the cluster (node1, node2, node3…)

mirror queue

Although we have built a cluster, but when a queue is stored, it is still stored on a node. If the node suddenly hangs up, isn’t the queue lost? Therefore, we need to configure the mirror queue.

Mirror queue: The queue can be mirrored to other Broker nodes in the cluster. If one node in the cluster fails, the queue can automatically switch to another node in the mirror to ensure service availability. Simply put: it is assumed that we have a hello queue on node 1. When we configure a mirror queue with parameter 2, our node 1 has a hello queue and its backup. When node 1 hangs up, the hello queue The backup will run to another node (the node is not hung), and a backup belonging to it will be created, thus ensuring that the queue is not lost.

Haproxy+Keepalive achieves high availability load balancing

There is a problem: As we mentioned earlier, when node 1 hangs up, the backup of the queue will run to another node, which will not cause the queue to disappear, but at this time the producer does not know that queue 1 hangs up (our code If node 1 is hardcoded in , the message is still sent to node 1.

In order to solve this problem, we implement high-availability load balancing, as shown below:
HAProxy implements load balancing :
HAProxy provides high availability, load balancing and proxy based on TCPHTTP applications, supports virtual hosts, and is a free, fast and reliable solution. (responsible for distributing messages to different nodes).

Keepalived implements dual-host (active-standby) hot standby
Imagine if the previously configured HAProxy host suddenly goes down or the network card fails, then although there is no failure in the RbbitMQ cluster, all connections will be disconnected for external clients and the result will be a disaster In order to ensure the reliability of the load balancing service, it is also very important to introduce Keepalived, which can achieve high availability (dual-system hot backup) through its own health check and resource takeover functions, and achieve failover.

Federation Exchange

There are problems: (broker Beijing), (broker Shenzhen) are far apart from each other, and network latency is a problem that has to be faced. There is a business in Shenzhen (Client Shenzhen) that needs to send a message to exchangeA (in Beijing), then there is a large network delay between (Client Shenzhen) and (broker Beijing), and (Client Shenzhen) will send a message to exchangeA. It will take a certain amount of time. Delay.

Solution:
1. We can deploy the business in Shenzhen to the computer room in Beijing. (But at this time, there will be a delay in accessing services in Shenzhen, so it is impossible to deploy all services in one computer room).
2. Use the Federation plugin to solve this problem very well.
The federated switch has a concept of upstream and downstream:
for example: suppose the service in Shenzhen wants to access the MQ in Beijing, and the delay is high. At this time, Beijing is used as the upstream, Shenzhen is used as the downstream (completed by configuration), and the upstream synchronizes the data to the Downstream, so the service in Shenzhen only needs to access the MQ in Shenzhen, and the delay is low. The opposite is true.

Federation Queue

Federated queues can provide load balancing for a single queue across multiple broker nodes (or clusters). A federated queue can connect to one or more upstream queues and obtain messages from these upstream queues to meet the needs of local consumers to consume messages.

For federated queues, the principle is similar to that of federated exchanges, but it is further subdivided into the queue level.

Shovel

Federation has a similar data forwarding function. Shovel is reliable enough to continuously pull data from a queue in one Broker (as the source, that is, the source) and forward it to the exchange in another Broker (as the destination, that is, the destination). The queue as the source end and the exchange as the destination end can be located on the same broker at the same time, or can be located on different brokers. Similar to Federation Exchange.

Leave a Comment

Your email address will not be published. Required fields are marked *