[Linux] Network Foundation–Network Layer and Data Link Layer

Hits: 0

Network foundation– detailed explanation of [network layer and data link layer]

This article will introduce the last two layers in the TCP/IP structural model – the network layer and the data link layer.

🌴Network layer

🌿 Network layer concept

The network layer is between the transport layer and the [data link layer , and further] manages the data communication in the network on the transmission function of data frames between two adjacent endpoints provided by the data link layer, and manages the data from the source end. It is transmitted to the destination through several intermediate nodes, thereby providing the most basic end-to-end data transmission service to the transport layer.

🌵Main functions of the network layer

1. Responsible for the communication between adjacent computers to
process the packet sending request from the transport layer. After receiving the request, load the packet request into the IP datagram, fill the header, select the path to the sink, and then send the datagram to appropriate network interface.
2. When processing the input datagram,
first check its legality, and then perform routing – if the datagram has reached the sink, remove the header and deliver the remaining part to the upper-layer transmission protocol; if the datagram has not yet reached the sink. , the datagram is forwarded.
3. Dealing with issues such as path, flow control, congestion, etc.
In simple terms, the network layer is responsible for determining an appropriate path in a complex network environment.

🌿Network layer protocol

Network layer protocols include:

  • IP (Internet protocol) protocol
  • ICMP (Internet Control Message Protocol) control message protocol
  • ARP (Address resolution protocol) address translation protocol
  • RARP (Reversearp) Reverse Address Translation Protocol

IP is the core of the network layer, and the next-hop IP is encapsulated and handed over to the interface layer through routing; IP datagram is a connectionless service. ICMP is a supplement to the network layer, and can send back packets to check whether the network is smooth; the Ping command is to send ICMP echo packets, and perform network testing through the echo relay.
As for the [ARP] and RARP protocols, it is not simply a network layer protocol, but a protocol between the network layer and the data link layer.
The following will introduce the IP protocol in detail, and briefly understand the ICMP protocol. As for the ARP protocol, it will be introduced at the data link layer.

🌵IP Protocol

Before introducing the IP protocol, we know that the IP address is the network address + the host address, that is, a host on a certain network segment is determined. Therefore, the IP protocol provides a capability, a capability to transfer data from host A to host B.

🍁Basic Concept

IP is a network layer protocol that provides an unreliable (connectionless) service. It just sends packets from the source node to the destination node as quickly as possible, but does not provide any reliability guarantees.
We know that the IP protocol has the ability to transmit data from the source host to the destination host, but the ability does not mean that it can be done, that is to say, the IP address has a high probability of completing its task, but does not provide reliability guarantees.
[Other related concepts]

  • Host: A device with an IP address, but without routing control
  • Router: It is equipped with an IP address and can perform routing control. Since the router works at the network layer, the router itself includes the network layer and the link layer
  • Node: A collective term for hosts and routers

🍁 Protocol header format

Before talking about the IP protocol header format, we still think about those two questions:

  1. How to ensure header and payload separation
  2. How to ensure that the payload can be successfully delivered to the upper layer

In order to solve the above problems, the IP protocol needs to ensure that at least the length of the header and the address of the receiver are protected in its header format. The following figure shows the format of the IP protocol header:
The IP protocol header contains the following:

  • 4-digit version number (version): Specifies the version of the IP protocol, which is 4 for IPv4.
  • 4-bit header length: How many 32 bits are the length of the IP header, that is, the number of bytes of length * 4. 4bit means that the maximum number is 15, so the maximum length of the IP header is 60 bytes.
  • 8-bit Type Of Service: 3-bit priority field (deprecated), 4-bit TOS field, and 1-bit reserved field (must be set to 0). The 4-bit TOS respectively represent: minimum delay, maximum throughput Volume, maximum reliability, and minimum cost. These four conflict with each other, and only one can be selected. For applications such as ssh/telnet, minimum latency is more important; for programs such as ftp, maximum throughput is more important.
  • 16-bit total length: How many bytes does the IP datagram occupy as a whole.
  • 16-bit identifier (id): Identifies the packet sent by the host. This identifier is unique. If the IP packet is fragmented at the data link layer, the id in each fragment is the same.
  • 3-bit flag field: The first bit is reserved (reserved means that it is not used now, but it may be used in the future). The second bit is 1 to prohibit fragmentation. At this time, if the packet length exceeds the MTU, the IP module The packet will be discarded. The third bit means “more fragments”, if fragmented, the last fragment is set to 1, and the others are 0. Similar to an end marker.
  • 13-bit fragment offset (framegament offset): It is the offset of the fragment relative to the beginning of the original IP packet. In fact, it indicates where the current fragment is in the original packet. The actual number of bytes of offset is This value * 8 is obtained. Therefore, except for the last packet, the length of other packets must be an integer multiple of 8 (otherwise the packets will not be consecutive).
  • 8-bit Time To Live (TTL): The maximum number of hops for the datagram to reach the destination. Generally, it is 64. Each time it passes through a route, TTL -= 1, until it reaches 0 and has not arrived, then the message The text is discarded. This field is mainly used to prevent routing loops.
  • 8-bit protocol: Indicates the type of upper-layer protocol
  • 16-bit header checksum: Checked using CRC to identify header corruption
  • 32-bit source address and 32-bit destination address: Indicates the sender and receiver.
  • Option field (indeterminate length, up to 40 bytes): The fixed header length of the first 20 bytes is removed, and the rest can be filled in the option field if there is more to be added in the protocol header. The maximum length of the option field is 40 bytes .

🍁Network Segmentation

As we said before, IP addresses are divided into network numbers and host numbers.

  • Network ID: Ensure that the two network segments connected to each other have different identities.
  • Host ID: Within the same network segment, hosts have the same network ID, but must have different host IDs.
  • Different subnets are groups of hosts with the same network number.
  • If a new host is added to the subnet, the network number of this host is the same as that of other hosts in the subnet, but the host number must be different from other host numbers.

With a reasonable host number and network number, it can be ensured that each host has a different IP address in the interconnected network.
But here comes the problem, it is quite troublesome to manage the IP of the subnet manually.

  • There is a technology called DHCP, which can automatically assign IP addresses to new host nodes in the subnet, avoiding the inconvenience of manual IP management.
  • General routers have DHCP function. Therefore, the router can also be regarded as a DHCP server.

In the past, a scheme for dividing network numbers and host numbers was proposed, and all IP addresses were divided into five categories, as shown in the following figure:

  • Class A 0.0.0.0 to 127.255.255.255
  • Class B 128.0.0.0 to 191.255.255.255
  • Class C 192.0.0.0 to 223.255.255.255
  • Class D 224.0.0.0 to 239.255.255.255
  • Class E 240.0.0.0 to 247.255.255.255

With the rapid development of the Internet, the limitations of this division scheme are quickly revealed. Most organizations apply for class B network addresses, resulting in class B addresses being allocated quickly, while class A wastes a lot of addresses;

  • For example, if a class B address is applied, more than 65,000 hosts can theoretically be allowed in one subnet. The number of hosts in the subnet of the class A address is more.
  • However, in the actual network setup, there will not be so many hosts in a subnet. Therefore, a large number of IP addresses are wasted.

A new division scheme is proposed for this situation, called CIDR (Classless Interdomain Routing):

  • Introduce an extra subnet mask to distinguish the network number from the host number;
  • The subnet mask is also a 32-bit positive integer. It usually ends with a string of “0”;
  • Perform a “bitwise AND” operation on the IP address and the subnet mask, and the result is the network number;
  • The division of network number and host number has nothing to do with whether the IP address is class A, class B or class C;

Here are two examples
of subnetting: Example 1 of subnetting:

IP address 140.252.20.68 8C FC 14 44
subnet mask 255.255.255.0 FF FF FF 00
network number 140.252.20.0 8C FC 14 00
Subnet address range 140.252.20.0 ~ 140.252.20.255

Since the IP address and the subnet mask get the network number by bitwise AND, the first 3 8bits of the subnet mask are all 1, so the first 3 8bits of the subnet address are the same as the IP address; 8bits are all 0, so the last 8 bits of the subnet address are arbitrary, which can be 1 or 0, so the range is 0~255.
Example 2 of subnetting

IP address 140.252.20.68 8C FC 14 44
subnet mask 255.255.255.240 FF FF FF F0
network number 140.252.20.64 8C FC 14 40
Subnet address range 140.252.20.64 ~ 140.252.20.79

The last 4 bits of the same subnet mask are 0, so the first 28 bits of the subnet address are the same as the IP address, and the last 4 bits range from 0 to 15. Therefore, the last 8 bits of the subnet address range from 64 to 79.

It can be seen that the network number can be obtained by doing the AND operation of the IP address and the subnet mask, and the host number from all 0s to all 1s is the address range of the subnet;
there is a more concise representation of the IP address and the subnet mask, such as 140.252. 20.68/24, which means the IP address is 140.252.20.68, and the high 24 bits of the subnet mask are 1, which is 255.255.255.0

🌱 Special IP address
  • Set all the host addresses in the IP address to 0, and it becomes the network number , representing this local area network;
  • Setting all host addresses in the IP address to 1 becomes a broadcast address , which is used to send data packets to all hosts connected to each other in the same link;
  • An IP address of 127.* is used for local loopback testing, usually 127.0.0.1
🌱 The number of IP addresses is limited

We know that an IP address (IPv4) is a 4-byte 32-bit positive integer. Then there are only 2 to the 32nd power of IP addresses, which is about 4.3 billion. The TCP/IP protocol stipulates that each host needs to have One IP address.
Does this mean that only 4.3 billion hosts can access the network?
In fact, due to the existence of some special IP addresses, the number is far less than 4.3 billion; in addition, IP addresses are not configured according to the number of hosts, Instead, each network card needs to be configured with one or more IP addresses.
CIDR alleviates the problem of insufficient IP addresses to a certain extent (improves utilization, reduces waste, but the absolute upper limit of IP addresses does not increase), still Not very useful. At this time, there are three ways to solve it:

  • Dynamic IP address assignment: Only assign IP addresses to devices connected to the network. Therefore, devices with the same MAC address will not necessarily get the same IP address every time they access the Internet;
  • NAT technology (will be introduced later);
  • IPv6: IPv6 is not a simple upgraded version of IPv4. These are two unrelated protocols and are not compatible with each other; IPv6 uses 16 bytes and 128 bits to represent an IP address; but IPv6 is not yet popular.
🌱Private IP address and public IP address

If an organization builds a local area network, the IP address is only used for communication within the local area network, not directly connected to the Internet. In theory, any IP address can be used, but RFC 1918 specifies the private IP address used to form a local area network:

  • 10.*, the first 8 digits are the network number, a total of 16,777,216 addresses
  • 172.16. to 172.31., the first 12 digits are the network number, a total of 1,048,576 addresses
  • 192.168.*, the first 16 digits are the network number, a total of 65,536 addresses

Those included in this range become private IPs, and the rest are called global IPs (or public IPs);
as for public IPs, we will introduce our three major operators below.
Our three domestic operators (Telecom, China Mobile, China Unicom) form a domestic wide area network through component infrastructure – hardware (laying of basic hardware) and software (network division). The domestic wide area network is divided, and Internet companies will definitely belong to a sub-network category.

  • A router can be configured with two IP addresses, one is the WAN port IP, the other is the LAN port IP (subnet IP). The LAN port IP is internal and the WAN port IP is external.
  • The hosts connected to the router’s LAN port belong to the current router’s subnet.
  • Different routers, the subnet IP is actually the same (usually 192.168.1.1). The IP address of the host in the subnet cannot be repeated. But the IP address between the subnets can be repeated.
  • Each home router is actually a node in the subnet of the carrier router. Such carrier routers may have many levels. The outermost carrier router, the WAN port IP is a public network IP.
  • When the host in the subnet needs to communicate with the external network, the router will replace the IP address in the IP header (replace it with the WAN port IP), so that the IP address in the final data packet becomes a public network IP. This technology is called NAT (Network Address Translation).
  • If we want our own server program to be accessible on the public network, we need to deploy the program on a server with an external IP. Such a server can be purchased on Alibaba Cloud/Tencent Cloud.

🍁 Routing

Routing refers to finding a route to an end point in a complex network structure.

In fact, when we go out to travel to find our way, it is a process of routing. For example, when we were visiting Hangzhou, Zhejiang, we wanted to go to see the West Lake but didn’t know the way. At this time, we could ask a passerby on the road to ask for directions. The answer from the passerby may be as follows:

  1. tell us where to go next
  2. Passersby don’t know where the West Lake is, but the uncle in the security room next to him knows how to get to the West Lake, so let’s ask the uncle for directions
  3. Passers-by told us that this is the West Lake
  4. Passers-by don’t know how to get to West Lake and say don’t ask him
  • In fact, the process of routing is the process of “asking for directions” by hop by hop. The so-called “hop” is an interval in the data link layer.
  • Specifically, in Ethernet, it refers to the frame transmission interval from the source MAC address to the destination MAC address.

This is how our IP packets find their way through the network hop by hop. The first case is to send the packet to the next-hop router according to the query result; the second case is to send the packet to the default router according to the query result; the third case is to reach the destination network according to the query result, and then According to the host number, the data is forwarded to the destination host; and the fourth situation does not exist in the network, because this kind of router setup in the network is redundant.
The transmission process of IP data packets is also the same as asking for directions.

  • When an IP packet arrives at the router, the router will first check the destination IP;
  • The router decides whether the packet can be sent directly to the target host, or needs to be sent to the next router;
  • Repeatedly, until reaching the target IP address;

To determine where the current data packet should be sent, it is necessary to maintain a routing table inside each node:

  • The routing table can be viewed using the route command
  • If the destination IP hits the routing table, it can be forwarded directly;
  • The last row in the routing table is mainly composed of the next hop address and the sending interface. When the destination address does not match other rows in the routing table, it is sent to the next hop address according to the interface specified by the default routing entry.
    Suppose the network interface configuration and routing table of a host are as follows:
  • This host has two network interfaces, one network interface is connected to the 192.168.10.0/24 network, and the other network interface is connected to the 192.168.56.0/24 network;
  • The Destination of the routing table is the destination network address , Genmask is the subnet mask , Gateway is the next hop address , Iface is the sending interface , the U flag in Flags means this entry is valid (some entries can be disabled), and the G flag means this entry The next hop address is the address of a router, and the entry without the G flag indicates that the destination network address is the network directly connected to the local interface, and does not need to be forwarded by the router;

Forwarding process example 1: If the destination address of the packet to be sent is 192.168.56.3

  • First, do an AND operation with the subnet mask 255.255.255.0 of the first line to get 192.168.56.0, which does not match the destination network address 192.168.10.0 of the first line.
  • Then do an AND operation with the subnet mask of the second line, which is also 192.168.56.0, which matches the destination network address 192.168.56.0 of the second line, so it is sent by the eth1 interface to remove
  • Since 192.168.56.0/24 is the network directly connected to the eth1 interface, it can be sent directly to the destination host without being forwarded by the router.

Forwarding process example 2: If the destination address of the data packet to be sent is 202.10.1.2

  • Compared with the first few items of the routing table in turn, it is found that they do not match.
  • According to the default routing entry, it is sent from the eth0 interface and sent to the 192.168.10.1 router
  • The next hop address is determined by the 192.168.10.1 router according to its routing table

Routing table generation algorithm: The routing table can be manually maintained by the network administrator (static routing), or it can be automatically generated through some algorithms (dynamic routing). Common generation algorithms include: distance vector algorithm, LS algorithm, Dijkstra algorithm, etc. This article does not give a detailed introduction, readers can understand by themselves.

🌵ICMP Protocol

The ICMP protocol is a network layer protocol. As we said above, the IP protocol does not provide reliable transmission. Therefore, if a packet loss occurs, the IP protocol cannot notify the transport layer whether the packet is lost and the reason for the packet loss. However, a newly built network often needs to perform a simple test to verify whether the network is smooth, and the ICMP protocol is the protocol that provides this function.

🍁 Main functions of ICMP protocol

  • Confirm whether the IP packet successfully reaches the destination address;
  • Notify the reason why the IP packet was dropped during transmission;
  • ICMP also works based on the IP protocol, but it is not a function of the transport layer, so people still attribute it to a network layer protocol;
  • ICMP can only be used with IPv4. If it is IPv6, you need to use ICMPv6.
    The following figure shows the implementation process of the ICMP protocol:
    Regarding the ARP request here, it will be introduced in detail in the data link layer.

🍁ICMP message format (easy to understand)

Regarding the format of the ICMP protocol message, we do not know in detail.
The types in the ICMP protocol message are roughly divided into the following:
ICMP is roughly divided into two types of messages:

  • One is to notify the cause of the error.
  • One class is for diagnostic queries.

🍁ping command

  • Note that ping + domain name here, not url, a domain name can be resolved into an IP address through the domain name resolution protocol.
  • The ping command can not only verify the connectivity of the network, but also count the response time and TTL (Time To Live in IP packets, life cycle).
  • The ping command will first send an ICMP Echo Request to the peer.
  • After the peer receives it, it will return an ICMP Echo Reply.

In the interview, the interviewer may ask that telnet is port 23 and ssh is port 22, so what port is ping?
It is worth noting that the ping command is based on ICMP, which belongs to the network layer, and the port number is the content of the transport layer. Information such as port numbers is not of interest in the network layer.

🍁traceroute command

🌴Data Link Layer

The data link layer is the last layer in the TCP/IP protocol structure model we introduced. It is used for transmission between two devices (the same data link node) and belongs to the bottom layer of the computer network system.
First, let’s review How data flows in the network (regardless of the physical layer):
Here we study the data link layer, so we only care about the link layer, then the data flow from host B to host C in the above figure can be regarded as the transfer of data from left to right, as shown in the following figure:
So why have a data link layer?

🌿The meaning of the data link layer

We know that the line of the physical layer is composed of a transmission medium and a communication device, and there will definitely be errors when the bit stream is transmitted on the transmission medium. Therefore, it is necessary to introduce the data link layer above the physical layer, and adopt methods such as error detection, error control and flow control to provide high-quality data transmission services to the network layer. That is to say, for the network layer, due to the existence of the link layer, there is no need to care about the specific transmission medium and communication device used by the physical layer.
The specific functions of the link layer are: link management, frame synchronization; flow control, error control. Needs to implement: separation of data and control information; transparent transmission and addressing.

🌿Ethernet

Ethernet is the most common type of computer network in the real world. Ethernet implements the idea of ​​multiple nodes in a radio system on a network sending information. Each node must acquire a cable or channel to transmit information, sometimes called Ether. Each node has a globally unique 48-bit address, which is the MAC address assigned by the manufacturer to the network card , to ensure that all nodes on the Ethernet can identify each other. Because Ethernet is so ubiquitous, many manufacturers integrate Ethernet cards directly into computer motherboards.

  • “Ethernet” is not a specific network, but a technical standard; it includes both the content of the data link layer and the content of some physical layers .
  • For example: the network topology, access control method, transmission rate, etc. are specified; the network cable in Ethernet must use twisted pair; the transmission rate is 10M, 100M, 1000M, etc.;
  • Ethernet is currently the most widely used local area network technology; alongside Ethernet there are token ring networks, wireless LANs, etc.;

🌵Ethernet Frame Format

The frame format of Ethernet is as follows:

The source address and destination address refer to the hardware address (also called MAC address) of the network card, the length is 48 bits, and it is solidified when the network card leaves the factory; the
frame protocol type field has three values, corresponding to IP, ARP, RARP respectively;
the end of the frame is CRC check code.

🍁 Know the MAC address

The MAC address is used to identify the connected nodes in the data link layer; the
length is 48 bits, that is, 6 bytes. Generally, it is expressed in the form of a hexadecimal number plus a colon (for example: 08:00:27:03:fb :19)
The MAC address is determined when the network card leaves the factory and cannot be modified. The mac address is usually unique (the mac address in the virtual machine is not the real mac address and may conflict; some network cards also support the user to configure the mac address).

Simple comparison of IP address and MAC address:

  • IP addresses describe the start and end points of the entire process of data transmission.
  • The MAC address is the start and end point of each interval on this route.
    Take the example of Tang monks taking Buddhist scriptures as an example. Tang monks came from the Eastern Land and Tang Dynasty and went to Western Heaven to collect Buddhist scriptures. For this whole journey, it is equivalent to the source IP address of the IP address in the Eastern Land and the Tang Dynasty, and the destination IP address in the West World; On the way to the Western Heaven to get Buddhist scriptures, the Tang monk passed through the country of the daughter to Xiaoleiyin Temple, then the country of the daughter was the source mac address, and Xiaoleiyin Temple was the destination mac address.

🍁 Meet MTU

MTU is the maximum transmission unit (Maximum Transmission Unit), which is used to notify the other party of the maximum size of the data service unit that can be accepted, indicating the payload size that the sender can accept. MTU is the maximum length of a packet or frame, usually in bytes.
MTU is equivalent to the limit on the size of the package when sending express. This limit is generated by the physical layer corresponding to different data links.

  • The data length in the Ethernet frame specifies a minimum of 46 bytes and a maximum of 1500 bytes. If the length of the ARP data packet is not enough to 46 bytes, padding bits should be added later;
  • The maximum 1500 bytes is called the maximum transmission unit (MTU) of Ethernet, and different network types have different MTUs.
  • If a packet is routed from Ethernet to a dial-up link, and the packet length is greater than the MTU of the dial-up link, then the packet needs to be fragmented;
  • The MTU of different data link layer standards is different;
🌱 The impact of MTU on IP protocol

Due to the limitation of the MTU of the data link layer, the larger IP data packets need to be sub-packaged.

📘 Divide a larger IP packet into multiple small packets, and label each small packet; the 16-bit identifier (id) of
the IP protocol header of each small packet is the same; 📘 The 3-bit identifier of the IP protocol header of each small packet In the field, the second position is 0, indicating that fragmentation is allowed, and the third position is the end marker (whether the current is the last small packet, if it is, it is set to 1, otherwise it is set to 0);
📘 When reaching the opposite end, these small packets will be , will be reassembled in order, assembled together and returned to the transport layer; once any of these small packets is lost, the reassembly at the receiving end will fail.
📘 But the IP layer will not be responsible for retransmitting data; this is why the IP protocol is not reliable.

🌱The effect of MTU on UDP protocol

We know that the UDP protocol is unreliable, that is to say, the UDP protocol only sends data, regardless of whether the peer receives it or not.

  • Then once the data carried by UDP exceeds 1472 (1500 – 20 (IP header) – 8 (UDP header)), it will be divided into multiple IP datagrams at the network layer.
  • If any one of these IP datagrams is lost, it will cause the network layer reassembly of the receiver to fail. This means that if the UDP datagram is fragmented at the network layer, the probability of the entire data being lost is greatly increased.
🌱 The impact of MTU on TCP
  • A datagram of TCP cannot be infinitely large, or is subject to MTU. The maximum message length of a single datagram of TCP is called MSS (Max Segment Size);
  • In the process of establishing a connection in TCP, both parties will conduct MSS negotiation.
  • Ideally, the value of MSS is exactly the maximum length at which the IP will not be fragmented (this length is still limited by the MTU of the data link layer).
  • When the two parties send SYN, they will write the MSS value they can support in the TCP header. Then, after the two parties know the MSS value of the other party, they choose the smaller one as the final MSS.
  • The value of MSS is in the 40-byte variable length option in the TCP header (kind=2);

Check the hardware address and MTU:
Use the ifconfig command to view the ip address, mac address, and MTU.

🌿ARP Protocol

Although we introduce the ARP protocol here, we have mentioned above that ARP is not a simple data link layer protocol, but a protocol between the data link layer and the network layer;

🌵The role of ARP protocol

ARP (Address resolution protocol) is the address translation protocol, and its function is to establish the mapping relationship between the host IP address and the MAC address.

  • During network communication, the application program of the source host knows the IP address and port number of the destination host, but does not know the hardware address of the destination host;
  • The data packet is first received by the network card and then processed by the upper-layer protocol. If the hardware address of the received data packet does not match the local machine, it will be discarded directly;
  • Therefore, the hardware address of the destination host must be obtained before communication; the ARP protocol solves this problem.

🌵 ARP protocol workflow

  • The source host sends an ARP request, asking “what is the hardware address of the host whose IP address is 172.20.1.2”, and broadcasts the request to the local network segment (the hardware address in the header of the Ethernet frame is filled with FF:FF:FF:FF:FF :FF means broadcast);
  • The destination host receives the broadcast ARP request and finds that the IP address is consistent with the local machine, then sends an ARP response packet to the source host, and fills in its own hardware address in the response packet ;
  • Each host maintains an ARP cache table, which can be viewed with the arp -a command. The entry in the cache table has an expiration time (usually 20 minutes). If an entry is not used again within 20 minutes, the entry will be invalid, and an ARP request will be sent next time to obtain the hardware address of the destination host.
    [Thinking] Why is there an ARP cache table? And why does the entry have an expiration time and is not always valid?
    The ARP cache table is nothing more than a convenience for us to send data again in a short period of time without having to ask for the physical address of the destination host. Why do we need to know the ARP cache? We mentioned above that the ARP request is broadcast, so not only the destination host can receive the request, so the incorrect host can also respond to our request, so that the data is sent to the wrong host. , this is “ARP spoofing”, that is, it takes advantage of the flaws in the design of the ARP protocol, and the way to prevent it is to set a static “IP-MAC comparison entry”. Clearing the ARP cache to prevent spoofing of the comparison table may solve the problem of not being able to access the Internet or misidentifying the link caused by ARP attacks. Secondly, if the ARP cache is not cleared, the memory will be occupied more and more, causing the machine to become more and more stuck.

🍁ARP datagram format

Thus, combined with the above-mentioned workflow, we can give the format of the ARP datagram:

  • Note that the source MAC address and the destination MAC address appear once each in the Ethernet header and in the ARP request. This is redundant for the case where the link layer is Ethernet, but may be necessary if the link layer is other types of networks of.
  • The hardware type refers to the link layer network type, 1 is Ethernet; the protocol type refers to the address type to be converted, 0x0800 is the IP address; the hardware address length is 6 bytes for the Ethernet address;
  • The protocol address length is 4 bytes for and IP addresses; an op field of 1 indicates an ARP request, and an op field of 2 indicates an ARP reply.

🌴Other important protocols or technologies

🌿DNS(Domain Name System)

DNS is a set of systems that map from domain names to IPs, the Domain Name System.
The host domain name consists of: host name. structure name. network name. top-level domain name. For example, in abc.de.edu.cn, abc is the host name, de is the structure name, edu is the network name, and cn is the top-level domain name.

🌵DNS Background

A program in TCP/IP that uses an IP address and a port number to identify a host on a network. But the IP address is not easy to remember. So people invented something called the hostname, which is a string, and uses the hosts file to describe the relationship between hostnames and IP addresses.
Initially, the hosts file was managed through the Internet Information Center (SRI-NIC).

  • If a new computer wants to connect to the network, or a computer IP changes, you need to apply to the information center to change the hosts file.
  • Other computers also need to regularly download and update the new version of the hosts file to properly surf the Internet.

But this was too much trouble, so the DNS system was created:

  • An organization’s system management agency that maintains the correspondence between the IP and hostname of each host in the system.
  • If a new computer is connected to the network, register this information in the database;
  • When the user enters the domain name, the DNS server will be automatically queried, and the DNS server will retrieve the database to obtain the corresponding IP address.

In fact, the hosts file is still kept on our computer so far. In the process of domain name resolution, the content of the hosts file will still be searched first.
Note: The DNS protocol specification uses UDP for transport . Also, the default port for DNS is 53.

🌵 Introduction to Domain Names

The primary domain name is a hierarchical name used to identify the host name and the organization to which the host belongs.

www.baidu.com

  • com: First-level domain name. Indicates that this is a corporate domain name. There are also “net” (network provider), “org” (non-profit organization), “edu” (educational organization), etc. at the same level.
  • baidu: Second-level domain name, company name.
  • www: is just an idiom. In the past, when people used domain names, they often named them in a format similar to ftp.xxx.xxx/www.xxx.xxx to indicate the protocols supported by the host.

🍁Analyze DNS with dig tool

Install the dig tool:

yum install bind-utils

Use the dig command to view the domain name resolution process:

  1. The beginning position is the version number of the dig command
  2. The second part is the details returned by the server, the important thing is the status parameter, NOERROR means the query is successful
  3. QUESTION SECTION indicates what is the domain name to be queried
  4. ANSWER SECTION indicates what the query result is. This result first queried www.baidu.com into www.a.shifen.com, and then queried www.a.shifen.com into two ip addresses.
  5. At the bottom are some result statistics, including query time and DNS server address, etc.

🌿NAT(Network Address Translation) Technology

🌵NAT Technology Background

Earlier we discussed the problem of insufficient number of IP addresses in the IPv4 protocol.
NAT technology is the main means to solve the problem of insufficient IP addresses, and it is an important function of routers;

  • NAT can convert private IP to global IP when communicating externally. That is, a technical method to convert private IP and global IP to each other:
  • Many schools, families, and companies use each terminal to set a private IP, and set a global IP on the router or necessary server; the global IP is required to be unique, but the private IP is not required;
  • The appearance of the same private IP in different LANs is completely unaffected;

🌵NAT: IP Translation Process

  • The NAT router replaces the source address from 10.0.0.10 with the global IP 202.244.174.37;
  • When the NAT router receives external data, it will replace the target IP from 202.244.174.37 back to 10.0.0.10;
  • Inside the NAT router, there is an automatically generated table for address translation;
  • When 10.0.0.10 sends data to 163.221.120.9 for the first time, the mapping relationship in the table will be generated

🌵NAPT Technology

If multiple hosts in the same LAN access the same external server, then the data returned by the server has the same destination IP . So how does the NAT router determine which
LAN host to forward the packet to?
It’s time for NAPT to solve this problem, that is, use IP+port to establish this association.
This association relationship is also automatically maintained by the NAT router. For example, in the case of TCP, when a connection is established, this entry will be generated; after the connection is disconnected, this entry will be deleted.

🌵Deficiencies of NAT Technology

Since NAT relies on this translation table, there are a number of limitations:

  • It is not possible to establish a connection to the internal server from outside the NAT; that is to say, the server can only be accessed from the client, and the external server cannot directly access the client, and the connection must be established.
  • Both the creation and destruction of the translation table require additional overhead.
  • Once the NAT device is abnormal during the communication process, all TCP connections will be disconnected even if there is a hot backup.

🌵NAT and proxy server

Routers often have the function of NAT devices, which are relayed through NAT devices to complete the communication process between subnet devices and other subnet devices. The
proxy server looks a bit like a NAT device. When a client sends a request like a proxy server, the proxy server will The request is forwarded to the server that really wants to request; after the server returns the result, the proxy server sends the result back to the client.
So what is the difference between NAT and a proxy server?

  • In terms of application, NAT equipment is one of the basic network equipment, which solves the problem of insufficient IP. Proxy servers are closer to specific applications, such as overcoming the wall through proxy servers, and game accelerators also use proxy servers.
  • In terms of the underlying implementation, NAT works at the network layer and directly replaces the IP address. Proxy servers often work at the application layer.
  • In terms of scope of use, NAT is generally deployed at the exit of the local area network, and the proxy server can be done in the local area network, in the wide area network, or across the network.
  • From the perspective of deployment location, NAT is generally integrated on hardware devices such as firewalls and routers, while the proxy server is a software program that needs to be deployed on the server.

A proxy server is a widely used technology.

  • Over the Wall: Proxy in WAN.
  • Load Balancing: Proxy in LAN

The proxy server is divided into forward proxy and reverse proxy:

Take the purchasing agent as an example, I want to buy an iPhone from abroad, so I asked my cousin from abroad to buy it in a local physical store and mail it to me. At this time, the owner of the physical store saw that it was my cousin who was buying the item, so my cousin was “Zheng”. to the agent”.
Later, there were too many people who asked my cousin to buy iPhones, so my cousin bought a lot of iPhones and kept them at home. As long as someone came to buy them on her behalf, she would directly mail the stocked iPhones. At this time, my cousin was “reverse”. acting”.

Forward proxies are used to forward requests (for example, to bypass anti-crawlers with the help of proxies).
Reverse proxies often act as a cache.

You may also like...

Leave a Reply

Your email address will not be published.