Notes HighPerformanceLinuxServerProgramming Download - Notes HighPerformanceLinuxServerProgramming Source code download

Chapter 1 TCP/IP Protocol Suite

TCP/IP protocol suite architecture and main protocols

There are many protocols in the protocol family. This book only selects IP and TCP protocols - which have the most direct impact on network programming.

The same seven layers are the osi reference model. After simplification, four different layers communicate with each other through interfaces, which facilitates the modification of each layer.

Application layer Responsible for handling application logic

Presentation layer Defines the format and encryption of data

session layer It defines how to start, control and end a session, including the control and management of multiple bidirectional messages, so that applications can be notified when only part of a continuous message is completed, so that the data seen by the presentation layer is continuous.

transport layer Provides end-to-end communication for applications on two hosts. Different from the next hop used by the network layer, it only cares about the start and end, and the transfer process is left to the lower layer. There are two major protocols in this layer: TCP Protocol and UDP protocol TCP protocol (Transmission Control Protocol Transmission Control Protocol)

Provide可靠的, 面向连接, 基于流的服务to the application layer
Ensure normal delivery of data through超时重传and数据确认.
TCP needs to store some necessary status, connection status, read and write buffers, many timers UPD protocol (User Datagram Protocol User Datagram Protocol)
Provides不可靠的, 无连接的, 基于数据报的服务to the application layer
Generally, you need to deal with the issues of数据确认and超时重传yourself.
The communication does not store status. Every time you send, you need to specify the address information.有自己的长度

network layer It realizes the routing and forwarding of data packets. If the data packet cannot reach the destination address, it will下一跳the next hop (hop by hop) and choose the nearest IP protocol (Internet Protocol) and ICMP protocol (Internet Control Message Protocol) . The latter protocol It is a supplement to the IP protocol, used to detect network connections 1. Error messages, used to respond to status 2. Query messages (the ping program uses this message to determine whether the information has been delivered)

data link layer A network driver that implements the network card interface. The driver here facilitates the manufacturer's lower-layer modifications and only needs to provide the specified interface to the upper layer. There are two protocols : ARP (Address Resolve Protocol, Address Resolution Protocol) . There are also RARP ( Reverse ~, Reverse Address Resolution Protocol) . Since the network layer uses IP addresses to address machines, but the data link layer uses physical addresses (usually MAC addresses), the conversion between them involves ARP protocol ARP spoofing, which may be related to this. Not studying at the moment

encapsulation The upper layer protocol is sent to the lower layer protocol. It is implemented through encapsulation. When transmitting between layers, its own header information is added. The data encapsulated by TCP becomes TCP报文段

The data is deleted after the kernel part is successfully sent.

Data encapsulated by UDP becomes UDP数据报

Delete after sending

After being encapsulated by IP, it becomes IP数据报Finally, it is encapsulated by the data link layer and becomes帧

The maximum data frame of Ethernet is 1518 bytes, throwing away 14 headers and 4 checksums at the end of the frame. MTU: The maximum transmission unit of the frame is generally 1500 bytes. MSS: The maximum data load of TCP packets is 1460 bytes = 1500 bytes - 20Ip header. -20TCP header has an additional 40-byte optional part

ARP The ARP protocol can realize the conversion of any network layer address to any physical address.

Chapter 2 Detailed explanation of IP protocol

The IP protocol is the core protocol of the TCP/IP protocol suite and one of the foundations of socket network programming. The IP protocol provides stateless, connectionless, and unreliable services for upper-layer protocols.

The maximum length of IP datagram is 65535 (2^16 - 1) bytes, but there is a MTU limit

When the length of an IP datagram exceeds the MTU, it will be fragmented for transmission. Fragmentation may occur at the sender, or at the transit router, or it may be fragmented multiple times. Only on the final target machine, these fragments can Will be reassembled by the ip module in the kernel

routing mechanism

After the target IP address is given, which item in the routing table will be matched? There are three steps.

Find the host IP address in the routing table that exactly matches the destination IP address of the datagram. If found, use the routing entry. Otherwise, proceed to the next step.
Find the network IP address in the routing table that has the same network ID as the destination IP address of the datagram....... Otherwise, next step
Select the default route item, which usually means the next hop route is the gateway

Chapter 3 Detailed explanation of TCP protocol

Tcp reading and writing are all for buffers, so there is no fixed correspondence between the number of reads and writes.

UDP does not have a buffer. Data must be received in time otherwise packets will be lost, or if the receiving buffer is too small, datagrams will be truncated.

ISN - Initial sequence number value 32-bit sequence number The sequence number value in the subsequent TCP message segment seq = ISN + The offset of the first byte of the message segment in the entire byte stream 32-bit confirmation number The sequence number value of the TCP message received + 1 . This 32-bit confirmation number is sent each time it is the last response.

ACK flag: Indicates whether the confirmation number is valid. The message segment carrying the ACK flag is called确认报文段. PSH flag: Prompts the receiving application to read data from the TCP receive buffer to make room for subsequent data. RST flag: Requirements The other party re-establishes the connection and carries...the复位报文段SYN flag: the flag requests to establish a connection and carries...同步报文段FIN flag: informs the other party that the local connection is to be closed, and carries...结束报文段

16-bit window size: The window refers to the receiving notification window, which tells the other party how many bytes of data the local TCP receiving buffer can hold. 16-bit checksum:可靠传输的重要保障The sending end fills, and the receiving end performs CRC algorithm verification. Check whether it is damaged, and check TCP头部and数据部分at the same time

TCP connection establishment and closing

# 三次握手
# 客户端发送请求连接 ISN= seq + 0 = 3683340920
# mss 最大数据载量1460
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 : 
Flags [S], seq 3683340920 , win 64240 , 
options [mss 1460 , nop ,wscale 8 , nop , nop ,sackOK], length 0

# 同意客户端连接
# ack = 客户端发送 seq + 1
# 同时发送服务端的seq
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 : 
Flags [S.], seq 938535101 , ack 3683340921 , win 64240 , 
options [mss 1460 , nop , nop ,sackOK, nop ,wscale 7 ], length 0

# 虽然这个报文段没有字节 但由于是同步报文段 需要占用一个序号值
# 这里是tcpdump的处理 ack显示相对值 即 3683340921 - 3683340920 = 1
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 : 
Flags [.], ack 938535102 , win 4106 , length 0


# 包含FIN标志 说明要求结束连接 也需要占用一个序号值
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 : 
Flags [F.], seq 1 , ack 1 , win 4106 , length 0

# 服务端确认关闭连接
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 : 
Flags [.], ack 2 , win 502 , length 0

# 服务端发送关闭连接
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 : 
Flags [F.], seq 1 , ack 2 , win 4105 , length 0

# 客户端确认
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 : 
Flags [.], ack 2 , win 503 , length 0

Chapter 5 Linux Network Programming Basic API

The basic socket API is located in the sys/socket.h header file. The initial meaning of socket is an IP address and port pair. The only network information that represents TCP communication is in netdb.h header file.

Host byte order and network byte order

Byte order is divided into大端字节序and小端字节序Since most PCs use little-endian byte order (high bits exist at high addresses), little-endian byte order is also called host byte order.

In order to prevent confusion caused by different byte order of different machines, it is stipulated that the transmission should be unified into big-endian byte order (network byte order). In this way, the host will decide according to its own situation - whether to convert the byte order of the received data

API

basic connection

 // 主机序和网络字节序转换
# include < netinet/in.h >
unsigned long int htonl ( unsigned long int hostlong); // host to network long
unsigned short int htons ( unsigned short int hostlong); // host to network short

unsigned long int htonl ( unsigned long int netlong);
unsigned short int htons ( unsigned short int netlong);

// IP地址转换函数
# include < arpa/inet.h >
// 将点分十进制字符串的IPv4地址, 转换为网络字节序整数表示的IPv4地址. 失败返回INADDR_NONE
in_addr_t  inet_addr ( const char * strptr);

// 功能相同不过转换结果存在 inp指向的结构体中. 成功返回1 反之返回0
int inet_aton ( const char * cp, struct in_addr * inp);

// 函数返回一个静态变量地址值, 所以多次调用会导致覆盖
char * inet_ntoa ( struct in_addr in); 

// src为 点分十进制字符串的IPv4地址 或 十六进制字符串表示的IPv6地址 存入dst的内存中 af指定地址族
// 可以为 AF_INET AF_INET6 成功返回1 失败返回-1
int inet_pton ( int af, const char * src, void * dst);
// 协议名, 需要转换的ip, 存储地址, 长度(有两个常量 INET_ADDRSTRLEN, INET6_ADDRSTRLEN)
const char * inet_ntop ( int af, const void *  src, char * dst, socklen_t cnt);


// 创建 命名 监听 socket
# include < sys/types.h >
# include < sys/socket.h >
// domain指定使用那个协议族 PF_INET PF_INET6
// type指定服务类型 SOCK_STREAM (TCP协议) SOCK_DGRAM(UDP协议)
// protocol设置为默认的0
// 成功返回socket文件描述符(linux一切皆文件), 失败返回-1
int socket ( int domain, int type, int protocol);

// socket为socket文件描述符
// my_addr 为地址信息
// addrlen为socket地址长度
// 成功返回0 失败返回 -1
int bind ( int socket, const struct sockaddr * my_addr, socklen_t addrlen);

// backlog表示队列最大的长度
int listen ( int socket, int backlog);
// 接受连接 失败返回-1 成功时返回socket
int accept ( int sockfd, struct sockaddr * addr, socklen_t * addrlen)

client

 // 发起连接
#include <sys/types.h>
#include <sys/socket.h>
// 第三个参数为 地址指定的长度
// 成功返回0 失败返回-1
int connect ( int sockfd , const struct sockaddr * serv_addr , socklen_t addrlen );

// 关闭连接
#include <unistd.h>
// 参数为保存的socket
// 并非立即关闭, 将socket的引用计数-1, 当fd的引用计数为0, 才能关闭(需要查阅)
int close ( int fd );

// 立即关闭
#include <sys/socket.h>
// 第二个参数为可选值 
//	SHUT_RD 关闭读, socket的接收缓冲区的数据全部丢弃
//	SHUT_WR 关闭写 socket的发送缓冲区全部在关闭前发送出去
//	SHUT_RDWR 同时关闭读和写
// 成功返回0 失败为-1 设置errno
int shutdown ( int sockfd , int howto )

Basic TCP

 #include <sys/socket.h>
#include <sys/types.h>

// 读取sockfd的数据
// buf 指定读缓冲区的位置
// len 指定读缓冲区的大小
// flags 参数较多
// 成功的时候返回读取到的长度, 可能小于预期长度, 需要多次读取.   读取到0 通信对方已经关闭连接, 错误返回-1
ssize_t recv ( int sockfd , void * buf , size_t len , int flags );
// 发送
ssize_t send ( int sockfd , const void * buf , size_t len , int flags );

option name	meaning	Available for sending	available for receiving
MSG_CONFIRM	Instructs the link layer protocol to continue listening until a reply is received. (Can only be used for SOCK_DGRAM and SOCK_RAW type sockets)	Y	N
MSG_DONTROUTE	Without checking the routing table, the data is sent directly to the local LAN host (meaning that the sender knows that the target host is in the local network)	Y	N
MSG_DONTWAIT	non-blocking	Y	Y
MSG_MORE	Inform the kernel that there is more data to be sent, and wait until the data is written into the buffer before sending it all together. Reduce short messages and improve transmission efficiency.	Y	N
MSG_WAITALL	The read operation waits until the specified byte is read before returning.	N	Y
MSG_PEEK	Take a look at the internal cache data, it will not affect the data	N	Y
MSG_OOB	Send or receive emergency data	Y	Y
MSG_NOSIGNAL	Writing data to a read-closed pipe or socket connection will not trigger the SIGPIPE signal.	Y	N

Basic UDP

 #include <sys/types.h>
#include <sys/socket.h>
// 由于UDP不保存状态, 每次发送数据都需要 加入目标地址.
// 不过recvfrom和sendto 也可以用于 面向STREAM的连接, 这样可以省略发送和接收端的socket地址
ssize_t recvfrom ( int sockfd , void * buf , size_t len , int flags , struct sockaddr * src_addr , socklen_t * addrlen );
ssize_t sendto ( int sockfd , const void * buf , size_t len , ing flags , const struct sockaddr * dest_addr , socklen_t addrlen );

General read and write functions

 #inclued <sys/socket.h>
ssize_t recvmsg ( int sockfd , struct msghdr * msg , int flags );
ssize_t sendmsg ( int sockfd , struct msghdr * msg , int flags );

struct msghdr
{
/* socket address --- 指向socket地址结构变量, 对于TCP连接需要设置为NULL*/
	void * msg_name ; 


	socklen_t msg_namelen ;
	
	/* 分散的内存块 --- 对于 recvmsg来说数据被读取后将存放在这里的块内存中, 内存的位置和长度由
     * msg_iov指向的数组指定, 称为分散读(scatter read)  ---对于sendmsg而言, msg_iovlen块的分散内存中
     * 的数据将一并发送称为集中写(gather write);
	*/
	struct iovec * msg_iov ;
	int msg_iovlen ; /* 分散内存块的数量*/
	void * msg_control ; /* 指向辅助数据的起始位置*/
	socklen_t msg_controllen ; /* 辅助数据的大小*/
	int msg_flags ; /* 复制函数的flags参数, 并在调用过程中更新*/
};

struct iovec
{
	void * iov_base /* 内存起始地址*/
	size_t iov_len /* 这块内存长度*/
}

Other APIs

 #include <sys/socket.h>
// 用于判断 sockfd是否处于带外标记, 即下一个被读取到的数据是否是带外数据, 
// 是的话返回1, 不是返回0
// 这样就可以选择带MSG_OOB标志的recv调用来接收带外数据. 
int sockatmark ( int sockfd );

// getsockname 获取sockfd对应的本端socket地址, 存入address指定的内存中, 长度存入address_len中 成功返回0失败返回-1
// getpeername 获取远端的信息, 同上
int getsockname ( int sockfd , struct sockaddr * address , socklen_t * address_len );
int getpeername ( int sockfd , struct sockaddr * address , socklen_t * address_len );

/* 以下函数头文件均相同*/

// sockfd 目标socket, level执行操作协议(IPv4, IPv6, TCP) option_name 参数指定了选项的名字. 后面值和长度
// 成功时返回0 失败返回-1
int getsockopt ( int sockfd , int level , int option_name , void * option_value , 
						socklen_t restrict option_len );
int setsockopt ( int sockfd , int level , int option_name , void * option_value , 
						socklen_t restrict option_len );

SO_REUSEADDR	Reuse local address	After a sock is set with this attribute, even if the sock is in the TIME_WAIT state after being bind(), the socket address bound to it can still be immediately reused to bind a new sock.
SO_RCVBUF	TCP receive buffer size	The minimum value is 256 bytes. After setting, the system will automatically double the value you set. The extra double will be used as a free buffer to deal with congestion.
SO_SNDBUF	TCP send buffer size	Minimum value is 2048 bytes
SO_RCVLOWAT	Received low water mark	The default is 1 byte. When the total number of readable data in the TCP receive buffer is greater than its low water mark, the IO multiplexing system call will notify the application that the data can be read from the corresponding socket.
SO_SNDLOWAT	high water mark sent	The default is 1 byte. Data can be written when the free space in the TCP send buffer is greater than the low water mark.
SO_LINGER

 struct linger
{
	int l_onoff /* 开启非0, 关闭为0*/
	int l_linger ; /* 滞留时间*/
	/*
	* 当onoff为0的时候此项不起作用, close调用默认行为关闭socket
	* 当onoff不为0 且linger为0, close将立即返回, TCP将丢弃发送缓冲区的残留数据, 同时发送一个复位报文段
	* 当onoff不为0 且linger大于0 . 当socket阻塞的时候close将会等待TCP模块发送完残留数据并得到确认后关 
	* 闭, 如果是处于非阻塞则立即关闭
	*/
};

Network Information API

 #include <netdb.h>
// 通过主机名查找ip
struct hostent * gethostbyname ( const char * name );

// 通过ip获取主机完整信息 
// type为IP地址类型 AF_INET和AF_INET6
struct hostent * gethostbyaddr ( const void * addr , size_t len , int type );

struct hostent
{
  char * h_name ;			/* Official name of host.  */
  char * * h_aliases ;		/* Alias list.  */
  int h_addrtype ;		/* Host address type.  */
  int h_length ;			/* Length of address.  */
  char * * h_addr_list ;		/* List of addresses from name server.  */
}

int main ( int argc , char * argv [])
{
    if ( argc != 2 )
    {
        printf ( "非法输入n" );
        exit ( 0 );
    }
    char * name = argv [ 1 ];

    struct hostent * hostptr {};

    hostptr = gethostbyname ( name );
    if ( hostptr == nullptr )
    {
        printf ( "输入存在错误 或无法获取n" );
        exit ( 0 );
    }

    printf ( "Official name of hostptr: %sn" , hostptr -> h_name );

    char * * pptr ;
    char inet_addr [ INET_ADDRSTRLEN ];

    printf ( "Alias list:n" );
    for ( pptr = hostptr -> h_aliases ; * pptr != nullptr ; ++ pptr )
    {
        printf ( "t%sn" , * pptr );
    }

    switch ( hostptr -> h_addrtype )
    {
        case AF_INET :
        {
            printf ( "List of addresses from name server:n" );
            for ( pptr = hostptr -> h_addr_list ; * pptr != nullptr ; ++ pptr )
            {
                printf ( "t%sn" ,
                        inet_ntop ( hostptr -> h_addrtype , * pptr , inet_addr , sizeof ( inet_addr )));
            }
            break ;
        }
        default :
        {
            printf ( "unknow address typen" );
            exit ( 0 );
        }
    }
    return 0 ;
}

/*
./run baidu.com
Official name of hostptr: baidu.com
Alias list:
List of addresses from name server:
	39.156.69.79
	220.181.38.148
*/

The following two functions obtain service information by reading the /etc/services file. The following content is from Wikipedia

The Service file is a configuration file in the etc directory of modern operating systems. It records the port number and protocol corresponding to the network service name. Its purpose is as follows

Through the TCP/IP API function (declared in netdb.h), the corresponding relationship between the network service name, port number, and usage protocol can be directly found. For example, getservbyname("serve","tcp") gets the port number; getservbyport(htons(port), "tcp") gets the service name on the port and protocol
If the user maintains all used network service names, ports, and protocols in this file, then he can clearly know at a glance which port numbers are used for which service and which port numbers are free.

 #include <netdb.h>
// 根据名称获取某个服务的完整信息
struct servent getservbyname ( const char * name , const char * proto );

// 根据端口号获取服务信息
struct servent getservbyport ( int port , const char * proto );

struct servent
{
	char * s_name ; /* 服务名称*/
	char * * s_aliases ; /* 服务的别名列表*/
	int s_port ; /* 端口号*/
	char * s_proto ; /* 服务类型, 通常为TCP或UDP*/
}

 #include <netdb.h>
// 内部使用的gethostbyname 和 getserverbyname
// hostname 用于接收主机名, 也可以用来接收字符串表示的IP地址(点分十进制, 十六进制字符串)
// service 用于接收服务名, 字符串表示的十进制端口号
// hints参数 对getaddrinfo的输出进行更准确的控制, 可以设置为NULL, 允许反馈各种有用的结果
// result 指向一个链表, 用于存储getaddrinfo的反馈结果
int getaddrinfo ( const char * hostname , const char * service , const struct addrinfo * hints , struct addrinfo * * result )

struct addrinfo
{
	int ai_flags ;
	int ai_family ;
	int ai_socktype ; /* 服务类型, SOCK_STREAM或者SOCK_DGRAM*/
	int ai_protocol ;
	socklen_t ai_addrlen ;
	char * ai_canonname ; /* 主机的别名*/
	struct sockaddr * ai_addr ; /* 指向socket地址*/
	struct addrinfo * ai_next ; /* 指向下一个结构体*/
}

// 需要手动的释放堆内存
void freeaddrinfo ( struct addrinfo * res );

 #include <netdb.h>
// host 存储返回的主机名
// serv存储返回的服务名

int getnameinfo ( const struct sockaddr * sockaddr , socklen_t addrlen , char * host , socklen_t hostlen , char * serv
	socklen_t servlen , int flags );

Test use

telnet ip port #来连接服务器的此端口
netstat -nt | grep port #来查看此端口的监听

Chapter 6 Advanced IO Functions

The advanced IO functions provided by Linux are naturally more powerful under specific conditions. Otherwise, what else would they do? Specific conditions naturally limit the frequency of use of file descriptors. The file descriptor is a non-negative integer. Is an index value that points to the record table of files opened by the process maintained by the kernel for each process. STDOUT_FILENO (value 1) - The file descriptor with value 1 is standard output. After turning off STDOUT_FILENO, use dup to return the smallest available value (currently, 1). In this way, the output is redirected to the file pointed to by the parameter calling dup.

Create file descriptor - pipe dup dup2 splice select

pipe function This function can be used to create a pipe to implement communication between processes.

 // 函数定义
// 参数文件描述符数组 fd[0] 读出 fd[1]写入 单向管道
// 成功返回0, 并将一对打开的文件描述符填入其参数指向的数组
// 失败返回-1 errno
#include <unistd.h>
int pipe ( int fd [ 2 ]);

 // 双向管道
// 第一个参数为 协议PF_UNIX(书上是AF_UNIX)感觉这里指明协议使用PF更好一些
#include <sys/types.h>
#include <sys/socket.h>
int socketpair ( int domain , int type , int protocol , int fd [ 2 ]);

After studying the following content and understanding of inter-process communication, I will come back and add an example.

 int main ()
{
    int fds [ 2 ];
    socketpair ( PF_UNIX , SOCK_STREAM , 0 , fds );
    int pid = fork ();
    if ( pid == 0 )
    {
        close ( fds [ 0 ]);
        char a [] = "123" ;
        send ( fds [ 1 ], a , strlen ( a ), 0 );
    }
    else if ( pid > 0 )
    {
        close ( fds [ 1 ]);
        char b [ 20 ] {};
        recv ( fds [ 0 ], b , 20 , 0 );
        printf ( "%s" , b );
    }
}

dup and dup2 functions Copy an existing file descriptor

 #include <unistd.h>
// 返回的文件描述符总是取系统当前可用的最小整数值
int dup ( int oldfd );
// 可以用newfd来制定新的文件描述符, 如果newfd已经被打开则先关闭
// 如果newfd==oldfd 则不关闭newfd直接返回
int dup2 ( int oldfd , int newfd );

The dup function creates a new file descriptor. The new file descriptor and the original file_descriptor both point to the same target. Come back and add an example. In this example, because STDOUT_FILENO is turned off, the smallest dup is STDOUT_FILENO , so the standard output goes to this in the file

 int main ()
{
    int filefd = open ( "/home/lsmg/1.txt" , O_WRONLY );
    close ( STDOUT_FILENO );
    dup ( filefd );
    printf ( "123n" );
    exit ( 0 );
}

Read and write data - readv writev mmap munmap

readv/writev

 #include <sys/uio.h>
// count 为 vector的长度, 即为有多少块内存
// 成功时返回写入读取的长度 失败返回-1
ssize_t readv ( int fd , const struct iovec * vector , int count );
ssize_t writev ( int fd , const struct iovec * vector , int count );

struct iovec {
	void * iov_base /* 内存起始地址*/
	size_t iov_len /* 这块内存长度*/
}

Come back and add a usage example. This example writes the memory representation of an int into a file. Use hexdump to view the file 0000000 86a0 0001 You can see 186a0 is 100000.

 // 2020年1月7日16:52:11
int main ()
{
    int file = open ( "/home/lsmg/1.txt" , O_WRONLY );
    int temp = 100000 ;
    iovec temp_iovec {};
    temp_iovec . iov_base = & temp ;
    temp_iovec . iov_len = sizeof ( temp );
    writev ( file , & temp_iovec , 1 );
}

sendfile function

 #include <sys/sendfile.h>
// offset为指定输入流从哪里开始读, 如果为NULL 则从开头读取
ssize_t sendfile ( int out_fd , int in_fd , off_t * offset , size_t count );

O_RDONLY只读模式
O_WRONLY只写模式
O_RDWR读写模式
int open ( file_name , flag );

The stat structure can be generated with fstat, which is simply the ID card of the file.

 #include <sys/stat.h>
struct stat
{
    dev_t       st_dev ;     /* ID of device containing file -文件所在设备的ID*/
    ino_t       st_ino ;     /* inode number -inode节点号*/
    mode_t      st_mode ;    /* protection -保护模式?*/
    nlink_t     st_nlink ;   /* number of hard links -链向此文件的连接数(硬连接)*/
    uid_t       st_uid ;     /* user ID of owner -user id*/
    gid_t       st_gid ;     /* group ID of owner - group id*/
    dev_t       st_rdev ;    /* device ID (if special file) -设备号，针对设备文件*/
    off_t       st_size ;    /* total size, in bytes -文件大小，字节为单位*/
    blksize_t   st_blksize ; /* blocksize for filesystem I/O -系统块的大小*/
    blkcnt_t    st_blocks ;  /* number of blocks allocated -文件所占块数*/
    time_t      st_atime ;   /* time of last access -最近存取时间*/
    time_t      st_mtime ;   /* time of last modification -最近修改时间*/
    time_t      st_ctime ;   /* time of last status change - */
};

ID card generation function

 // 第一个参数需要调用open生成文件描述符
// 下面其他两个为文件全路径
int fstat ( int filedes , struct stat * buf );

// 当路径指向为符号链接的时候, lstat为符号链接的信息. stat为符号链接指向文件信息
int stat ( const char * path , struct stat * buf );
int lstat ( const char * path , struct stat * buf );

/*
* ln -s source dist  建立软连接, 类似快捷方式, 也叫符号链接
* ln source dist  建立硬链接, 同一个文件使用多个不同的别名, 指向同一个文件数据块, 只要硬链接不被完全
* 删除就可以正常访问
* 文件数据块 - 文件的真正数据是一个文件数据块, 打开的`文件`指向这个数据块, 就是说
* `文件`本身就类似快捷方式, 指向文件存在的区域.
*/

mmap and munmap functions

mmap creates a memory shared by process communication (files can be mapped into it), munmap releases this memory.

 #include <sys/mman.h>

// start 内存起始位置, 如果为NULL则系统分配一个地址 length为长度
// port参数 PROT_READ(可读) PROT_WRITE(可写) PROT_EXEC(可执行), PROT_NONE(不可访问)
// flag参数 内存被修改后的行为
// - MAP_SHARED 进程间共享内存, 对内存的修改反映到映射文件中
// - MAP_PRIVATE 为调用进程私有, 对该内存段的修改不会反映到文件中
// - MAP_ANONUMOUS 不是从文件映射而来, 内容被初始化为0, 最后两个参数被忽略
// 成功返回区域指针, 失败返回 -1
void * mmap ( void * start , size_t length , int port , int flags , int fd , off_t offset );
// 成功返回0 失败返回-1
int munmap ( void * start , size_t length );

splice function Used to move data between two file name descriptors, 0 copy operation

 #include <fcntl.h>
// fd_in 为文件描述符, 如果为管道文件描述符则 off_in必须为NULL, 否则为读取开始偏移位置
// len为指定移动的数据长度, flags参数控制数据如何移动.
// - SPLICE_F_NONBLOCK 非阻塞splice操作, 但会受文件描述符自身的阻塞
// - SPLICE_F_MORE 给内核一个提示, 后续的splice调用将读取更多的数据???????
ssize_t splice ( int fd_in , loff_t * off_in , int fd_out , loff_t * off_out , size_t len , unsigned int flags );

// 使用splice函数  实现echo服务器
int main ( int argc , char * argv [])
{
    if ( argc <= 2 )
    {
        printf ( "the parmerters is wrongn" );
        exit ( errno );
    }
    char * ip = argv [ 1 ];

    int port = atoi ( argv [ 2 ]);
    printf ( "the port is %d the ip is %sn" , port , ip );

    int sockfd = socket ( PF_INET , SOCK_STREAM , 0 );
    assert ( sockfd >= 0 );

    struct sockaddr_in address {};
    address . sin_family = AF_INET ;
    address . sin_port = htons ( port );
    inet_pton ( AF_INET , ip , & address . sin_addr );

    int ret = bind ( sockfd , ( sockaddr * ) & address , sizeof ( address ));
    assert ( ret != -1 );

    ret = listen ( sockfd , 5 );

    int clientfd {};
    sockaddr_in client_address {};
    socklen_t client_addrlen = sizeof ( client_address );

    clientfd = accept ( sockfd , ( sockaddr * ) & client_address , & client_addrlen );
    if ( clientfd < 0 )
    {
        printf ( "accept errorn" );
    }
    else
    {
        printf ( "a new connection from %s:%d successn" , inet_ntoa ( client_address . sin_addr ), ntohs ( client_address . sin_port ));
        int fds [ 2 ];
        pipe ( fds );
        ret = splice ( clientfd , nullptr , fds [ 1 ], nullptr , 32768 , SPLICE_F_MORE );
        assert ( ret != -1 );

        ret = splice ( fds [ 0 ], nullptr , clientfd , nullptr , 32768 , SPLICE_F_MORE );
        assert ( ret != -1 );

        close ( clientfd );
    }
    close ( sockfd );
    exit ( 0 );
}

select function The select function returns when the second parameter list is readable or waits for the specified time to return.

After returning, the collection pointed to by the second parameter fdset is modified into a readable fd list. This requires updating the fdset collection after each return.

After returning, the return value of this function is the number of readable fds. It traverses the fdset collection and uses FD_ISSET to determine whether fdset[i] is in it and then determines whether the fd is listenfd. If so, accept the new connection. If not, it means that it has been accepted by others. fd determines whether there is data to read or the connection is disconnected

 #include <fcntl.h> 
// maxfdp 最大数 FD_SETSIZE
// struct fd_set 一个集合,可以存储多个文件描述符
// - FD_ZERO(&fd_set) 清空 -FD_SET(fd, &fd_set) 放入fd FD_CLR(fd, &fd_set)从其中清除fd
// - FD_ISSET(fd, &fd_set) 判断是否在其中
// readfds  需要监视的文件描述符读变化, 其中的文件描述符可读的时候返回
// writefds 需要监视的文件描述符写变化, 其中的文件描述符可写的时候返回
// errorfds 错误
// timeout 传入NULL为阻塞, 设置为0秒0微秒则变为非阻塞函数
// 返回值 负值为错误 等待超时说明文件无变化返回0 有变化返回正值
int select ( int maxfdp , fd_set * readfds , fd_set * writefds , fd_set * errorfds , struct timeval * timeout ); 

#define exit_if ( r , ...) 
{   
    if (r)  
    {   
        printf(__VA_ARGS__);    
        printf("errno no: %d, error msg is %s", errno, strerror(errno));    
        exit(1);    
    }   
}   

int main ( int argc , char * argv [])
{
    int keyboard_fd = open ( "/dev/tty" , O_RDONLY | O_NONBLOCK );
    exit_if ( keyboard_fd < 0 , "open keyboard fd errorn" );
    fd_set readfd ;
    char recv_buffer = 0 ;

    while (true)
    {
        FD_ZERO ( & readfd );
        FD_SET ( 0 , & readfd );

        timeval timeout { 5 , 0 };

        int ret = select ( keyboard_fd + 1 , & readfd , nullptr , nullptr , & timeout );
        exit_if ( ret == -1 , "select errorn" );
        if ( ret > 0 )
        {
            if ( FD_ISSET ( keyboard_fd , & readfd ))
            {
                recv_buffer = 0 ;
                read ( keyboard_fd , & recv_buffer , 1 );
                if ( 'n' == recv_buffer )
                {
                    continue ;
                }
                if ( 'q' == recv_buffer )
                {
                    break ;
                }
                printf ( "the input is %cn" , recv_buffer );
            }

        }
        if ( ret == 0 )
        {
            printf ( "timeoutn" );
        }
    }
}

Chapter 7 Linux Server Program Specifications

Linux program servers generally run as background processes. The background process is also called a daemon. It does not control the terminal, so it will not accidentally receive user input. The parent process of the daemon is usually the init process (PID is 1 process)
The Linux server program has a logging system, which can at least output logs to files. Logs are so important, and they all rely on them for troubleshooting and comparison.
Linux server programs generally run as a special non-root identity. For example, mysqld has its own account mysql.
Linux server programs generally have their own configuration files, rather than hard-coding all configurations in the code to facilitate subsequent changes.
Linux server programs usually generate a PID file when starting and store it in the /var/run directory to record the PID of the background process.
Linux server programs usually need to consider system resources and limitations and predict their own capabilities.

log

sudo service rsyslog restart // 启动守护进程

 #include <syslog.h>
// priority参数是所谓的设施值(记录日志信息来源, 默认为LOG_USER)与日志级别的按位或
// - 0 LOG_EMERG  /* 系统不可用*/
// - 1 LOG_ALERT   /* 报警需要立即采取行动*/
// - 2 LOG_CRIT /* 非常严重的情况*/
// - 3 LOG_ERR  /* 错误*/
// - 4 LOG_WARNING /* 警告*/
// - 5 LOG_NOTICE /* 通知*/
// - 6 LOG_INFO /* 信息*/
//  -7 LOG_DEBUG /* 调试*/
void syslog ( int priority , const char * message , .....);

// ident 位于日志的时间后 通常为名字
// logopt 对后续 syslog调用的行为进行配置
// -  0x01 LOG_PID  /* 在日志信息中包含程序PID*/
// -  0x02 LOG_CONS /* 如果信息不能记录到日志文件, 则打印到终端*/
// -  0x04 LOG_ODELAY /* 延迟打开日志功能直到第一次调用syslog*/
// -  0x08 LOG_NDELAY /* 不延迟打开日志功能*/
// facility参数可以修改syslog函数中的默认设施值
void openlog ( const char * ident , int logopt , int facility );

// maskpri 一共八位 0000-0000
// 如果将最后一个0置为1 表示 记录0级别的日志
// 如果将最后两个0都置为1 表示记录0和1级别的日志
// 可以通过LOG_MASK() 宏设定 比如LOG_MASK(LOG_CRIT) 表示将倒数第三个0置为1, 表示只记录LOG_CRIT
// 如果直接设置setlogmask(3); 3的二进制最后两个数均为1 则记录 0和1级别的日志
int setlogmask ( int maskpri );

// 关闭日志功能
void closelog ();

User information, switch users

UID - real user ID EUID - effective user ID - facilitate resource access GID - real group ID EGID - effective group ID

 #include <sys/types.h>
#include <unistd.h>

uid_t getuid ();
uid_t geteuid ();
gid_t getgid ();
gid_t getegid ();
int setuid ( uid_t uid );
int seteuid ( uid_t euid );
int setgid ( gid_t gid );
int setegid ( gid_t gid );

You can switch users through setuid and setgid The root user uid and gid are both 0.

Inter-process relationship

PGID - process group ID (each process under Linux belongs to a process group)

#include <unistd.h> pid_t getpgid(pid_t pid); Returns the pgid to which the pid belongs on success. Returns -1 on failure int setpgid(pid_t pid, pid_t pgid);

session Some associated process groups will form a session skip

Check the process relationship ps and less

Resource limits Change directory slightly slightly

Chapter 8 High-Performance Server Program Framework

Server model-CS model

advantage

Simple to implement Disadvantages
The server is the center of communication. If the access is too large, the response will be too slow.

Pattern diagram

The demo written does not use the fork function. It will be improved in the future.

Server framework IO model

I can probably understand this model, and I have studied Javaweb for half a year.

The socket is blocking by default when it is created, but it can be solved by passing SOCK_NONBLOCK parameter. Non-blocking calls will return immediately, but the event may not have occurred (recv did not receive the information). If it does not occur or an error occurs,返回-1 so it needs to be distinguished by errno These errors. The event did not occur accept, send, recv errno is set to EAGAIN(再来一次) or EWOULDBLOCK(期望阻塞) connect is set to EINPROGRESS(正在处理中)

Non-blocking IO needs to be called when the event has already occurred to improve performance.

The commonly used IO multiplexing function select poll epoll_wait will be explained later in Chapter 9. The signal will be explained in Chapter 10.

Two efficient event processing modes and concurrency mode

Programs are divided into computing-intensive (using a lot of CPU and little IO resources) and IO-intensive (inversely). The former will reduce efficiency when using concurrent programming, while the latter will improve efficiency. Concurrent programming uses both multi-process and multi-threading. way

Concurrency mode - a method of coordinating tasks between IO units and multiple logical units. The server has two main concurrency modes

Semi-synchronous/semi-asynchronous mode
leader/follower model

Semi-synchronous/semi-asynchronous mode In the IO model, the difference between asynchronous and synchronous is what kind of IO event the kernel notifies the application (ready event or completion event), and who completes the IO reading and writing (application or kernel)

And here (concurrency mode) synchronization refers to execution completely in the order of the code sequence - threads running in a synchronous manner are called synchronous threads. Asynchronous needs to be driven by system events (interrupts, signals) - threads running in an asynchronous manner are called asynchronous thread

Server (requires good real-time performance and can handle multiple customer requests at the same time) - generally implemented using synchronous threads and asynchronous threads, that is, semi-synchronous/semi-asynchronous mode Synchronous threads - process customer logic and process objects in the request queue asynchronously Thread - handles IO events, after receiving customer requests, encapsulates them into request objects and inserts them into the request queue

There are variations of Semi-Sync/Semi-Async Pattern半同步/半反应堆模式

Asynchronous thread - main thread - responsible for monitoring events on all sockets

leader/follower model slightly

Efficient programming method - finite state machine

 // 状态独立的有限状态机
STATE_MACHINE ( Package _pack ) {
	
	PackageType _type = _pack . GetType ();
	switch ( _type ) {
		case type_A :
			xxxx ;
			break ;
		case type_B :
			xxxx ;
			break ;
	}
}

// 带状态转移的有限状态机
STATE_MACHINE () {
	State cur_State = type_A ;
	while ( cur_State != type_C ) {
	
		Package _pack = getNewPackage ();
		switch ( cur_State ) {
			
			case type_A :
				process_package_state_A ( _pack );
				cur_State = type_B ;
				break ;
			case type_B :
				xxxx ;
				cur_State = type_C ;
				break ;
		}
	}
}

It took me an hour to finally copy the 5,000-word code letter by letter @September 8, 2019 22:08:46@

Other suggestions for improving server performance Pool data replication context switches and locks

Pools - Trade space for time process pools and thread pools

Data Replication - High-performance servers should try to avoid unnecessary replication

Context switches and locks Reduce the scope of锁. You should not create too many worker processes, but use dedicated business logic threads.

Chapter 9 I/O Multiplexing

I/O multiplexing allows programs to monitor multiple file descriptors at the same time.

The client program needs to handle multiple sockets at the same time using non-blocking connect technology
Client program handles both user input and network connections chat room program
The TCP server must handle both the listening socket and the connecting socket.
Handle TCP and UDP requests simultaneously - echo server
Monitor multiple ports at the same time, or handle multiple services - xinetd server

Commonly used methods select , poll , epoll

select

# include < sys/select.h >
// nfds - 被监听的文件描述符总数
// 后面三个分别指向 可读, 可写, 异常等事件对应的文件描述符集合
// timeval select超时时间 如果传递0 则为非阻塞, 设置为NULL则为阻塞
// 成功返回就绪(可读, 可写, 异常)文件描述符的总数, 没有则返回0 失败返回-1
int select ( int nfds, fd_set * readfds, fd_set * writefds, fd_set * exceptfds, struct timeval * timeout);

//操作fd_set的宏
FD_ZERO ( fd_set * fdset);
FD_SET ( int fd, fd_set * fdset);
FD_CLR ( int fd, fd_set * fdset);
FD_ISSET ( int fd, fd_set * fdset);
// 设置 timeval 超时时间
struct timeval
{
	long tv_sec; // 秒
	long tv_usec; // 微秒
}

select

file descriptor ready condition

The number of bytes in the socket kernel receive buffer is greater than or equal to its low water mark
The other party of socket communication closes the connection, and the read operation on the socket returns 0
Listen for new connection requests on the socket
There are unhandled errors on the socket. You can use getsockopt to read and clear the errors.
The number of available bytes in the socket kernel's send buffer is greater than or equal to its low water mark
The write operation of the socket is closed. Performing a write operation on the closed socket will trigger a SIGPIPE signal.
The socket uses non-blocking connect after the connection is successful or failed.

poll

poll

# include < poll.h >
// fds 结构体类型数组 指定我们感兴趣的文件描述符上发生的可读可写和异常事件
// nfds 遍历结合大小 左闭右开
// timeout 单位为毫秒 -1 为阻塞 0 为立即返回
int poll ( struct pollfd * fds, nfds_t nfds, int timeout);

struct pollfd
{
	int fd;
	short events;  //注册的事件, 告知poll监听fd上的哪些事件
	short revents; // 实际发生的事件
}

# define exit_if (r, ...) 
{   
    if (r)  
    {   
        printf (__VA_ARGS__);    
        printf ( " errno no: %d, error msg is %s " , errno, strerror (errno));    
        exit ( 1 );    
    }   
}   

struct client_info
{
    char *ip_;
    int port_;
};

int main ( int argc, char * argv[])
{
    int port = 8001 ;
    char ip[] = " 127.0.0.1 " ;

    struct sockaddr_in address;
    address. sin_port = htons (port);
    address. sin_family = AF_INET;
    address. sin_addr . s_addr = htons (INADDR_ANY);

    int listenfd = socket (PF_INET, SOCK_STREAM, 0 );
    exit_if (listenfd < 0 , " socket error n " );

    int ret = bind (listenfd, ( struct sockaddr *)&address, sizeof (address));
    exit_if (ret == - 1 , " bind error n " );

    ret = listen (listenfd, 5 );
    exit_if (ret == - 1 , " listen error n " );

    constexpr int MAX_CLIENTS = 1024 ;
    struct pollfd polls[MAX_CLIENTS] = {};
    struct client_info clientsinfo[MAX_CLIENTS] = {};

    polls[ 3 ]. fd = listenfd;
    polls[ 3 ]. events = POLLIN | POLLRDHUP;


    while ( true )
    {
        ret = poll (polls, MAX_CLIENTS + 1 , - 1 );
        exit_if (ret == - 1 , " poll error n " );

        for ( int i = 3 ; i <= MAX_CLIENTS; ++i)
        {
            int fd = polls[i]. fd ;

            if (polls[i]. revents & POLLRDHUP)
            {
                polls[i]. events = 0 ;
                printf ( " close fd-%d from %s:%d n " , fd, clientsinfo[fd]. ip_ , clientsinfo[fd]. port_ );
            }

            if (polls[i]. revents & POLLIN)
            {
                if (fd == listenfd)
                {
                    struct sockaddr_in client_address;
                    socklen_t client_addresslen = sizeof (client_address);

                    int clientfd = accept (listenfd, ( struct sockaddr *)&client_address,
                            &client_addresslen);

                    struct client_info *clientinfo = &clientsinfo[clientfd];

                    clientinfo-> ip_ = inet_ntoa (client_address. sin_addr );
                    clientinfo-> port_ = ntohs (client_address. sin_port );

                    exit_if (clientfd < 0 , " accpet error, from %s:%d n " , clientinfo-> ip_ ,
                            clientinfo-> port_ );
                    printf ( " accept from %s:%d n " , clientinfo-> ip_ , clientinfo-> port_ );

                    polls[clientfd]. fd = clientfd;
                    polls[clientfd]. events = POLLIN | POLLRDHUP;
                }
                else
                {
                    char buffer[ 1024 ];
                    memset (buffer, '  ' , sizeof (buffer));

                    ret = read (fd, buffer, 1024 );
                    if (ret == 0 )
                    {
                        close (fd);
                    }
                    else
                    {
                        printf ( " recv from %s:%d: n %s n " , clientsinfo[fd]. ip_ ,
                               clientsinfo[fd]. port_ , buffer);
                    }
                }
            }
        }
    }
}

epoll

epoll

epoll is a Linux-specific I/O multiplexing function. Its implementation is very different from select and poll.

epoll uses a set of functions to complete the task
epoll puts events on file descriptors that users care about in an event table in the kernel.
epoll does not require passing in a file descriptor set or event set with each call.

There is a specific file descriptor creation function to identify this event table epoll_create() epoll_ctl() is used to operate this kernel event table epoll_wait() returns the number of ready file descriptors for the main function successfully and returns -1 if epoll_wait() fails. When the function detects an event, it copies all ready events from the kernel event table (result returned by the first parameter, epoll_create) to the array pointed to by the second parameter event, This array is only used to output ready events detected epoll_wait .

The array parameter of event is different from select and poll. It is used to pass in user-registered events and output readiness events detected by the kernel, which improves efficiency.

 // 索引poll返回的就绪文件描述符
int ret = poll(fds, MAX_EVENT_NUMBER - 1 );
// 遍历
for ( int i = 0 ; i < MAX_EVENT_NUMBER; ++i) {
	if (fds[i]. revents & POLLIN) {
		int sockfd = fds[i]. fd ;
	}
}

// 索引epoll返回的就绪文件描述符
int ret = epoll_wait(epoll_fd, events, MAX_EVENT_NUMBER,  - 1 );
for ( int i = 0 ; i < ret; i++) {
	int sockfd = events[i]. data . fd ;
	// sockfd 一定就绪 ?????
}

LT and ET modes LT (level triggered, default working mode) epoll in LT mode is equivalent to a more efficient poll epoll_wait will only notify an event until the event is processed

ET (edge trigger, epoll's efficient working mode) mode. When registering an EPOLLET event on a file descriptor in the epoll kernel event table, epoll will use the ET mode to operate the file descriptor. epoll_wait will only notify once, regardless of this Has the event been completed?

ET mode

 -> 123456789-123456789-123456789
event trigger once
get 9bytes of content: 123456789
get 9bytes of content: -12345678
get 9bytes of content: 9-1234567
get 4bytes of content: 89
read later

LT mode

 -> 123456789-123456789-123456789
event trigger once
get 9bytes of contents: 123456789
event trigger once
get 9bytes of contents: -12345678
event trigger once
get 9bytes of contents: 9-1234567
event trigger once
get 4bytes of contents: 89

When a task arrives in ET mode, it must be completed, because this event will not be notified in the future, so ET is epoll's efficient working mode. LT mode will continue to notify as long as the event is not processed.

# include < epoll.h >
// size 参数只是给内核一个提示, 事件表需要多大
// 函数返回其他所有epoll系统调用的第一个参数, 来指定要访问的内核事件表
int epoll_create ( int size);

// epfd 为 epoll_create的返回值
// op为操作类型
// - EPOLL_CTL_ADD 向事件表中注册fd上的事件
// - EPOLL_CTL_MOD 修改fd上的注册事件
// - EPOLL_CTL_DEL 删除fd上的注册事件
// fd 为要操作的文件描述符
int epoll_ctl ( int epfd, int op, int fd, struct epoll_event * event);

struct epoll_event
{
	_uint32_t events; // epoll事件
	epoll_data_t data; // 用户数据 是一个联合体
}

typedef union epoll_data
{
	void * ptr; // ptr fd 不能同时使用
	int fd;
	uint32_t u32;
	uint64_t u64;
} epoll_data_t

// maxevents监听事件数 必须大于0
// timeout 为-1 表示阻塞
// 成功返回就绪的文件描述符个数 失败返回-1
int epoll_wait ( int epfd, struct epoll_event * events, int maxevents, int timeout);

Comparison of three types of IO multiplexing

select and poll are the same as epoll

Both can monitor multiple file descriptors at the same time and will wait for the timeout specified by the timeout parameter until an event occurs on one or more file descriptors.
The return value is the number of ready file descriptors. Returning 0 means no event occurred.

Advanced application of I/O multiplexing, non-blocking connect

When connect fails, it will return an errno value EINPROGRESS - indicating that connect is called on a non-blocking socket and the connection is not established immediately. At this time, the select and poll functions can be called to monitor the writable event on the socket where the connection failed.

When the function returns, you can use getsockopt to read the error code and clear the error on the socket. An error code of 0 indicates success.

Chapter 10 Signals

Api

Send signal API

# include < sys/types.h >
# include < signal.h >

// pid > 0 发送给PID为pid标识的进程
//  0 发送给本进程组的其他进程
// -1 发送给进程以外的所有进程, 但发送者需要有对目标进程发送信号的权限
// < -1 发送给组ID为 -pid 的进程组中的所有成员

// 出错信息 EINVAL 无效信号, EPERM 该进程没有权限给任何一个目标进程 ESRCH 目标进程(组) 不存在
int kill ( pid_t pid, int sig);

Receive signal API

# include < signal.h >
typedef void (* _sighandler_t ) ( int );

# include < bits/signum.h > // 此头文件中有所有的linux可用信号
// 忽略目标信号
# define SIG_DFL (( _sighandler_t ) 0 )
// 使用信号的默认处理方式
# define SIG_IGN (( _sighandler_t ) 1 )

Common signals

 SIGHUP 控制终端挂起
SIGPIPE 往读端被关闭的管道或者socket连接中写数据
SIGURG socket连接上收到紧急数据
SIGALRM 由alarm或setitimer设置的实时闹钟超时引起
SIGCHLD 子进程状态变化

signal function

 // 为一个信号设置处理函数
# include < signal.h >
// _handler 指定sig的处理函数
_sighandler_t signal ( int sig, __sighandler_t _handler)


int sigaction( int sig, struct sigaction * act, struct sigaction * oact)

Overview

A signal is a message sent by a user, system, or process to the target process to notify the target process of a certain state change or system exception. Generate conditions

For the foreground process, the user can send a signal to it by entering special terminal characters. CTRL+C is usually an interrupt signal SIGINT
System exceptions, floating point exceptions and access to illegal memory segments
System status changes caused by expiration of the alarm timer will cause the SIGALRM signal
Run the kill command or call the kill function

The server must handle (or at least ignore) some common signals to avoid abnormal termination

Interrupt system call?

Chapter 11 Timer

socket options `SO_RCVTIMEO` and `SO_SNDTIMEO`

Usage example, get the route after timeout by setting the corresponding SO_SNDTIMEO

 int timeout_connect ( const char * ip, const int port, const int sec)
{
    struct sockaddr_in address{};
    address. sin_family = AF_INET;
    address. sin_port = htons (port);
    address. sin_addr . s_addr = inet_addr (ip);

    int sockfd = socket (PF_INET, SOCK_STREAM, 0 );
    exit_if (sockfd < 0 , " socket error n " );

    struct timeval timeout{};
    timeout. tv_sec = sec;
    timeout. tv_usec = 0 ;
    socklen_t timeout_len = sizeof (timeout);

    setsockopt (sockfd, SOL_SOCKET, SO_SNDTIMEO, &timeout, timeout_len);

    int ret = connect (sockfd, ( struct sockaddr *)&address, sizeof (address));
    if (ret == - 1 )
    {
		// 当 errno为EINPROGRESS 说明 等待了 10S后依然无法连接成功 实现了定时器
        if (errno == EINPROGRESS)
        {
            printf ( " connecting timeout, process timeout logic n " );
            return - 1 ;
        }
        printf ( " error occur when connecting to server n " );
        return - 1 ;
    }
    return sockfd;
}

int main ( int argc, char * argv[])
{
    exit_if (argc <= 2 , " wrong number of parameters n " )
    const char * ip = argv[ 1 ];
    const int port = atoi (argv[ 2 ]);

    int sockfd = timeout_connect (ip, port, 10 );
    if (sockfd < 0 )
    {
        return 1 ;
    }
    return 0 ;
}

SIGALRM signal - timer based on ascending linked list

Once the real-time alarm clock set by the alarm and setitimer functions times out, the SIGALRM signal will be triggered. The code related to using the signal processing function to process scheduled tasks is placed on github. There is still a lot of code, so I will not put it here.

The summary is placed on the blog of the diary and then linked to it.

Timeout parameter of IO multiplexing system call

High performance timer

# time wheel

# time stack

Chapter 12 High-Performance IO Framework Library

Another blog

Chapter 13 Multi-process Programming

exec series system calls

# include < unistd.h >
// 声明这个是外部函数或外部变量
extern char ** environ;

// path 参数指定可执行文件的完成路径 file接收文件名,具体位置在PATH中搜寻
// arg-接受可变参数 和 argv用于向新的程序传递参数数组
// envp用于设置新程序的环境变量, 未设置则使用全局的环境变量
// exec函数是不返回的, 除非出错
// 如果未报错则源程序被新的程序完全替换

int execl ( const char * path, const char * arg, ...);
int execlp ( const char * file, const char * arg, ...);
int execle ( const char * path, const char * arg, ..., char * const envp[])
int execv( const char * path, char * const argv[]);
int execvp ( const char * file, char * const argv[]);
int execve ( const char * path, char * const argv[], char * const envp[]);

fork system call - creation of process

# include < sys/types.h >
# include < unistd.h >
// 每次调用都返回两次, 在父进程中返回的子进程的PID, 在子进程中返回0
// 次返回值用于区分是父进程还是子进程
// 失败返回-1
pid_t fork (viod);

The fork system calls the fork() function to copy the current process and create a new process table entry in the kernel process table. The new process table entry has many attributes the same as the original process.

Heap pointer
stack pointer
flag register value
The child process code is exactly the same as the parent process
Simultaneously copy (copy-on-write is adopted, the parent process and the child process will only copy the data after writing the data) the data of the parent process (heap data, stack data, static data)
After the child process is created, the file descriptor opened by the parent process is also文件描述符的引用计数in the child process by default. The reference count of父进程的用户根目录, 当前工作目录等变量的引用计数are increased by 1.

There are also different projects

The PPID of the process (identifying the parent process) is set to the PID of the original process.
信号位图被清除(the signal processing function set by the original process is invalid for the new process)

(Quoted from Wikipedia - Reference counting is a memory management technology in computer programming languages. It refers to saving the number of references to a resource (which can be an object, memory or disk space, etc.). When the number of references becomes zero ).

The child process is an exact duplicate of the parent process except for the following points:

The child has its own unique process ID, and this PID does not match the ID of any existing process group (setpgid(2)) or session.
The child's parent process ID is the same as the parent's process ID. The child's parent process ID PPID is the same as the parent process ID PID.
The child does not inherit its parent's memory locks (mlock(2), mlockall(2)). The child process does not inherit the memory locks of the parent process (guaranteed that part of the memory is in the memory, not the sawp partition)
Process resource utilizations (getrusage(2)) and CPU time counters (times(2)) are reset to zero in the child.
The child's set of pending signals is initially empty (sigpending(2)). The signal bitmap is initialized to empty. The original signal processing function is invalid for the child process and needs to be reset.
The child does not inherit semaphore adjustments from its parent (semop(2)). will not inherit semadj
The child does not inherit process-associated record locks from its parent (fcntl(2)). (On the other hand, it does inherit fcntl(2) open file description locks and flock(2) locks from its parent.)
The child does not inherit timers from its parent (setitimer(2), alarm(2), timer_create(2)).
The child does not inherit outstanding asynchronous I/O operations from its parent (aio_read(3), aio_write(3)), nor does it inherit any asynchronous I/O contexts from its parent (see io_setup(2)).

Dealing with zombie processes - process management

# include < sys/types.h >
# include < sys/wait.h >
// wait进程将阻塞进程, 直到该进程的某个子进程结束运行为止. 他返回结束的子进程的PID, 并将该子进程的退出状态存储于stat_loc参数指向的内存中. sys/wait.h 头文件中定义了宏来帮助解释退出信息.
pid_t wait ( int * stat_loc);

// 非阻塞, 只等待由pid指定的目标子进程(-1为阻塞)
// options函数取值WNOHANG-waitpid立即返回
// 如果目标子进程正常退出, 则返回子进程的pid
// 如果还没有结束或意外终止, 则立即返回0
// 调用失败返回-1
pid_t waitpid ( pid_t pid, int * stat_loc, int options);

WIFEXITED (stat_val); // 子进程正常结束, 返回一个非0
WEXITSTATUS (stat_val); // 如果WIFEXITED 非0, 它返回子进程的退出码
WIFSIGNALED (stat_val); // 如果子进程是因为一个未捕获的信号而终止, 返回一个非0值
WTERMSIG (stat_val); // 如果WIFSIGNALED非0 返回一个信号值
WIFSTOPPED (stat_val); // 如果子进程意外终止, 它返回一个非0值
WSTOPSIG (stat_val); // 如果WIFSTOPED非0, 它返回一个信号值

For multi-process programs, the parent process generally needs to track the exit status of the child process. Therefore, when the child process ends running, the kernel will not immediately release the process table entry of the process in order to satisfy the parent process's subsequent push information to the child process. Query

子进程结束运行之后, 父进程读取其退出状态前, we say that the child process is in僵尸态
Another situation that causes the child process to enter the zombie state - the parent process ends or terminates abnormally, while the child process continues to run. (The PPID of the child process is set to 1, and the init process takes over the child process)父进程结束运行之后, 子进程退出之前, in僵尸态

The above two states are that the parent process did not correctly handle the return information of the child process, and the child process stayed in the zombie state, occupying kernel resources.

Although waitpid() is non-blocking, it needs to be called after the process monitored by waitpid ends. SIGCHLD signal - this signal will be sent to the parent process after the child process ends.

 static void handle_child ( int sig)
{
	pid_t pid;
	int stat;
	while ((pid = waitpid (- 1 , &stat, WNOHANG)) > 0 )
	{
		// 善后处理emmmm
	}
}

Semaphore - process lock

semaphore primitive Only supports two operations, wait and signal. Wait and signal have special meanings in LInux, so they are also called P (passeren, passing is like entering a critical section) V (vrijgeven, releasing is like exiting) critical section) operation. Suppose there is a semaphore SV (can be any natural number, this book only discusses binary semaphores), the meaning of its PV operation is

P(SV), if the value of SV is greater than 0, decrement it by 1, if the value of sv is 0, suspend the execution of the process
V(SV), if other processes are suspended because they are waiting for SV, wake them up, if not, add 1 to SV.

Summary of how to use PV

Use semget to obtain the unique identifier. Use semctl 's SETVAL to pass in the sem_un union that initializes val. To initialize val, call semop to pass in the unique identifier. sem_op=-1 performs the P (lock) operation. sem_op=1 performs the V (unlock) operation. The switch lock passes when sem_op=-1,semval=0 and IPC_NOWAIT is not specified, waiting for semval to be changed from sem_op=1 to semval=1

Create semaphore

 // semeget 系统调用
// 创建一个全局唯一的信号量集, 或者获取一个已经存在的信号量集
// key 参数是一个键值, 用来标识一个全局唯一的信号量级,可以在不同进程中获取
// num_sems 参数指定要创建/获取的信号量集中信号量的数目. 如果是创建信号量-必须指定, 如果是获取-可以指定为0. 一般都是为1
// sem_flags指定一组标志, 来控制权限
// - 可以与IPC_CREAT 做或运算创建新的信号量集, 即使信号量集存在也不会报错
// - IPC_CREAT | IPC_EXCL来创建一组唯一信号量集 如果已经存在则会返回错误 errno = EEXIST
// 成功返回一个正整数, 是信号量集的标识符, 失败返回 -1
int semget ( key_t key, int num_sems, int sem_flags);

int sem_id = semget(( key_t ) 1234 , 1 , 0666 | IPC_CREAT);

initialization

 // semctl 系统调用
// sem_id 参数是由semget返回的信号量集标识符
// sen_num指定被操作的信号量在信号集中的编号
// command指定命令, 可以追加命令所需的参数, 不过有推荐格式
// 成功返回对应command的参数, 失败返回-1 errno
int semctl ( int sem_id, int sem_num, int command, ...);

// 第四个参数 竟然需要手动声明...
union semun
{
	int              val;    /* Value for SETVAL */
	struct semid_ds *buf;    /* Buffer for IPC_STAT, IPC_SET */
	unsigned short  *array;  /* Array for GETALL, SETALL */
	struct seminfo  *__buf;  /* Buffer for IPC_INFO
								(Linux-specific) */
};
// 初始化信号量
union semun sem_union;
sem_union.val = 1 ;
// 这里可以直接第三个参数传入1(val)
if (semctl(sem_id, 0 , SETVAL, sem_union) == - 1 )
{
	exit ( 0 );
}

// 删除信号量
union semun sem_union{};
if (semctl(sem_id, 0 , IPC_RMID, sem_union) == - 1 )
{
	exit (EXIT_FAILURE);
}

Some important kernel variables associated with semop semaphore

 unsigned short semval; // 信号量的值
unsigned short semzcnt; // 等待信号量值变为0的进程数量
unsigned short semncnt // 等待信号量值增加的进程数量
pid_t sempid; // 最后一次执行semop操作的进程ID

Operating the semaphore is actually operating on the kernel variables above

 // sem_id 是由semget调用返回的信号量集的标识符, 用以指定被操作的,目标信号量集.
// sem_ops 参数指向一个sembuf结构体类型的数组
// num_sem_ops 说明操作数组中哪个信号量
// 成功返回0, 失败返回-1 errno. 失败的时候sem_ops[] 中的所有操作不执行
int semop ( int sem_id, struct sembuf * sem_ops, size_t num_sem_ops);

// sem_op < 0 期望获得信号量
// semval-=abs(sem_op),要求调用进程对被操作信号量集有写权限
// 如果semval的值大于等于sem_op的绝对值, 则操作成功, 调用进程立即获得信号量

// 如果semval < abs(sem_op) 则在被指定IPC_NOWAIT的时候semop立即返回error, errno=EAGIN
// 如果没有指定 则 阻塞进程等待信号量可用, 且 semzcnt +=1, 等到下面三种情况唤醒
// 1 发生semval >= abs(sem_op), semzcnt-=1, semval-=abs(sem_op). 在SEM_UNDO设置时更新semadj
// 2 被操作的信号量所在的信号量集被进程移除, 此时semop调用失败返回, errno=EIDRM (同 sem_op = 0)
// 3 调用被系统中断, 此时semop调用失败返回, errno=EINTR, 同时将该信号量的semzcnt减1 (同 sem_op = 0)
bool P ( int sem_id)
{
    struct sembuf sem_b;
    sem_b. sem_num = 0 ; // 信号量编号 第几个信号量 一般都是第0个
    sem_b. sem_op = - 1 ; // P
	// IPC_NOWAIT 无论信号量操作是否成功, 都立即返回
	// SEM_UNDO当进程退出的时候, 取消正在进行的semop操作 PV操作系统更新进程的semadj变量
    sem_b. sem_flg = SEM_UNDO;
    return semop (sem_id, &sem_b, 1 ) != - 1 ;
}


// sem_op > 0 
// semval+=sem_op , 要求调用进程对被操作的信号量集有写权限
// 如果此时设置了SEM_UNDO标志, 则系统将更新进程的semadj变量(用以跟踪进程对信号量的修改情况)
bool V ( int sem_id)
{
    struct sembuf sem_b;
    sem_b. sem_num = 0 ;
    sem_b. sem_op = 1 ; // V
    sem_b. sem_flg = SEM_UNDO;
    return semop (sem_id, &sem_b, 1 ) != - 1 ;
}


// -- sem_op = 0
// -- 标着这是一个`等待0`的操作, 要求调用进程对被操作信号量集有用读权限
// -- 如果此时信号量的值是0, 则调用立即返回, 否则semop失败返回, 或者阻塞进程以等待信号量变为0
// -- 此时如果IPC_NOWAIT 标志被设置, sem_op立即返回错误 errno=EAGAIN
// -- 如果未指定此标志, 则信号量的semzcnt的值增加1, 这时进程被投入睡眠直到下列三个条件之一发生
// -- 1 信号量的值samval变为0, 此时系统将该信号量的semzcnt减1
// -- 2 被操作的信号量所在的信号量集被进程移除, 此时semop调用失败返回, errno=EIDRM
// -- 3 调用被系统中断, 此时semop调用失败返回, errno=EINTR, 同时将该信号量的semzcnt减1

When semget succeeds, it returns an associated kernel structure semid_ds.

 struct semid_ds
{
	struct ipc_perm sem_perm;
	unsigned long int sem_nsems; // 被设置为num_sems
	time_t sem_otime; // 被设置为0
	time_t sem_ctime; // 被设置为当前的系统时间
}
// 用来描述权限
struct ipc_perm
{
	uid_t uid; // 所有者的有效用户ID, 被semget设置为调用进程的有效用户ID
	gid_t gid; // 所有者的有效组ID, 被semget设置为调用进程的有效用户ID
	uid_t cuid; // 创建者的有效用户ID, 被semget设置为调用进程的有效用户ID
	gid_t cgid; // 创建者的有效组ID, 被semget设置为调用进程的有效用户ID
	mode_t mode; // 访问权限, 背着只为sem_flags参数的最低9位.
}

Shared memory - inter-process communication

The most efficient IPC (inter-process communication) mechanism You need to synchronize the process's access to it yourself, otherwise a race condition will occur

 // key
// 与semget相同 标识一段全局唯一的共享内存
// size 内存区域大小 单位字节
// shmflg
// IPC_CREAT 存不存在都创建新的共享内存
// IPC_CREAT | IPC_EXCL 不存在则创建 存在则报错
// SHM_HUGETLB 系统将使用"大页面"来为共享内存分配空间
// SHM_NORESERVE 不为共享内存保留swap空间, 如果物理内存不足
// -在执行写操作的时候将会触发`SIGSEGV`信号
// -成功返回唯一标识, 失败返回-1 errno
int shmget ( key_t key, size_t size, int shmflg)

 // shm_id 
// shmget返回的唯一标识
// shm_addr 
// 关联到进程的哪块地址空间, 其效果还受到shmflg的可选标识SHM_RND的影响
// 如果shm_addr = NULL, 则关联地址由操作系统决定, 代码可移植性强
// 如果 shm_addr 非空,且没有`SHM_RND`标志 则关联到指定的地址处
// 如果 shm_addr 非空, 但是设置了标志 *这里还没用到, 暂时不写*
// shmflg
// SHM_RDONLY 设置后内存内容变成只读, 不设置则为读写模式
// SHM_REMAP 如果地址shmaddr已经关联到一段内存上则重新关联
// SHM_EXEC 有执行权限
// 成功返回关联到的地址, 失败返回 (void*)-1 errno
void * shmat ( int shm_id, const void * shm_addr, int shmflg)

// 将共享内存关联到进程的地址空间 调用成功之后, 修改shmid_ds的部分内容
// -shm_nattach +1
// -更新 shm_lpid
// -shm_atime设置为当前时间

 // 将共享内存从进程地址空间中分离
// 成功后
// -shm_nattach -1
// -更新 shm_lpid和shm_dtime设置为当前时间
// 成功返回0 失败返回-1 errno
int shmdt ( const void * shm_addr)

 int shm_ctl ( int shm_id, int command, struct shmid_ds * buf)

shmget will also create the corresponding shmid_ds structure.

 struct shmid_ds
{
	struct ipc_perm shm_per; // 权限相关
	size_t shm_segsz; // 共享内存大小 单位字节	size
	__time_t shm_atime; // 对这段内存最后一次调用semat的时间 0
	__time_t shm_dtime; // 对这段内存最后一次调用semdt的时间 0
	__time_t shm_ctime; // 对这段内存最后一次调用semctl的时间 当前时间
	__pid_t shm_cpid; // 创建者PID
	__pid_t lpid; // 最后一次执行shmat或shmdt的进程PID
	shmatt_t shm_nattach // 关联到此共享内存空间的进程数量
}

POSIX approach to shared memory

 int shmfd = shm_open( " /shm_name " , O_CREAT | O_RDWR, 0666 );
ERROR_IF (shmfd == - 1 , " shm open " );

int ret = ftruncate(shmfd, BUFFER_SIZE);
ERROR_IF (ret == - 1 , " ftruncate " );

share_mem = ( char *)mmap( nullptr , BUFFER_SIZE,
		PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0 );
ERROR_IF (share_mem == MAP_FAILED, " share_mem " );
close (shmfd);

// 取消关联
munmap (( void *)share_mem, BUFFER_SIZE);

Process communication-pipeline

Pipes can transfer data between parent and child processes by using the fact that both file descriptors (fd[0] and fd[1]) remain open after the fork call. A pair of such file descriptors can only guarantee that the parent and child For data transfer in one direction between processes, one of the parent process and the child process must close fd[0], and the other closes fd[1].

You can use two pipes to transmit data in both directions, or you can use socketpair to create a pipe.

message queue

Message queue is a simple and effective way to transfer binary block data between two processes. Each data block has its own type, and the receiver can selectively receive data according to the type.

# include < sys/msg.h >
// 与semget 相同, 成功返回标识符
// msgflg的设置和作用域setget相同
int msgget ( key_t key, int msgflg);

 // msg_ptr参数指向一个准备发送的消息, 消息必须按如下定义
// msg_sz 指的是mtext的长度!!!
// msgflg通常仅支持IPC_NOWAIT 以非阻塞形式发送数据
int msgsnd ( int msqid, const void *msg_ptr, size_t msg_sz, int msgflg);
默认如果消息队列已满, 则会阻塞. 如果设置了 IPC_NOTWAIT
就立即返回 设置errno=EAGIN

系统自带这个结构体 不过mtext长度是1 ...
struct msgbuf
{
	long mtype; /* 消息类型 正整数*/
	char mtext[ 512 ]; /* 消息数据*/
}

 // msgtype = 0 读取消息队列第一个消息
// msgtype > 0 读取消息队列第一个类型是msgtype的消息 除非标志了MSG_EXCEPT
// msgtype < 0 读取第一个 类型值 < abs(msgtype)的消息

// IPC_NOWAIT 如果消息队列没有消息, 则msgrcv立即返回并设置errno=ENOMSG
// MSG_EXCEPT 如果msgtype大于0, 则接收第一个非 msgtype 的数据
// MSG_NOERROR 消息部分长度超过msg_sz 则将它截断
int msgrcv ( int msqid, void *msg_ptr, size_t msg_sz, long int msgtype, int msgflg);
处于阻塞状态 当消息队列被移除(errno=EIDRM)或者程序接受到信号(errno=EINTR) 都会中断阻塞状态

 int msgctl ( int msqid, int command, struct msqid_ds *buf);

IPC_STAT 复制消息队列关联的数据结构
IPC_SET 将buf中的部分成员更新到目标的内核数据
IPC_RMID 立即移除消息队列, 唤醒所有等待读消息和写消息的进程
IPC_INFO 获取系统消息队列资源配置信息

MSG_INFO 返回已经分配的消息队列所占用资源信息
MSG_STAT msgqid不再是标识符, 而是内核消息队列的数组索引

Pass file descriptors between processes

IPC command-View the globally unique key for inter-process communication

Chapter 14 Multithreaded Programming

According to the running environment and the identity of the scheduler, threads can be divided into two types: kernel threads run in the kernel space and are scheduled by the kernel. User threads run in the used space and are called by the thread library

When the kernel thread obtains the right to use the CPU, it loads and runs a user thread, so the kernel thread is equivalent to a container for the user thread.

There are three ways to implement threads

Completely implemented in user space - no kernel support is required to create and schedule threads, no kernel intervention is required, and the speed is very fast. It does not occupy additional kernel resources and has less impact on the system but cannot run on multiple processors because these user threads are implemented on a kernel thread
The tasks of creating and scheduling threads are completely scheduled by the kernel. The thread library running in user space does not need to be managed. The advantages and disadvantages are exactly the opposite of the previous one.
Double-layer scheduling combines the advantages of the first two without consuming too many core resources, and thread switching is fast. At the same time, it can make full use of the advantages of multi-processors.

Process creation and termination

# include < pthread.h >
int pthread_create ( pthread_t * thread, const pthread_attr_t * attr, void * (*start_routine)( void *), void* arg);
// 成功返回0 失败返回错误码
// thread 用来唯一的标识一个新线程
// attr用来设置新县城的属性 传递NULL表示默认线程属性
// start_routine 指定新线程运行的函数
// arg指定函数的参数

 void pthread_exit ( void * retval);
用来保证线程安全干净的退出, 线程函数最好结束时调用.
通过`retval`参数向线程的回收者传递其退出信息
执行后不会返回到调用者, 而且永远不会失败

int pthread_join ( pthread_t thread, void ** retval)
可以调用这个函数来回收其他线程 不过线程必须是可回收的该函数会一直阻塞知道被回收的线程结束.
成功时返回0, 失败返回错误码
等待其他线程结束
thread 线程标识符
retval 目标线程的退出返回信息

错误码如下
`EDEADLK`引起死锁, 两个线程互相针对对方调用pthread_join 或者对自身调用
`EINVAL`目标线程是不可回收的, 或是其他线程在回收目标线程
`ESRCH`目标线程不存在

int pthread_cancel( pthread_t thread)
异常终止一个线程, 即为取消线程
成功返回0, 失败返回错误码

Thread attribute settings

接收到取消请求的目标线程可以决定是否允许被取消以及如何取消.
// 启动线程取消
int pthread_setcancelstart ( int state, int * oldstate)
第一个参数
PTHREAD_CANCEL_ENABLE 允许线程被取消, 默认状态
PTHREAD_CANCEL_DISABLE 不允许被取消, 如果这种线程接收到取消请求, 则会挂起请求直到
这个线程允许被取消
第二个参数 返回之前设定的状态

// 设置线程取消类型
int pthread_setcanceltype( int type, int * oldtype)
第一个参数
PTHREAD_CANCEL_ASYNCHRONOUS 线程可以随时被取消
PTHREAD_CANCEL_DEFERRED 允许目标现成推迟行动, 直到调用了下面几个所谓的取消点函数
最好使用pthread_testcancel函数设置取消点
设置取消类型(如何取消)
第二个参数
原来的取消类型

Set off thread

 // 初始化线程属性对象
int pthread_attr_init ( pthread_attr_t *attr);
// 销毁线程属性对象, 直到再次初始化前都不能用
int pthread_attr_destory ( pthread_attr_t *attr)

// 参数取值
// -PTHREAD_CREATE_JOINABLE 线程可回收
// -PTHREAD_CREATE_DETACH 脱离与进程中其他线程的同步 成为脱离线程
int pthread_attr_getdetachstate( const pthread_attr_t *attr, int *detachstate);
int pthread_attr_setdetachstate ( pthread_attr_t *attr, int detachstate);
// 可以直接设置为脱离线程
int pthread_detach ( pthread_t thread)

Usage scenarios of thread synchronization mechanism

POSIX semaphore - you need to maintain the count value yourself. The user space has a count value semaphore and it has two count values. It is easy to make errors.

Mutex lock - exclusive access to critical resources

Condition variables - wait for a certain condition to be met. When a certain shared data reaches a certain value, wake up the thread waiting for the shared data.

Read-write lock-can be read by multiple processes, cannot be written while reading, and can only be written by one at the same time

Spin lock - frequently attempts to acquire locks through while loops, suitable for scenarios where lock events are short and quick switching is required

POSIX semaphore

Multi-threading must also consider the issue of thread synchronization. Although pthread_join() can be regarded as a simple thread synchronization method, it cannot efficiently implement complex synchronization requirements, such as exclusive access to shared resources, or waking up a specified thread under certain conditions. Specify thread.

# include < semaphore >
// 用于初始化一个未命名的信号量.
// pshared==0 则表示是当前进程的局部信号量, 否则信号量可以在多个进程间共享
// value指定参数的初始值
int sem_init ( sem_t * sem, int pshared, unsigned int value)

// 销毁信号量, 释放其占用的系统资源
int sem_destory( sem_t * sem)

// 以原子操作的形式将信号量的值 -1, 如果信号量的值为0, 则sem_wait将被阻塞直到sem_wait具有非0值
int sem_wait( sem_t * sem)

// 跟上面的函数相同不过不会阻塞. 信号量不为0则减一操作, 为0则返回-1 errno
int sem_trywait( sem_t * sem)

// 原子操作将信号量的值 +1
int sem_post( sem_t * sem)

Initializing an existing semaphore can lead to unexpected results

Destroying a semaphore that is being waited for by other threads will lead to unexpected results.

Examples are as follows

 constexpr int kNumberMax = 10 ;
std::vector< int > number ( kNumberMax );

constexpr int kThreadNum = 10 ;
sem_t sems[ kThreadNum ];
pthread_t threads[ kThreadNum ];

constexpr int kPrintTime = 1 ;

void * t ( void *no)
{
    int start_sub = * static_cast < int *>(no);
    int sub =start_sub;
    int time = 0 ;
    while (++ time <= kPrintTime )
    {
		// 锁住本线程 释放下一个线程
        sem_wait (&sems[start_sub]);
        printf ( " %d n " , number[sub]);
        sem_post (&sems[(start_sub + 1 ) % kThreadNum ]);
		// 计算下一次要打印的下标
        sub = (sub + kThreadNum ) % kNumberMax ;
    }
    pthread_exit ( nullptr );
}

int main ()
{
    std::iota (number. begin (), number. end (), 0 );
    sem_init (&sems[ 0 ], 0 , 1 );
    for ( int i = 1 ; i < kThreadNum ; ++i)
    {
        sem_init (&sems[i], 0 , 0 );
    }
    for ( int i = 0 ; i < kThreadNum ; ++i)
    {
        pthread_create (&threads[i], nullptr , t, &number[i]);
    }
	// 等待最后一个线程结束
    pthread_join (threads[ kThreadNum - 1 ], nullptr );
}

kThreadNum processes print [0, kNumberMax) in sequence. Each process prints kPrintTime times. The main thread can end only after the last process prints.

mutex lock

 // 初始化互斥锁
// 第一个参数指向目标互斥锁, 第二个参数指定属性 nullptr则为默认
int pthread_mutex_init ( pthread_mutex_t *mutex, const pthread_mutexattr_t *mutexattr);

// 销毁目标互斥锁
int pthread_mutex_destory ( pthread_mutex_t *mutex);

// 针对普通锁加锁
int pthread_mutex_lock ( pthread_mutex_t *mutex);

// 针对普通锁立即返回 目标未加锁则加锁 如果已经加锁则返回错误码EBUSY
int pthread_mutex_trylock ( pthread_mutex_t *mutex);

// 解锁 如果有其他线程在等待这个互斥锁, 则其中之一获得
int pthread_mutex_unlock ( pthread_mutex_t *mutex);

Destroying a locked mutex will have unpredictable consequences. You can also use the macro PTHREAD_MUTEX_INITIALIZER to initialize a mutex pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

Mutex lock attribute settings

 int pthread_mutexattr_init ( pthread_mutexattr_t *attr);

int pthread_mutexattr_destory ( pthread_mutexattr_t *attr);

// PTHREAD_PROCESS_SHARED 跨进程共享
// PTHREAD_PROCESS_PRIVATE 隶属同一进程的线程
int pthread_mutexattr_getpshared ( const pthread_mutexattr_t *attr, int *pshared);
int pthread_mutexattr_setpshared ( const pthread_mutexattr_t *attr, int pshared);

// PTHREAD_MUTEX_NORMAL 普通锁 默认类型
// PTHREAD_MUTEX_ERRORCHECK 检错锁
// PTHREAD_MUTEX_RECURSVE 嵌套锁
// PTHREAD_MUTEX_DEFAULT 默认锁
int pthread_mutexattr_gettype ( const pthread_mutexattr_t *attr, int *type);
int pthread_mutexattr_settype ( const pthread_mutexattr_t *attr, int type);

PTHREAD_MUTEX_NORMAL After a thread locks it, other processes requesting the lock will form a waiting queue. After unlocking, they will obtain it according to the priority. To ensure fair resource allocation, the A thread will lock an已经加锁ordinary lock再次加锁(也是A线程) - The same thread locks again before unlocking, causing a deadlock.解锁an ordinary lock that has被其他线程加锁, or再次解锁已经解锁- unlocking - unpredictable consequences

PTHREAD_MUTEX_ERRORCHECK The thread locks the已经加锁error detection lock再次加锁- lock - the lock operation returns EDEADLK to unlock a error detection解锁that has被其他线程加锁, or再次解锁已经解锁- unlock - ReturnEPERM

PTHREAD_MUTEX_RECURSVE allows a thread to lock multiple times before releasing the lock without deadlock. If其他线程want to obtain this lock,当前锁拥有者must perform a corresponding number of unlocking operations - locking is for已经被其他进程. Unlock the nested lock, or unlock已经解锁again--unlock-return to EPERM

PTHREAD_MUTEX_DEFAULT The implementation of this lock may be one of the above three. Locking the already locked default lock again, unlocking the default lock locked by other threads, and unlocking the already unlocked default lock again will have unpredictable consequences.

example

 pthread_mutex_t mutex;
int count = 0 ;
void * t ( void *a)
{
    pthread_mutex_lock (&mutex);
    printf ( " %d n " , count);
    count++;
    pthread_mutex_unlock (&mutex);
}
int main ()
{
    pthread_mutex_init (&mutex, nullptr );
    pthread_t thread[ 10 ];
    for ( int i = 0 ; i < 10 ; ++i)
    {
        pthread_create (&thread[i], nullptr , t, nullptr );
    }
    sleep ( 3 );
    pthread_mutex_destroy (&mutex);
}

condition variable

 int pthread_cond_init ( pthread_cond_t *cond, const pthread_condattr *cond_attr);

// 销毁一个正在被等待的条件变量 将会失败并返回EBUSY
int pthread_cont_destory ( pthread_cond_t *cond);

// 广播式的唤醒所有等待目标条件变量的线程
int pthread_cont_broadcast ( pthread_cond_t *cond);

// 唤醒一个等待目标条件变量的线程
int pthread_cond_signal ( pthread_cond_t *cond);

// 等待目标条件变量
int pthread_cond_wait ( pthread_cond_t *cond, pthread_mutex_t *mutex);

pthread_cond_t cond = PTHREAD_COND_INITIALIZER; Initialize each field to 0

The second parameter of pthread_cond_wait, the mutex used to protect the condition variable must be locked before using the function, otherwise unpredictable consequences will occur. Before the function is executed, the calling thread is placed in the condition variable waiting queue, and then the mutex is Unlock

From the time the function is called to being placed in the waiting queue, pthread_cond_signal(broadcast) will not modify the value of the condition variable, that is, the pthread_cond_wait function will not miss any changes in the target condition variable. When the pthread_cond_wait function returns, the mutex lock mutex will be locked again

example

 pthread_mutex_t mutex;
pthread_cond_t cond;
int good = 3 ;
int produce_count = 0 ;
int consume_count = 0 ;

void * Producer ( void *arg)
{
    while (produce_count < 10 )
    {
        pthread_mutex_lock (&mutex);
        good++;
        pthread_mutex_unlock (&mutex);

        produce_count++;
        printf ( " produce a good n " );
		// 通知一个线程
        pthread_cond_signal (&cond);
        sleep ( 2 );
    }
    pthread_exit ( nullptr );
}

void * Consumer ( void *arg)
{
    while (consume_count < 13 )
    {
		// 传入前需要加锁
        pthread_mutex_lock (&mutex);
        if (good > 0 )
        {
            good--;
            consume_count++;
            printf ( " consume a good, reset %d n " , good);
        }
        else
        {
            printf ( " good is 0 n " );
            // wait pthread_cond_signal
            pthread_cond_wait (&cond, &mutex);
        }
        pthread_mutex_unlock (&mutex);

        usleep ( 500 * 1000 );
    }
    pthread_exit ( nullptr );
}

int main ()
{
    mutex = PTHREAD_MUTEX_INITIALIZER;
    cond = PTHREAD_COND_INITIALIZER;
    pthread_t producer, consumer;

    pthread_create (&consumer, nullptr , Consumer, nullptr );
    pthread_create (&producer, nullptr , Producer, nullptr );

    pthread_join (consumer, nullptr );
    pthread_mutex_destroy (&mutex);
    pthread_cond_destroy (&cond);
}

read-write lock

spin lock

Thread synchronization wrapper class-multi-threaded environment

 class Sem
{
public:
    Sem ()
    {
        if ( sem_init (&sem_, 0 , 0 ) != 0 )
        {
            throw std::exception ();
        }
    }
    ~Sem ()
    {
        sem_destroy (&sem_);
    }
    bool Wait ()
    {
        return sem_wait (&sem_) == 0 ;
    }
    bool Post ()
    {
        return sem_post (&sem_) == 0 ;
    }
private:
    sem_t sem_;
};

class Mutex
{
public:
    Mutex ()
    {
        if ( pthread_mutex_init (&mutex_, nullptr ) != 0 )
        {
            throw std::exception ();
        }

    }
    ~Mutex ()
    {
        pthread_mutex_destroy (&mutex_);
    }
    bool Lock ()
    {
        return pthread_mutex_lock (&mutex_) == 0 ;
    }
    bool Unlock ()
    {
        return pthread_mutex_unlock (&mutex_) == 0 ;
    }

private:
    pthread_mutex_t mutex_;
};

class Cond
{
public:
    Cond ()
    {
        if ( pthread_mutex_init (&mutex_, nullptr ) != 0 )
        {
            throw std::exception ();
        }
        if ( pthread_cond_init (&cond_, nullptr ) != 0 )
        {
            // 这里我一开始没有想到..
            pthread_mutex_destroy (&mutex_);
            throw std::exception ();
        }
    }
    ~Cond ()
    {
        pthread_mutex_destroy (&mutex_);
        pthread_cond_destroy (&cond_);
    };
    bool Wait ()
    {
        int ret = 0 ;
        pthread_mutex_lock (&mutex_);
        ret = pthread_cond_wait (&cond_, &mutex_);
        pthread_mutex_unlock (&mutex_);
        return ret == 0 ;
    }
    bool Signal ()
    {
        return pthread_cond_signal (&cond_) == 0 ;
    }
private:
    pthread_cond_t cond_;
    pthread_mutex_t mutex_;
};

Thread-safe or reentrant functions - functions can be called by multiple threads at the same time without race conditions

When a thread in a multi-threaded program calls the fork function, the new process will not have the same number of threads as the parent process. The child process has only one thread - a perfect copy of the calling fork thread.

However, the child process will inherit the status of the mutex lock (condition variable) of the parent process. If the mutex lock is locked, but it不是由the thread calling fork,子进程执行加锁the mutex lock again. The operation will死锁.

 pthread_mutex_t mutex;
void * another ( void *arg)
{
    printf ( " in child thread, lock the mutex n " );
    pthread_mutex_lock (&mutex);
    sleep ( 5 );
    // 解锁后 Prepare才能加锁
    pthread_mutex_unlock (&mutex);
    pthread_exit ( nullptr );
}
// 这个函数在fork创建子进程前被调用
void Prepare ()
{
    // 但是会阻塞 直到执行another函数的线程解锁 才能够继续执行
    // 这个函数执行完毕前fork不会创建子进程
    pthread_mutex_lock (&mutex);
}
// fork创建线程后 返回前 会在子进程和父进程中执行这个函数
void Infork ()
{
    pthread_mutex_unlock (&mutex);
}
int main ()
{
    pthread_mutex_init (&mutex, nullptr );
    pthread_t id;
    pthread_create (&id, nullptr , another, nullptr );

    sleep ( 1 );
    // pthread_atfork(Prepare, Infork, Infork);
    int pid = fork ();
    if (pid < 0 )
    {
        printf ( " emmm???? n " );
        pthread_join (id, nullptr );
        pthread_mutex_destroy (&mutex);
        return 1 ;
    }
    else if (pid == 0 )
    {
        printf ( " child process, want to get the lock n " );
        pthread_mutex_lock (&mutex);
        printf ( " i cann't run to here, opps.... n " );
        pthread_mutex_unlock (&mutex);
        exit ( 0 );
    }
    else
    {
        printf ( " wait start n " );
        wait ( nullptr );
        printf ( " wait over n " ); // 没有打印 因为子进程不会终止
    }
    pthread_join (id, nullptr );
    pthread_mutex_destroy (&mutex);
    return 0 ;
}
// $ in child thread, lock the mutex
// $ wait start
// $ child process, want to get the lock

// $ in child thread, lock the mutex
// $ wait start
// $ child process, want to get the lock
// $ i cann't run to here, opps....
// $ wait over

The original version will deadlock, but the new version (code with comments removed) can run normally.

 int pthread_atfork ( void (*__prepare) ( void ),
			   void (*__parent) ( void ),
			   void (*__child) ( void ));

The first handle is executed before fork creates the child process. The second handle is executed in the parent process after fork creates the child process and before fork returns. The second handle is executed in the child process after fork creates the child process and before fork returns. implement

Chapter 15 Process Pool and Thread Pool

Thread pool and simple HTTP server

The thread pool that has been mysterious to me for a long time has finally been unveiled. Unexpectedly, this is thread pool 23333

After writing the thread pool, I directly wrote the HTTP server in the book.

I found at least two problems with that server

Unable to send large files
Some requests cannot be responded to

The reason why I cannot send large files is because the book uses writev to send data. At the beginning of the period, I thought that the return value of writev equals -1 in the following judgment is to send large files. Later, I found that this judgment was only prepared for failure to send at the beginning of the period.

I happened to look at the code of a server a while ago https://github.com/Jigokubana/Notes-flamingo

I simply modified the sending part directly.

 // write_sum_ 需发送总大小
// write_idx_ 已发送大小

int temp = 0 ;
if (write_sum_ - write_idx_ == 0 )
{
    Modfd (epollfd_, sockfd_, EPOLLIN);
    Init ();
    return true ;
}
while ( true )
{
    temp = send (sockfd_, &*write_buff_. begin () + write_idx_, write_sum_ - write_idx_, 0 );
    if (temp <= - 1 )
    {
        if (errno == EAGAIN)
        {
            Modfd (epollfd_, sockfd_, EPOLLOUT);
            return true ;
        }
    }
    write_idx_ += temp;

    if (write_idx_ == write_sum_)
    {
        // 解除绑定移到了其他地方
        if (linger_)
        {
            Init ();
            Modfd (epollfd_, sockfd_, EPOLLIN);
            return true ;
        }
        else
        {
            Modfd (epollfd_, sockfd_, EPOLLIN);
            return false ;
        }
    }
}

The second weird problem is that some requests cannot receive replies when using ab stress testing. This problem will be solved later, when I have more knowledge.

Expand

Notes HighPerformanceLinuxServerProgramming

Chapter 1 TCP/IP Protocol Suite

TCP/IP protocol suite architecture and main protocols

Chapter 2 Detailed explanation of IP protocol

Chapter 3 Detailed explanation of TCP protocol

Chapter 5 Linux Network Programming Basic API

Host byte order and network byte order

API

Chapter 6 Advanced IO Functions

Create file descriptor - pipe dup dup2 splice select

Read and write data - readv writev mmap munmap

Chapter 7 Linux Server Program Specifications

log

User information, switch users

Inter-process relationship

Chapter 8 High-Performance Server Program Framework

Other suggestions for improving server performance Pool data replication context switches and locks

Chapter 9 I/O Multiplexing

select

poll

epoll

Comparison of three types of IO multiplexing

Advanced application of I/O multiplexing, non-blocking connect

Chapter 10 Signals

Api

Overview

Chapter 11 Timer

socket options SO_RCVTIMEO and SO_SNDTIMEO

SIGALRM signal - timer based on ascending linked list

Timeout parameter of IO multiplexing system call

High performance timer

# time wheel

# time stack

Chapter 12 High-Performance IO Framework Library

Chapter 13 Multi-process Programming

exec series system calls

fork system call - creation of process

Dealing with zombie processes - process management

Semaphore - process lock

Shared memory - inter-process communication

Process communication-pipeline

message queue

Pass file descriptors between processes

IPC command-View the globally unique key for inter-process communication

Chapter 14 Multithreaded Programming

Process creation and termination

Usage scenarios of thread synchronization mechanism

POSIX semaphore

mutex lock

condition variable

read-write lock

spin lock

Thread synchronization wrapper class-multi-threaded environment

Chapter 15 Process Pool and Thread Pool

Thread pool and simple HTTP server

socket options `SO_RCVTIMEO` and `SO_SNDTIMEO`