There are many protocols in the protocol family. This book only selects IP and TCP protocols - which have the most direct impact on network programming.
The same seven layers are the osi reference model. After simplification, four different layers communicate with each other through interfaces, which facilitates the modification of each layer.
Application layer Responsible for handling application logic
Presentation layer Defines the format and encryption of data
session layer It defines how to start, control and end a session, including the control and management of multiple bidirectional messages, so that applications can be notified when only part of a continuous message is completed, so that the data seen by the presentation layer is continuous.
transport layer Provides end-to-end communication for applications on two hosts. Different from the next hop used by the network layer, it only cares about the start and end, and the transfer process is left to the lower layer. There are two major protocols in this layer: TCP Protocol and UDP protocol TCP protocol (Transmission Control Protocol Transmission Control Protocol)
可靠的, 面向连接, 基于流的服务
to the application layer超时重传
and数据确认
.不可靠的, 无连接的, 基于数据报的服务
to the application layer数据确认
and超时重传
yourself.有自己的长度
network layer It realizes the routing and forwarding of data packets. If the data packet cannot reach the destination address, it will下一跳
the next hop (hop by hop) and choose the nearest IP protocol (Internet Protocol) and ICMP protocol (Internet Control Message Protocol) . The latter protocol It is a supplement to the IP protocol, used to detect network connections 1. Error messages, used to respond to status 2. Query messages (the ping program uses this message to determine whether the information has been delivered)
data link layer A network driver that implements the network card interface. The driver here facilitates the manufacturer's lower-layer modifications and only needs to provide the specified interface to the upper layer. There are two protocols : ARP (Address Resolve Protocol, Address Resolution Protocol) . There are also RARP ( Reverse ~, Reverse Address Resolution Protocol) . Since the network layer uses IP addresses to address machines, but the data link layer uses physical addresses (usually MAC addresses), the conversion between them involves ARP protocol ARP spoofing, which may be related to this. Not studying at the moment
encapsulation The upper layer protocol is sent to the lower layer protocol. It is implemented through encapsulation. When transmitting between layers, its own header information is added. The data encapsulated by TCP becomes TCP报文段
Data encapsulated by UDP becomes UDP数据报
After being encapsulated by IP, it becomes IP数据报
Finally, it is encapsulated by the data link layer and becomes帧
The maximum data frame of Ethernet is 1518 bytes, throwing away 14 headers and 4 checksums at the end of the frame. MTU: The maximum transmission unit of the frame is generally 1500 bytes. MSS: The maximum data load of TCP packets is 1460 bytes = 1500 bytes - 20Ip header. -20TCP header has an additional 40-byte optional part
ARP The ARP protocol can realize the conversion of any network layer address to any physical address.
The IP protocol is the core protocol of the TCP/IP protocol suite and one of the foundations of socket network programming. The IP protocol provides stateless, connectionless, and unreliable services for upper-layer protocols.
The maximum length of IP datagram is 65535 (2^16 - 1) bytes, but there is a MTU limit
When the length of an IP datagram exceeds the MTU, it will be fragmented for transmission. Fragmentation may occur at the sender, or at the transit router, or it may be fragmented multiple times. Only on the final target machine, these fragments can Will be reassembled by the ip module in the kernel
routing mechanism
After the target IP address is given, which item in the routing table will be matched? There are three steps.
Tcp reading and writing are all for buffers, so there is no fixed correspondence between the number of reads and writes.
UDP does not have a buffer. Data must be received in time otherwise packets will be lost, or if the receiving buffer is too small, datagrams will be truncated.
ISN - Initial sequence number value 32-bit sequence number The sequence number value in the subsequent TCP message segment seq = ISN + The offset of the first byte of the message segment in the entire byte stream 32-bit confirmation number The sequence number value of the TCP message received + 1 . This 32-bit confirmation number is sent each time it is the last response.
ACK flag: Indicates whether the confirmation number is valid. The message segment carrying the ACK flag is called确认报文段
. PSH flag: Prompts the receiving application to read data from the TCP receive buffer to make room for subsequent data. RST flag: Requirements The other party re-establishes the connection and carries...the复位报文段
SYN flag: the flag requests to establish a connection and carries...同步报文段
FIN flag: informs the other party that the local connection is to be closed, and carries...结束报文段
16-bit window size: The window refers to the receiving notification window, which tells the other party how many bytes of data the local TCP receiving buffer can hold. 16-bit checksum:可靠传输的重要保障
The sending end fills, and the receiving end performs CRC algorithm verification. Check whether it is damaged, and check TCP头部
and数据部分
at the same time
TCP connection establishment and closing
# 三次握手
# 客户端发送请求连接 ISN= seq + 0 = 3683340920
# mss 最大数据载量1460
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 :
Flags [S], seq 3683340920 , win 64240 ,
options [mss 1460 , nop ,wscale 8 , nop , nop ,sackOK], length 0
# 同意客户端连接
# ack = 客户端发送 seq + 1
# 同时发送服务端的seq
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 :
Flags [S.], seq 938535101 , ack 3683340921 , win 64240 ,
options [mss 1460 , nop , nop ,sackOK, nop ,wscale 7 ], length 0
# 虽然这个报文段没有字节 但由于是同步报文段 需要占用一个序号值
# 这里是tcpdump的处理 ack显示相对值 即 3683340921 - 3683340920 = 1
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 :
Flags [.], ack 938535102 , win 4106 , length 0
# 包含FIN标志 说明要求结束连接 也需要占用一个序号值
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 :
Flags [F.], seq 1 , ack 1 , win 4106 , length 0
# 服务端确认关闭连接
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 :
Flags [.], ack 2 , win 502 , length 0
# 服务端发送关闭连接
IP ubuntu. 8000 > 192 . 168 . 80 . 1 . 7467 :
Flags [F.], seq 1 , ack 2 , win 4105 , length 0
# 客户端确认
IP 192 . 168 . 80 . 1 . 7467 > ubuntu. 8000 :
Flags [.], ack 2 , win 503 , length 0
The basic socket API is located in the sys/socket.h
header file. The initial meaning of socket is an IP address and port pair. The only network information that represents TCP communication is in netdb.h
header file.
Byte order is divided into大端字节序
and小端字节序
Since most PCs use little-endian byte order (high bits exist at high addresses), little-endian byte order is also called host byte order.
In order to prevent confusion caused by different byte order of different machines, it is stipulated that the transmission should be unified into big-endian byte order (network byte order). In this way, the host will decide according to its own situation - whether to convert the byte order of the received data
basic connection
// 主机序和网络字节序转换
# include < netinet/in.h >
unsigned long int htonl ( unsigned long int hostlong); // host to network long
unsigned short int htons ( unsigned short int hostlong); // host to network short
unsigned long int htonl ( unsigned long int netlong);
unsigned short int htons ( unsigned short int netlong);
// IP地址转换函数
# include < arpa/inet.h >
// 将点分十进制字符串的IPv4地址, 转换为网络字节序整数表示的IPv4地址. 失败返回INADDR_NONE
in_addr_t inet_addr ( const char * strptr);
// 功能相同不过转换结果存在 inp指向的结构体中. 成功返回1 反之返回0
int inet_aton ( const char * cp, struct in_addr * inp);
// 函数返回一个静态变量地址值, 所以多次调用会导致覆盖
char * inet_ntoa ( struct in_addr in);
// src为 点分十进制字符串的IPv4地址 或 十六进制字符串表示的IPv6地址 存入dst的内存中 af指定地址族
// 可以为 AF_INET AF_INET6 成功返回1 失败返回-1
int inet_pton ( int af, const char * src, void * dst);
// 协议名, 需要转换的ip, 存储地址, 长度(有两个常量 INET_ADDRSTRLEN, INET6_ADDRSTRLEN)
const char * inet_ntop ( int af, const void * src, char * dst, socklen_t cnt);
// 创建 命名 监听 socket
# include < sys/types.h >
# include < sys/socket.h >
// domain指定使用那个协议族 PF_INET PF_INET6
// type指定服务类型 SOCK_STREAM (TCP协议) SOCK_DGRAM(UDP协议)
// protocol设置为默认的0
// 成功返回socket文件描述符(linux一切皆文件), 失败返回-1
int socket ( int domain, int type, int protocol);
// socket为socket文件描述符
// my_addr 为地址信息
// addrlen为socket地址长度
// 成功返回0 失败返回 -1
int bind ( int socket, const struct sockaddr * my_addr, socklen_t addrlen);
// backlog表示队列最大的长度
int listen ( int socket, int backlog);
// 接受连接 失败返回-1 成功时返回socket
int accept ( int sockfd, struct sockaddr * addr, socklen_t * addrlen)
client
// 发起连接
#include <sys/types.h>
#include <sys/socket.h>
// 第三个参数为 地址指定的长度
// 成功返回0 失败返回-1
int connect ( int sockfd , const struct sockaddr * serv_addr , socklen_t addrlen );
// 关闭连接
#include <unistd.h>
// 参数为保存的socket
// 并非立即关闭, 将socket的引用计数-1, 当fd的引用计数为0, 才能关闭(需要查阅)
int close ( int fd );
// 立即关闭
#include <sys/socket.h>
// 第二个参数为可选值
// SHUT_RD 关闭读, socket的接收缓冲区的数据全部丢弃
// SHUT_WR 关闭写 socket的发送缓冲区全部在关闭前发送出去
// SHUT_RDWR 同时关闭读和写
// 成功返回0 失败为-1 设置errno
int shutdown ( int sockfd , int howto )
Basic TCP
#include <sys/socket.h>
#include <sys/types.h>
// 读取sockfd的数据
// buf 指定读缓冲区的位置
// len 指定读缓冲区的大小
// flags 参数较多
// 成功的时候返回读取到的长度, 可能小于预期长度, 需要多次读取. 读取到0 通信对方已经关闭连接, 错误返回-1
ssize_t recv ( int sockfd , void * buf , size_t len , int flags );
// 发送
ssize_t send ( int sockfd , const void * buf , size_t len , int flags );
option name | meaning | Available for sending | available for receiving |
---|---|---|---|
MSG_CONFIRM | Instructs the link layer protocol to continue listening until a reply is received. (Can only be used for SOCK_DGRAM and SOCK_RAW type sockets) | Y | N |
MSG_DONTROUTE | Without checking the routing table, the data is sent directly to the local LAN host (meaning that the sender knows that the target host is in the local network) | Y | N |
MSG_DONTWAIT | non-blocking | Y | Y |
MSG_MORE | Inform the kernel that there is more data to be sent, and wait until the data is written into the buffer before sending it all together. Reduce short messages and improve transmission efficiency. | Y | N |
MSG_WAITALL | The read operation waits until the specified byte is read before returning. | N | Y |
MSG_PEEK | Take a look at the internal cache data, it will not affect the data | N | Y |
MSG_OOB | Send or receive emergency data | Y | Y |
MSG_NOSIGNAL | Writing data to a read-closed pipe or socket connection will not trigger the SIGPIPE signal. | Y | N |
Basic UDP
#include <sys/types.h>
#include <sys/socket.h>
// 由于UDP不保存状态, 每次发送数据都需要 加入目标地址.
// 不过recvfrom和sendto 也可以用于 面向STREAM的连接, 这样可以省略发送和接收端的socket地址
ssize_t recvfrom ( int sockfd , void * buf , size_t len , int flags , struct sockaddr * src_addr , socklen_t * addrlen );
ssize_t sendto ( int sockfd , const void * buf , size_t len , ing flags , const struct sockaddr * dest_addr , socklen_t addrlen );
General read and write functions
#inclued <sys/socket.h>
ssize_t recvmsg ( int sockfd , struct msghdr * msg , int flags );
ssize_t sendmsg ( int sockfd , struct msghdr * msg , int flags );
struct msghdr
{
/* socket address --- 指向socket地址结构变量, 对于TCP连接需要设置为NULL*/
void * msg_name ;
socklen_t msg_namelen ;
/* 分散的内存块 --- 对于 recvmsg来说数据被读取后将存放在这里的块内存中, 内存的位置和长度由
* msg_iov指向的数组指定, 称为分散读(scatter read) ---对于sendmsg而言, msg_iovlen块的分散内存中
* 的数据将一并发送称为集中写(gather write);
*/
struct iovec * msg_iov ;
int msg_iovlen ; /* 分散内存块的数量*/
void * msg_control ; /* 指向辅助数据的起始位置*/
socklen_t msg_controllen ; /* 辅助数据的大小*/
int msg_flags ; /* 复制函数的flags参数, 并在调用过程中更新*/
};
struct iovec
{
void * iov_base /* 内存起始地址*/
size_t iov_len /* 这块内存长度*/
}
Other APIs
#include <sys/socket.h>
// 用于判断 sockfd是否处于带外标记, 即下一个被读取到的数据是否是带外数据,
// 是的话返回1, 不是返回0
// 这样就可以选择带MSG_OOB标志的recv调用来接收带外数据.
int sockatmark ( int sockfd );
// getsockname 获取sockfd对应的本端socket地址, 存入address指定的内存中, 长度存入address_len中 成功返回0失败返回-1
// getpeername 获取远端的信息, 同上
int getsockname ( int sockfd , struct sockaddr * address , socklen_t * address_len );
int getpeername ( int sockfd , struct sockaddr * address , socklen_t * address_len );
/* 以下函数头文件均相同*/
// sockfd 目标socket, level执行操作协议(IPv4, IPv6, TCP) option_name 参数指定了选项的名字. 后面值和长度
// 成功时返回0 失败返回-1
int getsockopt ( int sockfd , int level , int option_name , void * option_value ,
socklen_t restrict option_len );
int setsockopt ( int sockfd , int level , int option_name , void * option_value ,
socklen_t restrict option_len );
SO_REUSEADDR | Reuse local address | After a sock is set with this attribute, even if the sock is in the TIME_WAIT state after being bind(), the socket address bound to it can still be immediately reused to bind a new sock. |
---|---|---|
SO_RCVBUF | TCP receive buffer size | The minimum value is 256 bytes. After setting, the system will automatically double the value you set. The extra double will be used as a free buffer to deal with congestion. |
SO_SNDBUF | TCP send buffer size | Minimum value is 2048 bytes |
SO_RCVLOWAT | Received low water mark | The default is 1 byte. When the total number of readable data in the TCP receive buffer is greater than its low water mark, the IO multiplexing system call will notify the application that the data can be read from the corresponding socket. |
SO_SNDLOWAT | high water mark sent | The default is 1 byte. Data can be written when the free space in the TCP send buffer is greater than the low water mark. |
SO_LINGER |
struct linger
{
int l_onoff /* 开启非0, 关闭为0*/
int l_linger ; /* 滞留时间*/
/*
* 当onoff为0的时候此项不起作用, close调用默认行为关闭socket
* 当onoff不为0 且linger为0, close将立即返回, TCP将丢弃发送缓冲区的残留数据, 同时发送一个复位报文段
* 当onoff不为0 且linger大于0 . 当socket阻塞的时候close将会等待TCP模块发送完残留数据并得到确认后关
* 闭, 如果是处于非阻塞则立即关闭
*/
};
Network Information API
#include <netdb.h>
// 通过主机名查找ip
struct hostent * gethostbyname ( const char * name );
// 通过ip获取主机完整信息
// type为IP地址类型 AF_INET和AF_INET6
struct hostent * gethostbyaddr ( const void * addr , size_t len , int type );
struct hostent
{
char * h_name ; /* Official name of host. */
char * * h_aliases ; /* Alias list. */
int h_addrtype ; /* Host address type. */
int h_length ; /* Length of address. */
char * * h_addr_list ; /* List of addresses from name server. */
}
int main ( int argc , char * argv [])
{
if ( argc != 2 )
{
printf ( "非法输入n" );
exit ( 0 );
}
char * name = argv [ 1 ];
struct hostent * hostptr {};
hostptr = gethostbyname ( name );
if ( hostptr == nullptr )
{
printf ( "输入存在错误 或无法获取n" );
exit ( 0 );
}
printf ( "Official name of hostptr: %sn" , hostptr -> h_name );
char * * pptr ;
char inet_addr [ INET_ADDRSTRLEN ];
printf ( "Alias list:n" );
for ( pptr = hostptr -> h_aliases ; * pptr != nullptr ; ++ pptr )
{
printf ( "t%sn" , * pptr );
}
switch ( hostptr -> h_addrtype )
{
case AF_INET :
{
printf ( "List of addresses from name server:n" );
for ( pptr = hostptr -> h_addr_list ; * pptr != nullptr ; ++ pptr )
{
printf ( "t%sn" ,
inet_ntop ( hostptr -> h_addrtype , * pptr , inet_addr , sizeof ( inet_addr )));
}
break ;
}
default :
{
printf ( "unknow address typen" );
exit ( 0 );
}
}
return 0 ;
}
/*
./run baidu.com
Official name of hostptr: baidu.com
Alias list:
List of addresses from name server:
39.156.69.79
220.181.38.148
*/
The following two functions obtain service information by reading the /etc/services file. The following content is from Wikipedia
The Service file is a configuration file in the etc directory of modern operating systems. It records the port number and protocol corresponding to the network service name. Its purpose is as follows
#include <netdb.h>
// 根据名称获取某个服务的完整信息
struct servent getservbyname ( const char * name , const char * proto );
// 根据端口号获取服务信息
struct servent getservbyport ( int port , const char * proto );
struct servent
{
char * s_name ; /* 服务名称*/
char * * s_aliases ; /* 服务的别名列表*/
int s_port ; /* 端口号*/
char * s_proto ; /* 服务类型, 通常为TCP或UDP*/
}
#include <netdb.h>
// 内部使用的gethostbyname 和 getserverbyname
// hostname 用于接收主机名, 也可以用来接收字符串表示的IP地址(点分十进制, 十六进制字符串)
// service 用于接收服务名, 字符串表示的十进制端口号
// hints参数 对getaddrinfo的输出进行更准确的控制, 可以设置为NULL, 允许反馈各种有用的结果
// result 指向一个链表, 用于存储getaddrinfo的反馈结果
int getaddrinfo ( const char * hostname , const char * service , const struct addrinfo * hints , struct addrinfo * * result )
struct addrinfo
{
int ai_flags ;
int ai_family ;
int ai_socktype ; /* 服务类型, SOCK_STREAM或者SOCK_DGRAM*/
int ai_protocol ;
socklen_t ai_addrlen ;
char * ai_canonname ; /* 主机的别名*/
struct sockaddr * ai_addr ; /* 指向socket地址*/
struct addrinfo * ai_next ; /* 指向下一个结构体*/
}
// 需要手动的释放堆内存
void freeaddrinfo ( struct addrinfo * res );
#include <netdb.h>
// host 存储返回的主机名
// serv存储返回的服务名
int getnameinfo ( const struct sockaddr * sockaddr , socklen_t addrlen , char * host , socklen_t hostlen , char * serv
socklen_t servlen , int flags );
Test use
telnet ip port #来连接服务器的此端口
netstat -nt | grep port #来查看此端口的监听
The advanced IO functions provided by Linux are naturally more powerful under specific conditions. Otherwise, what else would they do? Specific conditions naturally limit the frequency of use of file descriptors. The file descriptor is a non-negative integer. Is an index value that points to the record table of files opened by the process maintained by the kernel for each process. STDOUT_FILENO (value 1) - The file descriptor with value 1 is standard output. After turning off STDOUT_FILENO, use dup to return the smallest available value (currently, 1). In this way, the output is redirected to the file pointed to by the parameter calling dup.
pipe function This function can be used to create a pipe to implement communication between processes.
// 函数定义
// 参数文件描述符数组 fd[0] 读出 fd[1]写入 单向管道
// 成功返回0, 并将一对打开的文件描述符填入其参数指向的数组
// 失败返回-1 errno
#include <unistd.h>
int pipe ( int fd [ 2 ]);
// 双向管道
// 第一个参数为 协议PF_UNIX(书上是AF_UNIX)感觉这里指明协议使用PF更好一些
#include <sys/types.h>
#include <sys/socket.h>
int socketpair ( int domain , int type , int protocol , int fd [ 2 ]);
After studying the following content and understanding of inter-process communication, I will come back and add an example.
int main ()
{
int fds [ 2 ];
socketpair ( PF_UNIX , SOCK_STREAM , 0 , fds );
int pid = fork ();
if ( pid == 0 )
{
close ( fds [ 0 ]);
char a [] = "123" ;
send ( fds [ 1 ], a , strlen ( a ), 0 );
}
else if ( pid > 0 )
{
close ( fds [ 1 ]);
char b [ 20 ] {};
recv ( fds [ 0 ], b , 20 , 0 );
printf ( "%s" , b );
}
}
dup and dup2 functions Copy an existing file descriptor
#include <unistd.h>
// 返回的文件描述符总是取系统当前可用的最小整数值
int dup ( int oldfd );
// 可以用newfd来制定新的文件描述符, 如果newfd已经被打开则先关闭
// 如果newfd==oldfd 则不关闭newfd直接返回
int dup2 ( int oldfd , int newfd );
The dup function creates a new file descriptor. The new file descriptor and the original file_descriptor both point to the same target. Come back and add an example. In this example, because STDOUT_FILENO
is turned off, the smallest dup is STDOUT_FILENO
, so the standard output goes to this in the file
int main ()
{
int filefd = open ( "/home/lsmg/1.txt" , O_WRONLY );
close ( STDOUT_FILENO );
dup ( filefd );
printf ( "123n" );
exit ( 0 );
}
readv/writev
#include <sys/uio.h>
// count 为 vector的长度, 即为有多少块内存
// 成功时返回写入读取的长度 失败返回-1
ssize_t readv ( int fd , const struct iovec * vector , int count );
ssize_t writev ( int fd , const struct iovec * vector , int count );
struct iovec {
void * iov_base /* 内存起始地址*/
size_t iov_len /* 这块内存长度*/
}
Come back and add a usage example. This example writes the memory representation of an int into a file. Use hexdump to view the file 0000000 86a0 0001
You can see 186a0
is 100000.
// 2020年1月7日16:52:11
int main ()
{
int file = open ( "/home/lsmg/1.txt" , O_WRONLY );
int temp = 100000 ;
iovec temp_iovec {};
temp_iovec . iov_base = & temp ;
temp_iovec . iov_len = sizeof ( temp );
writev ( file , & temp_iovec , 1 );
}
sendfile function
#include <sys/sendfile.h>
// offset为指定输入流从哪里开始读, 如果为NULL 则从开头读取
ssize_t sendfile ( int out_fd , int in_fd , off_t * offset , size_t count );
O_RDONLY只读模式
O_WRONLY只写模式
O_RDWR读写模式
int open ( file_name , flag );
The stat structure can be generated with fstat, which is simply the ID card of the file.
#include <sys/stat.h>
struct stat
{
dev_t st_dev ; /* ID of device containing file -文件所在设备的ID*/
ino_t st_ino ; /* inode number -inode节点号*/
mode_t st_mode ; /* protection -保护模式?*/
nlink_t st_nlink ; /* number of hard links -链向此文件的连接数(硬连接)*/
uid_t st_uid ; /* user ID of owner -user id*/
gid_t st_gid ; /* group ID of owner - group id*/
dev_t st_rdev ; /* device ID (if special file) -设备号,针对设备文件*/
off_t st_size ; /* total size, in bytes -文件大小,字节为单位*/
blksize_t st_blksize ; /* blocksize for filesystem I/O -系统块的大小*/
blkcnt_t st_blocks ; /* number of blocks allocated -文件所占块数*/
time_t st_atime ; /* time of last access -最近存取时间*/
time_t st_mtime ; /* time of last modification -最近修改时间*/
time_t st_ctime ; /* time of last status change - */
};
ID card generation function
// 第一个参数需要调用open生成文件描述符
// 下面其他两个为文件全路径
int fstat ( int filedes , struct stat * buf );
// 当路径指向为符号链接的时候, lstat为符号链接的信息. stat为符号链接指向文件信息
int stat ( const char * path , struct stat * buf );
int lstat ( const char * path , struct stat * buf );
/*
* ln -s source dist 建立软连接, 类似快捷方式, 也叫符号链接
* ln source dist 建立硬链接, 同一个文件使用多个不同的别名, 指向同一个文件数据块, 只要硬链接不被完全
* 删除就可以正常访问
* 文件数据块 - 文件的真正数据是一个文件数据块, 打开的`文件`指向这个数据块, 就是说
* `文件`本身就类似快捷方式, 指向文件存在的区域.
*/
mmap and munmap functions
mmap
creates a memory shared by process communication (files can be mapped into it), munmap
releases this memory.
#include <sys/mman.h>
// start 内存起始位置, 如果为NULL则系统分配一个地址 length为长度
// port参数 PROT_READ(可读) PROT_WRITE(可写) PROT_EXEC(可执行), PROT_NONE(不可访问)
// flag参数 内存被修改后的行为
// - MAP_SHARED 进程间共享内存, 对内存的修改反映到映射文件中
// - MAP_PRIVATE 为调用进程私有, 对该内存段的修改不会反映到文件中
// - MAP_ANONUMOUS 不是从文件映射而来, 内容被初始化为0, 最后两个参数被忽略
// 成功返回区域指针, 失败返回 -1
void * mmap ( void * start , size_t length , int port , int flags , int fd , off_t offset );
// 成功返回0 失败返回-1
int munmap ( void * start , size_t length );
splice function Used to move data between two file name descriptors, 0 copy operation
#include <fcntl.h>
// fd_in 为文件描述符, 如果为管道文件描述符则 off_in必须为NULL, 否则为读取开始偏移位置
// len为指定移动的数据长度, flags参数控制数据如何移动.
// - SPLICE_F_NONBLOCK 非阻塞splice操作, 但会受文件描述符自身的阻塞
// - SPLICE_F_MORE 给内核一个提示, 后续的splice调用将读取更多的数据???????
ssize_t splice ( int fd_in , loff_t * off_in , int fd_out , loff_t * off_out , size_t len , unsigned int flags );
// 使用splice函数 实现echo服务器
int main ( int argc , char * argv [])
{
if ( argc <= 2 )
{
printf ( "the parmerters is wrongn" );
exit ( errno );
}
char * ip = argv [ 1 ];
int port = atoi ( argv [ 2 ]);
printf ( "the port is %d the ip is %sn" , port , ip );
int sockfd = socket ( PF_INET , SOCK_STREAM , 0 );
assert ( sockfd >= 0 );
struct sockaddr_in address {};
address . sin_family = AF_INET ;
address . sin_port = htons ( port );
inet_pton ( AF_INET , ip , & address . sin_addr );
int ret = bind ( sockfd , ( sockaddr * ) & address , sizeof ( address ));
assert ( ret != -1 );
ret = listen ( sockfd , 5 );
int clientfd {};
sockaddr_in client_address {};
socklen_t client_addrlen = sizeof ( client_address );
clientfd = accept ( sockfd , ( sockaddr * ) & client_address , & client_addrlen );
if ( clientfd < 0 )
{
printf ( "accept errorn" );
}
else
{
printf ( "a new connection from %s:%d successn" , inet_ntoa ( client_address . sin_addr ), ntohs ( client_address . sin_port ));
int fds [ 2 ];
pipe ( fds );
ret = splice ( clientfd , nullptr , fds [ 1 ], nullptr , 32768 , SPLICE_F_MORE );
assert ( ret != -1 );
ret = splice ( fds [ 0 ], nullptr , clientfd , nullptr , 32768 , SPLICE_F_MORE );
assert ( ret != -1 );
close ( clientfd );
}
close ( sockfd );
exit ( 0 );
}
select function The select function returns when the second parameter list is readable or waits for the specified time to return.
After returning, the collection pointed to by the second parameter fdset is modified into a readable fd list. This requires updating the fdset collection after each return.
After returning, the return value of this function is the number of readable fds. It traverses the fdset collection and uses FD_ISSET to determine whether fdset[i] is in it and then determines whether the fd is listenfd. If so, accept the new connection. If not, it means that it has been accepted by others. fd determines whether there is data to read or the connection is disconnected
#include <fcntl.h>
// maxfdp 最大数 FD_SETSIZE
// struct fd_set 一个集合,可以存储多个文件描述符
// - FD_ZERO(&fd_set) 清空 -FD_SET(fd, &fd_set) 放入fd FD_CLR(fd, &fd_set)从其中清除fd
// - FD_ISSET(fd, &fd_set) 判断是否在其中
// readfds 需要监视的文件描述符读变化, 其中的文件描述符可读的时候返回
// writefds 需要监视的文件描述符写变化, 其中的文件描述符可写的时候返回
// errorfds 错误
// timeout 传入NULL为阻塞, 设置为0秒0微秒则变为非阻塞函数
// 返回值 负值为错误 等待超时说明文件无变化返回0 有变化返回正值
int select ( int maxfdp , fd_set * readfds , fd_set * writefds , fd_set * errorfds , struct timeval * timeout );
#define exit_if ( r , ...)
{
if (r)
{
printf(__VA_ARGS__);
printf("errno no: %d, error msg is %s", errno, strerror(errno));
exit(1);
}
}
int main ( int argc , char * argv [])
{
int keyboard_fd = open ( "/dev/tty" , O_RDONLY | O_NONBLOCK );
exit_if ( keyboard_fd < 0 , "open keyboard fd errorn" );
fd_set readfd ;
char recv_buffer = 0 ;
while (true)
{
FD_ZERO ( & readfd );
FD_SET ( 0 , & readfd );
timeval timeout { 5 , 0 };
int ret = select ( keyboard_fd + 1 , & readfd , nullptr , nullptr , & timeout );
exit_if ( ret == -1 , "select errorn" );
if ( ret > 0 )
{
if ( FD_ISSET ( keyboard_fd , & readfd ))
{
recv_buffer = 0 ;
read ( keyboard_fd , & recv_buffer , 1 );
if ( 'n' == recv_buffer )
{
continue ;
}
if ( 'q' == recv_buffer )
{
break ;
}
printf ( "the input is %cn" , recv_buffer );
}
}
if ( ret == 0 )
{
printf ( "timeoutn" );
}
}
}
sudo service rsyslog restart // 启动守护进程
#include <syslog.h>
// priority参数是所谓的设施值(记录日志信息来源, 默认为LOG_USER)与日志级别的按位或
// - 0 LOG_EMERG /* 系统不可用*/
// - 1 LOG_ALERT /* 报警需要立即采取行动*/
// - 2 LOG_CRIT /* 非常严重的情况*/
// - 3 LOG_ERR /* 错误*/
// - 4 LOG_WARNING /* 警告*/
// - 5 LOG_NOTICE /* 通知*/
// - 6 LOG_INFO /* 信息*/
// -7 LOG_DEBUG /* 调试*/
void syslog ( int priority , const char * message , .....);
// ident 位于日志的时间后 通常为名字
// logopt 对后续 syslog调用的行为进行配置
// - 0x01 LOG_PID /* 在日志信息中包含程序PID*/
// - 0x02 LOG_CONS /* 如果信息不能记录到日志文件, 则打印到终端*/
// - 0x04 LOG_ODELAY /* 延迟打开日志功能直到第一次调用syslog*/
// - 0x08 LOG_NDELAY /* 不延迟打开日志功能*/
// facility参数可以修改syslog函数中的默认设施值
void openlog ( const char * ident , int logopt , int facility );
// maskpri 一共八位 0000-0000
// 如果将最后一个0置为1 表示 记录0级别的日志
// 如果将最后两个0都置为1 表示记录0和1级别的日志
// 可以通过LOG_MASK() 宏设定 比如LOG_MASK(LOG_CRIT) 表示将倒数第三个0置为1, 表示只记录LOG_CRIT
// 如果直接设置setlogmask(3); 3的二进制最后两个数均为1 则记录 0和1级别的日志
int setlogmask ( int maskpri );
// 关闭日志功能
void closelog ();
UID - real user ID EUID - effective user ID - facilitate resource access GID - real group ID EGID - effective group ID
#include <sys/types.h>
#include <unistd.h>
uid_t getuid ();
uid_t geteuid ();
gid_t getgid ();
gid_t getegid ();
int setuid ( uid_t uid );
int seteuid ( uid_t euid );
int setgid ( gid_t gid );
int setegid ( gid_t gid );
You can switch users through setuid
and setgid
The root user uid and gid are both 0.
PGID - process group ID (each process under Linux belongs to a process group)
#include <unistd.h> pid_t getpgid(pid_t pid); Returns the pgid to which the pid belongs on success. Returns -1 on failure int setpgid(pid_t pid, pid_t pgid);
session Some associated process groups will form a session skip
Check the process relationship ps and less
Resource limits Change directory slightly slightly
Server model-CS model
advantage
Pattern diagram
The demo written does not use the fork function. It will be improved in the future.
Server framework IO model
I can probably understand this model, and I have studied Javaweb for half a year.
The socket is blocking by default when it is created, but it can be solved by passing SOCK_NONBLOCK
parameter. Non-blocking calls will return immediately, but the event may not have occurred (recv did not receive the information). If it does not occur or an error occurs,返回-1
so it needs to be distinguished by errno
These errors. The event did not occur accept, send, recv errno is set to EAGAIN(再来一次)
or EWOULDBLOCK(期望阻塞)
connect is set to EINPROGRESS(正在处理中)
Non-blocking IO needs to be called when the event has already occurred to improve performance.
The commonly used IO multiplexing function select
poll
epoll_wait
will be explained later in Chapter 9. The signal will be explained in Chapter 10.
Two efficient event processing modes and concurrency mode
Programs are divided into computing-intensive (using a lot of CPU and little IO resources) and IO-intensive (inversely). The former will reduce efficiency when using concurrent programming, while the latter will improve efficiency. Concurrent programming uses both multi-process and multi-threading. way
Concurrency mode - a method of coordinating tasks between IO units and multiple logical units. The server has two main concurrency modes
Semi-synchronous/semi-asynchronous mode In the IO model, the difference between asynchronous and synchronous is what kind of IO event the kernel notifies the application (ready event or completion event), and who completes the IO reading and writing (application or kernel)
And here (concurrency mode) synchronization refers to execution completely in the order of the code sequence - threads running in a synchronous manner are called synchronous threads. Asynchronous needs to be driven by system events (interrupts, signals) - threads running in an asynchronous manner are called asynchronous thread
Server (requires good real-time performance and can handle multiple customer requests at the same time) - generally implemented using synchronous threads and asynchronous threads, that is, semi-synchronous/semi-asynchronous mode Synchronous threads - process customer logic and process objects in the request queue asynchronously Thread - handles IO events, after receiving customer requests, encapsulates them into request objects and inserts them into the request queue
There are variations of Semi-Sync/Semi-Async Pattern半同步/半反应堆模式
Asynchronous thread - main thread - responsible for monitoring events on all sockets
leader/follower model slightly
Efficient programming method - finite state machine
// 状态独立的有限状态机
STATE_MACHINE ( Package _pack ) {
PackageType _type = _pack . GetType ();
switch ( _type ) {
case type_A :
xxxx ;
break ;
case type_B :
xxxx ;
break ;
}
}
// 带状态转移的有限状态机
STATE_MACHINE () {
State cur_State = type_A ;
while ( cur_State != type_C ) {
Package _pack = getNewPackage ();
switch ( cur_State ) {
case type_A :
process_package_state_A ( _pack );
cur_State = type_B ;
break ;
case type_B :
xxxx ;
cur_State = type_C ;
break ;
}
}
}
It took me an hour to finally copy the 5,000-word code letter by letter @September 8, 2019 22:08:46@
Pools - Trade space for time process pools and thread pools
Data Replication - High-performance servers should try to avoid unnecessary replication
Context switches and locks Reduce the scope of锁
. You should not create too many worker processes, but use dedicated business logic threads.
I/O multiplexing allows programs to monitor multiple file descriptors at the same time.
Commonly used methods select
, poll
, epoll
# include < sys/select.h >
// nfds - 被监听的文件描述符总数
// 后面三个分别指向 可读, 可写, 异常等事件对应的文件描述符集合
// timeval select超时时间 如果传递0 则为非阻塞, 设置为NULL则为阻塞
// 成功返回就绪(可读, 可写, 异常)文件描述符的总数, 没有则返回0 失败返回-1
int select ( int nfds, fd_set * readfds, fd_set * writefds, fd_set * exceptfds, struct timeval * timeout);
//操作fd_set的宏
FD_ZERO ( fd_set * fdset);
FD_SET ( int fd, fd_set * fdset);
FD_CLR ( int fd, fd_set * fdset);
FD_ISSET ( int fd, fd_set * fdset);
// 设置 timeval 超时时间
struct timeval
{
long tv_sec; // 秒
long tv_usec; // 微秒
}
select
file descriptor ready condition
poll
# include < poll.h >
// fds 结构体类型数组 指定我们感兴趣的文件描述符上发生的可读可写和异常事件
// nfds 遍历结合大小 左闭右开
// timeout 单位为毫秒 -1 为阻塞 0 为立即返回
int poll ( struct pollfd * fds, nfds_t nfds, int timeout);
struct pollfd
{
int fd;
short events; //注册的事件, 告知poll监听fd上的哪些事件
short revents; // 实际发生的事件
}
# define exit_if (r, ...)
{
if (r)
{
printf (__VA_ARGS__);
printf ( " errno no: %d, error msg is %s " , errno, strerror (errno));
exit ( 1 );
}
}
struct client_info
{
char *ip_;
int port_;
};
int main ( int argc, char * argv[])
{
int port = 8001 ;
char ip[] = " 127.0.0.1 " ;
struct sockaddr_in address;
address. sin_port = htons (port);
address. sin_family = AF_INET;
address. sin_addr . s_addr = htons (INADDR_ANY);
int listenfd = socket (PF_INET, SOCK_STREAM, 0 );
exit_if (listenfd < 0 , " socket error n " );
int ret = bind (listenfd, ( struct sockaddr *)&address, sizeof (address));
exit_if (ret == - 1 , " bind error n " );
ret = listen (listenfd, 5 );
exit_if (ret == - 1 , " listen error n " );
constexpr int MAX_CLIENTS = 1024 ;
struct pollfd polls[MAX_CLIENTS] = {};
struct client_info clientsinfo[MAX_CLIENTS] = {};
polls[ 3 ]. fd = listenfd;
polls[ 3 ]. events = POLLIN | POLLRDHUP;
while ( true )
{
ret = poll (polls, MAX_CLIENTS + 1 , - 1 );
exit_if (ret == - 1 , " poll error n " );
for ( int i = 3 ; i <= MAX_CLIENTS; ++i)
{
int fd = polls[i]. fd ;
if (polls[i]. revents & POLLRDHUP)
{
polls[i]. events = 0 ;
printf ( " close fd-%d from %s:%d n " , fd, clientsinfo[fd]. ip_ , clientsinfo[fd]. port_ );
}
if (polls[i]. revents & POLLIN)
{
if (fd == listenfd)
{
struct sockaddr_in client_address;
socklen_t client_addresslen = sizeof (client_address);
int clientfd = accept (listenfd, ( struct sockaddr *)&client_address,
&client_addresslen);
struct client_info *clientinfo = &clientsinfo[clientfd];
clientinfo-> ip_ = inet_ntoa (client_address. sin_addr );
clientinfo-> port_ = ntohs (client_address. sin_port );
exit_if (clientfd < 0 , " accpet error, from %s:%d n " , clientinfo-> ip_ ,
clientinfo-> port_ );
printf ( " accept from %s:%d n " , clientinfo-> ip_ , clientinfo-> port_ );
polls[clientfd]. fd = clientfd;
polls[clientfd]. events = POLLIN | POLLRDHUP;
}
else
{
char buffer[ 1024 ];
memset (buffer, '