gfcptun: A fast and low-latency connection tunnel using GFCP over UDP.
Make available 65535 or more file descriptors per gfcptun process.
MTU of 9000-9702 is recommended for high-speed local links.
Suggested minimum sysctl
tuning parameters for Linux UDP handling:
net.core.rmem_max=26214400 # Tune for BDP (bandwidth delay product)
net.core.rmem_default=26214400
net.core.wmem_max=26214400
net.core.wmem_default=26214400
net.core.netdev_max_backlog=2048 # (Adjust proportional to receive window)
-sockbuf 16777217
client -r "IN:4321" -l ":8765" -mode fast3 -nocomp -autoexpire 900 -sockbuf 33554434 -dscp 46
server -t "OUT:8765" -l ":4321" -mode fast3 -nocomp -sockbuf 33554434 -dscp 46
Application → Out (8765/TCP) → Internet → In (4321/UDP) → Server (8765/TCP)
-mode fast3 -ds 10 -ps 3
, etc.To tune, increase -rcvwnd
on client, and -sndwnd
on server, in unison.
The minimum window size will dictate the maximum link throughput:
( 'Wnd' * ( 'MTU' / 'RTT' ) )
MTU should be set by -mtu parameter and never exceed the MTU of the physical interface. For DC/high-speed local links w/jumbo framing, using an MTU of 9000-9702 is highly recommended.
Adjust the retransmission algorithm aggressiveness:
fast3
→ fast2
→ fast
→ normal
→ default
Raise -smuxbuf
to 16MiB (or more), however, the actual value to use depends
on link congestion as well as available contiguous system memory.
SMUXv2 can be used to limit per-stream memory usage. Enable with -smuxver 2
,
and then tune with -streambuf
(size in bytes).
-smuxver 2 -streambuf 8388608
for an 8MiB buffer (per stream).Start tuning by limiting the stream buffer on the receiving side of the link.
SMUXv2 configuration is not negotiated, so must be set manually on both sides of the GFCP link.
GOGC
runtime environment variable tuning recommendation:
10-20 for low-memory systems and embedded devices
120-150 (or higher) for dedicated servers
Notes regarding (GF)SMUX(v1/v2) tuning:
The buffer pool mechanism maintains a high watermark for in-flight objects from the pool to survive periodic runtime garbage collection.
Memory will be returned to the system by the Go runtime when idle. Variables
that can be used for tuning this are -sndwnd
,-rcvwnd
,-ds
, and -ps
.
The -smuxbuf
setting and GOMAXPROCS
variable can be used to tune the
balance between the concurrency limit and overall resource usage.
Increasing -smuxbuf
will increase the practical concurrency limit,
however, the -smuxbuf
value is not linerally proprotional to the
concurrency handling maximum because Go runtime's garbage collection is, for
practical purposes, non-deterministic.
Only empirical testing can provide the feedback required for real-world link tuning and optimization.
Optional compression (using Snappy) is supported.
Compression saves bandwidth on redundant, low-entropy data, but will increase overhead (and CPU usage) in all other cases.
Compression is enabled by default: use -nocomp
to disable.
USR1
signal, detailed link information will be displayed.-mode manual -nodelay 1 -interval 20 -resend 2 -nc 1