IdGenerator Download - IdGenerator Source code download

IdGenerator

JAVA source code

1.0.0

Download

A very useful digital ID generator in the snowflake algorithm

? Best practices (top)

In response to the performance questions that often arise during use, I give the following three sets of best practices:

❄ If the ID generation requirement does not exceed 5W/s, there is no need to modify any configuration parameters.

❄ If it exceeds 5W pieces/s and is less than 50W pieces/s, it is recommended to modify: SeqBitLength=10

❄ If it exceeds 50W bits/s and is close to 500W bits/s, it is recommended to modify: SeqBitLength=12

In summary, increasing SeqBitLength will result in better performance, but the generated IDs will be longer.

? Algorithm introduction

❄ This is an optimized snowflake algorithm (snowflake drift), which generates shorter and faster IDs.

❄ Supports automatic expansion of container environments such as k8s (automatic registration of WorkerId), and can generate a digital unique ID in a stand-alone or distributed environment.

❄ Natively supports C#/Java/Go/C/Rust/Python/Node.js/PHP (C extension)/SQL/ and other languages, and provides multi-threaded safe dynamic library calling (FFI).

❄ Compatible with all snowflake algorithms (number segment mode or classic mode, large or small manufacturers), you can make any upgrade or switch in the future.

❄ This is the most comprehensive Snowflake ID generation tool in computer history. 【As of August 2022】

Source of demand

? As an architect, you want to solve the problem of unique primary keys in a database, especially in a distributed system with multiple databases.

? You want the primary key of the data table to use the least storage space, index faster, and Select, Insert, and Update faster.

? You should consider that when dividing databases and tables (merging databases and tables), the primary key value can be used directly and can reflect the business timing.

? If such a primary key value is too long and exceeds the maximum value of the front-end js Number type, the Long type must be converted to String type, and you will feel a little frustrated.

? Although Guid can auto-increment, it takes up a lot of space and the indexing speed is slow. You don't want to use it.

? There may be more than 50 application instances, and each concurrent request can reach 10W/s.

? To deploy applications in a container environment, support horizontal replication and automatic expansion.

? You don’t want to rely on the auto-increment operation of redis to obtain continuous primary key IDs, because continuous IDs pose business data security risks.

? You want the system to operate for more than 100 years.

Traditional algorithm problems

The generated ID is too long.

The amount of instantaneous concurrency is not enough.

The time dialback problem cannot be solved.

Post-supplemental generation of pre-order ID is not supported.

May rely on external storage systems.

New algorithm features

✔ Integer numbers, monotonically increasing over time (not necessarily continuous), shorter in length, and will not exceed the maximum value of the js Number type in 50 years. (default configuration)

✔ Faster, 2-5 times faster than the traditional snowflake algorithm, 500,000 can be generated in 0.1 seconds (based on 8th generation low-voltage i7).

✔ Support time callback processing. For example, if the server time is set back by 1 second, this algorithm can automatically adapt to generate a unique ID for the critical time.

✔Supports manual insertion of new IDs. When the business needs to generate new IDs in historical time, the reserved bits of this algorithm can generate 5,000 IDs per second.

✔ Does not rely on any external cache or database. (The dynamic library that automatically registers WorkerId in the k8s environment relies on redis)

✔Basic functions, ready to use out of the box, no configuration files, database connections, etc. required.

Performance data

(Parameters: 10-bit auto-increasing sequence, 1000 drift maximum values)

Continuous requests	5K	5W	50W
Traditional snowflake algorithm	0.0045s	0.053s	0.556s
Snow drift algorithm	0.0015s	0.012s	0.113s

? Ultimate performance: 500W/s~3000W/s. (All test data are calculated based on 8th generation low voltage i7)

How to handle time dialback

? When the system time is dialed back, the algorithm uses the reserved sequence number of the past time series to generate a new ID.

? The ID number generated by the callback is placed first by default, and can also be adjusted to be later.

? Allow time to be set back to the preset base of this algorithm (parameters are adjustable).

? ID composition description

The ID generated by this algorithm consists of 3 parts (defined using the snowflake algorithm):
+------------------------+-------------+-------- --+
| 1. Time difference relative to base time | 2. WorkerId | 3. Sequence number |
+------------------------+-------------+-------- --+
Part 1, the time difference, is the total time difference (in milliseconds) from the system time minus BaseTime when the ID is generated.
Part 2, WorkerId, is a unique ID that distinguishes different machines or different applications. The maximum value is limited by WorkerIdBitLength (default 6).
Part 3, the number of sequences, is the number of sequences per millisecond, limited by the SeqBitLength (default 6) in the parameter.

ID example

? The ID generated by this algorithm is an integer (occupies up to 8 bytes of space). The following is the ID generated based on the default configuration:

 129053495681099        (运行1年，长度：15)
387750301904971        (运行3年，长度：15)
646093214093387        (运行5年，长度：15)
1292658282840139       (运行10年，长度：16)
9007199254740992       (运行70年，达到 js Number 最大值，长度：16)
165399880288699493     (运行1000年，等同普通雪花算法运行1年，长度：18)

? The ID value generated by this algorithm is 1%-10% of the maximum value of js Number, which is one thousandth of the value of the ordinary snowflake algorithm, but the generation speed is faster than the ordinary snowflake algorithm.

? The maximum value of the js Number type: 9007199254740992. This algorithm can take 70 years to reach the js Number Max value while maintaining concurrency performance (5W+/0.01s) and a maximum of 64 WorkerIds (6bit).

Length estimate

 ? 每增加 1位 WorkerIdBitLength 或 SeqBitLength，生成的ID数字值将会乘以2（基础长度可参考前一节“ID示例”），反之则除以2。

How long can it be used?

The explanation of how long it can be used refers to when the generated ID number can grow to exceed the maximum value of long (signed 64 bits, 8 bytes).

• In the default configuration, 71000 unique IDs are available.

? When supporting 1024 worker nodes, IDs are available for 4480 years without duplication.

? When supporting 4096 worker nodes, IDs are available for 1120 years without duplication.

? Parameter settings

❄ WorkerIdBitLength , the machine code bit length, determines the maximum value of WorkerId, the default value is 6 , and the value range is [1, 19]. In fact, some languages use the unsigned ushort (uint16) type to receive this parameter, so the maximum value is 16. If If signed short (int16) is used, the maximum value is 15.

❄ WorkerId , machine code, the most important parameter , no default value, must be globally unique (or unique within the same DataCenterId), must be set programmatically , the default condition (WorkerIdBitLength takes the default value), the maximum value is 63, the theoretical maximum value is 2^WorkerIdBitLength -1 (different implementation languages may be limited to 65535 or 32767, the principle is the same as the WorkerIdBitLength rule). It cannot be the same on different machines or different application instances. You can configure this value through the application or obtain the value by calling an external service. In response to the need for automatic registration of WorkerId, this algorithm provides a default implementation: automatically registering the dynamic library of WorkerId through redis, see "ToolsAutoRegisterWorkerId" for details.

Special note : If a server deploys multiple independent services, you need to specify a different WorkerId for each service.

❄ SeqBitLength , sequence bit length, default value 6 , value range [3, 21] (recommended not less than 4), determines the number of IDs generated per millisecond. If the number of requests per second does not exceed 5W, just keep the default value of 6; if it exceeds 5W and does not exceed 50W, it is recommended to assign a value of 10 or greater, and so on. Rule requirement: WorkerIdBitLength + SeqBitLength does not exceed 22.

❄ MinSeqNumber , the minimum sequence number, default value 5, value range [5, MaxSeqNumber], the first 5 sequence numbers per millisecond correspond to numbers 0-4 are reserved bits, of which 1-4 are the corresponding reserved bits for time dialback, 0 is a reserved bit for manual new values.

❄ MaxSeqNumber , the maximum sequence number, the setting range is [MinSeqNumber, 2^SeqBitLength-1], the default value is 0, the real maximum sequence number is the maximum value (2^SeqBitLength-1), if it is not 0, it is the real maximum sequence number , generally does not need to be set, unless multiple machines share the WorkerId to generate IDs in segments (in this case, the minimum sequence number must be correctly set).

❄ BaseTime , base time (also known as: base point time, origin time, epoch time), has a default value (2020), is a millisecond timestamp (an integer, .NET is a DatetTime type), its function is: use when generating ID The difference (in milliseconds) between the system time and the base time is used as the timestamp for generating the ID. There is generally no need to set the base time. If you feel that the default value is too old, you can reset it. However, please note that it is best not to change this value in the future.

The second version plans to add parameters:

❄ DataCenterId , data center ID (computer room ID, default 0), please make sure it is globally unique.

❄ DataCenterIdBitLength , data center ID length (default 0).

❄ TimestampType , timestamp type (0-milliseconds, 1-seconds), default 0.

General integration

1️⃣ Call in singleton mode. This algorithm uses a single thread to generate IDs, and calls from multiple parties will be mutually exclusive. Within the same application instance, the caller uses multi-threading (or parallel) to call this algorithm, which will not increase the ID output speed.

2️⃣ Specify a unique WorkerId. The global uniqueness of WorkerId must be ensured by an external system and assigned to the entry parameter of this algorithm.

3️⃣ Use different WorkerIds when deploying multiple instances on a single machine. Not all implementations support cross-process concurrent uniqueness. To be on the safe side, when deploying multiple application instances on the same host, please ensure that each WorkerId is unique.

4️⃣ Exception handling. The algorithm will throw all Exceptions, and the external system should catch the exceptions and handle them well to avoid causing a larger system crash.

5️⃣ Carefully understand the definition of IdGeneratorOptions, which will be helpful for integrating and using this algorithm.

6️⃣ Use snow drift algorithm. Although the code contains the definition of the traditional snowflake algorithm, and you can specify (Method=2) at the entry point to enable the traditional algorithm, it is still recommended that you use the snowflake drift algorithm (Method=1, the default), after all, it has better stretchability and higher performance.

7️⃣ Do not modify the core algorithm. This algorithm has many internal parameters and complex logic. When you have not mastered the core logic, please do not modify the core code and use it in a production environment unless it has been verified through a large number of meticulous and scientific tests.

8️⃣ The configuration policies within the application domain are the same. When the system has been running for a period of time and the project needs to switch from programmatically specifying WorkerId to automatically registering WorkerId, please ensure that all instances in use in the same application domain adopt a consistent configuration strategy. This is not only for WorkerId, but also includes other configuration parameters.

9️⃣ Manage server time well. Snowflake algorithm relies on system time. Do not manually adjust the operating system time by a large amount. If you must adjust, remember to ensure that the system time when the service is started again is greater than the time when it was last shut down. (Note: Small changes in system time caused by world-class or network-level time synchronization or callback have no impact on this algorithm)

Configuration changes

Configuration changes refer to adjusting the operating parameters (IdGeneratorOptions object properties) after the system has been running for a period of time. Please note:

? 1. The first principle is: BaseTime can only be older (further from the present), so that the generated ID value is larger than the historical maximum value, ensuring that there is no time overlap and no duplicate IDs are generated. [ It is not recommended to adjust BaseTime after the system is running]

? 2. Increasing WorkerIdBitLength or SeqBitLength at any time is allowed, but the "decrease" operation should be used with caution, because this may cause the ID generated in the future to be the same as the old configuration. [Allow any xxxBitLength value to be increased after the system is running]

? 3. If one of WorkerIdBitLength or SeqBitLength must be reduced, the condition must be met: the sum of the two new xxxBitLength must be greater than the sum of the old values. [Narrowing any BitLength value after running is not recommended ]

? 4. The above three rules are not logically controlled in this algorithm. Users should make configuration changes after confirming that the new configuration meets the requirements.

Automatically register WorkerId

? The unique ID generator relies on WorkerId. When business services require horizontal and indiscriminate replication (automatic expansion), this requires the ability to automatically register a globally unique WorkerId before generating a unique ID.

? This algorithm provides an open source dynamic library (implemented in Go language), which can automatically register WorkerId through redis in container environments such as k8s.

? Registering WorkerId through redis is not the only way. You can also develop a centralized configuration service. When each endpoint service starts, the unique WorkerId is obtained through the central service.

? Of course, if your service does not need to automatically expand, you do not need to automatically register WorkerId, but set globally unique values for them.

? There are many methods, such as: developing a centralized ID generation service, which generates usable IDs for each endpoint service (single or batch).

Automatic registration flow chart

Image link: https://github.com/yitter/IdGenerator/blob/master/Tools/AutoRegisterWorkerId/regprocess.jpg

Source code path:/Go/source/regworkerid/reghelper.go

Dynamic library download

Download link: https://github.com/yitter/IdGenerator/releases/download/v1.3.3/workeridgo_lib_v1.3.3.zip

Dynamic library interface definition

 // 注册一个 WorkerId，会先注销所有本机已注册的记录
// address: Redis连接地址，单机模式示例：127.0.0.1:6379，哨兵/集群模式示例：127.0.0.1:26380,127.0.0.1:26381,127.0.0.1:26382
// password: Redis连接密码
// db: Redis指定存储库，示例：1
// sentinelMasterName: Redis 哨兵模式下的服务名称，示例：mymaster，非哨兵模式传入空字符串即可
// minWorkerId: WorkerId 最小值，示例：30
// maxWorkerId: WorkerId 最大值，示例：63
// lifeTimeSeconds: WorkerId缓存时长（秒，3的倍数），推荐值15
extern GoInt32 RegisterOne(char* server, char* password, GoInt32 db, char* sentinelMasterName, GoInt32 minWorkerId, GoInt32 maxWorkerId, GoInt32 lifeTimeSeconds);

// 注销本机已注册的 WorkerId
extern void UnRegister();

implemented language

language	github
?C#	View example
?Java	View example
?Go	View example
? Rust	View example
?Python	View example
?C	View example
? C (PHP extension)	View example
? Delphi (Pascal)	View example
? JavaScript	View example
?TypeScript	View example
? V	View example
?D	View example