Exchangegis is WeBank's open source lightweight data exchange platform, which is used to quickly transfer data in different storage media and solve problems such as complexity and compatibility faced in the data exchange process. The design adopts the form of microservice architecture, and the upper and lower services are loosely coupled, making it easy to carry out personalized and highly scalable iterative development.
Exchangegis supports data transmission between structured and unstructured heterogeneous data sources. On the application layer, it has business features such as data permission control, node service high availability, and multi-tenant resource isolation. On the data layer, it also has transmission It has architectural features such as architectural diversification, module plug-in and low coupling of components.
The transmission and exchange capabilities of Exchangegis rely on its underlying aggregated transmission engine. Its top layer defines a unified parameter model for various data sources. Each transmission engine maps and configures the parameter model and converts it into the engine's input model. Each time an engine is aggregated, a type of Exchangegis feature will be added, and the feature enhancement of a certain type of engine will improve the features of Exchangegis. Default aggregation and enhancement of Alibaba's DataX transfer engine.
1. Data source management
Share your own data sources by binding projects;
Set the external permissions of the data source to control the inflow and outflow of data.
2. Multiple transmission engine support
The transport engine is horizontally scalable;
Currently, the offline batch engine DataX is fully aggregated, and the big data batch derivative engine SQOOP is partially aggregated.
3. Near real-time task management and control
Quickly capture the transmission task log and transmission rate and other information, and close the task in real time;
Dynamically limit tasks based on bandwidth conditions
4. Support unstructured transmission
The DataX framework is transformed to build a separate fast channel for binary streams, which is suitable for pure data synchronization scenarios without data conversion.
5. Task status self-check
Monitor long-running tasks and tasks with abnormal status, release occupied resources in a timely manner and issue alarms.