DataSphere Studio (DSS for short) is a one-stop data application development and management portal developed by WeBank.
DataSphereStudio's one-stop data application development and management portal is based on plug-in integration framework design and computing middleware Linkis, which can easily access various upper-layer data application systems, making data development simple and easy to use.
Under a unified UI, DataSphere Studio uses a workflow-style graphical drag-and-drop development experience to meet the needs of data application development from data exchange, desensitization and cleaning, analysis and mining, quality inspection, visual display, timing scheduling to data output applications, etc. Full process scenario requirements.
DSS adopts a pluggable integration framework design, allowing users to simply and quickly replace various functional components that DSS has integrated, or add new functional components according to needs.
With the help of the connection, reuse and simplification capabilities of Linkis computing middleware, DSS is inherently equipped with financial-level execution and scheduling capabilities such as high concurrency, high availability, multi-tenant isolation and resource management and control.
Core features
Main features of DSS
1. One-stop, full-process application development management interface
DSS has a very high degree of integration. Currently, the systems that have been integrated include
1. Data development IDE tool——Scriptis
2. Data visualization tool—Visualis (based on secondary development by CreditEase Davinci)
3. Data quality management tool——Qualitis
4. Workflow scheduling tool——Azkaban
DSS's plug-in framework design model allows users to quickly replace various web systems that DSS has integrated. For example: replace Scriptis with Zeppelin and Azkaban with DolphinScheduler.
2. Create a unique AppJoint design concept based on Linkis computing middleware
AppJoint is the core concept of DSS that can easily and quickly integrate various upper-layer Web systems.
AppJoint - application joint, defines a unified set of front-end and back-end access specifications, allowing external data application systems to quickly and easily access, becoming a part of DSS data application development.
DSS connects multiple AppJoints in series to form a workflow that supports real-time execution and scheduled scheduling. Users can complete the entire process development of data applications by simply dragging and dropping.
Since AppJoint is connected to Linkis, the external data application system has the capabilities of resource management and control, concurrency limiting, user resource management, etc., and allows context information to be shared across system levels, completely bidding farewell to application islands.
3. Project-level management unit
With Project as the management unit, it organizes and manages the business applications of each data application system, and defines a set of common standards for collaborative project development across data application systems.
4. Integrated data application components
By implementing multiple AppJoints, DSS has integrated a variety of upper-layer data application systems, which can basically meet users' data development needs.
If necessary, users can easily integrate new data application systems to replace or enrich the DSS data application development process.
1. DSS scheduling capability—Azkaban AppJoint
Many data applications of users usually want to have periodic scheduling capabilities.
The existing open source scheduling systems currently on the market have low integration with other upper-layer data application systems and are difficult to integrate.
By implementing Azkaban AppJoint, DSS allows users to publish an orchestrated workflow to Azkaban for scheduled scheduling with one click.
DSS also defines a set of standard and universal DSS workflow parsing and publishing specifications for scheduling systems, allowing other scheduling systems to easily connect with DSS at low cost.
2. Data development - Scriptis AppJoint
What are Scriptis?
Scriptis is a data analysis web tool that supports online writing of SQL, Pyspark, HiveQL and other scripts and submits them to Linkis for execution. It also supports enterprise-level features such as UDF, functions, resource management and control, and intelligent diagnosis.
Scriptis AppJoint integrates the data development capabilities of Scriptis into DSS and allows various script types of Scriptis to participate in the application development process as nodes of the DSS workflow.
Currently, script node types such as HiveSQL, SparkSQL, Pyspark, and Scala are supported.
3. Data visualization - Visualis AppJoint
What is Visualis?
Visualis is a data visualization BI tool developed based on the CreditEase open source project Davinci. It provides users with financial-level data visualization capabilities in terms of data security and permissions.
Visualis AppJoint integrates the data visualization capabilities of Visualis for DSS and allows data screens and dashboards to be used as nodes in the DSS workflow and associated with the upstream data mart.
4. Data quality——Qualitis AppJoint
Qualitis AppJoint integrates data quality verification capabilities for DSS, integrates the data quality system into DSS workflow development, and verifies data integrity and correctness.
5. Data sending - Sender AppJoint
Sender AppJoint integrates data sending capabilities for DSS and currently supports the SendEmail node type. The result sets of all other nodes can be sent via email.
For example: SendEmail node can directly send the large display data as an email.
6. Signal node——Signal AppJoint
EventChecker AppJoint is used to strengthen the decoupling and interconnection between business and processes.
DataChecker node: Check whether the library table partition exists.
EventSender: Message sending node across workflows and projects.
EventReceiver: Message receiving node across workflows and projects.
7. Function node
Empty node, child workflow node.
8. Node expansion
According to needs, users can simply and quickly replace various functional components that have been integrated by DSS, or add new functional components.
Usage scenarios
DataSphere Studio is suitable for the following scenarios
1. Scenarios where a big data platform is being built or has preliminary capabilities, but no data application tools are available.
2. Scenarios where big data basic platform capabilities are already available and there are only a few data application tools.
3. Already have big data basic platform capabilities and have all data application tools, but the tools have not yet been connected, and users have a strong sense of isolation and high learning costs.
4. Already have big data basic platform capabilities and have all data application tools. Some tools have been connected, but a unified and standardized scenario has not yet been defined.
Expand
Additional Information
-
Version
1.2.1
-
Type
JAVA source code
-
Update Time
2024-10-23
-
size
61.2MB
-
Language
Simplified Chinese