This topic describes the features and basic use scenarios of DataWorks modules.
Data processing procedure
DataWorks is an end-to-end data development and governance platform. The data processing procedure includes the phases that are shown in the following figure.
DataWorks modules
Feature directory | Module | Description |
Data integration | Data Integration is a stable, efficient, and scalable data synchronization service.
| |
Upload and Download allows you to upload data from multiple data sources, such as CSV files on an on-premises machine and Object Storage Service (OSS) objects, to big data compute engines such as MaxCompute for processing and analysis. | ||
Data development and O&M | Data Modeling is the first step for end-to-end data governance. Data Modeling focuses on the following aspects based on the modeling methodology of the Alibaba data mid-end:
Data Modeling interprets the business data of an enterprise from a business perspective, and allows personnel inside the enterprise to quickly understand and share the idea of measuring and interpreting business data in compliance with data warehousing specifications. | |
Data Studio (new version: Participate in Public Preview of DataStudio of New Version turned on) | The data development service is an end-to-end big data development system that supports the development of data processing tasks of big data compute engine types, such as MaxCompute, E-MapReduce (EMR), Hologres, Realtime Compute for Apache Flink, and AnalyticDB, online.
| |
Operation Center is a big data O&M and monitoring system that provides the following features:
| ||
Data governance | Data Map is an enterprise-grade data management system that provides management, sorting, quick search, and in-depth understanding capabilities for data objects based on the underlying unified metadata services. | |
Data Quality is a unified data quality check system. It is deeply integrated with the task scheduling system of DataWorks to help you identify quality issues at the earliest opportunity and to prevent data quality issues from escalating in an effective manner. This provides reliable data for business in an efficient manner. | ||
Data Asset Governance is a unified asset governance system. Data Asset Governance automatically identifies items to be governed based on accumulated rules, and provides governance and optimization solutions that cover pre-event issue prevention and post-event issue resolution in multiple governance fields. This helps actively and systematically complete data governance. | ||
Security Center is an end-to-end data security governance platform that covers classification of data assets, sensitive data identification, management on data-related authorization, masking of sensitive data, audit of access to sensitive data, and risk identification and response. Security Center helps you determine data security governance issues. | ||
Data analysis and service | DataAnalysis provides lightweight analysis tools and data analysis capabilities, such as SQL query, workbook, visualized analysis, and intelligent data insight, and allows you to connect to different types of data sources and compute engines in a convenient manner. DataAnalysis can be used by data analysts and business operating personnel in business insight scenarios such as daily data acquisition, data query, and report analysis. | |
DataService Studio is a flexible, lightweight, secure, and stable API construction system. It provides comprehensive data service and sharing capabilities for individuals, teams, and enterprises to help manage internal and external API services in a centralized manner. | ||
More | Management Center is a unified management interface that provides administrators with key features such as workspace common configurations, data sources, computing resources, members and roles, and tenant configurations. You can use Management Center to efficiently manage and optimize various resources to ensure the smooth running in workspaces. You can adjust configurations based on your business requirements. | |
Approval Center allows you to manage sensitive behaviors and permissions on data, configure approval policies, and process requests. Approval Center can help meet the approval requirements of enterprises in internal compliance scenarios. | ||
Migration Assistant is an end-to-end task migration system. You can use Migration Assistant to migrate the tasks of open source scheduling engines, such as Oozie, Azkaban, Airflow, and DolphinScheduler, to DataWorks and back up and restore data development outcomes in DataWorks. | ||
Open Platform provides the OpenAPI, OpenEvent, and Extensions sub-modules, which help quickly connect various application systems to DataWorks. |