Módulo 1: Introduction to data integration and Cloud Data Fusion
Contenidos:
- Data integration: what, why, challenges.
- Data integration tools used in industry.
- User personas.
- Introduction to Cloud Data Fusion.
- Data integration critical capabilities.
- Cloud Data Fusion UI components.
Objetivos:
- Understand the need for data integration.
- List the situations/cases where data integration can help businesses.
- List the available data integration platforms and tools.
- Identify the challenges with data integration.
- Understand the use of Cloud Data Fusion as a data integration platform.
- Create a Cloud Data Fusion instance.
- Familiarize with core framework and major components in Cloud Data Fusion.
Módulo 2: Building pipelines
Contenidos:
- Cloud Data Fusion architecture.
- Core concepts.
- Data pipelines and directed acyclic graphs (DAG).
- Pipeline Lifecycle.
- Designing pipelines in Pipeline Studio.
Objetivos:
- Understand Cloud Data Fusion architecture.
- Define what a data pipeline is.
- Understand the DAG representation of a data pipeline.
- Learn to use Pipeline Studio and its components.
- Design a simple pipeline using Pipeline Studio.
- Deploy and execute a pipeline.
Módulo 3: Designing complex pipelines
Contenidos:
- Branching, Merging and Joining.
- Actions and Notifications.
- Error handling and Macros.
- Pipeline Configurations, Scheduling, Import and Export.
Objetivos:
- Perform branching, merging, and join opeations.
- Execute pipeline with runtime arguments using macros.
- Work with error handlers.
- Execute pre- and post-pipeline executions with help of actions and notifications.
- Schedule pipelines for execution.
- Import and export existing pipelines.
Módulo 4: Pipeline execution environment
Contenidos:
- Schedules and triggers.
- Execution environment: Compute profile and provisioners.
- Monitoring pipelines.
Objetivos:
- Understand the composition of an execution environment.
- Configure your pipeline’s execution environment, logging, and Understand concepts like compute profile and provisioner.
- Create a compute profile.
- Create pipeline alerts.
- Monitor the pipeline under execution.
Módulo 5: Building Transformations and Preparing Data with Wrangler
Contenidos:
- Wrangler.
- Directives.
- User-defined directives.
Objetivos:
- Understand the use of Wrangler and its main components.
- Transform data using Wrangler UI.
- Transform data using directives/CLI methods.
- Create and use user-defined directives.
Módulo 6: Connectors and streaming pipelines
Contenidos:
- Understand the data integration architecture.
- List various connectors.
- Use the Cloud Data Loss Prevention (DLP) API.
- Understand the reference architecture of streaming pipelines.
- Build and execute a streaming pipeline.
Objetivos:
- Connectors.
- DLP.
- Reference architecture for streaming applications.
- Building streaming pipelines.
Módulo 7: Metadata and data lineage
Contenidos.
Objetivos:
- List types of metadata.
- Differentiate between business, technical, and operational metadata.
- Understand what data lineage is.
- Understand the importance of maintaining data lineage.
- Differentiate between metadata and data lineage.
Módulo 8: Summary