Detailed Course Outline
Installation Overview (Quick Start)
- Cloudera Management Console
- CDP Credentials
- CDP Control Plane Regions
- Register a CDP environment
- Cloudera Data Platform
- Industry Trends for Big Data
- The Challenge to Become Data-Driven
- The Enterprise Data Cloud
- CDP Overview
- CDP Form Factors
CDP Architecture
- Overview
- Key Concepts & Components
- CDP Runtime Overview
- Minimum Hardware
- Outbound Connections
Control Plane Overview
- Accessing and Managing an Environment
- Data Management Overview
- Management Console
- Dashboard
- Environments
- Data Lakes
- User Management
- Classic Clusters
- Data Hubs
- Data Catalog
- Replication Manager
- Observability
CDP CLI (Command Line Interface)
- CDP CLI Command Line Interface
- Installing CDP CLI / CLI Client Setup
- CLI Modules
- Generating an API access key / Configuring CDP client
- Logging into the CDP CLI/SDK
- Configuring CLI autocomplete / CLI reference / Accessing CLI help
- CDP API overview / CDP SDK for Java overview / CDP curl overview
Managing CDP Access
- Management Console
- User Management
- Create Machine User
- User Permissions
- Sync Users
- Configure Groups
- Identity Providers
- Roles and Resource Roles
- Global Settings
- Audit Data Storage Credential
Data Hubs Overview
- Data Hubs
- Planning / Creating your Data Hub Cluster
- General Planning Considerations
- Configuring Nodes
- Managing Data Hub
- Choosing the Right Hardware
- Advanced Cluster Configuration
- Data Hub Types
- DataFlow
- Data Engineering
- Troubleshooting
Managing Data Hubs
- Best Practices on Data Hubs
- Sizing Data Hubs
- Cloudera Manager
- Data Hub Services
- Autoscaling/Data Hub Info
- Checking Cluster Health Status / Events and Alerts
- Host Maintenance
- Upgrading a Data Hub Cluster
- Monitoring / Monitoring Features
Data Services Overview
- Data Services Overview
- Data Services
- Planning Your Data Service Cluster
- Choosing the Right Hardware / Network Considerations
- Creating Data Services
- DataFlow
- Data Engineering
- Data Warehouse
- Operational Database
- Machine Learning
- Troubleshooting
DataFlow
- DataFlow Service Overview
- Data Ingest Overview
- Ingesting Data using File Transfer or REST Interfaces
- Ingesting Data Using NiFi
- Autoscaling
Data Engineering
- Data Engineering Service Overview
- Apache Spark/Flink/Kafka streams Overview
- Autoscaling
Data Warehouse
- Data Warehouse Service Overview
- Adding and Managing a Database Catalog
- Adding and Tuning a Virtual Warehouse
- Querying a Data Warehouse
- Data Visualization
- Monitoring & Troubleshooting
Operational Database
- Operational Database Service Overview
- Apache HBase/Search Overview
- Autoscaling
Machine Learning
- Machine Learning Service Overview
- CML Engines
- Requirements for CML Workspaces
- Provisioning a CML Workspace
- CML Auto-Scaling
- Monitoring
Monitoring and Management
- Monitoring and Management in CDP Public Cloud
- Data Lake Cluster Monitoring and CDP Auditing
- Getting Started with Monitoring in CDP
- Monitoring with Cloudera Manager: Health Tests and Dashboards
- Monitoring Clusters, Services, Hosts, Roles, and Activities
- Troubleshooting Cluster Configuration and Operation
Data Management
- SDX - Security and Governance
- Security Concepts
- Access Cloud Storage
- Data Lake Security: SDX
- Apache Ranger
- CDP Authorization / Authentication
- Data Governance
- Apache Atlas
- Data Catalog
Observability
- Overview
- Support
- Observability deployment architecture
- Monitoring capabilities
- Working with alerts, costs, and reports