Skip to main content

Common Use Cases

SQL Querying and Analytics

Data analysts and engineers query data directly from the browser using a distributed SQL engine that scales across large datasets without requiring data movement. Query results stay inside the cluster; nothing is cached or stored on Cogrion infrastructure.

Teams build and share dashboards on top of saved queries. Dashboards support a wide range of chart types, cross-filtering, and scheduled refresh — designed for both technical users and business stakeholders who need to explore the same data in different ways.

Data Pipeline Orchestration

Data engineers define pipelines as directed acyclic graphs (DAGs), where tasks declare their dependencies and the platform handles scheduling, retry logic, and execution order. Pipelines run inside the tenant cluster with access to the cluster's storage and compute resources.

Monitoring is built in: graph and Gantt views show execution status in real time, task logs are accessible immediately after a run, and failures surface before downstream consumers are affected.

Data Discovery and Governance

Before data can be used, it needs to be understood. The catalog gives teams a central view of what tables and schemas exist, who owns them, and how they relate to one another. Lineage tracing shows how data moves from source to consumer, making it easier to assess the impact of upstream changes.

Access control integrates directly into the catalog. Permissions are assigned at the catalog, schema, and table level and enforced at query time — so discovery and access management stay in sync.

Machine Learning and AI

Data scientists and ML engineers work through the full model lifecycle on the platform. Experiments track training runs, hyperparameters, and metrics across iterations, making it straightforward to compare approaches and reproduce results. The feature store manages reusable feature groups with defined ownership and storage backends, so features built for one model can be shared across the team.

Trained models are registered and versioned in a model registry. Teams promote versions through staging and production stages, and deploy them as REST API endpoints through AI Gateway — which supports both internally hosted models and external model providers.

For teams that want rapid model selection without manual tuning, AutoML runs the selection and training process automatically and surfaces evaluation metrics for review.

Notebook-Based Collaboration

Data teams organize their work in collaborative notebooks — combining code, outputs, and documentation in a shared environment. Notebooks are organized in workspaces with folder hierarchies, per-item sharing controls, and scheduling support for recurring executions.