Modeling Layer

A semantic layer acts as a translation layer between your raw data and business definitions. Zing lets you do data modeling and create a semantic layer easily to produce trustable, governed results.

Data modeling (semantic layer)


A semantic layer acts as a translation layer between your raw data and business definitions. It provides a business-friendly interface to your data warehouse, allowing you to define metrics, relationships, and access controls in a single place. Think of it as your organization’s single source of truth for metric definitions, and a way to ensure only certain metrics and columns are used for analysis to keep things focused for end users.

 

While Zing already offers capabilities to define metrics examples, create aliases, and define join relationships, a semantic layer goes beyond this to let you to specify:

  • Which fields you want to expose from your tables / views
  • Metrics calculations and descriptions
  • Dimensions which must co-occur with metrics (for instance, only show order_count by city but never by customer_name)
  • Table / view join relationships
  • Field and table aliases
  • Row level security requirements

Zing’s Implementation

We’ve taken a straightforward approach to semantic layer implementation, offering a JSON-based configuration that makes it easy to define your metrics and their relationships.

Importantly, Zing does not require you to have a data model to work, and a semantic model can be added later as your organization’s needs for consistency in metrics, guardrails, and governance increase.

Columns, Metrics and Relationships

A data model is defined at the data source level.

Columns and Tables

You can choose which tables/views, columns, and metrics are used to answer questions by defining a model. This allows you to put guardrails on natural language questions that users may ask within Zing, and ensures data users have a consistent, governed experience.

Specifying tables/views and columns is not required. If no tables/views are specified in your modeling layer, then by default, all tables/views within data source will be made available to answer questions.

If no columns are specified, then all columns within a given table/view will be made available.

Metrics

Metrics are defined by a:

  • A description describing the metric (and used as context to map user questions to this metric)
  • An expression which is the calculation for the metric
  • A Group bys section which lets you indicate is a metric can be aggregated by various columns (‘allowed’), must be aggregated by one or more columns (‘required’), or cannot be aggregated by one or more columns (‘forbidden’)
    • Allowed
    • Required
    • Forbidden
    • Note: that these group by attributes are optional and if they are not specified then a metric can be aggregated by any column in the columns part of the date model

A sample data model file for Zing is shown below:

{
  "tables": [
    {
    "name": "Orders",
      "columns": [
        {
          "name": "OrderID",
          "description": "Unique identifier for the order"
        },
        {
          "name": "CustomerID",
          "description": "Unique identifier for the customer"
        },
        {
          "name": "Item_Amount",
          "description": "Total revenue excluding taxes of all items in the order"
        },
        {
          "name": "Tax_Amount",
          "description": "Tax amount paid on the order"
        },
        {
          "name": "Total_Amount",
          "description": "Total order amount including tax"
        }
      ]
    },
    {
      "name": "Customers",
      "columns": [
        {
          "name": "CustomerID",
          "description": "Unique identifier for the customer"
        },
        {
          "name": "Name",
          "description": "Name of the customer"
        },
        {
          "name": "City",
          "description": "City of the customer"
        },
        {
          "name": "State",
          "description": "State of the customer"
        }
      ]
      }
  ],
  "metrics": [
    {
      "description": "Tax_Percentage",
      "expr": "sum(Tax_Amount)/sum(Item_Amount)",
      "groupBy": {
        "allowed": [
          "City",
          "State"
        ],
        "required": [],
        "forbidden":[]
      }
    },
    {
      "description": "Order_Amount",
      "expr": "sum(Total_Amount)",
      "groupBy": {
        "allowed": [
        ],
        "required": [],
        "forbidden": ["CustomerID", "Name"]
      }
    }
  ]
}

This would ensure that:

  • Order amount can’t be shown by Customer Name or Customer ID

  • Tax rate is only shown at a City or State level

  • Only the Orders and Customers tables/views are available to answer questions, and only with the specific columns outlined.

     

Getting Started

  1. Log in to the Zing Data Console, then click on ‘data sources’ and select the data source you’d like to create (or update) a modeling layer for
  2. Click the ‘semantic layer’ tab
  3. Upload a semantic model .json file in the format outlined above
  4. Any questions that a user asks with natural language, will adhere to the modeling layer you have uploaded.

    Questions asked with the visual editor, or using the SQL editor do not adhere to the data modeling layer you’ve uploaded because those interfaces are typically used for more customized or ad-hoc analysis.
  5. Note that users with an editor or admin role can optionally ask natural language questions which are not bound to the semantic layer by clicking on the options drop-down within the natural language composer and selecting ‘Expert mode’ instead of ‘Semantic model mode’