Interacting with the Metrics API

What is the Metrics API?

  • The Metrics API, Peak’s Semantic Layer, is an abstraction layer between the raw database and end users, allowing users to access data, metrics, and metadata in a user-friendly and understandable format.

  • For example, imagine we’ve built an inventory application. It might use anywhere from tens to hundreds of internal tables. Users may want to retrieve data from the application, but they shouldn’t need to understand the entire database structure to do so.

  • One option is to create dashboards for users. However, dashboards can limit flexibility. Users may want to group data by multiple columns or filter it based on specific fields, and accommodating all these variations in a static dashboard can be highly complex.

  • That’s where The Metrics API comes in. It allows us to define the models of our tables and associated metrics, enabling users to query these metrics with custom filters and groupings to obtain the data they need. Users don’t need to worry about the underlying database or tables—just the metrics and the related data that are made available to them.

  • The Metrics API is Peak’s implementation of a Semantic Layer, providing the ability to manage and query metrics efficiently.

Metrics API

  • The Metrics API is built over an open-source tool called CubeJS. To define models, metrics, and more, we follow the CubeJS syntax.

  • Once we have the metrics defined in a collection of YAML files, we are ready to publish them via the Metrics API. Let’s see how the Metrics API can be used to manage and query these metrics.

  • Before diving into the functions and usage, let’s go over some key terms used by the Metrics API:

    • Collection

      • A collection is a group of related metrics and other resources that belong together.

      • Creating a collection alone does not make the metrics available to tenants. Metrics must be “published” using the collection before they become available for users to query.

      • A collection can be scoped as either PUBLIC (available to all tenants) or PRIVATE (available only to the tenant it is created on). By default, a collection is scoped as PRIVATE.

    • Namespace

      • A namespace allows us to group published metrics for a tenant to prevent conflicts.

      • A default namespace called default is automatically created for each tenant and is used if no other namespace is specified.

      • We recommend creating a new namespace for development and keeping all production metrics in the default namespace.

    • Cube

      • A cube represents a table of related data.

    • Measures

      • Measures are aggregations performed on columns, such as sums, counts, or percentages.

      • These are essentially the “metrics” of the system.

    • Dimensions

      • Dimensions are attributes used to filter or group data, such as status, type, or categories.

    • View

      • Views sit on top of cubes and define the interface that users interact with.

      • Views can specify which resources the user should be able to access, hiding complex resources and creating a clean, simplified interface for the user.

  • To interact with the Metrics API, peak-sdk provides a metrics module available in both the CLI and the SDK. It contains all the necessary functions for managing and querying metrics. You can also interact with the API directly—documentation for the API is available here.

  • In the following sections, we will be using the CLI, but you should be able to easily migrate these commands to the SDK if needed.

Creating the Metrics

  • Before the metrics can be published, they need to be created. Refer to the CubeJS documentation for detailed instructions on how to create metrics.

  • Once the metrics are defined, you should have a set of yaml files, each containing one or more cubes and views. Let’s consider the following three files as an example:

    # metrics/sales.yml
    
    cubes:
        - name: sales
          sql_table: STAGE.QP__SALE
          data_source: default
          public: false
    
          dimensions:
              - name: sale_id
                sql: SALE_ID
                type: string
                primaryKey: true
    
              - name: customer_id
                sql: CUSTOMER_ID
                type: string
    
              - name: product_id
                sql: PRODUCT_ID
                type: string
    
          joins:
              - name: products
                relationship: one_to_one
                sql: "{CUBE}.product_id = {products.product_id}"
    
          measures:
              - name: total_sales_quantity
                sql: QUANTITY
                type: sum
                title: "Total Sales Quantity"
    
              - name: total_sales_revenue
                sql: QUANTITY * SELLING_PRICE
                type: sum
                title: "Total Sales Revenue"
    
              - name: average_selling_price
                sql: SELLING_PRICE
                type: avg
                title: "Average Selling Price"
    
              - name: average_quantity_sold
                sql: QUANTITY
                type: avg
                title: "Average Quantity Sold"
    
    # metrics/products.yml
    
    cubes:
        - name: products
          sql_table: STAGE.QP__PRODUCT
          data_source: default
          public: false
    
          dimensions:
              - name: product_id
                sql: PRODUCT_ID
                type: string
                primaryKey: true
    
              - name: product_name
                sql: PRODUCT_NAME
                type: string
    
              - name: category
                sql: PRODUCT_CATEGORY
                type: string
    
              - name: sub_category
                sql: PRODUCT_SUBCATEGORY
                type: string
    
          measures:
              - name: product_count
                sql: PRODUCT_ID
                type: count
                title: "Total Products"
    
    # metrics/views.yml
    
    views:
        - name: pricing
          cubes:
              - join_path: products
                includes:
                    - product_id
                    - product_name
                    - category
                    - sub_category
                    - product_count
                    - products
              - join_path: sales
                includes:
                    - sale_id
                    - total_sales_quantity
                    - total_sales_revenue
                    - average_selling_price
    
  • In this example, we are defining the metrics for an imaginary pricing application. We’ve created two cubes; sales and products.

  • Rather than exposing them as two separate cubes, we are exposing them via a view called pricing. This ensures that end users interact with a single pricing interface, which contains all the resources related to the pricing application.

Publishing the Metrics

  • Now that we have the yaml files ready, let’s see how we can publish them to a tenant.

  • The peak-sdk provides a publish function in the metrics module, allowing us to publish the metrics. Publishing metrics can be done in two ways:

    • Publish metrics directly using the created yaml files.

    • Create a collection and use it to publish the metrics.

Publish Metrics Directly

  • We can use the created yaml files to publish the metrics directly.

  • When this is done, it not only creates a collection with private scope, but also publishes the metrics in the specified namespace.

    peak metrics publish --artifact-path metrics --namespace sdk-example
    

    Response:

    {
        "artifactId": "3af5b853-4337-4b41-82bc-1b8b5a8440f6",
        "collectionId": "f371e152-4fe3-48d2-aa32-a3afaa4b8455",
        "publicationId": "960d04d7-0a62-4d90-89a6-d76127537b1d"
    }
    
  • The above command uses the metrics folder we created in the previous step to publish the metrics for the tenant. We are using a namespace called sdk-example to ensure that these metrics aren’t added to the default namespace but are kept separate.

  • Along with publishing the metrics, this command also creates a collection. However, the collection is always scoped as private and is therefore available only to the tenant that it is created on.

Using Collections

  • Collections are useful for creating a reusable group of unpublished metrics that can be published to a tenant as needed.

  • The best part about collections is that they can be made public, in which case they are available across all tenants. Once created, you can use the collection to publish the metrics to any tenant.

  • To create a collection, use the create-collection command:

    peak metrics create-collection \
        --name products \
        --scope public \
        --description "Metrics around the pricing app" \
        --artifact-path metrics
    

    Response:

    {
        "artifactId": "80669d62-2acd-410f-98e8-28fe4a622ced",
        "collectionId": "0bad55da-86d1-4e5a-b409-6190522e7b7f"
    }
    
  • The create-collection command returns a collectionId, which can then be used to publish the metrics:

    peak metrics publish --collection-id 0bad55da-86d1-4e5a-b409-6190522e7b7f --namespace sdk-example
    

    Response:

    {
        "artifactId": "80669d62-2acd-410f-98e8-28fe4a622ced",
        "collectionId": "0bad55da-86d1-4e5a-b409-6190522e7b7f",
        "publicationId": "af4506e4-741c-4bc5-b1ca-e4392c533295"
    }
    

One thing to note: you can run the publish command on a namespace multiple times and publish any number of metrics to that namespace. However, cube and view names must be unique within a namespace; otherwise, a conflict error will occur.

Querying the Metrics

  • Now that we have published a set of metrics, it’s time to query the values of these metrics. The Metrics API not only allows us to retrieve metric values but also enables us to apply filters and group the data by specific columns as needed.

Listing Resources

  • Before we get into querying, let’s first view a list of all the metrics that are available for a tenant. This can be done easily by running the list command:

    peak metrics list --namespace sdk-example
    

The response will look something like this:

{
    "data": [
        {
            "name": "pricing",
            "type": "view",
            "public": true,
            "measures": [
                {
                    "name": "pricing.product_count",
                    "type": "number",
                    "aggType": "count",
                    "public": true
                },
                {
                    "name": "pricing.total_sales_quantity",
                    "type": "number",
                    "aggType": "sum",
                    "public": true
                },
                {
                    "name": "pricing.total_sales_revenue",
                    "type": "number",
                    "aggType": "sum",
                    "public": true
                },
                {
                    "name": "pricing.average_selling_price",
                    "type": "number",
                    "aggType": "avg",
                    "public": true
                }
            ],
            "dimensions": [
                {
                    "name": "pricing.product_id",
                    "type": "string",
                    "public": true,
                    "primaryKey": false
                },
                {
                    "name": "pricing.product_name",
                    "type": "string",
                    "public": true,
                    "primaryKey": false
                },
                {
                    "name": "pricing.category",
                    "type": "string",
                    "public": true,
                    "primaryKey": false
                },
                {
                    "name": "pricing.sub_category",
                    "type": "string",
                    "public": true,
                    "primaryKey": false
                },
                {
                    "name": "pricing.sale_id",
                    "type": "string",
                    "public": true,
                    "primaryKey": false
                }
            ],
            "segments": []
        }
    ],
    "pageCount": 1,
    "pageNumber": 1,
    "pageSize": 25,
    "totalCount": 1
}
  • But that’s not all. The list function allows us to get a list of only specific resources, such as measures or dimensions. Simply pass the –type parameter to filter the results:

    peak metrics list --namespace sdk-example --type measure
    

Querying Metrics

  • Once we have gone through the list of available resources and decided which metric we want to query, we can run the query command to get the required value.

  • In the above example, let’s say we want to query total sales revenue, we can run the following command:

    peak metrics query --measures "pricing.total_sales_revenue" --namespace sdk-example
    

    Response:

    [{ "pricing.total_sales_revenue": 47434594.4457526 }]
    
  • Now, let’s say we want to get the total revenue by product. It’s quite easy to do so, we just need to pass the required field in the --dimensions parameter:

    peak metrics query --measures "pricing.total_sales_revenue" --dimensions "pricing.product_name" --namespace sdk-example
    

    Response:

    [
        {
            "pricing.product_name": "P49",
            "pricing.total_sales_revenue": 1327089.02611895
        },
        {
            "pricing.product_name": "P41",
            "pricing.total_sales_revenue": 1154901.18124684
        }
    ]
    
  • Another requirement arrives, and now we want to get the total sales quantity for the P41 product. Again, it is super easy to do so. Let’s use the YAML syntax of the command rather than passing everything in the CLI:

    # query.yml
    
    namespace: sdk-example
    measures:
        - pricing.total_sales_quantity
    filters:
        - dimension: pricing.product_name
          operator: equals
          values:
              - "P41"
    
    peak metrics query query.yml
    

    Response:

    [{ "pricing.total_sales_quantity": 6820 }]
    

Cleaning Things Up

  • Once some metrics are no longer needed, we can delete them. To delete the metrics, you will need a publicationId.

  • If you paid attention to the response from the publish function, you might have noticed that it returns a key called publicationId. A publicationId represents a group of published metrics that were published together.

  • But don’t worry if you didn’t note down the publicationId from the publish command response. It is also returned in the response of the list function.

  • To delete metrics, you need to pass in the publicationId, and this will delete all the metrics associated with that specific publication:

    peak metrics delete --publication-id 960d04d7-0a62-4d90-89a6-d76127537b1d
    
  • The above command will delete all metrics published with the given publicationId. If the namespace becomes empty after deleting the metrics, the namespace itself will also be deleted. If you publish new metrics to that namespace, it will be automatically re-created.

  • You may also want to delete a collection. This is simple, just run the following command:

    peak metrics delete-collection 0bad55da-86d1-4e5a-b409-6190522e7b7f
    
  • Currently, it’s not possible to delete a single metric from a publication—you must delete the entire publication. However, this may change in future SDK releases, so stay updated.

Best Practices

  • It is recommended to create a folder called metrics and store all your cube files in that folder. This folder can then be referenced when creating a collection or publishing metrics.

  • We suggest making all cubes private and creating views to expose resources to the user. All cubes are private, and only a single view, such as pricing, is exposed:

    • This provides users with a much simpler interface for querying the metrics and keeps things cleaner. In the example above, users can access all the required measures and dimensions from the pricing view (pricing.<measure_name>), which is more efficient than having to pass different cube names (<cube_name>.<measure_name>).

    • This approach also offers cleaner segregation between resources. For example, if metrics for two different applications are published (e.g., pricing and inventory apps), the metrics remain separate—the pricing metrics are accessible in a view called pricing, while the inventory metrics are in a view called inventory.

  • During development, it is recommended to create a separate namespace to test things out before publishing metrics to the default namespace.