3 - Verify Deployments with Prometheus

Updated 2 months ago by Michael Cretzman

The following procedure describes how to add Prometheus as a verification step in a Harness workflow. For more information about workflows, see Add a Workflow.

Once you run a deployment and Prometheus preforms verification, Harness machine-learning verification analysis will assess the risk level of the deployment.

In order to obtain the names of the host(s), pod(s), or container(s) where your service is deployed, the verification provider should be added to your workflow after you have run at least one successful deployment.

Deployment Verification Setup

To verify your deployment with Prometheus, do the following:

  1. Ensure that you have added Prometheus as a verification provider.
  2. In your workflow, under Verify Service, click Add Verification, and then click Prometheus. The Prometheus dialog appears.

The Prometheus workflow dialog has the following fields.

Field

Description

Prometheus Server

Select the server you added when setting up the Prometheus verification provider.

Metric to Monitor

Every time series is uniquely identified by its metric name and a set of key-value pairs, also known as labels. For more information, see Data Model from Prometheus.A metric requires the following parameters:

  • Metric Name: The name of the metric defined in Prometheus.
  • Metric Type: The type of metric (Response Time, Error, Throughput, or Value).
  • Group Name: The service or request context to which the metric relates. For example, Login.
  • Query: The API query required to retrieve the metric value. This query must include a placeholder for hostname, $hostName.

For Query, you can simply copy your query from Prometheus and paste it into Harness, and then replace the actual hostname in the query with $hostName.

For example, here is a query in Prometheus:

The actual query string is:

container_cpu_usage_seconds_total{pod_name="prometheus-deployment-7c878596ff-r8qrt",namespace="harness"}

When you paste that string into the Query field in Harness, you replace the pod_name value with $hostName:

container_cpu_usage_seconds_total{pod_name="$hostName",namespace="harness"}

Expression for Host/Container Name

The expression entered here should resolve to a host/container name in your deployment environment. By default, the expression is ${host.hostName}. If you begin typing the expression into the field, the field provides expression assistance.

The following settings are common to all verification provider dialogs in workflows.

Field

Description

Analysis Time duration

Set the duration for the verification step. If a verification step exceeds the value, the workflow Failure Strategy is triggered. For example, if the Failure Strategy is Ignore, then the verification state is marked Failed but the workflow execution continues.

See CV Strategies, Tuning, and Best Practices.

Baseline for Risk Analysis

Canary Analysis - Harness will compare the metrics received for the nodes deployed in each phase with metrics received for the rest of the nodes in the application. For example, if this phase deploys to 25% of your nodes, the metrics received from Prometheus during this deployment for these nodes will be compared with metrics received for the other 75% during the defined period of time.

Previous Analysis - Harness will compare the metrics received for the nodes deployed in each phase with metrics received for all the nodes during the previous deployment. For example, if this phase deploys V1.2 to node A, the metrics received from Prometheus during this deployment will be compared to the metrics for nodes A, B, and C during the previous deployment (V1.1). Previous Analysis is best used when you have predictable load, such as in a QA environment.

See CV Strategies, Tuning, and Best Practices.

Algorithm Sensitivity

See CV Strategies, Tuning, and Best Practices.

Execute with previous steps

Check this checkbox to run this verification step in parallel with the previous steps in Verify Service.

Failure Criteria

Specify the sensitivity of the failure criteria. When the criteria is met, the workflow Failure Strategy is triggered.

Include instances from previous phases

If you are using this verification step in a multi-phase deployment, select this checkbox to include instances used in previous phases when collecting data. Do not apply this setting to the first phase in a multi-phase deployment.

Wait interval before execution

Set how long the deployment process should wait before executing the verification step.

Verification Results

When Harness deploys a new application or service to the target environment defined in the workflow, it will immediately connect to the Prometheus Server and build a model of what it is observing.

Next, Harness compares this model with previous deployment models to identify anomalies or regressions. If necessary, Harness rolls back to the previous working version automatically. For more information, see Rollback Steps.

Here is an example of a typical, successful deployment verified using Jenkins and Prometheus.

Under Prometheus, you can see that all Prometheus metrics have been validated by the Harness machine learning algorithms. Green indicates that there are no anomalies or regressions identified and the deployment is operating within its normal range.

To see an overview of the verification UI elements, see Continuous Verification Tools.

Next Steps


How did we do?