ELK Elasticsearch Verification

Updated 2 days ago by Michael Cretzman

The following sections describe how Harness integrates ELK Elasticsearch into Harness Continuous Verification to monitor your live, production services and verify your deployments:

ELK and Harness

Harness Continuous Verification integrates with ELK to verify your deployments and live production applications using the following Harness features:

  • 24/7 Service Guard - Monitors your live, production applications.
  • Deployment Verification - Monitors your application deployments, and performs automatic rollback according to your criteria.

This document describes how to set up these Harness Continuous Verification features and monitor your deployments and production applications using its unsupervised machine-learning functionality.

Exceptions with Elasticsearch via Kibana

Harness Analysis of Elasticsearch Verification

Harness verifies your deployments and production services using the following features:

  • 24/7 Service Guard - Monitors your live, production applications using Harness Continuous Verification.
  • Deployment Verification - Monitors your application deployments using Harness Continuous Verification, and performs automatic rollback according to your criteria.

Setup Preview

You set up ELK and Harness in the following way:

  1. ELK - Monitor your application using ELK. In this article, we assume that you are using ELK to monitor your application already.
  2. ​Verification Provider Setup - In Harness, you connect Harness to your ELK account, adding ELK as a Harness Verification Provider.
  3. Harness Application - Create a Harness Application with a Service and an Environment. We do not cover Application set up in this article. See Application Checklist.
  4. ​24/7 Service Guard Setup- In the Environment, set up 24/7 Service Guard to monitor your live, production application.
  5. Verify Deployments:
    1. Add a Workflow to your Harness Application and deploy your microservice or application to the service infrastructure in your Environment.
    2. After you have run a successful deployment, you then add verification steps to the Workflow using your Verification Provider.
    3. Harness uses unsupervised machine-learning and Elasticsearch analytics to analyze your future deployments, discovering events that might be causing your deployments to fail. Then you can use this information to set rollback criteria and improve your deployments.

Verification Provider Setup

The first step in using Elasticsearch with Harness is to set up an Elasticsearch Verification Provider in Harness.

A Harness Verification Provider is a connection to monitoring tools such as Elasticsearch. Once Harness is connected, you can use Harness 24/7 Service Guard and Deployment Verification with your Elasticsearch data and analysis.

To add Elasticsearch as a Harness Verification Provider, do the following:

  1. In Harness, click Setup.
  2. Click Connectors, and then click Verification Providers.
  3. Click Add Verification Provider, and select ELK. The Add ELK Verification Provider dialog for your provider appears.
  1. Complete the following fields of the Add ELK Verification Provider dialog.

Field

Description

Connector type

Select the server type for the connection. You can choose whether to access your ELK server directly using the Elasticsearch Server URL (recommended), or using the Kibana Server URL.

URL

Enter the URL of the server. The format is http(s)://server:port/. The default port is 9200.

Username and Password

Enter the credentials to authenticate with the server.

Token

Some systems provide Elasticsearch as a service and use access tokens.

If you have token based authentication, provide the authentication header that is passed when making the HTTP request.

Header: APITokenKey. Example: x-api-key (varies by system).

Value: APITokenValue. Example: kdsc3h3hd8wngdfujr23e23e2.

Display Name

Enter a display name for the provider. If you are going to use multiple providers of the same type, ensure you give each provider a different name.

Usage Scope

If you want to restrict the use of a provider to specific applications and environments, do the following:

In Usage Scope, click the drop-down under Applications, and click the name of the application.

In Environments, click the name of the environment.

  1. When you have set up the dialog, click TEST.
  2. Once the test is completed, click SUBMIT to add the Verification Provider.

If you receive an error, see Troubleshooting.

24/7 Service Guard Setup

Harness 24/7 Service Guard monitors your live applications, catching problems that surface minutes or hours following deployment. For more information, see 24/7 Service Guard.

You can add your Elasticsearch monitoring to Harness 24/7 Service Guard in your Harness Application Environment. For a setup overview, see Setup Preview.

This section assumes you have a Harness Application set up and containing a Service and Environment. For steps on setting up a Harness Application, see Application Checklist.

To set up 24/7 Service Guard for Elasticsearch, do the following:

  1. Ensure that you have added ELK Elasticsearch as a Harness Verification Provider, as described in Verification Provider Setup.
  2. In your Harness Application, ensure that you have added a Service, as described in Services. For 24/7 Service Guard, you do not need to add an Artifact Source to the Service, or configure its settings. You simply need to create a Service and name it. It will represent your application for 24/7 Service Guard.
  3. In your Harness Application, click Environments.
  4. In Environments, ensure that you have added an Environment for the Service you added. For steps on adding an Environment, see Environments.
  5. Click the Environment for your running microservice. Typically, the Environment Type is Production.
  6. In the Environment page, locate 24/7 Service Guard.
  7. In 24/7 Service Guard, click Add Service Verification, and then click ELK. The ELK dialog appears.
  8. Fill out the dialog. The dialog has the following fields.
For 24/7 Service Guard, the queries you define to collect logs are specific to the application or service you want monitored. Verification is application/service level. This is unlike Workflows, where verification is performed at the host/node/pod level.

Field

Description

Display Name

The name that will identify this service on the Continuous Verification dashboard. Use a name that indicates the environment and monitoring tool, such as ELK.

Service

The Harness Service to monitor with 24/7 Service Guard.

ELK Server

Select the ELK Verification Provider to use.

Search Keywords

Enter search keywords for your query, such as error or exception.

Query Type

Select TERM to finds documents that contain the exact term specified in the inverted index. MATCH queries accept text, numerics, and dates, analyze them, and construct a query. If you want the query analyzed, then use MATCH.

Index

Enter the the index to search. This field is automatically populated from the index templates, if available.

Message Field

Enter the field by which the messages are usually indexed. Typically, a log field.

To find the field in Kibana and enter it in Harness, do the following:

  1. In Kibana, click Discover.
  2. In the search field, search for error or exception.
  3. In the results, locate a log for the host/container/pod ELK is monitoring. For example, in the following Kubernetes results in Kibana, the messages are indexed under the log field.
  4. In Harness, in the ELK dialog, next to Message Field, click Guide From Example. The Message Field popover appears.
  5. In the JSON response, click on the name of the label that maps to the log in your Kibana results. Using our Kubernetes example, you would click the log label.

    The label is added to the Message Field.

Timestamp Field

Enter the timestamp field in the Elasticsearch record, such as @timestamp.

Timestamp Format

Enter the format for the timestamp field in the Elasticsearch record. Use Kibana to determine the format.

In Kibana, use the Filter feature in Discover to construct your timestamp range:

Format Examples:

Timestamp: 2018-08-24T21:40:20.123Z. Format: yyyy-MM-dd'T'HH:mm:ss.SSSX

Timestamp: 2018-08-30T21:57:23+00:00. Format: yyyy-MM-dd'T'HH:mm:ss.SSSXXX

For more information, see Data Math from Elastic.

Algorithm Sensitivity

Select the Algorithm Sensitivity.

Enable 24/7 Service Guard

Click the checkbox to enable 24/7 Service Guard.

Baseline

Select the baseline time unit for monitoring. For example, if you select For 4 hours, Harness will collect the logs for the last 4 hours as the baseline for comparisons with future logs. If you select Custom Range you can enter a Start Time and End Time.

When you are finished, the dialog will look something like this:

  1. Click TEST. Harness verifies the settings you entered.
  2. Click SUBMIT. The ELK 24/7 Service Guard is configured.

To see the running 24/7 Service Guard analysis, click Continuous Verification.

The 24/7 Service Guard dashboard displays the production verification results.

For information on using the dashboard, see Using 24/7 Service Guard.

Verify Deployments

Harness can analyze Elasticsearch data and analysis to verify, rollback, and improve deployments. To apply this analysis to your deployments, you set up Elasticsearch as a verification step in a Harness Workflow.

This section covers how to set up Elasticsearch in a Harness Workflow, and provides a summary of Harness verification results.

In order to obtain the names of the host(s), pod(s), or container(s) where your service is deployed, the verification provider should be added to your workflow after you have run at least one successful deployment.

Deployment Verification Setup

To add an ELK verification step to your Workflow, do the following:

  1. Ensure that you have added ELK Elasticsearch as a Verification Provider, as described in Verification Provider Setup.
  2. In your Workflow, under Verify Service, click Add Verification, and then click ELK. The ELK dialog appears.

To configure the ELK dialog fields, do the following:

  1. In Elastic Search Server, select the server you added when you set up the ELK verification provider, as described above.
  2. In Search Keywords, enter search keywords for your query, such as error or exception. The keywords are searched against the logs identified in the Message field of the dialog (see below).

    For an advanced query, enter an Elasticsearch JSON query. You can use JSON to create complex queries beyond keywords. The following example looks for the substring error in the field log:

    {"regexp":{"log": {"value":"error"}}}
  3. In Query Type, select TERM to finds documents that contain the exact term specified in the inverted index. MATCH queries accept text, numerics, and dates, analyze them, and construct a query. If you want the query analyzed, then use MATCH.
  4. In Index, enter the the index to search. This field is automatically populated from the index templates, if available.
    If there are no index templates, or if you do not have administrator privileges with ELK, enter the index manually.
    1. To locate indices, in Kibana, click Management.
    2. Click Index Patterns. The Index Patterns page appears.
    3. Copy the name of one of the Index patterns.
    4. In Harness, in the ELK dialog, paste the name of the Index pattern into Indices.
  5. In Host Name Field, enter the field name used in the ELK logs that refers to the host/pod/container ELK is monitoring.

    To find the hostname in Kibana and enter it in Harness, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for erroror exception.
    3. In the results, locate the host name of the host/container/pod where ELK is monitoring. For example, when using Kubernetes, the pod name field kubernetes.pod_name is used.
    4. In Harness, in the ELK dialog, next to Host Name Field, click Guide From Example. The Host Name Field popover appears.
    5. In the JSON response, click on the name of the label that maps to the host/container/pod in your log search results. Using our Kubernetes example, under pod, you would click the first name label.

      The Host Name Field is filled with the JSON label for the hostname.
  6. In Message Field, enter the field by which the messages are usually indexed. Typically, a log field.

    To find the field in Kibana and enter it in Harness, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for error or exception.
    3. In the results, locate a log for the host/container/pod ELK is monitoring. For example, in the following Kubernetes results in Kibana, the messages are indexed under the log field.
    4. In Harness, in the ELK dialog, next to Message Field, click Guide From Example. The Message Field popover appears.
    5. In the JSON response, click on the name of the label that maps to the log in your Kibana results. Using our Kubernetes example, you would click the log label.

      The label is added to the Message Field.
  7. In Expression for Host/Container name, add an expression that evaluates to the host name value for the field you entered in the Host Name Field above. The default expression is ${host.hostName}.
    In order to obtain the names of the host where your service is deployed, the verification provider should be added to your workflow after you have run at least one successful deployment.
    To ensure that you pick the right field when using Guide From Example, you can use a host name from the ELK log messages as a guide.

    To use Guide From Example for a host name expression, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for error or exception.
    3. In the results, locate the name of the host/container/pod ELK is monitoring. For example, when using Kubernetes, the pod name field kubernetes.pod_name displays the value you need.

      The expression that you provide in Expression for Host/Container Name should evaluate to the name here, although the suffixes can differ.
    4. In Harness, in your workflow ELK dialog, click Guide From Example. The Expression for Host Name popover appears.

      The dialog shows the service, environment, and service infrastructure used for this workflow.
    5. In Host, click the name of the host to use when testing verification. The hostname will be similar to the hostname you used for the Host Name Field, as described earlier in this procedure. The suffixes can be different.
    6. Click SUBMIT. The JSON for the host appears. Look for the host section.

      You want to use a name label in the host section. Do not use a host name label outside of that section.
    7. To identify which label to use to build the expression, compare the host/pod/container name in the JSON with the hostname you use when configuring Host Name Field.
    8. In the Expression for Host Name popover, click the name label to select the expression. Click back in the main dialog to close the Guide From Example. The expression is added to the Expression for Host/Container name field.

      For example, if you clicked the name label, the expression ${host.name} is added to the Expression for Host/Container name field.
  8. In Timestamp format, enter the format for the timestamp field in the Elasticsearch record. Use Kibana to determine the format.

    In Kibana, use the Filter feature in Discover to construct your timestamp range:
    Format Examples:

    Timestamp: 2018-08-24T21:40:20.123Z. Format: yyyy-MM-dd'T'HH:mm:ss.SSSX

    Timestamp: 2018-08-30T21:57:23+00:00. Format: yyyy-MM-dd'T'HH:mm:ss.SSSXXX

    For more information, see Data Math from Elastic.
  9. At the bottom of the New Relic dialog, click TEST.

    A new Expression for Host Name popover appears.

    In Host, select the same host you selected last time, and then click RUN. Verification for the host is found.

If you receive an error, it is likely because you selected the wrong label in Expression for Host/Container name or Host Name Field.

  1. Next, click Analysis Details. The Analysis Details appear.

The following settings are common to all verification provider dialogs in workflows.

Field

Description

Analysis Period

Set the duration for the verification step. If a verification step exceeds the value, the workflow Failure Strategy is triggered. For example, if the Failure Strategy is Ignore, then the verification state is marked Failed but the workflow execution continues.

Baseline for Risk Analysis

Select one of the following:

  • Previous Analysis - Select Previous Analysis to have this verification use the previous analysis for a baseline comparison.
  • Canary Analysis - If your workflow is a Canary workflow type, you can select Canary Analysis to have this verification compare old versions of nodes to new versions of nodes in real-time.
  • Predictive Analysis - The Predictive Analysis option instructs Harness to take previous logs over the length of time specified in Baseline for Predictive Analysis, set those logs as a baseline analysis, and then compare that baseline with future logs for the length of time in Analysis Time duration. Harness then analyses these past and future logs to see if there are anomalies or unknown and unexpected frequencies that were potentially triggered during deployment.For Canary Analysis and Previous Analysis, analysis happens at the host/node/pod level.For Predictive Analysis, data collection happens at the host/node/pod level but analysis happens at the application or service level. Consequently, for data collection, provide a query that targets the logs for the host using fields such as SOURCE_HOST in Field name for Host/Container.

Algorithm Sensitivity

Select the sensitivity that will result in the most useful results for your analysis.

Execute with previous steps

Check this checkbox to run this verification step in parallel with the previous steps in Verify Service.

Include instances from previous phases

If you are using this verification step in a multi-phase deployment, select this checkbox to include instances used in previous phases when collecting data. Do not apply this setting to the first phase in a multi-phase deployment.

When you are finished, click SUBMIT. The ELK verification step is added to your workflow.

Verification Results

Once you have deployed your workflow (or pipeline) using the New Relic verification step, you can automatically verify cloud application and infrastructure performance across your deployment.

Workflow Verification

To see the results of Harness machine-learning evaluation of your ELK verification, in your workflow or pipeline deployment you can expand the Verify Service step and then click the ELK step.

Continuous Verification

You can also see the evaluation in the Continuous Verification dashboard. The workflow verification view is for the DevOps user who developed the workflow. The Continuous Verification dashboard is where all future deployments are displayed for developers and others interested in deployment analysis.

To learn about the verification analysis features, see the following sections.

Deployments

Deployment info
See the verification analysis for each deployment, with information on its service, environment, pipeline, and workflows.

Verification phases and providers
See the vertfication phases for each vertfication provider. Click each provider for logs and analysis.

Verification timeline
See when each deployment and verification was performed.

Transaction Analysis

Execution details
See the details of verification execution. Total is the total time the verification step took, and Analysis duration is how long the analysis took.

Risk level analysis
Get an overall risk level and view the cluster chart to see events.

Transaction-level summary
See a summary of each transaction with the query string, error values comparison, and a risk analysis summary.

Execution Analysis

Event type
Filter cluster chart events by Unknown Event, Unexpected Frequency, Anticipated Event, Baseline Event, and Ignore Event.

Cluster chart
View the chart to see how the selected event contrast. Click each event to see its log details.

Event Management

Event-level analysis
See the threat level for each event captured.

Tune event capture
Remove events from analysis at the service, workflow, execution, or overall level.

Event distribution
Click the chart icon to see an event distribution including the measured data, baseline data, and event frequency.

Troubleshooting

The following are resolutions to common configuration problems.

Workflow Step Test Error

When you click TEST in the ELK workflow dialog Expression for Host Name popover, you should get provider information:

The following error message can occur when testing the New Relic verification step in your workflow:

ELK_CONFIGURATION_ERROR: Error while saving ELK configuration. No node with name ${hostName} found reporting to ELK
Cause

The expression in the Expression for Host/Container name field is incorrect. Typically, this occurs when the wrong hostName label is selected to create the expression in the Expression for Host/Container name field.

Solution

Following the steps in Verify with ELK again to select the correct expression. Ensure that the name label selected is under the host section of the JSON.

SocketTimeoutException

When you add an ELK verification provider and click SUBMIT, you might see the following error.

Cause

The Harness delegate does not have a valid connection to the ELK server.

Solution

On the same server or instance where the Harness delegate is running, run one of the following cURL commands to verify whether the delegate can connect to the ELK server.

If you do not have a username and password for the ELK server:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If you have username and password then use this command:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -H 'Authorization: <Basic: Base64 encoded username:password>'-d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If you have token-based authentication, use this command:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -H 'tokenKey: tokenValue'-d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If the cURL command cannot connect, it will fail.

If the cURL command can connect, it will return a HTTP 200, along with the JSON.

If the cURL command is successful, but you still see the SocketTimeoutException error in the ELK dialog, contact Harness Support (support@harness.io).

It is possible that the response from the ELK server is just taking very long.

Next Steps


How did we do?