ELK Elasticsearch Verification

Updated 2 months ago by Michael Cretzman

Harness supports ELK Elasticsearch analytics engine as a deployment verification provider and applies machine-learning to analyze your deployment logs and discover areas for improvement.

Verification Setup Overview

You set up Elasticsearch and Harness in the following way:

  1. Using Elasticsearch, you monitor your microservice or application.
  2. In Harness, you connect Harness to your Elasticsearch account, adding ELK Elasticsearch as a Harness Verification Provider.
  3. After you have built and run a successful deployment of your microservice or application in Harness, you then add ELK verification steps to your Harness deployment workflow.
  4. Harness uses Elasticsearch to verify your future microservice/application deployments.
  5. Harness Continuous Verification uses unsupervised machine-learning to analyze your deployments and Elasticsearch analytics, discovering events that might be causing your deployments to fail. Then you can use this information to improve your deployments.

Exceptions with Elasticsearch via Kibana

Harness Analysis of Elasticsearch Verification

Intended Audience

  • Developers
  • DevOps

Before You Begin

Add ELK to Verification Providers

The following procedure connects Harness to your ELK Elasticsearch account.

To add Elasticsearch as a verification provider, do the following:

  1. Click Setup.
  2. Click Connectors.
  3. Click Verification Providers.
  4. Click Add Verification Provider, and select ELK. The Add ELK Verification Provider dialog for your provider appears.

The Add ELK Verification Provider dialog has the following fields.

Field

Description

Connector type

Select the server type for the connection. You can choose whether to access your ELK server directly using the Elasticsearch Server URL (recommended), or using the Kibana Server URL.

URL

Enter the URL of the server. The format is http(s)://server:port/. The default port is 9200.

Username and Password

Enter the credentials to authenticate with the server.

Token

Some systems provide Elasticsearch as a service and use access tokens.

If you have token based authentication, provide the authentication header that is passed when making the HTTP request.

Header: APITokenKey. Example: x-api-key (varies by system).

Value: APITokenValue. Example: kdsc3h3hd8wngdfujr23e23e2.

Display Name

Enter a display name for the provider. If you are going to use multiple providers of the same type, ensure you give each provider a different name.

Usage Scope

If you want to restrict the use of a provider to specific applications and environments, do the following:

In Usage Scope, click the drop-down under Applications, and click the name of the application.

In Environments, click the name of the environment.

If you receive an error, see Troubleshooting.

Verify with ELK

The following procedure adds an ELK Elasticsearch verification step to a workflow.

In order to obtain the names of the host(s), pod(s), or container(s) where your service is deployed, the verification provider should be added to your workflow after you have run at least one successful deployment.

To verify your deployment with ELK, do the following:

  1. Ensure that you have added ELK Elasticsearch as a verification provider, as described above.
  2. In your workflow, under Verify Service, click Add Verification, and then click ELK. The ELK dialog appears.

To configure the ELK dialog fields, do the following:

  1. In Elastic Search Server, select the server you added when you set up the ELK verification provider, as described above.
  2. In Search Keywords, enter search keywords for your query, such as error or exception.

    Click Validate Search Keywords to confirm syntax. The keywords are searched against the logs identified in the Message field of the dialog (see below).
  3. In Advanced Query, enter an Elasticsearch JSON query. You can use this field to create complex queries beyond keywords. The following example looks for the substring error in the field log:

    {"regexp":{"log": {"value":"error"}}}
  4. In Query Type, select TERM to finds documents that contain the exact term specified in the inverted index. MATCH queries accept text, numerics, and dates, analyze them, and construct a query. If you want the query analyzed, then use MATCH.
  5. In Indices, enter the the index to search. This field is automatically populated from the index templates, if available.



    If there are no index templates, or if you do not have administrator privileges with ELK, enter the index manually.
    1. To locate indices, in Kibana, click Management.
    2. Click Index Patterns. The Index Patterns page appears.
    3. Copy the name of one of the Index patterns.
    4. In Harness, in the ELK dialog, paste the name of the Index pattern into Indices.
  6. In Host Name Field, enter the field name used in the ELK logs that refers to the host/pod/container ELK is monitoring.

    To find the hostname in Kibana and enter it in Harness, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for error or exception.
    3. In the results, locate the host name of the host/container/pod where ELK is monitoring. For example, when using Kubernetes, the pod name field kubernetes.pod_name is used.
    4. In Harness, in the ELK dialog, next to Host Name Field, click Guide From Example. The Host Name Field popover appears.
    5. In the JSON response, click on the name of the label that maps to the host/container/pod in your log search results. Using our Kubernetes example, under pod, you would click the first name label.

      The Host Name Field is filled with the JSON label for the hostname.
  7. In Message Field, enter the field by which the messages are usually indexed. Typically, a log field.

    To find the field in Kibana and enter it in Harness, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for error or exception.
    3. In the results, locate a log for the host/container/pod ELK is monitoring. For example, in the following Kubernetes results in Kibana, the messages are indexed under the log field.
    4. In Harness, in the ELK dialog, next to Message Field, click Guide From Example. The Message Field popover appears.
    5. In the JSON response, click on the name of the label that maps to the log in your Kibana results. Using our Kubernetes example, you would click the log label.

      The label is added to the Message Field.
  8. In Expression for Host/Container name, add an expression that evaluates to the host name value for the field you entered in the Host Name Field above. The default expression is ${host.hostName}.
    In order to obtain the names of the host where your service is deployed, the verification provider should be added to your workflow after you have run at least one successful deployment.
    To ensure that you pick the right field when using Guide From Example, you can use a host name from the ELK log messages as a guide.

    To use Guide From Example for a host name expression, do the following:
    1. In Kibana, click Discover.
    2. In the search field, search for error or exception.
    3. In the results, locate the name of the host/container/pod ELK is monitoring. For example, when using Kubernetes, the pod name field kubernetes.pod_name displays the value you need.

      The expression that you provide in Expression for Host/Container Name should evaluate to the name here, although the suffixes can differ.
    4. In Harness, in your workflow ELK dialog, click Guide From Example. The Expression for Host Name popover appears.

      The dialog shows the service, environment, and service infrastructure used for this workflow.
    5. In Host, click the name of the host to use when testing verification. The hostname will be similar to the hostname you used for the Host Name Field, as described earlier in this procedure. The suffixes can be different.
    6. Click SUBMIT. The JSON for the host appears. Look for the host section.

      You want to use a name label in the host section. Do not use a host name label outside of that section.
    7. To identify which label to use to build the expression, compare the host/pod/container name in the JSON with the hostname you use when configuring Host Name Field.
    8. In the Expression for Host Name popover, click the name label to select the expression. Click back in the main dialog to close the Guide From Example. The expression is added to the Expression for Host/Container name field.

      For example, if you clicked the name label, the expression ${host.name} is added to the Expression for Host/Container name field.
  9. In Timestamp format, enter the format for the timestamp field in the Elasticsearch record. Use Kibana to determine the format.

    In Kibana, use the Filter feature in Discover to construct your timestamp range:



    Format Examples:

    Timestamp: 2018-08-24T21:40:20.123Z. Format: yyyy-MM-dd'T'HH:mm:ss.SSSX

    Timestamp: 2018-08-30T21:57:23+00:00. Format: yyyy-MM-dd'T'HH:mm:ss.SSSXXX

    For more information, see Data Math from Elastic.
  10. At the bottom of the New Relic dialog, click TEST.

    A new Expression for Host Name popover appears.

    In Host, select the same host you selected last time, and then click RUN. Verification for the host is found.

If you receive an error, it is likely because you selected the wrong label in Expression for Host/Container name or Host Name Field.

The following settings are common to all verification provider dialogs in workflows.

Field

Description

Analysis Time duration

Set the duration for the verification step. If a verification step exceeds the value, the workflow Failure Strategy is triggered. For example, if the Failure Strategy is Ignore, then the verification state is marked Failed but the workflow execution continues.

Baseline for Risk Analysis

Select Previous Analysis to have this verification use the previous analysis for a baseline comparison. If your workflow is a Canary workflow type, you can select Canary Analysis to have this verification compare old versions of nodes to new versions of nodes in real-time.

Execute with previous steps

Check this checkbox to run this verification step in parallel with the previous steps in Verify Service.

Failure Criteria

Specify the sensitivity of the failure criteria. When the criteria is met, the workflow Failure Strategy is triggered.

Include instances from previous phases

If you are using this verification step in a multi-phase deployment, select this checkbox to include instances used in previous phases when collecting data. Do not apply this setting to the first phase in a multi-phase deployment.

Wait interval before execution

Set how long the deployment process should wait before executing the verification step.

When you are finished, click SUBMIT. The ELK verification step is added to your workflow.

Verification Results

Once you have deployed your workflow (or pipeline) using the New Relic verification step, you can automatically verify cloud application and infrastructure performance across your deployment.

Workflow Verification

To see the results of Harness machine-learning evaluation of your ELK verification, in your workflow or pipeline deployment you can expand the Verify Service step and then click the ELK step.

Continuous Verification

You can also see the evaluation in the Continuous Verification dashboard. The workflow verification view is for the DevOps user who developed the workflow. The Continuous Verification dashboard is where all future deployments are displayed for developers and others interested in deployment analysis.

To learn about the verification analysis features, see the following sections.

Deployments

Deployment info
See the verification analysis for each deployment, with information on its service, environment, pipeline, and workflows.

Verification phases and providers
See the vertfication phases for each vertfication provider. Click each provider for logs and analysis.

Verification timeline
See when each deployment and verification was performed.

Transaction Analysis

Execution details
See the details of verification execution. Total is the total time the verification step took, and Analysis duration is how long the analysis took.

Risk level analysis
Get an overall risk level and view the cluster chart to see events.

Transaction-level summary
See a summary of each transaction with the query string, error values comparison, and a risk analysis summary.

Execution Analysis

Event type
Filter cluster chart events by Unknown Event, Unexpected Frequency, Anticipated Event, Baseline Event, and Ignore Event.

Cluster chart
View the chart to see how the selected event contrast. Click each event to see its log details.

Event Management

Event-level analysis
See the threat level for each event captured.

Tune event capture
Remove events from analysis at the service, workflow, execution, or overall level.

Event distribution
Click the chart icon to see an event distribution including the measured data, baseline data, and event frequency.

Troubleshooting

The following are resolutions to common configuration problems.

Workflow Step Test Error

When you click TEST in the ELK workflow dialog Expression for Host Name popover, you should get provider information:

The following error message can occur when testing the New Relic verification step in your workflow:

ELK_CONFIGURATION_ERROR: Error while saving ELK configuration. No node with name ${hostName} found reporting to ELK
Cause

The expression in the Expression for Host/Container name field is incorrect. Typically, this occurs when the wrong hostName label is selected to create the expression in the Expression for Host/Container name field.

Solution

Following the steps in Verify with ELK again to select the correct expression. Ensure that the name label selected is under the host section of the JSON.

SocketTimeoutException

When you add an ELK verification provider and click SUBMIT, you might see the following error.

Cause

The Harness delegate does not have a valid connection to the ELK server.

Solution

On the same server or instance where the Harness delegate is running, run one of the following cURL commands to verify whether the delegate can connect to the ELK server.

If you do not have a username and password for the ELK server:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If you have username and password then use this command:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -H 'Authorization: <Basic: Base64 encoded username:password>'-d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If you have token-based authentication, use this command:

curl -i -X POST url/*/_search?size=1 -H 'Content-Type: application/json' -H 'tokenKey: tokenValue'-d '{"size":1,"query":{"match_all":{}},"sort":{"@timestamp":"desc"}}'

If the cURL command cannot connect, it will fail.

If the cURL command can connect, it will return a HTTP 200, along with the JSON.

If the cURL command is successful, but you still see the SocketTimeoutException error in the ELK dialog, contact Harness Support (support@harness.io).

It is possible that the response from the ELK server is just taking very long.

Next Steps


How did we do?