Rehydration of Self Hosted Archives

Rehydrate logs from Self hosted archive storage in the Edge Delta web application.

6 minute read

Rehydration Recap

Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.

Rehydration from Self-Hosted Storage

The Edge Delta agent archives logs on ClickHouse instances that are owned by Edge Delta, and it rehydrates data from these archives. However, your environment might be configured to archive data in your own ClickHouse storage. You can configure rehydration to rehydrate data stored on your own data archive by installing the Edge Delta rehydration poller and handler. These rehydration components can be deployed to any K8s cluster. When rehydration of self-hosted storage is configured, the Edge Delta backend does not handle any raw data, it only serves as storage of rehydration metadata such as input/filters/destination and status.

This page deals with rehydration from storage that is self-hosted, while other Edge Delta components such as the web UI are delivered via Edge Delta SaaS - known as a hybrid deployment.

Architecture Overview

The following components are self-hosted:

ClickHouse

ClickHouse consists of 2 different kinds of nodes:

Keeper Nodes: Keeper nodes are responsible for replicating data across different ClickHouse nodes and ensuring data consistency. In other words, they act as the distributed transaction coordinator for the ClickHouse cluster. Keeper nodes also maintain the metadata about the cluster’s data distribution and replication status.
Main Nodes: Main nodes are responsible for processing and querying data on a blob storage like S3, GCS. Main nodes can be thought of as the workhorses of the ClickHouse cluster. In the Kubernetes deployment of ClickHouse, keeper and main nodes are deployed as separate Kubernetes StatefulSets, which ensures that each node has a unique, persistent identity and that the nodes can be scaled up or down as needed.

ClickHouse deployment is done using the Altinity/clickhouse-operator. Using the clickhouse-operator allows to do the following:

Customized storage provisioning (VolumeClaim templates)
Customized pod templates
Customized service templates for endpoints
ClickHouse configuration management
ClickHouse users management
ClickHouse cluster scaling including automatic schema propagation
ClickHouse version upgrades
Exporting ClickHouse metrics to Prometheus

Edge Delta Handler

The Handler is the Edge Delta component responsible for running rehydration. It uses ClickHouse as a query engine. Streaming to the destination is done in chunks.

Edge Delta Poller

The Poller is the Edge Delta component that integrates with the Edge Delta backend API to manage rehydration states. This component triggers, invokes and coordinates rehydrations. Authentication is done using an api-token secret.

Prerequisites

Resource Requirements

Three instances of m5a.4xlarge (16 vcpu/64g ram) are required at a minimum.

Component	CPU	RAM
Handler	7000m	25000Mi
Poller	200m	200Mi
ClickHouse	6000m	28000Mi

API token for Rehydration

Create an API token with the following permissions:

Resource	Instances	Operation
Rehydrations	All current and future	write
Integrations	All current and future	write

Click Admin - My Organization.
Click API Tokens.
Click Create Token.
Provide a descriptive token name and click Add Permissions.
Select a resource type the token should grant access to.
Select which resources of that type the token should grant access to.
Select the operations that the token should allow for that resource.
Click Create Permission.
Repeat Steps 5-8 for all resource permissions the token should allow.
Click Token Details.
Review the token name and token permissions then click Create.
Copy the private key for the token and keep it safe.
Click Close.

Get your Organization ID

Click Admin - My Organization.
Copy the Organization ID.

Deploy ClickHouse and the Edge Delta Rehydration Components

Remove all existing resources.

kubectl patch ClickHouseInstallation rh -p '{"metadata":{"finalizers":[]}}' --type=merge -n edgedelta-rehydration

kubectl delete namespace edgedelta-rehydration

Get the ClickHouse yaml

wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/655a8c098fe817010e6feb577e04c796e34b014e/ed-ch-operator.yml

Get the Edge Delta Handler yaml

wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/93ed146153e80e52055103ab41d5abab5a75b4c2/rh-k8s.yml

Convert the API token to base64 and add it to line 9 to create the secret.

ed-api-token: "<API TOKEN IN BASE64>"

Add the Organization ID to line 140

value: "<ORG_ID>"

Add any required secrets, such as a splunk API key as environment variables after line 89.

- name: ED_SPLUNK_API_KEY
  value: "<SPLUNK API KEY>"

Create the edgedelta-rehydration namespace

kubectl create namespace edgedelta-rehydration

Apply ed-ch-operator.yml and wait for the resources to enter the running state

kubectl apply -f ed-ch-operator.yml

Apply rh-k8s.yml and wait for the resources to enter the running state.

kubectl apply -f rh-k8s.yml

There should now be running pods for chi-rh-default-0-0-0, clickhouse-operator, rehydration-handler, and rehydration-poller.

Configure Rehydration in Edge Delta

Click Pipelines and select Rehydrations.

If this setting is hidden for your organization please contact Edge Delta.

Click Settings
Set the Is On-Prem option to True.
Configure other settings as required:

Setting	Description
Maximum Rehydration Size	Rehydration size check. While creating a Rehydration, if the size is greater than what analyze button shows, the Create button will be disabled
Maximum Concurrent Rehydration Count	Maximum rehydration requests accepted for an organization. This does not mean the concurrent running rehydrations. That is defined using the `ED_MAX_INFLIGHT` environment variable in Handler.
Maximum Concurrent Rehydration Per User	Maximum concurrent rehydration requests accepted for an individual user.
Maximum Rehydration Time Range	Maximum time range that is allowed when creating a new rehydration.

Click Save.

Environment Variables

The following Environment variables can be configured in the Handler yaml:

Environment Variable	Description
ED_REHYDRATION_PUSH_BATCH_SIZE	Batch size in which to send logs to the destination.
ED_REMOTE_TOKEN_FILE	If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API.
ED_API_ENDPOINT	The Edge Delta API Endpoint: `https://api.edgedelta.com`.
ED_MAX_INFLIGHT	The maximum number of rehydrations the organization can run at the same time.
GOGC	The aggressiveness of the GO Garbage Collection mechanism. A smaller number is more aggressive which may mitigate out of memory issues.
GOMEMLIMIT	The maximum memory allocation for the application, which should be approximately 90% of the limits.memory for the pod.
ED_REHYDRATION_CONCURRENCY_LIMIT	The number of concurrent streams being processed, this can be increased to 16 from the default of 4 if there is a large memory allocation such as 128000Mi. Bear in mind that each concurrent stream is pre-allocated with the decompression and scan buffer memory allocation. So a large buffer memory allocation in multiple concurrent streams could quickly saturate the memory allocation.
CLICKHOUSE_RH_CONNECTION	The ClickHouse internal endpoint: `Endpoint=rh-ch-srv.edgedelta-rehydration.svc.cluster.local:9000,Database=default,Username=default,Password=`
ED_CH_CHUNK_SIZE_MB	Chunk size where rehydration creates a chunk and queries files that are in the chunk. It helps both scaling rehydration and ClickHouse performance.
ED_CH_CHUNK_MAX_NUM_FILES	The maximum number of files in a chunk. Each chunk needs to satisfy both this environment variable and `ED_CH_CHUNK_SIZE_MB`.
ED_REHYDRATION_SYNC_MODE	Synchronous push to destination is set to `false`. It should be async communication.
ED_REHYDRATION_PUSH_CONCURRENCY	Concurrent number of workers to push logs to the destinations.
ED_REHYDRATION_BUFFER_PATH	Path to streamer buffer. Useful when the working directory does not have read/write permission. Requires the given path to pre-exist.
Other integration variables such as ED_SPLUNK_API_KEY	Environment variables that are used in Integration configurations can be set here.

The following Environment variables can be configured in the Poller yaml:

Environment Variable	Description
ED_REHYDRATION_POLL_INTERVAL	The default polling interval is 60s.
ED_ORG_ID	The Organization ID in Edge Delta.
ED_REHYDRATE_DIRECT_ENDPOINT	The rehydration handler’s internal endpoint: `http://rehydration-handler-service.edgedelta-rehydration.svc.cluster.local:8080`
ED_DISABLE_OPENFAAS	This option is set to `1` to use ClickHouse instead of OpenFaas
ED_REMOTE_TOKEN_FILE	If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API.
ED_API_ENDPOINT	The Edge Delta API Endpoint: `https://api.edgedelta.com`.
ED_AGENT_TAGS_FILTER	The tags that the Poller fetches rehydrations with.
ED_RH_MULTI_POLLER_ENABLE	Whether to use multiple rehydration poller instances for high availability. The default is 0.

Trigger Rehydration

Use the Rehydration page of the Edge Delta App to trigger rehydration for a specific time range, source filter, and keyword.