Rehydration of Self Hosted Archives

Rehydrate logs from Self hosted archive storage in the Edge Delta web application.

Rehydration Recap

Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.

Rehydration from Self-Hosted Storage

The Edge Delta agent archives logs on ClickHouse instances that are owned by Edge Delta, and it rehydrates data from these archives. However, your environment might be configured to archive data in your own ClickHouse storage. You can configure rehydration to rehydrate data stored on your own data archive by installing the Edge Delta rehydration poller and handler. These rehydration components can be deployed to any K8s cluster. When rehydration of self-hosted storage is configured, the Edge Delta backend does not handle any raw data, it only serves as storage of rehydration metadata such as input/filters/destination and status.

This page deals with rehydration from storage that is self-hosted, while other Edge Delta components such as the web UI are delivered via Edge Delta SaaS - known as a hybrid deployment. For more information on full on-premises deployments - where all Edge Delta components are self-hosted - see here.

Architecture Overview

The following components are self-hosted:

ClickHouse

ClickHouse consists of 2 different kinds of nodes:

  • Keeper Nodes: Keeper nodes are responsible for replicating data across different ClickHouse nodes and ensuring data consistency. In other words, they act as the distributed transaction coordinator for the ClickHouse cluster. Keeper nodes also maintain the metadata about the cluster’s data distribution and replication status.
  • Main Nodes: Main nodes are responsible for processing and querying data on a blob storage like S3, GCS. Main nodes can be thought of as the workhorses of the ClickHouse cluster. In the Kubernetes deployment of ClickHouse, keeper and main nodes are deployed as separate Kubernetes StatefulSets, which ensures that each node has a unique, persistent identity and that the nodes can be scaled up or down as needed.

ClickHouse deployment is done using the Altinity/clickhouse-operator. Using the clickhouse-operator allows to do the following:

  • Customized storage provisioning (VolumeClaim templates)
  • Customized pod templates
  • Customized service templates for endpoints
  • ClickHouse configuration management
  • ClickHouse users management
  • ClickHouse cluster scaling including automatic schema propagation
  • ClickHouse version upgrades
  • Exporting ClickHouse metrics to Prometheus

Edge Delta Handler

The Handler is the Edge Delta component responsible for running rehydration. It uses ClickHouse as a query engine. Streaming to the destination is done in chunks.

Edge Delta Poller

The Poller is the Edge Delta component that integrates with the Edge Delta backend API to manage rehydration states. This component triggers, invokes and coordinates rehydrations. Authentication is done using an api-token secret.

Prerequisites

Resource Requirements

Three instances of m5a.4xlarge (16 vcpu/64g ram) are required at a minimum.

Component CPU RAM
Handler 7000m 25000Mi
Poller 200m 200Mi
ClickHouse 6000m 28000Mi

API token for Rehydration

Create an API token with the following permissions:

Resource Instances Operation
Rehydrations All current and future write
Integrations All current and future write
  1. Click Admin - My Organization.
  2. Click API Tokens.
  3. Click Create Token.
  4. Provide a descriptive token name and click Add Permissions.
  5. Select a resource type the token should grant access to.
  6. Select which resources of that type the token should grant access to.
  7. Select the operations that the token should allow for that resource.
  8. Click Create Permission.
  9. Repeat Steps 5-8 for all resource permissions the token should allow.
  10. Click Token Details.
  11. Review the token name and token permissions then click Create.
  12. Copy the private key for the token and keep it safe.
  13. Click Close.

Get your Organization ID

  1. Click Admin - My Organization.
  2. Copy the Organization ID.

Deploy ClickHouse and the Edge Delta Rehydration Components

  1. Remove all existing resources.
kubectl patch ClickHouseInstallation rh -p '{"metadata":{"finalizers":[]}}' --type=merge -n edgedelta-rehydration

kubectl delete namespace edgedelta-rehydration
  1. Get the ClickHouse yaml
wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/655a8c098fe817010e6feb577e04c796e34b014e/ed-ch-operator.yml
  1. Get the Edge Delta Handler yaml
wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/93ed146153e80e52055103ab41d5abab5a75b4c2/rh-k8s.yml
  1. Convert the API token to base64 and add it to line 9 to create the secret.
ed-api-token: "<API TOKEN IN BASE64>"
  1. Add the Organization ID to line 140
value: "<ORG_ID>"
  1. Add any required secrets, such as a splunk API key as environment variables after line 89.
- name: ED_SPLUNK_API_KEY
  value: "<SPLUNK API KEY>"
  1. Create the edgedelta-rehydration namespace
kubectl create namespace edgedelta-rehydration
  1. Apply ed-ch-operator.yml and wait for the resources to enter the running state
kubectl apply -f ed-ch-operator.yml
  1. Apply rh-k8s.yml and wait for the resources to enter the running state.
kubectl apply -f rh-k8s.yml

There should now be running pods for chi-rh-default-0-0-0, clickhouse-operator, rehydration-handler, and rehydration-poller.

Configure Rehydration in Edge Delta

  1. Click Pipelines and select Rehydrations.

If this setting is hidden for your organization please contact Edge Delta.

  1. Click Settings
  2. Set the Is On-Prem option to True.
  3. Configure other settings as required:
Setting Description
Maximum Rehydration Size Rehydration size check. While creating a Rehydration, if the size is greater than what analyze button shows, the Create button will be disabled
Maximum Concurrent Rehydration Count Maximum rehydration requests accepted for an organization. This does not mean the concurrent running rehydrations. That is defined using the ED_MAX_INFLIGHT environment variable in Handler.
Maximum Concurrent Rehydration Per User Maximum concurrent rehydration requests accepted for an individual user.
Maximum Rehydration Time Range Maximum time range that is allowed when creating a new rehydration.
  1. Click Save.

Environment Variables

The following Environment variables can be configured in the Handler yaml:

Environment Variable Description
ED_REHYDRATION_PUSH_BATCH_SIZE Batch size in which to send logs to the destination.
ED_REMOTE_TOKEN_FILE If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API.
ED_API_ENDPOINT The Edge Delta API Endpoint: https://api.edgedelta.com.
ED_MAX_INFLIGHT The maximum number of rehydrations the organization can run at the same time.
GOGC The aggressiveness of the GO Garbage Collection mechanism. A smaller number is more aggressive which may mitigate out of memory issues.
GOMEMLIMIT The maximum memory allocation for the application, which should be approximately 90% of the limits.memory for the pod.
ED_REHYDRATION_CONCURRENCY_LIMIT The number of concurrent streams being processed, this can be increased to 16 from the default of 4 if there is a large memory allocation such as 128000Mi. Bear in mind that each concurrent stream is pre-allocated with the decompression and scan buffer memory allocation. So a large buffer memory allocation in multiple concurrent streams could quickly saturate the memory allocation.
CLICKHOUSE_RH_CONNECTION The ClickHouse internal endpoint: Endpoint=rh-ch-srv.edgedelta-rehydration.svc.cluster.local:9000,Database=default,Username=default,Password=
ED_CH_CHUNK_SIZE_MB Chunk size where rehydration creates a chunk and queries files that are in the chunk. It helps both scaling rehydration and ClickHouse performance.
ED_CH_CHUNK_MAX_NUM_FILES The maximum number of files in a chunk. Each chunk needs to satisfy both this environment variable and ED_CH_CHUNK_SIZE_MB.
ED_REHYDRATION_SYNC_MODE Synchronous push to destination is set to false. It should be async communication.
ED_REHYDRATION_PUSH_CONCURRENCY Concurrent number of workers to push logs to the destinations.
ED_REHYDRATION_BUFFER_PATH Path to streamer buffer. Useful when the working directory does not have read/write permission. Requires the given path to pre-exist.
Other integration variables such as ED_SPLUNK_API_KEY Environment variables that are used in Integration configurations can be set here.

The following Environment variables can be configured in the Poller yaml:

Environment Variable Description
ED_REHYDRATION_POLL_INTERVAL The default polling interval is 60s.
ED_ORG_ID The Organization ID in Edge Delta.
ED_REHYDRATE_DIRECT_ENDPOINT The rehydration handler’s internal endpoint: http://rehydration-handler-service.edgedelta-rehydration.svc.cluster.local:8080
ED_DISABLE_OPENFAAS This option is set to 1 to use ClickHouse instead of OpenFaas
ED_REMOTE_TOKEN_FILE If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API.
ED_API_ENDPOINT The Edge Delta API Endpoint: https://api.edgedelta.com.
ED_AGENT_TAGS_FILTER The tags that the Poller fetches rehydrations with.
ED_RH_MULTI_POLLER_ENABLE Whether to use multiple rehydration poller instances for high availability. The default is 0.

Trigger Rehydration

Use the Rehydration page of the Edge Delta App to trigger rehydration for a specific time range, source filter, and keyword.