Rehydration of Self Hosted Archives
6 minute read
Rehydration Recap
Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.
Rehydration from Self-Hosted Storage
The Edge Delta agent archives logs on ClickHouse instances that are owned by Edge Delta, and it rehydrates data from these archives. However, your environment might be configured to archive data in your own ClickHouse storage. You can configure rehydration to rehydrate data stored on your own data archive by installing the Edge Delta rehydration poller and handler. These rehydration components can be deployed to any K8s cluster. When rehydration of self-hosted storage is configured, the Edge Delta backend does not handle any raw data, it only serves as storage of rehydration metadata such as input/filters/destination and status.
This page deals with rehydration from storage that is self-hosted, while other Edge Delta components such as the web UI are delivered via Edge Delta SaaS - known as a hybrid deployment. For more information on full on-premises deployments - where all Edge Delta components are self-hosted - see here.
Architecture Overview
The following components are self-hosted:
ClickHouse
ClickHouse consists of 2 different kinds of nodes:
- Keeper Nodes: Keeper nodes are responsible for replicating data across different ClickHouse nodes and ensuring data consistency. In other words, they act as the distributed transaction coordinator for the ClickHouse cluster. Keeper nodes also maintain the metadata about the cluster’s data distribution and replication status.
- Main Nodes: Main nodes are responsible for processing and querying data on a blob storage like S3, GCS. Main nodes can be thought of as the workhorses of the ClickHouse cluster. In the Kubernetes deployment of ClickHouse, keeper and main nodes are deployed as separate Kubernetes StatefulSets, which ensures that each node has a unique, persistent identity and that the nodes can be scaled up or down as needed.
ClickHouse deployment is done using the Altinity/clickhouse-operator. Using the clickhouse-operator allows to do the following:
- Customized storage provisioning (VolumeClaim templates)
- Customized pod templates
- Customized service templates for endpoints
- ClickHouse configuration management
- ClickHouse users management
- ClickHouse cluster scaling including automatic schema propagation
- ClickHouse version upgrades
- Exporting ClickHouse metrics to Prometheus
Edge Delta Handler
The Handler is the Edge Delta component responsible for running rehydration. It uses ClickHouse as a query engine. Streaming to the destination is done in chunks.
Edge Delta Poller
The Poller is the Edge Delta component that integrates with the Edge Delta backend API to manage rehydration states. This component triggers, invokes and coordinates rehydrations. Authentication is done using an api-token secret.
Prerequisites
Resource Requirements
Three instances of m5a.4xlarge
(16 vcpu/64g ram) are required at a minimum.
Component | CPU | RAM | |
---|---|---|---|
Handler | 7000m | 25000Mi | |
Poller | 200m | 200Mi | |
ClickHouse | 6000m | 28000Mi |
API token for Rehydration
Create an API token with the following permissions:
Resource | Instances | Operation |
---|---|---|
Rehydrations | All current and future | write |
Integrations | All current and future | write |
- Click Admin - My Organization.
- Click API Tokens.
- Click Create Token.
- Provide a descriptive token name and click Add Permissions.
- Select a resource type the token should grant access to.
- Select which resources of that type the token should grant access to.
- Select the operations that the token should allow for that resource.
- Click Create Permission.
- Repeat Steps 5-8 for all resource permissions the token should allow.
- Click Token Details.
- Review the token name and token permissions then click Create.
- Copy the private key for the token and keep it safe.
- Click Close.
Get your Organization ID
- Click Admin - My Organization.
- Copy the Organization ID.
Deploy ClickHouse and the Edge Delta Rehydration Components
- Remove all existing resources.
kubectl patch ClickHouseInstallation rh -p '{"metadata":{"finalizers":[]}}' --type=merge -n edgedelta-rehydration
kubectl delete namespace edgedelta-rehydration
- Get the ClickHouse yaml
wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/655a8c098fe817010e6feb577e04c796e34b014e/ed-ch-operator.yml
- Get the Edge Delta Handler yaml
wget https://gist.githubusercontent.com/aliozcan/b7465051d2bcbe700605fa0be306200e/raw/93ed146153e80e52055103ab41d5abab5a75b4c2/rh-k8s.yml
- Convert the API token to base64 and add it to line 9 to create the secret.
ed-api-token: "<API TOKEN IN BASE64>"
- Add the Organization ID to line 140
value: "<ORG_ID>"
- Add any required secrets, such as a splunk API key as environment variables after line 89.
- name: ED_SPLUNK_API_KEY
value: "<SPLUNK API KEY>"
- Create the edgedelta-rehydration namespace
kubectl create namespace edgedelta-rehydration
- Apply
ed-ch-operator.yml
and wait for the resources to enter the running state
kubectl apply -f ed-ch-operator.yml
- Apply
rh-k8s.yml
and wait for the resources to enter the running state.
kubectl apply -f rh-k8s.yml
There should now be running pods for chi-rh-default-0-0-0
, clickhouse-operator
, rehydration-handler
, and rehydration-poller
.
Configure Rehydration in Edge Delta
- Click Data Pipeline and select Rehydrations.
If this setting is hidden for your organization please contact Edge Delta.
- Click Settings
- Set the Is On-Prem option to True.
- Configure other settings as required:
Setting | Description |
---|---|
Maximum Rehydration Size | Rehydration size check. While creating a Rehydration, if the size is greater than what analyze button shows, the Create button will be disabled |
Maximum Concurrent Rehydration Count | Maximum rehydration requests accepted for an organization. This does not mean the concurrent running rehydrations. That is defined using the ED_MAX_INFLIGHT environment variable in Handler. |
Maximum Concurrent Rehydration Per User | Maximum concurrent rehydration requests accepted for an individual user. |
Maximum Rehydration Time Range | Maximum time range that is allowed when creating a new rehydration. |
- Click Save.
Environment Variables
The following Environment variables can be configured in the Handler yaml:
Environment Variable | Description |
---|---|
ED_REHYDRATION_PUSH_BATCH_SIZE | Batch size in which to send logs to the destination. |
ED_REMOTE_TOKEN_FILE | If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API. |
ED_API_ENDPOINT | The Edge Delta API Endpoint: https://api.edgedelta.com . |
ED_MAX_INFLIGHT | The maximum number of rehydrations the organization can run at the same time. |
GOGC | The aggressiveness of the GO Garbage Collection mechanism. A smaller number is more aggressive which may mitigate out of memory issues. |
GOMEMLIMIT | The maximum memory allocation for the application, which should be approximately 90% of the limits.memory for the pod. |
ED_REHYDRATION_CONCURRENCY_LIMIT | The number of concurrent streams being processed, this can be increased to 16 from the default of 4 if there is a large memory allocation such as 128000Mi. Bear in mind that each concurrent stream is pre-allocated with the decompression and scan buffer memory allocation. So a large buffer memory allocation in multiple concurrent streams could quickly saturate the memory allocation. |
CLICKHOUSE_RH_CONNECTION | The ClickHouse internal endpoint: Endpoint=rh-ch-srv.edgedelta-rehydration.svc.cluster.local:9000,Database=default,Username=default,Password= |
ED_CH_CHUNK_SIZE_MB | Chunk size where rehydration creates a chunk and queries files that are in the chunk. It helps both scaling rehydration and ClickHouse performance. |
ED_CH_CHUNK_MAX_NUM_FILES | The maximum number of files in a chunk. Each chunk needs to satisfy both this environment variable and ED_CH_CHUNK_SIZE_MB . |
ED_REHYDRATION_SYNC_MODE | Synchronous push to destination is set to false . It should be async communication. |
ED_REHYDRATION_PUSH_CONCURRENCY | Concurrent number of workers to push logs to the destinations. |
ED_REHYDRATION_BUFFER_PATH | Path to streamer buffer. Useful when the working directory does not have read/write permission. Requires the given path to pre-exist. |
Other integration variables such as ED_SPLUNK_API_KEY | Environment variables that are used in Integration configurations can be set here. |
The following Environment variables can be configured in the Poller yaml:
Environment Variable | Description |
---|---|
ED_REHYDRATION_POLL_INTERVAL | The default polling interval is 60s. |
ED_ORG_ID | The Organization ID in Edge Delta. |
ED_REHYDRATE_DIRECT_ENDPOINT | The rehydration handler’s internal endpoint: http://rehydration-handler-service.edgedelta-rehydration.svc.cluster.local:8080 |
ED_DISABLE_OPENFAAS | This option is set to 1 to use ClickHouse instead of OpenFaas |
ED_REMOTE_TOKEN_FILE | If it isn’t possible to mount a token as an environment variable you can provide a only-text file which contains the Edge Delta token for accessing the API. |
ED_API_ENDPOINT | The Edge Delta API Endpoint: https://api.edgedelta.com . |
ED_AGENT_TAGS_FILTER | The tags that the Poller fetches rehydrations with. |
ED_RH_MULTI_POLLER_ENABLE | Whether to use multiple rehydration poller instances for high availability. The default is 0. |
Trigger Rehydration
Use the Rehydration page of the Edge Delta App to trigger rehydration for a specific time range, source filter, and keyword.