Self Hosted Rehydrations

Self hosted rehydrations in the Edge Delta web application.

Rehydration Recap

Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.

Self-Hosted Rehydration

The Edge Delta agent archives logs on customer-specific S3 buckets, which are owned by Edge Delta, and it rehydrates data from these archives. However, your environment might be configured to archive data in your own data storage. You can configure self-hosted rehydration to rehydrate data stored from your own data storage. Rehydration components can be deployed to any K8s cluster. When self-hosted rehydration is configured, the Edge Delta backend does not handle any raw data, it only serves as storage of rehydration metadata such as input/filters/destination and status.

Deploying Self-Hosted Rehydration

There are two methods of deploying Edge Delta’s rehydration components in your environment:

  • Using OpenFaas, or
  • Without OpenFaas.

Deploy Self-Hosted Rehydration using OpenFaas

Deployment Overview

To deploy self-hosted rehydration using OpenFaas you install the OpenFaas Helm chart while passing in an Edge Delta values.yaml file. You create an API token in the Edge Delta web app and save it as a Kubernetes secret, you also need to enable rehydration in the web app. With port forwarding temporarily enabled on the Edge Delta Rehydration gateway service, you install the Edge Delta rehydration function handler. Finally, you download the Edge Delta rehydration poller and configure it with your organization’s ID before deploying it. The following section explains these steps in detail.

Deployment Steps

  1. Install the following tools:
  • kubectl
  • Helm 3.8.x or older. Helm 3.9.x is not recommended.
  • faas
  1. Create a namespace for the Edge Delta rehydration components:
kubectl create namespace edgedelta-rehydration
  1. Download and upgrade the OpenFaas Helm repo and install the OpenFaas Helm chart, while passing in the Edge Delta values.yaml file.
helm repo add openfaas https://openfaas.github.io/faas-netes
helm repo update
helm upgrade openfaas --wait --install openfaas/openfaas --namespace edgedelta-rehydration -f https://raw.githubusercontent.com/edgedelta/k8s/master/on-prem-rehydration/on-prem-rehydration-helm-values.yml
  1. Create an Edge Delta token in the web app: 4.1 Click Management - My Organization. 4.2 Select API Tokens and click Create Token. 4.3 Name the token and select Add Permissions. 4.3 Configure the following permissions and click Add to Token to add each one to the token:
  • Rehydration - All Current and Future Rehydrations - Write
  • Integration - All Current and Future Rehydrations - Read
    4.4 Select Token Details, confirm the permissions configuration and click Create. 4.5 Copy the token key and click Close.
  1. Save the token in a new Kubernetes secret. In this example the secret and key are both named ed-api-token, and the token key is 123456789:
kubectl create secret generic ed-api-token \
  --namespace=edgedelta-rehydration \
  --from-literal=ed-api-token="123456789"
  1. Enable rehydrations in the Edge Delta web app. 6.1 Click Data Pipeline and select Rehydrations. 6.2 Click Settings 6.3 Set the Is On-Prem option to True and click Save.

If this setting is hidden for your organization please contact Edge Delta.

  1. Temporarily enable port forwarding to connect to OpenFaas:
kubectl port-forward -n edgedelta-rehydration svc/gateway 8080:8080
  1. In another terminal, deploy the rehydration function handler:
faas deploy -f https://raw.githubusercontent.com/edgedelta/k8s/master/on-prem-rehydration/on-prem-rehydration-function.yml
  1. Stop port forwarding with Control+C in the terminal you used in step 7.
  2. Download the Edge Delta rehydration poller yaml to a local file /tmp/rehydration-poller.yml:
curl https://raw.githubusercontent.com/edgedelta/k8s/master/on-prem-rehydration/on-prem-rehydration-poller.yml -o /tmp/rehydration-poller.yml
  1. In the Edge Delta web app, click Management - My Organization and copy your organization ID.
  2. Edit rehydration-poller.yml and specify your organization ID as the value for ED_ORG_ID.
  3. Deploy the customized rehydration poller:
kubectl apply -f /tmp/rehydration-poller.yml;
  1. Review the logs to verify a successful connection to the OpenFaaS gateway:
kubectl logs deployment/rehydration-poller -n edgedelta-rehydration

When you trigger a rehydration, requests will now be processed by the rehydration components installed on your k8s cluster.

Customize Rehydration Resource Allocations

The default resource allocations for the rehydration components should be suitable for most use cases. In some instances you may find that you need to change some resource settings to cater for specific data characteristics such as long log line length or a large rehydration data size that needs to process quickly. The following parameters can be adjusted or added and passed in when the on-prem-rehydration-function yaml is deployed in step 8:

Parameter Description
functions
  rehydrate
    limits
      memory: 32000Mi
    requests
      memory: 32000Mi
Allocate memory to the pod.
functions
  rehydrate
    environment
      ED_REHYDRATION_MEMORY_THRESHOLD: 28G
Define the maximum memory allocation for the application, which should be approximately 90% of the limits.memory for the pod.
functions
  rehydrate
    environment
      GOGC: 20
Aggressiveness of the GO Garbage Collection mechanism. A smaller number is more aggressive which may mitigate out of memory issues.
functions
  rehydrate
    environment
      ED_REHYDRATION_CONCURRENCY_LIMIT: 4
The number of concurrent streams being processed, this can be increased to 16 from the default of 4 if there is a large memory allocation such as 128000Mi. Bear in mind that each concurrent stream is pre-allocated with the decompression and scan buffer memory allocation. So a large buffer memory allocation in multiple concurrent streams could quickly saturate the memory allocation.
functions
  rehydrate
    environment
      ED_DECOMPRESS_BUFFER_SIZE: 1G
The decompression buffer memory size per concurrent rehydration stream.
functions
  rehydrate
    environment
      ED_SCAN_BUFFER_SIZE: 1M
The scan buffer memory size per concurrent rehydration stream for hash decoding. This can be increased in instances with very long lines per log (more than 10^6 characters per log).

Deploy Self-Hosted Rehydration without OpenFaas

You can use kubectl to deploy self-hosted rehydration if your version of Helm does not install it correctly, or if your cluster is older than Kubernetes v1.16 (OpenFaas has a dependency on apiextensions.k8s.io/v1beta1, which is only compatible with Kubernetes v1.16 and higher.)

Deployment Overview

To deploy self-hosted rehydration without OpenFaas you create an API token in the Edge Delta web app and save it as a Kubernetes secret, you also need to enable rehydration in the web app. You install the Edge Delta rehydration function handler. Finally, you download the Edge Delta rehydration poller and configure it with your organization ID before deploying it. The following section explains these steps in detail.

Deployment Steps

  1. Install kubectl or ensure it is installed.
  2. Create the edgedelta-rehydration namespace:
kubectl create namespace edgedelta-rehydration
  1. Create an Edge Delta token in the web app: 3.1 Click Management - My Organization. 3.2 Select API Tokens and click Create Token. 3.3 Name the token and select Add Permissions. 3.3 Configure the following permissions and click Add to Token to add each one to the token:
  • Rehydration - All Current and Future Rehydrations - Write
  • Integration - All Current and Future Rehydrations - Read
    3.4 Select Token Details, confirm the permissions configuration and click Create. 3.5 Copy the token key and click Close.
  1. Save the token in a new Kubernetes secret. In this example the secret and key are both named ed-api-token, and the token key is 123456789:
kubectl create secret generic ed-api-token \
  --namespace=edgedelta-rehydration \
  --from-literal=ed-api-token="123456789"
  1. Enable rehydrations in the Edge Delta web app. 5.1 Click Data Pipeline and select Rehydrations. 5.2 Click Settings 5.3 Select True and click Save.

If this setting is hidden for your organization please contact Edge Delta.

  1. Deploy the rehydration handler:
kubectl apply -f https://raw.githubusercontent.com/edgedelta/k8s/master/on-prem-rehydration/on-prem-rehydration-handler-faasless.yml
  1. Download the rehydration poller YML to a local file /tmp/rehydration-poller.yml:
curl https://raw.githubusercontent.com/edgedelta/k8s/master/on-prem-rehydration/on-prem-rehydration-poller-faasless.yml -o /tmp/rehydration-poller-faasless.yml
  1. In the Edge Delta web app, click Management - My Organization and copy your organization ID.
  2. Edit rehydration-poller.yml and specify your organization ID as the value for ED_ORG_ID.
  3. Deploy the customized rehydration poller:
kubectl apply -f /tmp/rehydration-poller-faasless.yml
  1. Review the logs to verify the poller has no errors:
kubectl logs deployment/rehydration-poller -n edgedelta-rehydration

When you trigger a rehydration, requests will now be processed by the rehydration components installed on your k8s cluster.

Customize Rehydration Resource Allocations

The default resource allocations for the rehydration components should be suitable for most use cases. In some instances you may find that you need to change some resource settings to cater for specific data characteristics such as long log line length or a large rehydration data size that needs to occur quickly. The following parameters can be adjusted or added and passed in when the on-prem-rehydration-handler-faasless yaml is deployed in step 6:

Parameter Description
spec
  template
    spec
      - name: handler
        resources:
          limits:
            memory: 32000Mi
          requests:
            memory: 32000Mi
Allocate memory to the pod.
spec
  template
    spec
      - name: handler
        env:
          - name: ED_REHYDRATION_MEMORY_THRESHOLD
            value: 28G
Define the maximum memory allocation for the application, which should be approximately 90% of the limits.memory for the pod.
spec
  template
    spec
      - name: handler
        env:
          - name: GOGC
            value: “20”
Aggressiveness of the GO Garbage Collection mechanism. A smaller number is more aggressive which may mitigate out of memory issues.
spec
  template
    spec
      - name: handler
        env:
          - name: ED_REHYDRATION_CONCURRENCY_LIMIT
            value: “4”
The number of concurrent streams being processed, this can be increased to 16 from the default of 4 if there is a large memory allocation such as 128000Mi. Bear in mind that each concurrent stream is pre-allocated with the decompression and scan buffer memory allocation. So a large buffer memory allocation in multiple concurrent streams could quickly saturate the memory allocation.
spec
  template
    spec
      - name: handler
        env:
          - name: ED_DECOMPRESS_BUFFER_SIZE
            value: 1G
The decompression buffer memory size per concurrent rehydration stream.
spec
  template
    spec
      - name: handler
        env:
          - name: ED_SCAN_BUFFER_SIZE
            value: 1M
The scan buffer memory size per concurrent rehydration stream for hash decoding. This can be increased in instances with very long lines per log (more than 10^6 characters per log).

Trigger Rehydration

Use the Rehydration page of the Edge Delta App to trigger rehydration for a specific time range, source filter, and keyword.