Troubleshooting Edge Delta Rehydrations

Troubleshooting rehydrations in the Edge Delta web application.

Rehydration Recap

Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.

Troubleshooting Rehydration

Rehydration Failed

If a rehydration status is Failed, the cause of failure can be found in the Rehydration Detail page in the Error section. Known issues include the following:

  • Large archive files can cause rehydrate function to timeout while waiting for sufficient memory to become available. This results in the following error message: err: failed to block until sufficient memory is available.
  • Multiple agents archiving to the rehydration source in different formats can cause errors. This results in one of the following error messages: failed to decompress failed to decode
  • A nil pointer exception can cause an error in rare cases. This can be found by investigating rehydration handler logs.

Rehydration is stuck in in-progress

If the rehydration status is In-Progress and the completion percentage in the Rehydration Detail page has not progressed in a while, the rehydration is most likely stuck. It should be retried. If this happens for self-hosted rehydration used along with rehydration poller, the poller will re-invoke these rehydrations after a default 15-minute interval or as set by ED_REHYDRATION_REPROCESS_WAIT_INTERVAL.

For self-hosted rehydrations, the logs of the rehydrate function can be inspected to find out what caused the rehydration to get stuck:

kubectl logs svc/rehydrate -n openfaas-fn for OpenFaaS
kubectl logs svc/rehydration-handler -n edgedelta-rehydration for non-OpenFaaS

Self-hosted Rehydration Doesn’t Start

If you are unable to create a rehydration in a self-hosted environment, check whether the rehydrate pod is up and running.

kubectl get pod -n openfaas-fn for OpenFaaS
kubectl get pod -n edgedelta-rehydration for non-OpenFaaS

If the pod with name prefix rehydrate- (or rehydration-handler for non-OpenFaaS) shows 0/x READY, it means the rehydrate pod was unable to initialize. Use one of the following commands, depending on whether you are using OpenFaas, to show why the pod is unable to come up:

kubectl describe pod {pod_name} -n openfaas-fn 

or

kubectl describe pod {pod_name} -n edgedelta-rehydration 

Usually, this could be a result of insufficient resources in the nodes for the pod.

New Self-Hosted Rehydrations not Detected

If the poller or rehydration function in a self-hosted environment does not pick up new rehydrations there may be multiple pollers running in different clusters. The poller which polled the SaaS backend first after new rehydration was created would pick it up and invoke it on the connected rehydration function handler. This could lead to no processing logs in the poller/rehydrator being tailed but the rehydration status showing progress or an error state. The rehydration poller does not support multiple pollers and this could lead to the same rehydration being invoked concurrently across different rehydration handlers.