Troubleshooting Rehydrations
3 minute read
Rehydration Recap
Rehydration is the process of pushing, or rehydrating, archived data to a streaming destination. You may want to do this to investigate an incident.
Troubleshooting Rehydration
Rehydration Failed
If a rehydration status is Failed
, the cause of failure can be found in the Rehydration Detail page in the Error section. Known issues include the following:
- Large archive files can cause rehydrate function to timeout while waiting for sufficient memory to become available. This results in the following error message:
err: failed to block until sufficient memory is available.
- Multiple agents archiving to the rehydration source in different formats can cause errors. This results in one of the following error messages:
failed to decompress
failed to decode
- A nil pointer exception can cause an error in rare cases. This can be found by investigating rehydration handler logs.
Rehydration is stuck in in-progress
If the rehydration status is In-Progress
and the completion percentage in the Rehydration Detail page has not progressed in a while, the rehydration is most likely stuck. It should be retried. If this happens for self-hosted rehydration used along with rehydration poller, the poller will re-invoke these rehydrations after a default 15-minute interval or as set by ED_REHYDRATION_REPROCESS_WAIT_INTERVAL
.
For self-hosted rehydrations, the logs of the rehydrate function can be inspected to find out what caused the rehydration to get stuck:
kubectl logs svc/rehydrate -n openfaas-fn for OpenFaaS
kubectl logs svc/rehydration-handler -n edgedelta-rehydration for non-OpenFaaS
Self-hosted Rehydration Doesn’t Start
If you are unable to create a rehydration in a self-hosted environment, check whether the rehydrate pod is up and running.
kubectl get pod -n openfaas-fn for OpenFaaS
kubectl get pod -n edgedelta-rehydration for non-OpenFaaS
If the pod with name prefix rehydrate-
(or rehydration-handler
for non-OpenFaaS) shows 0/x READY, it means the rehydrate pod was unable to initialize. Use one of the following commands, depending on whether you are using OpenFaas, to show why the pod is unable to come up:
kubectl describe pod {pod_name} -n openfaas-fn
or
kubectl describe pod {pod_name} -n edgedelta-rehydration
Usually, this could be a result of insufficient resources in the nodes for the pod.
New Self-Hosted Rehydrations not Detected
If the poller or rehydration function in a self-hosted environment does not pick up new rehydrations there may be multiple pollers running in different clusters. The poller which polled the SaaS backend first after new rehydration was created would pick it up and invoke it on the connected rehydration function handler. This could lead to no processing logs in the poller/rehydrator being tailed but the rehydration status showing progress or an error state. The rehydration poller does not support multiple pollers and this could lead to the same rehydration being invoked concurrently across different rehydration handlers.