Define an In-Cluster Destination for Edge Delta

Installing the Edge Delta Pipeline with an in-cluster destination.

2 minute read

Overview

You might need to install Edge Delta with a special configuration to cater for your environment.

In-Cluster Data Destinations

In-cluster data destinations can be set up for various reasons, each with distinct advantages and disadvantages.

One major advantage of in-cluster data destinations is reduced network latency. Keeping data within the same Kubernetes cluster can significantly lower latency compared to transmitting data outside the cluster, which is particularly beneficial for services requiring quick response times. Additionally, there is a security benefit as data kept in-cluster minimizes exposure to potential vulnerabilities that could arise from external data transmission, hence enhancing security.

Cost efficiency is another advantage, as in-cluster destinations may reduce data transfer costs, especially since many cloud providers charge for data leaving the cloud environment or crossing regional boundaries. The configuration can also be simpler; using in-cluster service endpoints allows for streamlined configuration through Kubernetes-native DNS resolution, which the platform manages. Moreover, Kubernetes environments inherently support resilience and high availability. In-cluster resources can utilize these capabilities, potentially leading to decreased downtime.

On the other hand, there are notable disadvantages to consider. Scalability may be limited by the cluster’s capacity, and expanding a cluster to meet increased demand can be more complex compared to employing managed services designed for high scalability. There’s also a management overhead since managing in-cluster resources requires more operational focus than using providers’ managed services, which handle some of the complexities internally.

Resource contention is another potential downside, as in-cluster destinations might compete for resources with other applications running in the cluster, impacting overall performance. From a disaster recovery perspective, if all components, including data destinations, reside within a single cluster, issues affecting the cluster could disrupt both application functionality and data access. Furthermore, managing persistent storage within a cluster can be challenging and might necessitate additional tools and configurations if not already supported by a managed service.

Prepare for an In-Cluster Data Destinations

If you need to configure an output destination that resides within your Kubernetes cluster, you must set a resolvable service endpoint in your Pipeline configuration. For example, if you have an elasticsearch-master Elasticsearch service in the elasticsearch namespace with port 9200 in your cluster-domain.example cluster, then you need to specify the Elastic output address as http://elasticsearch-master.elasticsearch.svc.cluster-domain.example:9200. To learn more, see this article.