Troubleshooting Splunk Integration

Comprehensive troubleshooting guide for all Edge Delta Splunk integrations including TCP, HEC, and source nodes.

Overview

This guide provides comprehensive troubleshooting for all Edge Delta Splunk integrations:

Quick Diagnostic Checklist

Before diving into specific issues, verify:

For All Integrations:

  • Correct node type for your use case (source vs destination)
  • Network connectivity between Edge Delta and Splunk/forwarders
  • Firewall rules allowing required ports
  • Edge Delta agent is running and pipeline is deployed

For Destinations (Sending TO Splunk):

  • Splunk service is running and accessible
  • Proper authentication (tokens for HEC, certificates for TCP)
  • Target index exists in Splunk
  • Network path from Edge Delta to Splunk is open

For Sources (Receiving FROM Splunk):

  • Edge Delta source node is configured and listening on correct port
  • Linux port binding permissions granted (Ubuntu 24.04+)
  • Splunk forwarder has enableOldS2SProtocol = true (for TCP)
  • Forwarder protocol settings match Edge Delta requirements
  • Network path from forwarders to Edge Delta is open

Connection Issues

Splunk TCP Destination (S2S)

Symptoms

  • “Connection refused” errors on port 9997
  • “Unable to establish TCP connection”
  • Intermittent connection drops

Diagnosis and Solutions

1. Test Network Connectivity

Test the connection to your Splunk indexer using standard networking tools:

telnet splunk-indexer.example.com 9997
nc -zv splunk-indexer.example.com 9997

2. Verify Splunk Configuration

On the Splunk indexer, check the inputs.conf file to ensure the TCP input is properly configured:

cat $SPLUNK_HOME/etc/system/local/inputs.conf | grep -A 5 "splunktcp"

The expected configuration should show:

[splunktcp://9997]
disabled = 0

3. Check TLS Configuration

Ensure your Edge Delta TLS configuration is correct. Verify that certificate files exist, have proper permissions, and are not expired:

nodes:
- name: splunk_tcp
  type: splunk_tcp_output
  host: splunk.example.com
  port: 9997
  tls:
    enabled: true
    ca_file: /path/to/ca.pem
    crt_file: /path/to/client.pem
    key_file: /path/to/client.key

4. Validate Certificate

Check certificate expiration dates:

openssl x509 -in /path/to/client.pem -noout -dates

Verify the certificate chain is valid:

openssl verify -CAfile /path/to/ca.pem /path/to/client.pem

Splunk HEC Output

Symptoms

  • HTTP 400/401/403 errors
  • “Invalid token” messages
  • SSL/TLS handshake failures

Diagnosis and Solutions

1. Test HEC Endpoint

Test HEC connectivity by checking the health endpoint (replace with your values):

curl -k https://splunk-hec.example.com:8088/services/collector/health

Test data submission with your HEC token:

curl -k -H "Authorization: Splunk YOUR-TOKEN" \
  https://splunk-hec.example.com:8088/services/collector \
  -d '{"event": "test"}'

2. Verify Token Configuration

Ensure your Edge Delta configuration has a valid HEC token:

nodes:
- name: splunk_hec
  type: splunk_output
  hec_uri: https://splunk-hec.example.com:8088/services/collector
  token: your-hec-token

3. Check Splunk HEC Settings

  • Navigate to Settings → Data Inputs → HTTP Event Collector
  • Verify token is enabled and not expired
  • Check source type and index settings
  • Ensure “Enable SSL” matches your configuration

Splunk TCP Source (Receiving from Forwarders)

Symptoms

  • Edge Delta not receiving data from Universal Forwarders
  • “Bind: address already in use” errors
  • “Connection refused” errors from forwarders
  • “unexpected EOF” or “failed to read signature” errors in Edge Delta logs
  • Authentication failures
  • Port binding permission errors

Diagnosis and Solutions

1. Verify Port Availability

Check if the required port is available and not in use by another service:

sudo netstat -tulpn | grep 9997
sudo lsof -i :9997

If the port shows as not bound or “Connection refused” appears despite the port being configured, proceed to step 5 for Linux permission issues.

2. Verify Edge Delta is Listening

Check Edge Delta logs to confirm the TCP source started successfully:

tail -f /var/log/edgedelta/edgedelta.log | grep -i "splunk_tcp"

Look for messages indicating the listener started on the configured port.

3. Configure Universal Forwarder Output with Required Protocol Settings

On the Universal Forwarder, verify the outputs.conf configuration includes all required settings:

cat $SPLUNK_HOME/etc/system/local/outputs.conf

The configuration must include these settings for Edge Delta compatibility:

# REQUIRED configuration for Edge Delta
[tcpout]
defaultGroup = edge_delta
disabled = false
# CRITICAL: Enable legacy S2S protocol
enableOldS2SProtocol = true

[tcpout:edge_delta]
server = edge-delta-agent.example.com:9997
useACK = false
# Send cooked data (parsed events)
sendCookedData = true
# Use protocol level 0 for Edge Delta
negotiateProtocolLevel = 0
# Disable compression
compressed = false

Critical: The enableOldS2SProtocol = true setting is mandatory. Without it, Splunk Universal Forwarders will reject connections to Edge Delta because Edge Delta uses S2S protocol level 0, which forwarders reject by default.

After modifying outputs.conf, restart the Splunk Universal Forwarder:

$SPLUNK_HOME/bin/splunk restart

4. Edge Delta Configuration

Configure Edge Delta to listen on all interfaces for incoming Splunk forwarder connections:

nodes:
- name: splunk_tcp_input
  type: splunk_tcp_input
  listen: 0.0.0.0
  port: 9997
  max_connections: 200

5. Linux Port Binding Permissions

On some Linux distributions (particularly Ubuntu 24.04 and newer), the Edge Delta agent may lack permission to bind to ports, even above the privileged range (1024+). This manifests as:

  • “Connection refused” errors in Splunk forwarder logs
  • “address already in use” or “bind: permission denied” in Edge Delta logs
  • Port not showing as bound in netstat output

Solution: Grant the Edge Delta agent the network binding capability:

# Find the Edge Delta agent binary path
which edgedelta

# Grant network binding capability (more secure than running as root)
sudo setcap 'cap_net_bind_service=+ep' /path/to/edgedelta

# Restart the Edge Delta agent
sudo systemctl restart edgedelta

# Verify the port is now bound
sudo netstat -tulpn | grep 9997

Note: If the edgedelta binary is updated or reinstalled, you may need to reapply this capability.

6. Test Connection from Forwarder

From the Universal Forwarder host, test network connectivity to Edge Delta:

telnet edge-delta-agent.example.com 9997
# or
nc -zv edge-delta-agent.example.com 9997

If this fails, check:

  • Firewall rules on Edge Delta host
  • Network security groups (cloud environments)
  • Network routing between hosts

7. Verify Forwarder is Sending Data

Check the Splunk Universal Forwarder logs for errors:

tail -f $SPLUNK_HOME/var/log/splunk/splunkd.log | grep -i "tcpout\|edge_delta"

Look for:

  • Connection errors
  • Protocol negotiation failures
  • “Applying quarantine” messages (indicates repeated failures)

8. Check for Protocol Compatibility Errors

If you see errors like:

  • failed to read signature from: <ip>:<port>, err: unexpected EOF
  • failed to parse protocol
  • protocol negotiation failed

This indicates the forwarder is not using the correct protocol settings. Verify:

  1. enableOldS2SProtocol = true is in the [tcpout] section
  2. negotiateProtocolLevel = 0 is in the target group stanza
  3. sendCookedData = true is set
  4. The forwarder has been restarted after configuration changes

Splunk HEC Source (Receiving via HEC Protocol)

Symptoms

  • HTTP clients cannot connect to Edge Delta HEC endpoint
  • Invalid token errors
  • Port binding issues
  • Data not appearing after successful HTTP requests

Diagnosis and Solutions

1. Verify Edge Delta HEC Source Configuration

Ensure the HEC source is properly configured in your Edge Delta pipeline:

nodes:
- name: splunk_hec_receiver
  type: splunk_hec_input
  port: 8088
  token: your-secure-token

2. Test HEC Endpoint Connectivity

Test the HEC endpoint from a client:

# Test basic connectivity
curl -k http://<edge-delta-host>:8088/services/collector/health

# Test with token authentication
curl -k -H "Authorization: Splunk <your-token>" \
  http://<edge-delta-host>:8088/services/collector \
  -d '{"event": "test message"}'

3. Verify Token Configuration

Ensure the token in client requests matches the Edge Delta configuration:

  • Check for leading/trailing whitespace in token
  • Verify token is not quoted when it shouldn’t be
  • Confirm the Authorization header format: Authorization: Splunk <token>

4. Check Port Availability

Similar to TCP source, the HEC source may have port binding issues on certain Linux distributions:

sudo netstat -tulpn | grep 8088

If the port is not bound, apply the same network capability fix as TCP sources:

sudo setcap 'cap_net_bind_service=+ep' /path/to/edgedelta
sudo systemctl restart edgedelta

5. Verify Firewall Rules

Ensure firewall allows incoming HTTP/HTTPS traffic on the HEC port:

# For firewalld
sudo firewall-cmd --list-ports
sudo firewall-cmd --permanent --add-port=8088/tcp
sudo firewall-cmd --reload

# For ufw
sudo ufw status
sudo ufw allow 8088/tcp

6. Check Edge Delta Logs

Monitor logs for HEC-specific errors:

tail -f /var/log/edgedelta/edgedelta.log | grep -i "hec\|splunk_hec"

7. Validate Data Format

Ensure the data sent to HEC is properly formatted. Valid HEC event format:

{
  "event": "Your log message or structured data",
  "time": 1234567890,
  "host": "hostname",
  "source": "source_name",
  "sourcetype": "sourcetype_name",
  "index": "index_name"
}

Data Flow Issues

Data Not Appearing in Splunk

Common Causes and Solutions

1. Index Configuration

Verify the index exists in Splunk:

index=your_index | head 1

Check index permissions:

| rest /services/data/indexes | table title

2. Source Type Mismatch

Ensure the source type in your Edge Delta configuration matches Splunk’s expectations:

nodes:
- name: splunk_output
  type: splunk_output
  index: main
  source_type: _json

3. Time Zone and Timestamp Issues

  • Verify timestamp format matches Splunk’s expectations
  • Check time zone settings on both Edge Delta and Splunk
  • Use timestamp extraction if needed

4. Data Format Problems

For JSON data, ensure proper formatting. Edge Delta will automatically format the data for Splunk:

nodes:
- name: splunk_tcp
  type: splunk_tcp_output
  host: splunk.example.com
  port: 9997
  index: json_index

Partial Data or Missing Fields

1. Field Extraction Issues

  • Check Splunk props.conf for field extraction rules
  • Verify JSON structure if using structured data
  • Test with sample data in Splunk’s search interface

2. Data Truncation

Increase buffer sizes if you’re experiencing truncation with large events:

nodes:
- name: splunk_tcp
  type: splunk_tcp_output
  buffer_max_bytesize: "500MB"

Performance Optimization

Slow Data Transmission

TCP Destination Optimization

Optimize TCP destination performance by using a load balancer, increasing worker count, extending retry periods, and enlarging buffers:

nodes:
- name: high_performance_splunk
  type: splunk_tcp_output
  host: splunk-lb.example.com
  port: 9997
  parallel_worker_count: 20
  buffer_ttl: "1h"
  buffer_max_bytesize: "1GB"

HEC Output Optimization

Optimize HEC output by batching events and adjusting timeouts based on your network conditions:

nodes:
- name: optimized_hec
  type: splunk_output
  hec_uri: https://splunk-hec.example.com:8088/services/collector
  token: your-token
  parallel_worker_count: 15
  batch_size: 1000
  timeout: "30s"

Resource Usage Issues

1. Monitor Edge Delta Agent Resources

Check CPU and memory usage of the Edge Delta agent:

top -p $(pgrep edgedelta)
ps aux | grep edgedelta

2. Adjust Worker Counts

Balance worker counts with available resources. Start conservative and increase gradually based on performance metrics:

parallel_worker_count: 10

3. Implement Buffering Strategy

buffer_path: "/var/log/edgedelta/buffer"
buffer_max_bytesize: "500MB"
buffer_ttl: "30m"

Migration Issues

Universal Forwarder to Edge Delta

Common Migration Problems

1. Data Duplication During Transition

  • Use different indexes during migration
  • Implement phased rollout
  • Monitor for duplicate events

2. Configuration Differences

Map Universal Forwarder settings to Edge Delta configuration. The UF outputs.conf translates to Edge Delta’s splunk_tcp_output:

nodes:
- name: uf_replacement
  type: splunk_tcp_output
  host: same-as-uf-target.com
  port: 9997
  index: same_index

3. Authentication Migration

  • Convert from Splunk certificates to Edge Delta TLS config
  • Update firewall rules for new source IPs
  • Test authentication before full migration

Error Messages Reference

Error MessageLikely CauseSolution
“Connection refused”Service not running, wrong port, or permission issueVerify service status, port config, and Linux capabilities (setcap)
“Invalid HEC token”Token expired or incorrectRegenerate token in Splunk HEC settings or Edge Delta config
“SSL certificate problem”Certificate mismatch or expiredUpdate certificates, check CA chain
“Index not found”Index doesn’t exist or no permissionsCreate index or adjust permissions
“Timeout waiting for response”Network latency or Splunk overloadedIncrease timeout, check Splunk performance
“Address already in use”Port conflict or permission deniedChange port, stop conflicting service, or grant cap_net_bind_service
“Authentication failed”Wrong credentials or methodVerify authentication configuration
“Buffer overflow”Data volume exceeds bufferIncrease buffer_max_bytesize
“unexpected EOF”S2S protocol mismatchAdd enableOldS2SProtocol=true to forwarder config
“failed to read signature”S2S protocol level incompatibleSet negotiateProtocolLevel=0 and sendCookedData=true
“bind: permission denied”Linux capability missingRun: sudo setcap ‘cap_net_bind_service=+ep’ /path/to/edgedelta
“Applying quarantine to ip”Repeated connection failures from forwarderFix forwarder config (protocol settings) and restart forwarder
“failed to start tailer”TLS misconfiguration on sourceCheck TLS cert/key paths and validity
“Invalid event format” (HEC)Malformed JSON in HEC requestValidate JSON format matches HEC specification

Advanced Debugging

Enable Debug Logging

1. Edge Delta Agent Debug Mode

agent:
  log_level: debug

2. Monitor Logs

Watch Edge Delta logs for Splunk-related messages:

tail -f /var/log/edgedelta/edgedelta.log | grep -i splunk

Filter logs for errors and failures:

grep -i "error\|fail\|refuse" /var/log/edgedelta/edgedelta.log

Packet Capture for Network Issues

Capture network traffic to Splunk for detailed analysis:

sudo tcpdump -i any -w splunk_traffic.pcap host splunk.example.com and port 9997

Analyze the captured traffic with Wireshark or tcpdump:

tcpdump -r splunk_traffic.pcap -nn

Test Data Flow

Testing Outbound Connections (Destinations)

Create a test pipeline to verify data flow to Splunk:

nodes:
- name: test_generator
  type: memory_input

- name: test_splunk
  type: splunk_tcp_output
  host: splunk.example.com
  port: 9997
  index: test_index

Testing Inbound Connections (Sources)

For Splunk TCP Source, test with a minimal forwarder configuration:

# From the forwarder, check if Edge Delta is listening
telnet <edge-delta-host> 9997

# Check forwarder logs for connection attempts
tail -f $SPLUNK_HOME/var/log/splunk/splunkd.log | grep "tcpout\|edge_delta"

For Splunk HEC Source, test with curl:

# Test health endpoint
curl -k http://<edge-delta-host>:8088/services/collector/health

# Send test event
curl -k -H "Authorization: Splunk <token>" \
  http://<edge-delta-host>:8088/services/collector \
  -d '{"event": "test from curl", "sourcetype": "manual"}'

Use Edge Delta’s Live Capture feature to verify incoming data:

  1. Navigate to the Edge Delta web interface
  2. Select your agent
  3. Open Live Capture
  4. Send test data and verify it appears in the capture

Best Practices

Connection Management

  1. Use connection pooling appropriately
  2. Implement retry logic with backoff
  3. Monitor connection health metrics
  4. Use load balancers for high availability

Data Integrity

  1. Enable buffering for reliability
  2. Monitor for data loss indicators
  3. Implement checksums if needed
  4. Validate data in Splunk regularly

Security

  1. Always use TLS/SSL in production
  2. Rotate tokens and certificates regularly
  3. Restrict network access with firewalls
  4. Audit authentication logs

Performance

  1. Start with conservative worker counts
  2. Monitor and adjust based on metrics
  3. Use appropriate batch sizes
  4. Consider data volume and velocity

Getting Help

If issues persist:

  1. Collect Diagnostic Information:

    • Edge Delta configuration (sanitized)
    • Error messages from logs
    • Splunk version and configuration
    • Network topology diagram
  2. Check Splunk Logs:

    Query Splunk’s internal logs for errors:

    index=_internal source=*splunkd.log* ERROR
    
  3. Contact Support with:

    • Diagnostic information
    • Steps to reproduce
    • Expected vs actual behavior
    • Any workarounds attempted

Node Configuration References

Integration Guides

General Documentation