CMCD Pack

This is a Common Media Client Data (CMCD) pack that processes log data to extract and report metrics based on cmcd filters

Edge Delta Pipeline Pack for CMCD

Overview

The Edge Delta CMCD pack processes log data to extract and report metrics based on Common Media Client Data (CMCD) parameters. It includes parsing, routing, and metrics conversion tailored for media content delivery.

This pack assumes that data coming in will be tab delimited (\t).

Pack Description

1. Data Ingestion

The data flow starts with the compound_input node. This node serves as the entry point into the pack where it begins processing the incoming logs.

2. Field Extraction

Logs move to a Grok node. This node uses a pattern to extract fields such as timestamp, c-ip, time-to-first-byte, and various CMCD fields, structuring them as attributes.

  - name: grok
    type: grok
    custom_pattern: ^%{NUMBER:timestamp}\t%{NOTSPACE:c-ip}\t%{NOTSPACE:time-to-first-byte}\t%{NOTSPACE:sc-status}\t%{NOTSPACE:sc-bytes}\t%{NOTSPACE:cs-method}\t%{NOTSPACE:cs-protocol}\t%{NOTSPACE:cs-host}\t%{NOTSPACE:cs-uri-stem}\t%{NOTSPACE:cs-bytes}\t%{NOTSPACE:x-edge-location}\t%{NOTSPACE:x-edge-request-id}\t%{NOTSPACE:x-host-header}\t%{NOTSPACE:time-taken}\t%{NOTSPACE:cs-protocol-version}\t%{NOTSPACE:c-ip-version}\t%{NOTSPACE:cs-user-agent}\t%{NOTSPACE:cs-referer}\t%{NOTSPACE:cs-cookie}\t%{NOTSPACE:cs-uri-query}\t%{NOTSPACE:x-edge-response-result-type}\t%{NOTSPACE:x-forwarded-for}\t%{NOTSPACE:ssl-protocol}\t%{NOTSPACE:ssl-cipher}\t%{NOTSPACE:x-edge-result-type}\t%{NOTSPACE:fle-encrypted-fields}\t%{NOTSPACE:fle-status}\t%{NOTSPACE:sc-content-type}\t%{NOTSPACE:sc-content-len}\t%{NOTSPACE:sc-range-start}\t%{NOTSPACE:sc-range-end}\t%{NOTSPACE:c-port}\t%{NOTSPACE:x-edge-detailed-result-type}\t%{NOTSPACE:c-country}\t%{NOTSPACE:cs-accept-encoding}\t%{NOTSPACE:cs-accept}\t%{NOTSPACE:cache-behavior-path-pattern}\t%{NOTSPACE:cs-headers}\t%{NOTSPACE:cs-header-names}\t%{NOTSPACE:cs-headers-count}\t%{NOTSPACE:primary-distribution-id}\t%{NOTSPACE:primary-distribution-dns-name}\t%{NOTSPACE:origin-fbl}\t%{NOTSPACE:origin-lbl}\t%{NOTSPACE:asn}\t%{NOTSPACE:cmcd-encoded-bitrate}\t%{NOTSPACE:cmcd-buffer-length}\t%{NOTSPACE:cmcd-buffer-starvation}\t%{NOTSPACE:cmcd-content-id}\t%{NOTSPACE:cmcd-object-duration}\t%{NOTSPACE:cmcd-deadline}\t%{NOTSPACE:cmcd-measured-throughput}\t%{NOTSPACE:cmcd-next-object-request}\t%{NOTSPACE:cmcd-next-range-request}\t%{NOTSPACE:cmcd-object-type}\t%{NOTSPACE:cmcd-playback-rate}\t%{NOTSPACE:cmcd-requested-maximum-throughput}\t%{NOTSPACE:cmcd-streaming-format}\t%{NOTSPACE:cmcd-session-id}\t%{NOTSPACE:cmcd-stream-type}\t%{NOTSPACE:cmcd-startup}\t%{NOTSPACE:cmcd-top-bitrate}\t%{NOTSPACE:cmcd-version}
  

Structuring the extracted data makes it easier to query and monitor specific attributes related to media streaming, such as buffer length and session information.

3. CMCD Evaluation

The logs are processed by the cmcd_eval node, a Route node, assessing logs to determine if they lack CMCD session data and routing them accordingly.

- name: cmcd_eval
  type: route
  paths:
    - path: not_cmcd
      condition: IsMatch(attributes["cmcd-buffer-starvation"], "-")
      exit_if_matched: true
  expression_type: ottl

This node helps filter logs missing essential CMCD session tracking, ensuring focused metrics extraction on relevant entries only. Unmatched logs flow to the default unmatched route path.

4. Session ID Metrics

Logs continue to the cmcd_session_id node, a Log to Metric node. It creates metrics for CMCD session IDs.

  - name: cmcd_session_id
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd_session_id
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["cmcd-session-id"]

Tracking session IDs helps monitor the distinct user sessions in your streaming environment, offering insight into user engagement.

5. Buffer Starvation Evaluation

The bs_eval node is also a Route node. It routes logs indicating buffer starvation.

- name: bs_eval
  type: route
  paths:
    - path: cmcd_bs true
      condition: IsMatch(attributes["cmcd-buffer-starvation"], "1")
      exit_if_matched: false
  expression_type: ottl

Identifying logs with buffer starvation helps diagnose media delivery issues impacting user experience. Unmatched logs flow to the default unmatched route path.

6. Buffer Starvation Metrics

Logs proceed to the cmcd_bufferstarvation node, a Log to Metric node, computing occurrences of buffer starvation.

  - name: cmcd_bufferstarvation
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd_buffer_starvation
    enabled_stats:
      - count

Capturing buffer starvation events can help identify patterns in user experience deterioration, necessary for predictive maintenance.

7. Bitrate Metrics

The cmcd_bitrate node, another Log to Metric node, creates metrics including average, minimum, and maximum encoded bitrate.

  - name: cmcd_bitrate
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd_enc_bitrate
    enabled_stats:
      - avg
      - max
      - min
    dimension_groups:
      - field_numeric_dimension: item["attributes"]["cmcd-encoded-bitrate"]

Monitoring bitrates enables quality of service analysis, determining whether streaming is delivered efficiently.

8. Detailed Request Evaluation

The x-edge-request node, a Route node, evaluates whether x-edge-request-id is present, directing logs for further processing if true.

- name: x-edge-request
  type: route
  paths:
    - path: x-edge-request_true
      condition: Not(IsMatch(attributes["x-edge-request-id"], "-"))
      exit_if_matched: false
  expression_type: ottl

Filtering by request ID helps hone in on log details corresponding to specific content delivery requests. Unmatched logs flow to the default unmatched route path.

9. Measured Throughput Metrics

The cmcd_mt node, another Log to Metric node, tracks metrics for CMCD measured throughput.

  - name: cmcd_mt
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd-mt
    enabled_stats:
      - avg
      - min
      - max
    dimension_groups:
      - field_numeric_dimension: item["attributes"]["cmcd-measured-throughput"]

These metrics give insight into the actual data rate at which content was delivered, which assists in capacity planning and network optimizations.

10. Streaming Format Evaluation

The sf_eval node, a Route node, determines presence of streaming format details.

- name: sf_eval
  type: route
  paths:
    - path: streaming_format_eval
      condition: IsMatch(attributes["cmcd-streaming-format"], "-")
      exit_if_matched: true

Ensuring streaming format details are captured aids in recognizing media delivery method discrepancies and troubleshooting playback issues. Unmatched logs flow to the default unmatched route path.

11. Streaming Format Plays Metrics

Next, data is processed by the cmcd_streaming_format_plays node, a Log to Metric node, which tracks the number of plays for each streaming format and session ID.

  - name: cmcd_streaming_format_plays
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd_streaming_format_plays
    enabled_stats:
      - count
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["cmcd-streaming-format"]
          - item["attributes"]["cmcd-session-id"]

Tracking plays by format contributes to understanding user preferences and optimizing media strategy.

12. Buffer Length Metrics

Moving to the cmcd_buff_len node, a Log to Metric node, which assesses buffer lengths, offering min, max, and average statistics.

  - name: cmcd_buff_len
    type: log_to_metric
    pattern: .*
    interval: 1m0s
    skip_empty_intervals: false
    only_report_nonzeros: false
    metric_name: cmcd_buff_len
    enabled_stats:
      - avg
      - min
      - max
    dimension_groups:
      - field_dimensions:
          - item["attributes"]["c-country"]
        field_numeric_dimension: item["attributes"]["cmcd-buffer-length"]

Understanding buffer lengths can guide efforts in adapting delivery networks to improve streaming quality, particularly in geographically diverse user bases.

13. Output of Processed Logs

Processed logs are captured by the processed-logs node, a compound_output node. It routes logs after they have been sufficiently filtered and metrics generated, ensuring post-processing storage for analytics.

14. Output of Metrics

Generated metrics are outputted through the metrics node, another compound_output node, ensuring aggregation of all computed metrics for further reporting and visualization.

15. Archive

Finally, unmatched logs are routed out the pack in an archive node, a compound_output node, preserving logs for long-term access and compliance tracking.

Sample Input

1730040399.707	2600:387:15:4410::8	0.522	200	816166	GET	https	d144vxocammxq8.cloudfront.net	/out/v1/bf647bec26dc4ce3b7820f8a562c979b/index_1_7741736.ts?m=1725925771&CMCD=bl%253D4700%252Cbr%253D3363%252Ccid%253D%2522wpoc%2522%252Cd%253D2001.9999999999998%252Cdl%253D4700%252Cmtp%253D1100%252Cnor%253D%2522https%253A%252F%252Fd144vxocammxq8.cloudfront.net%252Fout%252Fv1%252Fbf647bec26dc4ce3b7820f8a562c979b%252Findex_1_7741737.ts%253Fm%253D1725925771%2522%252Cot%253Dv%252Csf%253Dh%252Csid%253D%252243803f9c-c975-4972-b671-e3d35b53de3f%2522%252Cst%253Dl%252Csu%252Ctb%253D3363.219	327	DFW57-P9	SZfy1f-OVQWrQpKCcYdhX9O_iFWcFXpwnPBhYSWvxsihQTpJq-PZxw==	d144vxocammxq8.cloudfront.net	0.733	HTTP/2.0	IPv6	Mozilla/5.0%20(Linux;%20Android%2010;%20K)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/130.0.0.0%20Mobile%20Safari/537.36	https://clientpoc.video-dev.clientstoryline.com/	-	m=1725925771&CMCD=bl%253D4700%252Cbr%253D3363%252Ccid%253D%2522wpoc%2522%252Cd%253D2001.9999999999998%252Cdl%253D4700%252Cmtp%253D1100%252Cnor%253D%2522https%253A%252F%252Fd144vxocammxq8.cloudfront.net%252Fout%252Fv1%252Fbf647bec26dc4ce3b7820f8a562c979b%252Findex_1_7741737.ts%253Fm%253D1725925771%2522%252Cot%253Dv%252Csf%253Dh%252Csid%253D%252243803f9c-c975-4972-b671-e3d35b53de3f%2522%252Cst%253Dl%252Csu%252Ctb%253D3363.219	Miss	-	TLSv1.3	TLS_AES_128_GCM_SHA256	Miss	-	-	video/MP2T	814792	-	-	57106	Miss	US	gzip,%20deflate,%20br,%20zstd	*/*	*	host:d144vxocammxq8.cloudfront.net%0Asec-ch-ua-platform:%22Android%22%0Auser-agent:Mozilla/5.0%20(Linux;%20Android%2010;%20K)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/130.0.0.0%20Mobile%20Safari/537.36%0Asec-ch-ua:%22Chromium%22;v=%22130%22,%20%22Google%20Chrome%22;v=%22130%22,%20%22Not?A_Brand%22;v=%2299%22%0Asec-ch-ua-mobile:?1%0Aaccept:*/*%0Aorigin:https://clientpoc.video-dev.clientstoryline.com%0Asec-fetch-site:cross-site%0Asec-fetch-mode:cors%0Asec-fetch-dest:empty%0Areferer:https://clientpoc.video-dev.clientstoryline.com/%0Aaccept-encoding:gzip,%20deflate,%20br,%20zstd%0Aaccept-language:en-US,en;q=0.9,et;q=0.8%0Apriority:u=1,%20i%0A	host%0Asec-ch-ua-platform%0Auser-agent%0Asec-ch-ua%0Asec-ch-ua-mobile%0Aaccept%0Aorigin%0Asec-fetch-site%0Asec-fetch-mode%0Asec-fetch-dest%0Areferer%0Aaccept-encoding%0Aaccept-language%0Apriority%0A	14	E26BR2KTLMWONZ	d144vxocammxq8.cloudfront.net	0.493	0.704	7018	3363	4700	0	wpoc	-	4700	1100	-	-	v	-	-	h	43803f9c-c975-4972-b671-e3d35b53de3f	l	1	-	1