Example implementation

This guide will walk you through the process of instrumenting your application to send API latency metrics to slaOS using the direct ingestion API. We'll provide code examples in JSON-compatible formats and cover best practices to ensure efficient and reliable data submission.

Building an SLA tracking API Uptime and Latency

To make things more concrete we will use the examples of sending relevant data to slaOS, in order to build an Uptime and a Latency SLA, drawing from logs that are emitted by your application directly to slaOS (without the use of any integrations).

Sample log message
{
    "event": "request_finished",
    "request_route_name": "/v0/eth/operators/{operator_id}/apr",
    "request_headers": {
        "host": "api.rated.network",
        "user-agent": "undici",
        "accept": "*/*",
        "accept-encoding": "br, gzip, deflate",
        "accept-language": "*",
        "authorization": "Bearer eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.******.***Y48gQ",
        "content-type": "application/json",
        "sec-fetch-mode": "cors",
        "x-envoy-expected-rq-timeout-ms": "120000",
        "x-envoy-external-address": "45.88.222.59",
        "x-forwarded-for": "45.88.222.59",
        "x-forwarded-proto": "https",
        "x-rated-network": "mainnet",
        "x-request-id": "0cc0a428-a3bc-435c-9bcb-d0aa57511185"
    },
    "request_query_params": {
        "window": "7d",
        "idType": "withdrawalAddress"
    },
    "peer_ip": "40.86.210.59",
    "status_code": 200,
    "request_method": "GET",
    "cache_hit": true,
    "request_path": "/v0/eth/operators/0xcd615270ab3a7a3a262a4e49935d002278c76b78/apr",
    "content_length": 244,
    "request_id": "0cc0a428-a3bc-435c-9bcb-d0aa5751176",
    "user": {
        "id": "bb856d6a365946459ad04816fb70aj6d",
        "org": null
    },
    "org": {
        "id": "531ccc57c17a4b418236931e96fb8047",
        "company_name": "ACME Corporation",
        "pricing_tier": "growth"
    },
    "token": {
        "id": "4fc60598816b4db78644c511d2a8l9c7",
        "expires_at": "2024-12-30T16:35:24.602333+00:00"
    },
    "took": 0.03523564338684082,
    "level": "info",
    "logger": "rated.api",
    "timestamp": 1719931770.556692
}

Uptime SLA

  • The SLI that we would like to extract here would be the status_code returned for each log event produced from the request_finished event type.

  • The SLO we generate here would count each none 5xx status_code as a good outcome for uptime, which will be divided over the total number of events to produce the measured objective.

  • The SLA we can create here for each customer would require a threshold set on the SLO, i.e >= 99.5% uptime.

The configuration of SLIs, SLOs and SLAs happens within the slaOS UI. This step is solely focused on getting the right data, in the right format, in slaOS

Latency SLA

  • The SLI that we would like to extract here would be the took returned for each log event produced from the request_finished event type.

  • The SLO we generate would count each request with a request_duration of under 200ms as a good outcome for latency, which will be divided over the total number of events.

  • The SLA we can create here for each customer would require a threshold set on the SLO, i.e > 95% of requests return with a latency of <0.2 seconds.

The configuration of SLIs, SLOs and SLAs happens within the slaOS UI. This step is solely focused on getting the right data, in the right format, in slaOS

Define a payload structure

We can see that for each log event, we can extract both the status_code, took, organization_id and timestamp. We can structure our event payload like this:

{
  "organization_id": "531ccc57c17a4b418236935e96fb8049",
  "timestamp": "2024-07-02T12:34:56Z",
  "values": {
    "took": 0.03523564338684082,
    "status_code": 200
  },
  "key": "rated_api",
}

Pushing events with a worker

Now that an event payload structure has been chosen, we can use a worker to deliver these events from your application to slaOS using the direct ingestion API.

SlaOSWorker Implementation

Here's a sample implementation of a worker that can batch and send events to slaOS:

import requests
import json
import time
from threading import Lock

class SlaOSWorker:
    def __init__(self, host, ingestion_id, ingestion_key, max_retries=3, retry_delay=5):
        self.ingestion_url = f"https://{host}/v1/ingest/{ingestion_id}/{ingestion_key}"
        self.lock = Lock()
        self.max_retries = max_retries
        self.retry_delay = retry_delay

    def add_event(self, event):
        event = self._add_local_identifier(event)
        self._send_event(event)

    def _add_idempotency_key(self, event):
        # Adding a unique identifier to avoid duplicate processing
        event['idempotency_key'] = f"{event['organization_id']}:{event['timestamp']}:{hash(json.dumps(event['values']))}"
        return event

    def _send_event(self, event):
        retries = 0
        while retries < self.max_retries:
            try:
                response = requests.post(
                    self.ingestion_url,
                    json=event,
                    headers={'Content-Type': 'application/json'}
                )
                response.raise_for_status()
                print(f"Successfully sent event to slaOS")
                return
            except requests.RequestException as e:
                print(f"Error sending event to slaOS: {e}. Retrying {retries + 1}/{self.max_retries}...")
                retries += 1
                time.sleep(self.retry_delay)
        print(f"Failed to send event after {self.max_retries} retries.")

The idempotency_key can be added to each event to prevent duplicate processing of the same event. You can either pass a key generated on our end. If we don't receive a idempotency_key, we generate a unique string based on the organization_id, timestamp, and the hashed contents of the values. This identifier is crucial in avoiding the reprocessing of identical events.

Usage in Your Application

To integrate this worker into your application:

  1. Initialize the Worker: Begin by creating an instance of SlaOSWorker with your specific slaOS host, ingestion ID, and ingestion key.

  2. Send Events: Whenever an event occurs in your application that you want to send to slaOS, simply call worker.add_event(event_data) with the relevant event data.

worker = SlaOSWorker(
    host="your-slaos-host.com",
    ingestion_id="your-ingestion-id",
    ingestion_key="your-ingestion-key"
)

# In your application code, add events as they occur
worker.add_event({
    "organization_id": "531ccc57c17a4b418236935e96fb8049",
    "timestamp": "2024-07-02T12:34:56Z",
    "values": {
        "took": 0.03523564338684082,
        "status_code": 200
    },
    "key": "rated_api"
})

Last updated