# Filters

This guide explains how slaOS applies filters to parse and process log data. We'll cover key concepts and provide practical examples to illustrate how log parsing works in slaOS.

## Log formats

slaOS supports both structured (JSON) and unstructured (raw text) log formats.

{% hint style="info" %}
Please note that unstructured logs are processed using regex to extract relevant features.
{% endhint %}

## **Field Types**

slaOS supports the following field types:

* `timestamp` : For date and time information
* `integer`: For whole numbers
* `float`: For decimal numbers
* `string`: For text data

## Example: Parsing API Key Metric Usage Logs

In this example, we'll demonstrate how to effectively parse JSON logs containing an application API usage metrics using slaOS. Imagine you're building an SLA (Service Level Agreement) for an API that tracks critical metrics such as units consumed, latency, API paths per customer IDs.&#x20;

For simplicity, we'll assume the logs are in a JSON-compatible format. Here's an example of such a log entry:

```json
{
  "timestamp": "2024-08-15T10:30:45Z",
  "customer": {
    "id": "cust_12345"
  },
  "api_call": {
    "latency": 120,
    "credits_used": 5,
    "path": "/v1/process",
    "method": "POST"
  },
  "response": {
    "status_code": 200
  }
}
```

### **Step 1: Define the log pattern**

To effectively parse your API key usage logs, you need to define a pattern that highlights the relevant fields you want to track based on your log structure. Below is an example of a pattern definition tailored to the log structure above,

Each field definition consists of:

* **Key**: The name of the field in the parsed output
* **Field Type**: One of the supported field types (`timestamp`, `integer`, `float`, or `string`)
* **Format** (*optional*): For timestamp fields, specifies how to parse the date/time string
* **Path**: The path to the field in the nested JSON structure

```json
{
  "version": 1,
  "log_format": "json_dict",
  "fields": [
    {
      "key": "timestamp",
      "field_type": "timestamp",
      "format": "%Y-%m-%dT%H:%M:%SZ",
      "path": "timestamp"
    },
    {
      "key": "customer_id",
      "field_type": "string",
      "path": "customer.id"
    },
    {
      "key": "latency",
      "field_type": "integer",
      "path": "api_call.latency"
    },
    {
      "key": "credits_used",
      "field_type": "integer",
      "path": "api_call.credits_used"
    },
    {
      "key": "path",
      "field_type": "string",
      "path": "api_call.path"
    },
    {
      "key": "method",
      "field_type": "string",
      "path": "api_call.method"
    },
    {
      "key": "api_status_code",
      "field_type": "integer",
      "path": "response.status_code"
    }
  ]
}
```

This pattern provides slaOS with clear instructions on how to interpret and extract the necessary data from your JSON logs, ensuring that each critical metric is accurately captured.

### **Step 2: Set Up the Parser**

Once you’ve defined the log pattern, the next step is to set up the parser in slaOS by providing this pattern definition. While the exact setup process may vary depending on your specific integration, the underlying concept remains the same: you’re telling slaOS, "This is how my logs are structured, and here’s how to interpret each field."

<details>

<summary>Adding patterns using the <code>rated-parser</code> Python library</summary>

```python
from rated_parser import LogParser

log_parser = LogParser()

# Define your log pattern
log_pattern = {
    "version": 1,
    "log_format": "json_dict",
    "fields": [
        {"key": "timestamp", "field_type": "timestamp", "format": "%Y-%m-%dT%H:%M:%SZ", "path": "timestamp"},
        {"key": "customer_id", "field_type": "string", "path": "customer.id"},
        {"key": "latency", "field_type": "integer", "path": "api_call.latency"},
        {"key": "credits_used", "field_type": "integer", "path": "api_call.credits_used"},
        {"key": "path", "field_type": "string", "path": "api_call.path"},
        {"key": "method", "field_type": "string", "path": "api_call.method"},
        {"key": "status_code", "field_type": "integer", "path": "response.status_code"}
    ]
}

# Configure the parser with the defined pattern
log_parser.add_patter(log_pattern)
```

</details>

### **Step 3: Parse the Log Entry**

After setting up the parser, slaOS processes your log entries according to the defined pattern, converting and extracting each field into a structured format. Here’s how the parsed output might look:

```json
{
  "timestamp": "2024-08-15 10:30:45",
  "customer_id": "cust_12345",
  "latency": 120,
  "credits_used": 5,
  "path": "/v1/process",
  "method": "POST",
  "api_status_code": 200
}
```

Notice how the timestamp has been standardized, and all fields have been accurately extracted based on their specified paths and types. This structured output is now ready for further analysis, reporting, or intheretegration into your monitoring tools.

<details>

<summary>Parsing logs using the <code>rated-parser</code> Python library</summary>

```python
# Example log entry
log_entry = {
    "timestamp": "2024-08-15T10:30:45Z",
    "customer": {"id": "cust_12345"},
    "api_call": {"latency": 120, "credits_used": 5, "path": "/v1/data/upload", "method": "POST"},
    "response": {"status_code": 200}
}

# Parse the log entry
parsed_log = log_parser.parse_log(log_entry, version=1)
print(parsed_log)
```

</details>

## Notes on DateTime Formats

When defining timestamp fields, it's crucial to use the correct `datetime` format string that matches the format of your log timestamps. These format strings tell the system exactly how to interpret the date and time information in your logs. Here are some examples of datetime format strings that can be used to accurately parse different timestamp formats:

| Format String              | Example Timestamp                 |
| -------------------------- | --------------------------------- |
| `%Y-%m-%dT%H:%M:%S.%fZ`    | `2023-07-25T14:30:45.678901Z`     |
| `%d/%b/%Y:%H:%M:%S %z`     | `25/Jul/2023:14:30:45 +0000`      |
| `%a %b %d %H:%M:%S %Y`     | `Tue Jul 25 14:30:45 2023`        |
| `%A, %d-%b-%y %H:%M:%S %Z` | `Tuesday, 25-Jul-23 14:30:45 UTC` |
| `%Y%m%d%H%M%S`             | `20230725143045`                  |

Each of these format strings is designed to match specific timestamp layouts, allowing the system to correctly parse and convert the raw timestamp data into a standardized format for processing and analysis.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rated.co/onboarding-your-data/filters.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
