LogoLogo
HomeApplicationBlog
  • WELCOME
    • Overview
    • Quickstart
    • Key Features
    • Demo
  • ONBOARDING YOUR DATA
    • Getting started
    • Prerequisites
    • Integrations
      • Prometheus
        • Handling organization_id
        • Privacy and Security
      • CloudWatch
      • Datadog
      • Coming soon
      • Request integration
    • Self-hosting
      • Configuration yaml
      • Running locally with Docker
      • Running with Kubernetes (k8s)
    • Data API
      • Example implementation
    • Filters
      • Open source log parsing library
      • Data hashing & transformation
    • Custom adapters
  • API (BETA)
    • Authentication
    • Pagination
    • API Reference
  • How to's
    • Tutorials
      • Build a SLI
      • Build a SLO
      • Create an Organization
      • Build a SLA
      • Configure a SLA Portal
    • Guides
    • Glossary
  • MISC
    • Changelog
    • Join the Closed Beta
  • Legal
    • Terms of Service
    • Contributor License Agreement
    • Privacy Notice
Powered by GitBook
On this page
  • Overview
  • Field Processing Options
  • Privacy Protection Options
  • 1. Encryption
  • 2. Hashing
  • Data Transformations
  • 1. Expression Transformations
  • 2. Function Transformations
  • Built-in Safety Features
  • Example Implementation
  1. ONBOARDING YOUR DATA
  2. Filters

Data hashing & transformation

Overview

The rated-parser library provides powerful data processing capabilities with built-in privacy features to help you handle sensitive data responsibly. This guide explains how to use these features while maintaining GDPR compliance.

Field Processing Options

Basic Field Definition

Every field in your metrics is defined by a key that maps to the corresponding value in your data. For example:

{
  "user_email": "john@example.com",
  "request_count": 150,
  "response_time_ms": 250
}

Privacy Protection Options

1. Encryption

Use encryption when you need to retrieve the original value later (e.g., for debugging or customer support).

Example Use Cases:

  • User identifiers

  • Email addresses

  • IP addresses

  • Session IDs

{
  "version": 1,
  "fields": [
    {
      "key": "user_email",
      "encryption": true
    }
  ]
}

When processed, the email becomes an encrypted string that can only be decrypted with your encryption key:

{
  "user_email": "AES256.cbc.f7d9a1b2..."
}

2. Hashing

Use hashing when you need to track metrics without storing the original value. Hashed values cannot be reversed.

Our implementation uses:

  • Algorithm: SHA-256

  • Encoding: UTF-8

  • Output Format: Hexadecimal digest (64 characters)

These specifications ensure consistent hash generation across different systems. The code implementation is:

def hash_value(value):
    return sha256(str(value).encode()).hexdigest()

Example Use Cases:

  • Organization IDs for analytics

  • Device IDs for unique user counting

  • Transaction IDs for deduplication

{
  "version": 1,
  "fields": [
    {
      "key": "organization_id",
      "hash": true
    }
  ]
}

Results in:

{
  "organization_id": "sha256.8f4e8d9c..."
}

Data Transformations

1. Expression Transformations

Use expressions when you need to modify values using simple mathematical or string operations.

Example Use Cases:

  • Converting units (bytes to MB, seconds to milliseconds)

  • Normalizing string formats

  • Basic calculations

{
  "version": 1,
  "fields": [
    {
      "key": "memory_usage",
      "transformation": "value / (1024 * 1024)",
      "transformation_type": "expression"
    }
  ]
}

This transforms memory usage from bytes to MB:

Input:  { "memory_usage": 1048576 }
Output: { "memory_usage": 1.0 }

2. Function Transformations

Use predefined functions for more complex transformations.

Example Use Cases:

  • Duration string parsing

  • HTTP status code categorization

  • String normalization

{
  "version": 1,
  "fields": [
    {
      "key": "duration",
      "transformation": "duration_to_ms",
      "transformation_type": "function"
    }
  ]
}

This converts duration strings to milliseconds:

Input:  { "duration": "1.5s" }
Output: { "duration": 1500.0 }

Built-in Safety Features

  1. Field Protection:

    • Cannot combine encryption and hashing on the same field

    • Automatic validation of transformation expressions

    • Protection against injection attacks

  2. Transformation Safety:

    • Restricted to safe mathematical operations

    • Limited to approved string methods

    • No access to system functions or dangerous operations

Example Implementation

Here's a complete example showing different types of field processing:

{
  "version": 1,
  "fields": [
    {
      "key": "user_id",
      "encryption": true
    },
    {
      "key": "organization_id",
      "hash": true
    },
    {
      "key": "response_time",
      "transformation": "value * 1000",
      "transformation_type": "expression"
    },
    {
      "key": "status_code",
      "transformation": "status_class",
      "transformation_type": "function"
    }
  ]
}

Input data:

{
  "user_id": "user_123",
  "organization_id": "org_456",
  "response_time": 0.45,
  "status_code": 404
}

Output data:

{
  "user_id": "AES256.cbc.a1b2c3...",
  "organization_id": "sha256.d4e5f6...",
  "response_time": 450.0,
  "status_code": "4xx"
}

This processed data is now ready for storage or analysis while maintaining privacy and compliance requirements.

PreviousOpen source log parsing libraryNextCustom adapters

Last updated 6 months ago