Validate JSON in Kafka Without Migrating to Schema Registry

Block malformed JSON before it breaks Kafka pipelines. Conduktor Trust validates schema, types, and fields—no Schema Registry migration needed.

Chuck Larrieu CasiasChuck Larrieu Casias · September 4, 2025
Validate JSON in Kafka Without Migrating to Schema Registry

Kafka accepts any data you throw at it. That's the point. Fraud detection, recommendation engines, real-time analytics: Kafka doesn't care about your payload structure.

This is also Kafka's biggest problem. A single malformed message can break dashboards, crash downstream jobs, and corrupt ML model training.

Why Schema Registry Falls Short for JSON

Schema Registry defines message structure and manages data contracts between producers and consumers. It tracks schema evolution and provides an API for schema management.

But it doesn't solve everything:

  • Partial adoption. Many organizations have Schema Registry on some topics but not others. Governance ends up fragmented.
  • Migration is a breaking change. Moving a topic to Schema Registry requires all downstream consumers to adopt the schema registry deserializer. That coordination rarely happens.
  • JSON has no built-in schema. Avro and Protobuf enforce structure. JSON has syntax rules but no schema enforcement across messages.

JSON Is Easy to Use and Hard to Trust

JSON is everywhere because it's flexible and human-readable. Engineers and non-technical people can skim fields and understand the data.

The same flexibility creates problems:

  • Loose contracts. Tighter data formats mean less flexibility. JSON sits at the loose end of that spectrum.
  • Assumed structure. JSON Schema defines standards, but nothing enforces them. Missing fields, wrong types, and business rule violations slip through.

Here's a JSON Schema for a product catalog:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/product.schema.json",
  "title": "Product",
  "description": "A product from Acme's catalog",
  "type": "object",
  "$comment": "setting additionalProperties to false to ensure no extra fields are allowed",
  "additionalProperties": false,
  "properties": {
    "productId": {
      "description": "The unique identifier for a product",
      "type": "integer"
    },
    "productName": {
      "description": "Name of the product",
      "type": "string"
    },
    "price": {
      "description": "The price of the product",
      "type": "number",
      "exclusiveMinimum": 0
    },
    "tags": {
      "description": "Tags for the product",
      "type": "array",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "uniqueItems": true
    }
  },
  "required": ["productId", "productName", "price"]
}

Your order system expects productId as an integer, productName as a string, and price as a positive number.

Valid JSON:

{ "productId": 1234, "productName": "duck", "price": 22.22 }
{"productId": 1234, "productName": "duck", "price": 22.22, "tags": ["very_cool"]}

Invalid JSON:

{ "productName": "duck", "price": 22.22 }  // missing required field
{ "productId": 1234, "productName": "duck", "price": -5 }  // invalid price
{"productId": 1234, "productName": "duck", "price": 22.22, "favorite_company": "Conduktor"}  // extra field

These poison pills crash applications, drift ML models, and corrupt analytics.

Misrouted Data Breaks Pipelines

Kafka accepts whatever you send to whatever topic you specify. A developer types "user-signps" instead of "user-signups" and Kafka creates the topic. No warning. No validation.

Kafka has authorization permissions, but they control application access to topics. They can't validate that the right data goes to the right place.

The result:

  • Missing data. Correct consumers never receive messages they need.
  • Garbage accumulation. Topics fill with irrelevant data that can't be removed until retention expires.
  • Application failures. Consumers receiving unexpected formats throw deserialization errors or crash.

Conduktor Trust Validates JSON Before It Reaches Kafka

JSON Schema Rules act as a gatekeeper. Every JSON message gets validated against a predefined schema before reaching your topics.

Conduktor Trust's Kafka proxy intercepts messages and validates field names, types, and structure. It can also enforce field-level quality rules (e.g., productId must be a positive integer).

For messages that fail validation, you choose the response:

  • Log the violation for observability without blocking
  • Mark the message with a header for downstream handling
  • Block the message from entering the topic

Trust doesn't replace Schema Registry. It doesn't handle schema management, track evolution, ensure compatibility, or provide an API. Trust works with or without Schema Registry. Either way, you can use CEL to enforce semantic data quality rules and validate that records contain valid schema IDs.

JSON Schema Rules give teams structure enforcement and pipeline protection without requiring a full Schema Registry migration. It's the middle ground between ad hoc JSON and full schema governance.

Stop Bad JSON at the Source

Conduktor Trust and JSON Schema Rules let teams use JSON with confidence. No more hidden poison pills, misrouted data, or downstream cascades.

Start your free demo of JSON Schema Rules in Conduktor Trust today.