Manage Schemas with the Redpanda Operator
Use the Schema resource to declaratively create and manage schemas as part of a Redpanda deployment in Kubernetes. Each Schema resource maps to a schema in your Redpanda cluster, allowing you to define data structures, compatibility, and schema evolution in a declarative way.
Prerequisites
Ensure you have the following:
-
Kubectl: Ensure the kubectl command-line tool is installed and configured to communicate with your cluster.
-
Redpanda cluster: Ensure you have at least version v2.3.0-24.3.1 of the Redpanda Operator and a Redpanda resource deployed and accessible.
Create a schema
-
Define a schema using the Schema resource. Here’s a basic example configuration that defines an Avro schema:
schema.yaml
apiVersion: cluster.redpanda.com/v1alpha2 kind: Schema metadata: name: example-schema namespace: <namespace> spec: cluster: clusterRef: name: <cluster-name> schemaType: avro compatibilityLevel: Backward text: | { "type": "record", "name": "ExampleRecord", "fields": [ { "type": "string", "name": "field1" }, { "type": "int", "name": "field2" } ] }
yamlReplace the following placeholders:
-
<namespace>
: The namespace in which to deploy the Schema resource. The Schema resource must be deployed in the same namespace as the Redpanda resource defined inclusterRef.name
. -
<cluster-name>
: The name of the Redpanda resource that defines the Redpanda cluster to which you want to upload the schema.
-
-
Apply the manifest:
kubectl apply -f schema.yaml --namespace <namespace>
bashWhen the manifest is applied, the schema will be created in your Redpanda cluster.
-
Check the status of the Schema resource using the following command:
kubectl get schema example-schema --namespace <namespace>
bash -
Create an alias to simplify running
rpk
commands on your cluster:alias internal-rpk="kubectl --namespace <namespace> exec -i -t <pod-name> -c redpanda -- rpk"
bashReplace
<pod-name>
with the name of a Pod that’s running Redpanda. -
Verify that the schema was created in Redpanda:
internal-rpk registry subject list
bashYou should see
example-schema
in the output.
Schema examples
These examples demonstrate how to define schemas in Avro, Protobuf, and JSON Schema formats.
Create an Avro schema
avro-schema.yaml
# This manifest creates an Avro schema named "customer-profile" in the "basic" cluster.
# The schema defines a record with fields for customer ID, name, and age.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: customer-profile
spec:
cluster:
clusterRef:
name: basic
schemaType: avro
compatibilityLevel: Backward
text: |
{
"type": "record",
"name": "CustomerProfile",
"fields": [
{ "type": "string", "name": "customer_id" },
{ "type": "string", "name": "name" },
{ "type": "int", "name": "age" }
]
}
Create a Protobuf schema
proto-schema.yaml
# This manifest creates a Protobuf schema named "product-catalog" in the "basic" cluster.
# The schema defines a message "Product" with fields for product ID, name, price, and category.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: product-catalog
spec:
cluster:
clusterRef:
name: basic
schemaType: protobuf
compatibilityLevel: Backward
text: |
syntax = "proto3";
message Product {
int32 product_id = 1;
string product_name = 2;
double price = 3;
string category = 4;
}
Create a JSON schema
json-schema.yaml
# This manifest creates a JSON schema named "order-event" in the "basic" cluster.
# The schema requires an "order_id" (string) and a "total" (number) field, with no additional properties allowed.
---
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: order-event
spec:
cluster:
clusterRef:
name: basic
schemaType: json
compatibilityLevel: None
text: |
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"order_id": { "type": "string" },
"total": { "type": "number" }
},
"required": ["order_id", "total"],
"additionalProperties": false
}
Configuration
The Schema resource in Redpanda offers various options to customize and control schema behavior. This section covers schema compatibility, schema references, and schema types, providing a detailed guide on using each of these features to maintain data integrity, manage dependencies, and facilitate schema evolution.
You can find all configuration options for the Schema resource in the CRD reference.
schema.yaml
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: <subject-name>
namespace: <namespace>
spec:
cluster:
clusterRef:
name: <cluster-name>
schemaType: avro
compatibilityLevel: Backward
references: []
text: |
{
"type": "record",
"name": "test",
"fields": [
{ "type": "string", "name": "field1" },
{ "type": "int", "name": "field2" }
]
}
1 | Subject name: The name of the subject for the schema. When data formats are updated, a new version of the schema can be registered under the same subject, enabling backward and forward compatibility. |
2 | Namespace: The namespace in which to deploy the Schema resource. The Schema resource must be deployed in the same namespace as the Redpanda resource defined in clusterRef.name . |
3 | Cluster name: The name of the Redpanda resource that defines the Redpanda cluster to which you want to upload the schema. |
4 | Compatibility level: Defines the compatibility level for the schema. Options are Backward (default), BackwardTransitive , Forward , ForwardTransitive Full , FullTransitive , or None . See Choose a compatibility mode. |
5 | Schema type: Specifies the type of the schema. Options are avro (default) or protobuf . For JSON Schema, include "$schema": in the text to indicate the JSON Schema draft version. See Choose a schema type. |
6 | References: Any references you want to add to other schemas. If no references are needed, this can be an empty list (default). See Use schema references. |
7 | Schema body: The body of the schema, which defines the data structure. |
Choose a schema type
Redpanda’s Schema Registry supports the following schema types:
-
Avro: A widely used serialization format in event-driven architectures.
-
Protobuf: Popular for defining data structures in gRPC APIs and efficient data serialization.
-
JSON Schema: Dynamic, schema-based validation for JSON documents.
If no type is specified, Redpanda defaults to Avro.
Choose a compatibility mode
Compatibility modes determine how schema versions within a subject can evolve without breaking existing data consumers. Redpanda supports the following compatibility levels:
-
None
: Disables compatibility checks, allowing any schema change. -
Backward
: Consumers using the new schema (for example, version 10) can read data from producers using the previous schema (for example, version 9). -
BackwardTransitive
: Enforces backward compatibility across all versions, not just the latest. -
Forward
: Consumers using the previous schema (for example, version 9) can read data from producers using the new schema (for example, version 10). -
ForwardTransitive
: Ensures forward compatibility across all schema versions. -
Full
: Combines backward and forward compatibility, requiring that changes maintain compatibility in both directions. A new schema and the previous schema (for example, versions 10 and 9) are both backward and forward-compatible with each other. -
FullTransitive
: Enforces full compatibility across all schema versions.
For example, to set full compatibility, configure the Schema resource with:
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: fully-compatible-schema
namespace: redpanda
spec:
cluster:
clusterRef:
name: basic
schemaType: avro
compatibilityLevel: Full
text: |
{
"type": "record",
"name": "ExampleRecord",
"fields": [
{ "type": "string", "name": "field1" },
{ "type": "int", "name": "field2" }
]
}
Compatibility settings are essential for maintaining data consistency, especially when updating schemas over time.
Use schema references
For complex data structures, you can define schema references that allow one schema to reference another, enabling modular and reusable schema components. Schema references are helpful for shared data structures across topics like product information or user profiles, reducing redundancy.
This feature is supported for Avro and Protobuf schemas. |
Define a schema reference using the references
field. The reference includes the name, subject, and version of the referenced schema:
apiVersion: cluster.redpanda.com/v1alpha2
kind: Schema
metadata:
name: order-schema
namespace: redpanda
spec:
cluster:
clusterRef:
name: basic
references:
- name: product-schema
subject: product
version: 1
text: |
{
"type": "record",
"name": "Order",
"fields": [
{ "name": "product", "type": "Product" }
]
}
Update a schema
To update a schema, modify the Schema resource and apply the changes:
kubectl apply -f <manfiest-filename>.yaml --namespace <namespace>
Check schema version
Ensure the schema has been versioned by running:
kubectl get schema <subject-name> --namespace <namespace>
You can also check specific versions of the schema:
internal-rpk registry schema get --id 1
internal-rpk registry schema get --id 2
Delete a schema
To delete a schema, use the following command:
kubectl delete schema <subject-name> --namespace redpanda
Verify that the schema was deleted by checking the Redpanda Schema Registry:
internal-rpk registry subject list