Flatten JSON Messages
This example uses Redpanda data transforms to take JSON messages in an input topic and flatten them using a customizable delimiter.
{
"content": {
"id": 123,
"name": {
"first": "Dave",
"middle": null,
"last": "Voutila"
},
"data": [1, "fish", 2, "fish"]
}
}
{
"content.id": 123,
"content.name.first": "Dave",
"content.name.middle": null,
"content.name.last": "Voutila",
"content.data": [1, "fish", 2, "fish"]
}
Prerequisites
You must have the following:
-
At least version 1.20 of Go installed on your host machine.
-
Install
rpk
on your host machine. -
Docker and Docker Compose installed on your host machine.
Limitations
-
Arrays of objects are currently untested.
-
Providing a series of objects as input, not an an array, may result in a series of flattened objects as output.
-
Due to how JSON treats floating point values, values such as
1.0
that can be converted to an integer will lose the decimal point. For example1.0
becomes1
.
Run the lab
-
Clone this repository:
git clone https://github.com/redpanda-data/redpanda-labs.git
bash -
Change into the
data-transforms/flatten/
directory:cd redpanda-labs/data-transforms/go/flatten
bash -
Set the
REDPANDA_VERSION
environment variable to at least version 23.3.1. Data transforms was introduced in this version. For all available versions, see the GitHub releases.For example:
export REDPANDA_VERSION=24.3.8
bash -
Set the
REDPANDA_CONSOLE_VERSION
environment variable to the version of Redpanda Console that you want to run. For all available versions, see the GitHub releases.For example:
export REDPANDA_CONSOLE_VERSION=2.8.4
bash -
Start Redpanda in Docker by running the following command:
docker compose up -d --wait
bash -
Set up your rpk profile:
rpk profile create flatten --from-profile profile.yml
bash -
Create the required topics
iss_json
andiss_avro
:rpk topic create src sink
bash -
Build and deploy the transforms function:
rpk transform build rpk transform deploy --input-topic=src --output-topic=sink
bashThis example accepts the following environment variables:
-
RP_FLATTEN_DELIM
: The delimiter to use when flattening the JSON fields. Defaults to.
.For example:
rpk transform deploy --var "RP_FLATTEN_DELIM=<delimiter>"
bash
-
-
Produce a JSON message to the source topic:
rpk topic produce src
bash -
Paste the following into the prompt and press Ctrl+C to exit:
{"message": "success", "timestamp": 1707743943, "iss_position": {"latitude": "-28.5723", "longitude": "-149.4612"}}
json -
Consume the sink topic to see the flattened result:
rpk topic consume sink --num 1
bash{ "topic": "sink", "value": "{\n \"message\": \"success\" \"timestamp\": 1.707743943e+09 \"iss_position.latitude\": \"-28.5723\",\n \"iss_position.longitude\": \"-149.4612\"\n}\n", "timestamp": 1707744765541, "partition": 0, "offset": 0 }
You can also see this in Redpanda Console.
Clean up
To shut down and delete the containers along with all your cluster data:
docker compose down -v