Benchmark Redpanda
Learn how to measure the performance of a Redpanda cluster deployed on AWS EC2 instances with the Linux Foundation’s OpenMessaging Benchmark. Run the same tests and workloads that Redpanda uses to demonstrate significantly better performance than Apache Kafka.
About OpenMessaging Benchmark
The Linux Foundation’s OpenMessaging Benchmark (OMB) Framework is an open-source, cloud-based benchmark framework that supports several messaging systems, including Kafka, and is configurable for workloads representing real-world use cases.
Redpanda Data provides a fork of OMB on Github with some updates:
-
Fixed coalescing of asynchronous consumer offset requests in the OMB Kafka driver.
-
Support for Kafka 3.2.0 clients.
OMB workloads
An OMB workload is a benchmark configuration that sets the producers, consumers, topics, and messages used by a test, as well as the production rate and duration of each test. An OMB workload is specified in a YAML configuration file.
Example workload configuration file
The content of an OMB workload configuration file, copied from Redpanda Data’s fork of OMB:
name: 1 topic / 1 partition / 1Kb
topics: 1
partitionsPerTopic: 1
keyDistributor: "NO_KEY"
messageSize: 1024
payloadFile: "payload/payload-1Kb.data"
subscriptionsPerTopic: 1
consumerPerSubscription: 1
producersPerTopic: 1
producerRate: 50000
consumerBacklogSizeGB: 0
testDurationMinutes: 15
The keyDistributor
property configures how keys are distributed and assigned to messages.
- NO_KEY
sets null
for all keys.
- KEY_ROUND_ROBIN
cycles through a finite set of keys in round-robin fashion.
- RANDOM_NANO
returns random keys based on System.nanoTime()
.
Set up benchmark
Running OMB with Redpanda requires setting up your local environment to provision and start a Redpanda cluster in AWS.
-
Install CLI tools.
-
Ansible (v2.11 or higher)
-
Python 3 and pip
-
A window manager like tmux or screen that supports detachable screen sessions.
Redpanda Data recommends running the benchmark executable with a window manager that supports detachable screen sessions, like tmux or screen, so the benchmark can continue to run in the background even after you disconnect.
-
Clone the Redpanda Data fork of OMB.
git clone https://github.com/redpanda-data/openmessaging-benchmark
bashThe repository contains a directory for the Redpanda driver,
openmessaging-benchmark/driver-redpanda
. Subsequent steps read and configure files in that directory. -
Customize the
openmessaging-benchmark/driver-redpanda/pom.xml
file with your Kafka client version if necessary (currently 3.3.1):pom.xml
... <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>3.3.1</version> </dependency> ...
xml -
From the repository root directory, build the benchmark client.
cd openmessaging-benchmark mvn clean install -Dlicense.skip=true
bash -
From the Redpanda driver directory, install the Ansible roles required for deploying Redpanda.
cd driver-redpanda/deploy ansible-galaxy install -r requirements.yaml
bash -
Configure AWS credentials and SSH keys.
-
Provision a Redpanda cluster to deploy on AWS with Terraform.
-
Customize the
openmessaging-benchmark/deploy/terraform.tfvars
Terraform configuration file for your environment.Default Terraform configuration for Redpanda on AWS
The default contents of
openmessaging-benchmark/driver-redpanda/deploy/terraform.tfvars
:public_key_path = "~/.ssh/redpanda_aws.pub" region = "us-west-2" az = "us-west-2a" ami = "ami-0d31d7c9fc9503726" profile = "default" instance_types = { "redpanda" = "i3en.6xlarge" "client" = "m5n.8xlarge" "prometheus" = "c5.2xlarge" } num_instances = { "client" = 4 "redpanda" = 3 "prometheus" = 1 }
-
From the Redpanda driver deployment directory, initialize the Terraform deployment of Redpanda on AWS.
cd driver-redpanda/deploy terraform init terraform apply -auto-approve
bashThe terraform apply
command prompts you for an owner name (var.owner
) that is used to tag all the cloud resources that will be created. Once the installation is complete, you will see a confirmation message listing the resources that have been installed.
-
-
Run the Ansible playbook to install and start the Redpanda cluster.
Redpanda can run with or without TLS and SASL enabled.
-
To run Redpanda without TLS and SASL:
ansible-playbook deploy.yaml
bash -
To run Redpanda with TLS and SASL:
ansible-playbook deploy.yaml -e "tls_enabled=true sasl_enabled=true"
bashIf the path to your SSH private key isn’t
~/.ssh/redpanda_aws
, add the--private-key
flag to your Ansible command.ansible-playbook deploy.yaml --private-key=<private-key-path>
bashBeginning with Ansible 2.14, references to args: warn
within Ansible tasks cause a fatal error and halt the execution of the playbook. You may find instances of this in the components installed byansible-galaxy
, particularly in thecloudalchemy.grafana
task indashboards.yml
. To resolve this issue, removing thewarn
line in from the yml file.
-
Run benchmark
Connect to the benchmark’s client and run the benchmark with a custom workload.
-
Connect with SSH to the benchmark client, with its IP address retrieved from the
client_ssh_host
output of Terraform.ssh -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host)
bash -
On the client, navigate to the
/opt/benchmark
directory.cd /opt/benchmark
bash -
Create a workload configuration file. For example, create a
.yaml
file with one topic, 144 partitions, 500 MBps producer rate, four producers, and four consumers:cat > workloads/1-topic-144-partitions-500mb-4p-4c.yaml << EOF name: 500mb/sec rate; 4 producers 4 consumers; 1 topic with 144 partitions topics: 1 partitionsPerTopic: 144 messageSize: 1024 useRandomizedPayloads: true randomBytesRatio: 0.5 randomizedPayloadPoolSize: 1000 subscriptionsPerTopic: 1 consumerPerSubscription: 4 producersPerTopic: 4 producerRate: 500000 consumerBacklogSizeGB: 0 testDurationMinutes: 30 EOF
bashAlternatively, you can use an existing workload file from the Redpanda repo, in
openmessaging-benchmark/driver-redpanda/deploy/workloads/
.Workloads from Redpanda vs. Kafka comparison
The workloads from the Redpanda vs. Kafka benchmark comparison can be gotten from the chart in the comparison:
-
Create or reuse a client configuration file. This file configures the Redpanda producer and consumer clients, as well as topics.
The rest of the guide uses the
openmessaging-benchmark/driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml
configuration file.Client configuration from Redpanda vs. Kafka comparison
The client configuration from the Redpanda vs. Kafka benchmark comparison can be gotten from the code listing in the comparison:
topicConfig: | min.insync.replicas=2 flush.messages=1 flush.ms=0 producerConfig: | acks=all linger.ms=1 batch.size=131072 consumerConfig: | auto.offset.reset=earliest enable.auto.commit=false auto.commit.interval.ms=0 max.partition.fetch.bytes=131072
yamlConfigure reset=false
and manually delete the generated topic after the benchmark completes. Otherwise, whenreset=true
, the benchmark can fail due to it erroneously trying to delete the_schemas
topic. -
Run the benchmark with your workload and client configuration.
sudo bin/benchmark -d \ driver-redpanda/redpanda-ack-all-group-linger-1ms.yaml \ workloads/1-topic-144-partitions-500mb-4p-4c.yaml
bash
View benchmark results
After a run completes, the benchmark generates results as *.json
files in /opt/benchmark
.
Redpanda provides a Python script, generate_charts.py
, to generate charts of benchmark results. To run the script:
-
Copy the results from the client to your local machine.
exit; # back to your local machine mkdir ~/results scp -i ~/.ssh/redpanda_aws ubuntu@$(terraform output --raw client_ssh_host):/opt/benchmark/*.json ~/results/
bash -
From the root directory of the repository, install the prerequisite packages for the Python script.
cd ../../bin # openmessaging-benchmark/bin python3 -m pip -r install requirements.txt
bash -
To list all options, run the script with the
-h
flag../generate_charts.py -h
bash -
To generate charts from your
~/results/
directory, first create an~/output
directory, then run the script with--results
and--output
options set accordingly.mkdir ~/output ./generate_charts.py --results ~/results --output ~/output
bash -
In
~/output
, verify the generated charts are in an HTML page with charts for throughput, publish latency, end-to-end latency, publish rate, and consume rate.
Tear down benchmark
When done running the benchmark, tear down the Redpanda cluster.
terraform destroy -auto-approve