Skip to content

Commit

Permalink
Init BemiDB
Browse files Browse the repository at this point in the history
  • Loading branch information
exAspArk committed Nov 5, 2024
0 parents commit 3f62e96
Show file tree
Hide file tree
Showing 33 changed files with 5,604 additions and 0 deletions.
17 changes: 17 additions & 0 deletions .env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
BEMIDB_PORT=54321
BEMIDB_DATABASE=bemidb
BEMIDB_LOG_LEVEL=DEBUG

# Local storage
BEMIDB_STORAGE_TYPE=LOCAL
BEMIDB_ICEBERG_PATH=../iceberg

# S3 storage
# BEMIDB_STORAGE_TYPE=AWS_S3
# BEMIDB_ICEBERG_PATH=iceberg
# BEMIDB_AWS_REGION=us-west-1
# BEMIDB_AWS_S3_BUCKET=us-west-1-bemidb-iceberg-test
# BEMIDB_AWS_ACCESS_KEY_ID=[REPLACE_ME]
# BEMIDB_AWS_SECRET_ACCESS_KEY=[REPLACE_ME]

PG_DATABASE_URL=postgres://username:password@localhost:5432/database_name
26 changes: 26 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Build

on:
push:
branches: ['**']

jobs:
test:
name: Test
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v4

- name: Set Up Go
uses: actions/setup-go@v5
with:
go-version: '1.23.1'

- name: Install Dependencies
run: go get .
working-directory: ./src

- name: Run Tests
run: go test -v ./...
working-directory: ./src
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/iceberg
/iceberg-test
.env
661 changes: 661 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

26 changes: 26 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
sh:
devbox shell

install:
devbox run "cd src && go mod tidy"

up:
devbox run --env-file .env "cd src && go run ."

build:
devbox run "go build -C src -o ../bemidb"

sync:
devbox run --env-file .env "cd src && go run . sync"

test:
devbox run "cd src && go test ./..."

debug:
devbox run "cd src && dlv test github.com/BemiHQ/BemiDB"

lint:
devbox run "cd src && go fmt"

outdated:
devbox run "cd src && go list -u -m -f '{{if and .Update (not .Indirect)}}{{.}}{{end}}' all"
164 changes: 164 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# BemiDB

BemiDB is a Postgres read replica optimized for analytics.
It consists of a single binary that seamlessly connects to a Postgres database, replicates the data in a compressed columnar format,
and allows running complex queries using the Postgres-compatible analytical query engine.

## Contents

- [Highlights](#highlights)
- [Quickstart](#quickstart)
- [Configuration](#configuration)
- [Local disk storage](#local-disk-storage)
- [S3 block storage](#s3-block-storage)
- [Architecture](#architecture)
- [Roadmap](#roadmap)
- [Development](#development)
- [License](#license)

## Highlights

- **Single Binary**: consists of a single binary that can be run on any machine.
- **Postgres Replication**: automatically syncs data from Postgres databases.
- **Query Engine**: embeds a query engine optimized for analytical workloads.
- **Compressed Data**: uses an open columnar format for tables with compression.
- **Scalable Storage**: storage is separated from compute and can natively work on S3.
- **Postgres-Compatible**: integrates with any services and tools in the Postgres ecosystem.
- **Open-Source**: released under the OSI-approved license.

## Quickstart

Install BemiDB:

```sh
curl -sSL https://api.bemidb.com/install.sh | bash
```

Sync data from a Postgres database:

```sh
bemidb sync --pg-database-url postgres://postgres:postgres@localhost:5432/dbname
```

Run BemiDB database:

```sh
bemidb start
```

Run Postgres queries on top of the BemiDB database:

```sh
# List all tables
psql postgres://localhost:54321/bemidb -c "SELECT * FROM information_schema.tables"

# Query a table
psql postgres://localhost:54321/bemidb -c "SELECT COUNT(*) FROM [table_name]"
```

## Configuration

### Local disk storage

By default, BemiDB stores data on the local disk.
Here is an example of running BemiDB with default settings and storing data in a local `iceberg` directory:

```sh
bemidb start \
--port 54321 \
--database bemidb \
--storage-type LOCAL \
--iceberg-path ./iceberg \ # $PWD/iceberg/*
--init-sql ./init.sql \
--log-level INFO
```

### S3 block storage

BemiDB natively supports S3 storage. You can specify the S3 settings using the following flags:

```sh
bemidb start \
--port 54321 \
--database bemidb \
--storage-type AWS_S3 \
--iceberg-path iceberg \ # s3://[AWS_S3_BUCKET]/iceberg/*
--aws-region us-east-1 \
--aws-s3-bucket [AWS_S3_BUCKET] \
--aws-access-key-id [AWS_ACCESS_KEY_ID] \
--aws-secret-access-key [AWS_SECRET_ACCESS_KEY]
```

Here is the minimal IAM policy required for BemiDB to work with S3:

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::[AWS_S3_BUCKET]",
"arn:aws:s3:::[AWS_S3_BUCKET]/*"
]
}
]
}
```

## Architecture

BemiDB consists of the following main components:

- **Database Server**: implements the [Postgres protocol](https://www.postgresql.org/docs/current/protocol.html) to enable Postgres compatibility.
- **Query Engine**: embeds the [DuckDB](https://duckdb.org/) query engine to run analytical queries.
- **Storage Layer**: uses the [Iceberg](https://iceberg.apache.org/) table format to store data in a columnar compressed Parquet files.
- **Postgres Connector**: connects to a Postgres databases to sync tables' schema and data.

<img src="/img/architecture.png" alt="Architecture" width="720px">

## Roadmap

- [ ] Native support for complex data structures like JSON and arrays.
- [ ] Incremental data synchronization into Iceberg tables.
- [ ] Direct Postgres-compatible write operations.
- [ ] Real-time replication from Postgres using CDC.
- [ ] TLS and authentication support for Postgres connections.
- [ ] Iceberg table compaction and partitioning.
- [ ] Cache layer for frequently accessed data.
- [ ] Add support for materialized views.

## Development

We develop BemiDB using [Devbox](https://www.jetify.com/devbox) to ensure a consistent development environment without relying on Docker.

To start developing BemiDB and run tests, follow these steps:

```sh
cp .env.sample .env
make install
make test
```

To run BemiDB locally, use the following command:

```sh
make up
```

To sync data from a Postgres database, use the following command:

```sh
make sync
```

## License

Distributed under the terms of the [AGPL-3.0 License](/LICENSE). If you need to modify and distribute the code, please release it to contribute back to the open-source community.
14 changes: 14 additions & 0 deletions devbox.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"$schema": "https://raw.githubusercontent.com/jetify-com/devbox/0.13.1/.schema/devbox.schema.json",
"packages": [
"go@latest"
],
"shell": {
"init_hook": [],
"scripts": {
"test": [
"echo \"Error: no test specified\" && exit 1"
]
}
}
}
53 changes: 53 additions & 0 deletions devbox.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
{
"lockfile_version": "1",
"packages": {
"go@latest": {
"last_modified": "2024-09-10T15:01:03Z",
"resolved": "github:NixOS/nixpkgs/5ed627539ac84809c78b2dd6d26a5cebeb5ae269#go_1_23",
"source": "devbox-search",
"version": "1.23.1",
"systems": {
"aarch64-darwin": {
"outputs": [
{
"name": "out",
"path": "/nix/store/nvaay1c4banbccyvv6ba1gzyqpypjmfq-go-1.23.1",
"default": true
}
],
"store_path": "/nix/store/nvaay1c4banbccyvv6ba1gzyqpypjmfq-go-1.23.1"
},
"aarch64-linux": {
"outputs": [
{
"name": "out",
"path": "/nix/store/9ylsay11jb3p6yarkmlz0fin76cdypwa-go-1.23.1",
"default": true
}
],
"store_path": "/nix/store/9ylsay11jb3p6yarkmlz0fin76cdypwa-go-1.23.1"
},
"x86_64-darwin": {
"outputs": [
{
"name": "out",
"path": "/nix/store/zkg5xhyx2rs03dq0qp14nqlx9ff1y5c5-go-1.23.1",
"default": true
}
],
"store_path": "/nix/store/zkg5xhyx2rs03dq0qp14nqlx9ff1y5c5-go-1.23.1"
},
"x86_64-linux": {
"outputs": [
{
"name": "out",
"path": "/nix/store/mi0ybwsm6pmxzv9hsm6bcbqaq1pkf8wh-go-1.23.1",
"default": true
}
],
"store_path": "/nix/store/mi0ybwsm6pmxzv9hsm6bcbqaq1pkf8wh-go-1.23.1"
}
}
}
}
}
Binary file added img/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 3f62e96

Please sign in to comment.