Comparison with Other Systems

JDBIN is not a "copy" of any single existing system. It has traits from several systems, but its architectural starting point is different.

Short answer

JDBIN is not a new Parquet and not a new DuckDB. It borrows ideas such as segments, indexes, immutable structure, and selective reads, but combines them around Cloudflare R2, HTTP byte-range fetches, and a Worker-based query engine.

1. Parquet

Parquet is a file format.

CSV
  -> Parquet file
  -> Spark / DuckDB / Athena / BigQuery

Parquet includes things like:

columnar storage
metadata
pages
compression
statistics

But it is not a query engine. It only says: "this is how the data is stored."

JDBIN differs because it includes not only a format, but also a Worker-based query path.

2. Apache ORC

ORC follows a very similar idea to Parquet.

It adds things like:

bloom filters
indexes
statistics
predicate pushdown

But it still is not a query engine by itself.

3. DuckDB storage engine

DuckDB is a real database.

SQL
  -> DuckDB parser
  -> optimizer
  -> execution engine
  -> storage engine
  -> Parquet / disk

DuckDB includes:

a SQL parser
an optimizer
an execution engine
a storage engine

The JDBIN path looks more like this:

Worker
  -> planner
  -> range reader
  -> JDBIN
  -> R2

The key difference is that, under the current boundary, JDBIN does not have a general SQL parser. The documentation instead emphasizes a Worker-side query planner that performs only the byte-range fetches actually needed from R2.

4. ClickHouse MergeTree

This is one of the closest comparisons.

MergeTree uses things like:

parts
indexes
marks
granules

so that only a small part of the file needs to be read.

JDBIN follows a similar idea:

Worker
  -> planner
  -> range read
  -> segment
  -> string pool
  -> manifest

The major difference is that MergeTree works on its own disk and inside its own database process. JDBIN uses object storage, here specifically R2.

5. SSTable

SSTable is a very interesting comparison.

Its model includes:

sorted key
immutable structure
index
lookup path

When new data is written, a new SSTable is created instead of overwriting the old one.

The JDBIN write path:

delta
  -> manifest
  -> active pointer
  -> new view

is similar in the sense that a new view is built without directly overwriting old data.

6. Lucene

Lucene is not a database. It is a search index engine.

It has things like:

an inverted index
segments
immutable segments

JDBIN is not an inverted index, but the segment-oriented thinking is comparable.

The real architectural difference

According to the documentation, the JDBIN architecture looks like this:

Client
  -> Cloudflare Worker
  -> query planner
  -> range reader
  -> R2 object
  -> JDBIN

The Worker fetches only the required byte ranges from R2 without a separate database server.

Most more traditional systems look more like this:

Disk
  -> database process
  -> query engine
  -> client

In other words, the database process owns the disk and the query engine.

This model instead looks like:

Object Storage (R2)
  -> HTTP Range GET
  -> Cloudflare Worker
  -> planner
  -> JSON

That is a different architectural starting point.

Summary

Property	Parquet	DuckDB	ClickHouse	SSTable	JDBIN
Binary format	Yes	Yes	Yes	Yes	Yes
Own query planner	No	Yes	Yes	Partly	Yes
Object storage native	No	No	No	No	Yes
Byte-range reads as main mechanism	Partly	No	Partly	No	Yes
Separate DB server required	Practically yes	Yes	Yes	Yes	No

Safe closing formulation

JDBIN is not a new Parquet and not a new DuckDB. It combines segment, index, and immutable ideas into an object-storage-native model where the Cloudflare Worker acts as the query engine and R2 acts as the canonical storage layer.

Comparison with Other Systems

On this page