git-meta blog

publishing metadata to the right places

Sun, 31 May 2026 00:00:00 GMT

One of the awkward things about metadata is that not all of it wants the same audience.

Some metadata is obviously public. Review status, ownership hints, generated documentation notes, compatibility information, provenance for a generated patch. If the repository is public, that metadata can probably travel with it.

Some metadata is useful only inside a company. Internal ticket IDs, security scan details, deployment approvals, customer names, private model evaluations, or links to systems that only exist behind the VPN.

And some metadata should not leave your laptop at all. Cursor positions, local scratch state, experiments, personal agent notes, temporary import markers.

If all of that lives in one pile, pushing metadata becomes scary. So git-meta has serialize filters.

The default: publish normal keys

By default, metadata serializes to the main local metadata ref:

refs/meta/local/main

That ref can then be pushed to a remote metadata ref, normally something like:

refs/meta/main

So if you set ordinary keys:

git meta set project review:policy required
git meta set path:src/api.rs owner platform-team
git meta set commit:abc123 provenance:generator codegen-v2

then git meta serialize writes those keys to the normal serialized tree.

That is the happy path. If a key is not special and no filter matches it, it goes to main.

Local-only keys

There is one hard rule that does not need configuration:

local:*

Anything under local: is never serialized, no matter which target it is attached to.

For example:

git meta set project local:last-viewed-branch sc/experiment
git meta set path:src/api.rs local:cursor-line 142

Those values stay in your local SQLite database. They are not written to any Git tree, even if they are attached to a commit, branch, path, or project, and even if you add a filter that tries to route them somewhere.

Use this namespace for personal state, caches, UI hints, and anything that would be noise or a leak if another clone saw it.

Excluding keys

Sometimes a key is not purely local, but you still do not want it serialized. Add an exclude rule.

Filter rules are set members on the project target:

git meta set:add project meta:filter "exclude draft:**"

That means any key under draft: is skipped during serialization:

git meta set project draft:summary "not ready yet"
git meta set project draft:review-notes "too spicy"

The ** wildcard means zero or more key segments, so draft:title and draft:review:notes both match.

A plain * matches exactly one segment:

git meta set:add project meta:filter "exclude scratch:*"

That matches scratch:one, but not scratch:one:two.

Routing keys to another ref

The more interesting filter is route.

A route rule keeps matching keys out of main and writes them to another local destination ref:

git meta set:add project meta:filter "route company:** company"

Now this key:

git meta set commit:abc123 company:jira PROJ-1234

serializes to:

refs/meta/local/company

instead of:

refs/meta/local/main

That gives you two separate publishable streams:

main for public metadata
company for internal metadata

The serialized trees use the same exchange format; they just live under different refs.

Shared filters vs personal filters

There are two places to put filter rules:

meta:filter
local:meta:filter

Use meta:filter for project policy. These rules are themselves shared metadata, so everyone can agree that company:** goes to the company-only destination.

Use local:meta:filter for your own publishing preferences. Since it is under local:, it is not serialized.

For example, a company might share this rule:

git meta set:add project meta:filter "route company:** company"

And you might locally add:

git meta set:add project local:meta:filter "route me:** mine"

Now company values go to the company destination, your personal values go to your private destination, and normal values still go to public main.

A public plus company workflow

Imagine an open source project maintained by a company. The public repository should get useful metadata:

git meta set commit:abc123 review:status approved
git meta set commit:abc123 provenance:agent codex
git meta set path:src/billing.rs owner billing-team

But internal systems should stay internal:

git meta set commit:abc123 company:jira BILL-2841
git meta set commit:abc123 company:security-scan scan-99812
git meta set path:src/billing.rs company:service-id billing-prod

Set the shared filter once:

git meta set:add project meta:filter "route company:** company"

Then serialize:

git meta serialize

You now have at least two local metadata refs:

refs/meta/local/main
refs/meta/local/company

The public ref can be pushed to the public host. The company ref can be pushed to an internal metadata remote or internal ref. The exact remote setup is host-tool policy, but the important bit is that the data is already separated before publishing.

Conceptually:

# public metadata
git push public refs/meta/local/main:refs/meta/main

# internal metadata
git push company refs/meta/local/company:refs/meta/main

Now public consumers can materialize review status, owners, and provenance without seeing internal Jira IDs. Company users can materialize the internal stream too.

Multiple destinations

A route can name more than one destination with commas:

git meta set:add project meta:filter "route audit:** company,audit"

That writes matching keys to both destination refs. This is useful when one class of metadata should go to a broad internal audience and also to a narrower archive or compliance stream.

Conflict rules are intentionally boring

Filters decide where a key is serialized. They do not change the meaning of the key, the target, or the merge rules for the value.

Strings are still strings. Lists are still append-friendly lists. Sets are still unique-member sets. A routed ref is just another serialized git-meta tree.

That is the design goal: keep policy about where metadata goes separate from semantics about what metadata means.

A simple naming pattern

The practical advice is to pick obvious top-level namespaces:

review:*        public review metadata
provenance:*    public generation/provenance metadata
owner           public ownership hint
company:*       company-only metadata
me:*            personal metadata routed by local preference
local:*    never serialized at all

Then the filters are easy to understand:

git meta set:add project meta:filter "route company:** company"
git meta set:add project local:meta:filter "route me:** mine"

The result is boring in the good way: public metadata can be published publicly, company metadata can stay inside the company, and local metadata can remain local.

That is the point of serialize filters. Not secrecy magic, not access control, not encryption. Just a clean way to avoid mixing audiences before the data ever gets pushed.

dealing with large values

Fri, 29 May 2026 00:00:00 GMT

Metadata starts out cute and tiny, right up until somebody decides that the value should be an entire agent transcript, a generated report, or some giant attestation blob.

That's when your nice little key/value store starts looking like a junk drawer.

Most git-meta values are boring in the best possible way. A model name. A review status. A URL. A short string that says something useful about a commit, branch, path, or project. Those values belong directly in SQLite because SQLite is fast, local, and extremely good at answering questions like "what is the current value for this target and key?"

But not everything should live inline forever.

SQLite is the index, Git is the object store

The trick is that git-meta already has two useful storage systems available.

SQLite is the local working database. It tracks targets, keys, value types, authors, timestamps, and the current state of the world. That's the stuff you want to query quickly without walking Git history or inflating a bunch of objects for no reason.

Git, on the other hand, is very good at storing content-addressed blobs and moving them around with fetch and push. So when a value gets large enough, git-meta can write the actual payload as a Git blob and store the blob id in SQLite instead.

SQLite keeps the pointer. Git keeps the big thing.

The cutoff

By default, the cutoff is 1 KiB. If a string value is larger than that, git-meta stores it as a Git blob reference instead of stuffing the whole thing into the SQLite row.

This applies whether the value came from the command line:

git meta set commit:HEAD agent:transcript "...some very large string..."

or from a file:

git meta set commit:HEAD agent:transcript -F transcript.jsonl

Originally, this only happened for -F/--file payloads. That was too cute. Large is large, regardless of whether it came from a file or from an inline argument, so the check now happens on the resulting value itself.

You can change the size

The cutoff is not hardcoded policy anymore. You can set it with metadata too, which is pleasingly recursive.

There are two keys:

meta:sqlite:object-max-size sets the shared project default.
local:meta:sqlite:object-max-size sets your local override and wins if both are present.

Both accept plain byte counts or friendly sizes like 4k, 64k, or 1m.

Set it low if you want SQLite to stay lean. Set it high if you want more values inline for easier inspection. Set it to 0 if you want every non-empty string value to become a Git blob reference.

Please do not set it to something ridiculous and then act surprised when the database becomes ridiculous. Computers are obedient, not wise.

Reading it back

The blob reference is supposed to be an implementation detail.

When you run git meta get, you should get the value back, not a random-looking object id that makes you go spelunking through .git/objects. Internally, git-meta remembers whether a stored value is inline text or a Git reference, then resolves it on read when needed.

That distinction also matters during sync and materialization. Remote metadata can arrive with blob-backed values, and the local store needs to preserve that shape instead of accidentally turning an object id into the user-visible value.

Why bother?

Because small metadata and large metadata want different tradeoffs.

Small values should be cheap, local, and queryable. Large values should not bloat the hot path just because we wanted to attach something useful to a commit. Splitting the two lets git-meta keep SQLite as the fast index and Git as the durable blob store.

That's the whole idea: use the boring parts for what they're good at.

introducing git-meta

Fri, 29 May 2026 00:00:00 GMT

The git-meta project is basically another stab at git notes - a way to attach arbitrary metadata to things in Git without needing to rewrite information.

If you're familiar with git notes, you know that it allows you to attach a single blob of data to a single Git object (normally a commit) and use Git push/fetch commands to move them around. They are somewhat rarely used because of a number of shortcomings such as:

complexity around merging from two contributors
the limitation of one value per commit
the inability to attach data to other things like paths or the project as a whole
scalability issues
and more!

The git-meta project was started to address this. With git-meta, you can:

attach values to various types of targets in the project (branches, paths, commits, change-ids)
have namespaced key/value pairs (ie agent:model on a commit)
have rich value types (strings, sets or lists)
merge multi-user metadata easily
use multiple sharing targets
scale to many millions of keys easily
and more!

We created this specification, a Rust library, and a reference CLI implementation to allow everyone to easily attach and manage arbitrary metadata to various parts of their existing Git codebases with minimal difficulty.

Simple example

Here's how it works. You can install it with Cargo:

cargo install git-meta-cli

Then you have the git-meta CLI tool that can manage everything. Attach a new arbitrary value on a commit with the git meta set command:

❯ git meta set commit:314e7f0fa7 agent:model "claude-opus-4-8[1m]"

That will look up the commit, expand it to the full SHA and place a value under the key agent:model attached to that commit. You could also set agent:provider or any other key/value combination.

Then you can get the value back out with git meta get:

❯ git meta get commit:314e7f0fa7 agent:model
agent:model  claude-opus-4-6[1m]

You probably get the idea. You can also assign sets or lists of values to a key if that makes more sense for the data type.

You can easily setup a meta remote (someplace to push the data to) with git meta setup which will default to the same repository as your code, but under a hidden refs/meta/main reference. So it can, for example, be on GitHub with your code but just not visible.

Then just type git meta sync to push new values you've created and pull down new values from other users on your team.

Try it out

The project is still early, but the core direction is in place: make Git metadata portable, local-first, and tool-friendly. Try it out and give us feedback!