Documentation
How Hyperlabel scores and labels hypercert activity records.
How it works
- 1.Hyperlabel uses Tap — Bluesky's official sync tool — to monitor the AT Protocol network for org.hypercerts.claim.activity records. Tap automatically discovers repos, backfills historical records from each PDS, and streams live events with cryptographic verification. This means records created before the labeler started are still scored.
- 2.When a record is detected, Hyperlabel normalizes and scores it, then writes the scored result to the activity log. The ⟳ Detected tier is reserved for incomplete or stale ingests and should not appear during a healthy scoring path.
- 3.The scoring engine evaluates the record against 9 quality criteria worth 100 points in total. Test signals are checked first — any record that looks like placeholder data is flagged immediately regardless of its numeric score.
- 4.A signed AT Protocol label is applied to the activity record URI based on the score tier. If the record already had another active Hyperlabel quality label, the old label is negated before the new one is applied.
- 5.When an activity record is deleted, Hyperlabel negates any active quality labels for that record and removes it from the dashboard database.
Scoring criteria
Each record is evaluated on 9 criteria for a maximum of 100 points.
| Criterion | Description | Max pts |
|---|---|---|
| Title quality | Meaningful, descriptive title | 15 |
| Summary quality | Clear short description | 15 |
| Description quality | Detailed description with sufficient length | 20 |
| Image | Has an attached image | 10 |
| Work scope | Defines work scope tags | 10 |
| Contributors | Lists contributors with weights and details | 15 |
| Locations | Has geographic locations | 5 |
| Date range | Specifies start and end dates | 5 |
| Rights | Defines usage rights | 5 |
| Total | 100 |
Penalties
Points are deducted when low-quality patterns are detected.
| Penalty | Trigger | Deduction |
|---|---|---|
| Repetition | High line or word repetition in description/summary (e.g. song lyrics, copypasta) | −5 to −15 |
| Duplicate fields | Summary identical to description (lazy copy-paste) | −20 |
Quality tiers
Scores map to four tiers. Test signals override the numeric score and always produce a “Likely Test” label.
Well-documented record with comprehensive activity details.
Adequate record with basic activity information filled in.
Minimal record — appears to be a work in progress.
Contains test or placeholder data (e.g. "Test", "asdf", lorem ipsum, repeated characters).
Test signal patterns
Records are automatically flagged as ⚠ Likely Test when the title, summary, or a short placeholder-like description matches patterns such as: test, E2E, Hyperindex update, update-burst, backfill test, fixture data, dummy activity, smoke, asdf, lorem ipsum, untitled, aaaa…, or when the title is identical to the summary and fewer than 50 characters.
Operators can also configure TEST_PDS_HOSTS so actors hosted on known development PDSes are forced into ⚠ Likely Test after background PDS resolution.
🤖 ML classification
When HF_TOKEN is set, each record is asynchronously classified by a HuggingFace zero-shot model (facebook/bart-large-mnli). The model classifies the combined title, summary, and description text into one of four categories:
- • meaningful project description — legitimate content
- • test or placeholder data — test/junk content
- • song lyrics or copypasta — copied or irrelevant text
- • spam or gibberish — nonsensical content
If the model's winning class is anything other than meaningful project description, the record receives an HF test signal and is automatically downgraded to ⚠ Likely Test.
API endpoints
The labeler exposes a small REST API for the dashboard as well as standard AT Protocol labeler XRPC endpoints, including WebSocket label subscription.
Dashboard statistics — total counts, tier breakdown, 24h/7d activity.
curl https://activitylabeler.hypercerts.dev/api/stats
Recent activities with pagination and optional tier filtering. Valid tier values: all, pending, high-quality, standard, draft, likely-test.
curl "https://activitylabeler.hypercerts.dev/api/recent?limit=20&offset=0&tier=high-quality"
Query AT Protocol labels via the standard labeler endpoint. Supports record URI patterns and sources query params.
curl "https://activitylabeler.hypercerts.dev/xrpc/com.atproto.label.queryLabels?uriPatterns=at://did:plc:*/org.hypercerts.claim.activity/*"
Subscribe to signed AT Protocol label events over WebSocket. Use wss:// on HTTPS hosts; local WebSocket proxying requires the Caddy-fronted service.
websocat "wss://activitylabeler.hypercerts.dev/xrpc/com.atproto.label.subscribeLabels"
AT Protocol integration
Hyperlabel is a fully compliant AT Protocol labeler. Any app that supports the labeler protocol can subscribe to or query its labels.
Labeler DID
did:plc:5rw6of6lry7ihmyhm323ycwn
Handle
einstein.climateai.org
- —Labels are served via the standard com.atproto.label.queryLabels XRPC endpoint and can be queried by any AT Protocol client.
- —Each label is signed with secp256k1 and includes: source DID, target record URI, label value, timestamp, and a cryptographic signature.
- —Apps can subscribe with com.atproto.label.subscribeLabels over WebSocket at wss://activitylabeler.hypercerts.dev/xrpc/com.atproto.label.subscribeLabels to receive quality signals for hypercert activity records.
- —Only one quality label is active per record URI at a time. When a record is updated and re-scored, the previous label is negated before the new one is applied.