News
Ideas
-
Summaries run through a single Ollama backend (cube/qwen3 via
OLLAMA_URL). A single backend means summarization stalls whenever that host is down, and it can't take advantage of fox (gemma4:26b), which produced tighter summaries at comparable latency in a 4-article comparison. This makes fox the primary summarizer with cube as an automatic fallback.Solution
fox is served by llama-swap and speaks only the OpenAI-compatible
/v1/chat/completionsAPI — it 404s on Ollama's/api/chat— so a config-only switch isn't possible. This adds anAnalyzerinterface with two implementations (OllamaClient,OpenAIClient) sharing the Norwegian prompt and JSON-extraction logic, plus aFallbackAnalyzerthat tries the primary first and retries the fallback on any error (but not on context cancellation / shutdown).To avoid a config trap, the existing
OLLAMA_URL/OLLAMA_MODELkeep configuring the fallback backend — so the deployed unit, which already setsOLLAMA_URL=cube, needs no change. A newAI_URL/AI_MODEL/AI_API(defaulthttp://fox:11434,gemma4:26b,openai) configures the primary. The Ollama client timeout is raised 30s→60s to tolerate cold model loads (~48s observed).This re-ports work originally written against an orphaned repo line (old PR #4, preserved on
orbit-main-archive) onto the canonical codebase.Verification
go build+go vetclean; all AI/config tests pass. The 4 pre-existing failures on this branch (snapshot/desk OIDC tests) are unchanged frommain— not introduced here. The OpenAI/Ollama client paths were validated live against fox and cube earlier.Deploy note
Not for deploy until approved. On deploy, summaries switch cube→fox automatically (fox is the default primary); cube remains the fallback via the unit's existing
OLLAMA_URL. No unit change required.Closes #5
History
-
deploy.sh built with
go build .from whatever working tree it ran in, so production depended on one host's checkout (servo). That is precisely how the real source ended up trapped off-git. This makes git the source of truth for deploys.Solution
deploy.sh now clones the canonical remote at a given ref (default
main) into a temp dir, builds linux/amd64 there, deploys, and cleans up. Because it builds from a fresh clone, only committed-and-pushed code can reach production; local/unpushed trees are never deployed. The script is host-independent — runnable from any machine with git, Go, and SSH — and prints the exact deployed SHA.Usage:
./scripts/deploy.sh [host] [git-ref](defaults:fismen,main). The host-managed unit (incl. the OIDC drop-in) is left intact; the embedded unit is only written on a fresh install.Verification
Validated the clone+build path from git end-to-end (no deploy): clones
origin/main, builds the 21.4 MB linux/amd64 binary. Syntax-checked.Follow-ups
- Forgejo Actions CI/CD (push/tag → auto-deploy) can build on this; needs a runner + SSH secrets.
Closes #7
-
AI summaries went through a single Ollama backend (cube, qwen3:14b). We want fox (gemma4:26b) as the primary summarizer — its output is tighter and at least as fast in a 4-article comparison — but a single backend means summaries stall whenever that host is down. This makes fox primary and keeps cube as an automatic fallback.
Solution
The two hosts speak different protocols: cube is Ollama-native (
/api/chat), while fox is served by llama-swap and only exposes the OpenAI-compatible/v1/chat/completions(it 404s on/api/chat). So a config-only switch isn't possible. This introduces anAnalyzerinterface with two implementations —OllamaClientand a newOpenAIClient— sharing the Norwegian prompt and JSON-extraction logic so swapping models never changes what we ask for. AFallbackAnalyzerwraps the two: every article tries the primary first and retries on the fallback on any error, so a recovered primary is used again immediately with no cooldown bookkeeping. It does not fall back when the context is cancelled (shutdown).Backends are configured independently via
AI_*andAI_FALLBACK_*env vars (URL, model, protocol, optional Bearer key); defaults are fox-primary / cube-fallback. The HTTP timeout is raised from 30s to 60s — the comparison showed cube cold-loads taking ~48s, which the old timeout would have killed. Verified end-to-end against the live fox and cube backends.Known cuts
- Per-request retry only — no circuit-breaker/cooldown, so a hung (not refused) fox costs its full timeout before each fallback. Refused connections fail fast.
- The summary prompt is unchanged.
Follow-ups
- Confirm fox/cube are reachable from the deploy host (fismen) over tailscale before this goes live; an older deploy comment noted Ollama wasn't reachable there yet.
Closes #3
-
forge is the unified git-forge CLI that replaces tea locally.
forge pr mergehits the same Forgejo API endpointtea pr mergedid, so the merge-via-Forgejo-API constraint is preserved.
-
Summary
- Initialize design/ directory via orbit with the news design system (tokens, fonts, components, previews)
- Add Forgejo webhook for issue/PR sync to orbit
- Apply conventions from arne/conventions (CLAUDE.md, docs/conventions/)
- Add M↓ markdown button to candidate eyebrows and article view
- Add prototype header with orbit navigation to all preview pages
- Rename Breaking to Kandidater across all desk previews
Test plan
- [ ] Open design/index.html — verify all component sections render (colors, typography, buttons, badges, candidates, clusters, etc.)
- [ ] Open design/preview/desk.html — verify M↓ button appears in candidate eyebrows
- [ ] Open design/preview/desk-article.html — verify M↓ has no underline
- [ ] Verify prototype header links back to orbit from all preview pages
- [ ] Verify dark mode toggle works across all pages