Building a Workshop: Series Summary

Summary of summary ;)

Knowledge work is increasingly going to be about thinking, deciding, and instructing AI to do the work. Humans set direction; agents execute.
Agent performance is dependent on the information it is given. Even the best agents can't do good work in standard knowledge bases because the information it needs is buried in files with unrelated info, conflicting & stale data, and undocumented preferences, decisions and assumptions. Smarter agents will not make this problem go away.
As we give more ownership to agents and lessen oversight, this problem will magnify. With smarter models, serious agent mistakes are going to come from confident decisions on bad info.
This will become a serious blocker for enterprise adoption, where individual users can't intuitively fact-check answers (ie when doing team or X-department work), and where data and information is vast, structure is poor, and decisions can be extremely serious.
We need to build environments designed for agents to natively operate in. For example, agents are new each time - they have no memory and no intuition for what's true or stale. The environment has to do that work for them: onboarding on arrival, a log of decisions & reasoning, explicit flags on assumptions and staleness.
They also benefit from structure more than humans. We can derive inspiration from coding agents here. Codebases are already organized the way agents need them — single source of truth, clear data and file structures, low ambiguity of which functions to use.
The optimal workspace will have to be maintained by the agents themselves. Humans are terrible at knowledge management and will never keep a base clean at the level an agent needs. Current agents can do this work — they should own it.
I built an early version and I use it every day. It's called Workshop, open source on GitHub.
Read the full summary below, or discuss the posts with ChatGPT/Claude:

Read these three blog posts — posts 1 and 2 are the real argument, post 3 is implementation. The author also included a summary at the link below showing what he thought was most important across all three. Start with the summary to get his frame, then give me: the central thesis in 2-3 sentences, and the 2-3 conceptual ideas you find most interesting or most debatable. Don't worry about the specific details of how he built his GitHub project — focus on the ideas. Then let's discuss.

Summary: https://2514coombs.com/blog/workshop-summary
Post 1: https://2514coombs.com/blog/smarter-elves
Post 2: https://2514coombs.com/blog/non-human-collaborator
Post 3: https://2514coombs.com/blog/system-in-practice

Workshop: Summary of Thoughts

the problems

Users working with AI should be doing only 3 things: think, decide, and instruct.
Chatbots and cowork-style tools (here-on “Cowork”) create friction by requiring users to manually manage their chats — actively adding context, uploading files, and clarifying preferences every session.
This manual context management both creates friction and degrades quality. Users forget key details, don't bother to repeat personal preferences, or they forgo adding more files to not kill context.
This gets worse as you use it more. Power-using Cowork makes files and information pile up fast — resulting in unmanageable file messes, inconsistent and stale information, and agents needing to suck in everything as context to do anything useful.
And the agent is set up to fail too. It arrives in this mess without onboarding, clarity of preferences, awareness of previous sessions & decisions, no sense of what's stale or what matters.
Smarter models will still hit this barrier & will make mistakes of context & information, not intelligence. This is a huge problem as we ask them to take on more ownership and increasingly stop checking the details of their work. It's critical to solve before we hit large-scale enterprise implementation.
We therefore need to re-think our approach to information & context. We need to make current, accurate information (i.e., instructions, project context, research & data) accessible by the agent in the fastest and most efficient way possible. We need to expand from the small view of memory (basic info injected into prompts) and the blunt tool of research (go look at everything ever) into smart information management. This will needs a better class of knowledge base.

the thesis

The industry is building smarter models, but the agents hampered by bad environments. A smarter model doesn't fix a messy file system, doesn't know what happened last session, doesn't maintain its own workspace. If we solve this it will improve quality of results massively.
Knowledge bases need this more than codebases because they lack the systemic rigor & enforced best practices. Codebases have decades of discipline. Knowledge bases are messy file dumps.
Humans are incredibly bad at managing information so agents should manage everything - file organization, staleness, context, orientation — autonomously.
The future of knowledge work is people talking to agents. The agents manage the system, pull information, do research & analysis, and create deliverables. Current models can do this already but the environment & lack of agent autonomy is structurally limiting this relationship.
This is actually urgent, not just good UI. As people scale their work with agents they will increasingly not give detailed oversight, relying on agents being smart and correct. The problem is that current environments are almost designed for agents to make mistakes.
In current knowledge bases, agents operate on stale data, make decisions without rationale history, act on assumptions nobody tracks & so the likelihood of serious mistakes compounds. A highly structured, agent-managed knowledge base environment is the most effective mitigation.

the elf metaphor

If agents are doing the work, the environment should be optimized for how they operate, so we need to design systems for agents, not for humans.
Think of agents as elves in Santa's Workshop. Every session, a new brilliant elf enters with zero knowledge of previous work. If the workshop is a pile of unlabeled, overlapping, stale, messy files on the floor, the elf has to read everything, doesn't know anything — is basically set up to fail.
The goal: design the workshop so the elf understands the situation, aligns to the goal, and starts correct work in the fastest, lowest-context-load method possible. A map of the workshop; a diary of what previous elves have done; a directory for books in the library; a list of ongoing projects & decisions; a clear description of which styles and materials the gifts should be made from.
For real agents, this means instruction sets, maps & pathways through the workspace, summarized information they can read fast, and interface paradigms that should be invisible to the human user but allow the agent to hit the ground running.
For clarity - there's two parts to get right: 1. The actual file system, organized to be functional & fast so the agent doesn't drown its context on irrelevant info. 2. The supporting apparatus — instruction sets, orientation protocols, maintenance rules — that allow the agent to navigate it.

knowledge bases should have the discipline of codebases

Codebases are really well suited for agents: a single-source-of-truth information system with clear filenames, structure, no irrelevant info, low contradiction/overlap etc.
Knowledge Bases should look more like them. Knowledge bases are still chaotic information dumps - unstructured at the folder level (many random files) and at the file level (random information interspersed).
In a perfect world Knowledge bases are SSoT & even structured by epistemic stage. e.g., Research (data & frozen evidence) → Analysis (interpretation, cites sources) → Plan (commitments, tracked decisions).
We should treat knowledge data as foundational building blocks that can be found, dedup'ed, updated & version controlled.
The system should handle this autonomously. When upstream data changes, the system traces the cascade through everything downstream — like a compiler tracing type dependencies.
Files should not be random collections of info - they're specifically for narrative & analysis. Explicit dependency and linkage systems between files to allow for update & cascade management.

designing for agents as users

With the file system structured, agents need the apparatus to navigate & manage it. Designing for agents is a real UX discipline — agents have specific needs around orientation, attention, uncertainty, and continuity that are different from humans but just as concrete.
Orientation: Agents are born completely fresh, so they need a single entry point that loads everything they need to work on this project — identity, project state, previous session results, open decisions, urgencies, user preferences. This should be independent of the human user - it should be automatic on chat-start.
Bounded attention: Knowledge bases can easily exceed context windows. Without designed loading structures, agents either drown in context — wasting their best reasoning capacity on orientation instead of work — or skip files and miss critical dependencies. Files should have tiered loading: lightweight summaries by default, full depth on demand. Distilled file versions, token-budget-aware architecture, expertise tracking (what does the user already know, so the agent doesn't waste context explaining it). This is UX for an AI's attention window.
Structured doubt: Agents don't have a vague memory of a decision from last week. They don't have intuition about what feels off. They are born anew in your project every session, with zero sense of what is real, important, or true — and they can't load everything at once to find out. This is a fundamental handicap, and the system needs to compensate for it. The answer is explicitly modeling for uncertainty: update dates on every file, dependency & citation trackers, clear confidence levels on assumptions, and automatic cascade flagging when upstream data changes.
Self-maintenance: Agent sessions are ephemeral but the work is continuous. Without maintenance protocols, entropy accumulates between sessions and the compounding promise breaks. Every session should leave the workspace slightly better — agents should actively update the workshop while completing tasks so the next agent also inherits a clean, current environment.

constitutions, not prompts

In 2023 we thought of agents as prompt -> outcome. In 2024/5 that turned to job-descriptions (think Genspark here). Increasingly we should think about human-ai relationship in terms of constitutions - agreements that clarify intent and ideas and work rather than specific deterministic instructions.
Related: Knowledge work isn't deterministic — you can't hardcode multi-project information management. This means that natural language, not code is the right format for the agent relationship. A good agent system should entirely run on markdown with zero (0) code. The agent can write code to achieve tasks, but the user works with the agent through discussion & MD files
Related: I dont actually write these myself. Talk to Claude and have it write things. Test, discuss, iterate. Need to move fast here :)

system-level governance and safety

The idea of agent evals has lessened in popularity in the last year or two, mostly because A) it's a bit hard and annoying, and B) models are actually just really good. Models do what they're told, they're just in general given wrong, insufficient or unclear instructions.
In multi-agent systems this means we need to address performance at the system level. How does your constitution manage information flow & order clarity. Does the agent know enough about you to understand your needs? Does it have decision logs, dependency graphs, session histories? In 2026 this is where failure happens.
Note: we should still test agents (obviously lol) but think broader in the problem space about why it might be failing and definitely don't immediately turn to rules or restricting agent ownership. It's generally a you problem, not the agent problem.
This reframes safety too. Good environments don't constrain agents — they give agents the context and structure to act correctly in the first place. A well-designed environment makes a more autonomous agent safer, not riskier, because the guardrails are clear to both humans and agents.
Important note: Agent deployment is accelerating, but the environments agents are being deployed into haven't caught up. Deploying agents into organizational chaos — no instructions, no filing system, no shared decisions log — means elves running around a workshop, picking up random books and being constantly called dumb over the speakers as they try to build the gifts. As we give more control and have less oversight, this is how you get really bad results really quickly. Not because the agents are dumb, but because we have failed to give them an environment to thrive in.

from individual to organizational scale

Everything above applies to a single person's workspace. The real impact is at organizational scale.
The problem is bigger: take it from an ex-consultant - enterprise file systems are at best abysmal and embarrassing messes.
The reward is also bigger: When agents maintain living knowledge infrastructure at team and company scale — maintaining connections between procurement data, financial models, strategic plans — managers get access to the full analytical depth of the entire organization. Not just what they personally know or can find.
From consulting exp: in every organization someone called Bill in Finance has an excel that is red-alert critical for the whole company. Not only does no one know that, no one connects that file to the one in Procurement to identify where the real opportunity, value, information of a company is. This information is hidden but we're also limited by our human biological RAM - we can't make connections and identify patterns at this sort of scale.
With a well managed organization-level Knowledge system, an agent surfaces how a supply chain change affects your forecast, because it maintains the dependency graph across departments that no human team could hold simultaneously & it can analyze it at extreme scale

conclusion

Humans should be doing 3 things with AI: think, decide, instruct. Hand everything off to agents to manage - they can handle it.
In order for agents to take on this work we need to think about their environment. Smarter elves are good. Smarter elves in better Workshops are much better.
Knowledge bases are crazy bad. Learn from codebases and expand from there.
We need to design our systems for agents, not humans. Like your worst junior analyst ey won't ask for help and they will silently screw up. As they gain more ownership we need to make sure they are set up for success in environments and with information systems designed for them
This is a critical blocker for enterprise adoption. We must identify best practices at the personal level and then expand to the enterprise level where complexity and risk is much higher. Need to do this now.
I did a basic version of this as a form of design fiction that I now use every day. It's called Workshop and it's found on Github here.
It's full MIT license. Run, play, just give me feedback where possible so I can get that external validation we all crave.
DM me at the socials below or above. If you're in SF I'd love to get coffee!

Source posts: Smarter Elves · Non-Human Collaborator · System in Practice