<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://www.stephen-cresswell.com/feed.xml" rel="self" type="application/atom+xml" /><link href="http://www.stephen-cresswell.com/" rel="alternate" type="text/html" /><updated>2026-02-16T08:47:03+00:00</updated><id>http://www.stephen-cresswell.com/feed.xml</id><title type="html">Signal Over Noise</title><author><name>Stephen Cresswell</name></author><entry><title type="html">More Experiences of Vibe Coding</title><link href="http://www.stephen-cresswell.com/2026/02/15/More-Experiences-Of-Vibe-Coding.html" rel="alternate" type="text/html" title="More Experiences of Vibe Coding" /><published>2026-02-15T00:00:00+00:00</published><updated>2026-02-15T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2026/02/15/More%20Experiences%20Of%20Vibe%20Coding</id><content type="html" xml:base="http://www.stephen-cresswell.com/2026/02/15/More-Experiences-Of-Vibe-Coding.html"><![CDATA[<p>One of the outstanding questions from my <a href="/2026/01/01/Why-Are-Experiences-of-Vibe-Coding-so-Polarised.html">previous post</a> was whether code quality mattered to AI. I am now more convinced that it does. What I have observed in extended use is this: unless guided carefully, Claude produces more code than necessary, with weaker abstractions and noticeable duplication. As the codebase grows, the problem compounds. A bug fixed in one place introduces a bug somewhere else. Fix that, and you either recreate the original defect or produce a new one. It becomes a kind of Dr. Strange vs Dormammu time loop, where you are trapped in an endless cycle of regression; or, alternatively, a maddening game of whack-a-mole.</p>

<figure style="float: right; margin: 0 0 1em 2em; max-width: 400px;">
  <img src="/images/dormammu.png" alt="Dormammu Time Loop" style="width: 100%;" />
  <figcaption style="text-align: center;">Trapped in an endless cycle of regression</figcaption>
</figure>

<p>The pattern is not surprising. Claude is highly influenced by the existing codebase. If the surface area is large, inconsistent, or structurally weak, the model will amplify those characteristics. Conversely, if the surface area is small, cohesive and internally consistent, the model’s output improves markedly.</p>

<p>For clean code, three principles dominate.</p>

<p><strong>Strong Domain Models</strong><br />
The core concepts of the system should be explicit, named, and represented directly in code. Behaviour should live with the concepts it belongs to. When the model is coherent, both humans and AI can extend it predictably. When the domain is implicit or smeared across utilities and controllers, every change becomes guesswork.</p>

<p><strong>Encapsulation</strong><br />
Keep data and behaviour together and expose only meaningful operations that reflect the language of the domain. Do not provide mutators (setters) and avoid casual accessors (getters); every accessor that leaks state invites distributed behaviour, breaking the domain model. AI systems are particularly sensitive to porous boundaries, because they will happily reimplement logic.</p>

<p><strong>Minimal Conditional Logic</strong><br />
All but the briefest branching structures are a code smell. They signal decisions being made too late, missing domain concepts, missing polymorphism, or collapsed responsibilities. Nested conditionals increase cognitive load and expand the probability space the model must reason over. Move decisions back to the caller, which already knows the intent, or replace conditionals with richer, polymorphic domain types. This reduces complexity at the point of execution and keeps behaviour aligned with explicit intent rather than inferred state.</p>

<h2>When Claude got everything right</h2>

<p>That said, it would be misleading to suggest the experience is uniformly negative. Recently I built a small utility application for testing single sign on in a single prompt, and Claude produced exactly what I wanted. The prompt was:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Create a very simple web application for testing authentication with azure.
It should let me sign in, and when I click a button, makes a request to a 
backend service using the bearer token. The backend service is 
../azure-ad-jwt-debugger. Use the msal library as appropriate.
</code></pre></div></div>

<p>And Claude delivered precisely that. Below is the generated interface…</p>

<figure style="text-align: center; margin: 2em 0;">
  <img src="/images/azure-ad-jwt-debugger.png" alt="Azure AD JWT Debugger" style="display: block; margin: 0 auto;" />
  <figcaption>The Azure AD JWT debugger interface, generated in a single prompt</figcaption>
</figure>

<p>It signs in with MSAL, retrieves the token, and calls the backend with the bearer token exactly as requested. Interestingly, it did not just implement the minimum described in the prompt. It also added:</p>

<ul>
  <li>A sign out button</li>
  <li>A structured JSON response display panel</li>
</ul>

<p>Neither of these were explicitly requested; both were appreciated. You can view the source code <a href="https://github.com/cressie176/azure-ad-jwt-debugger-web">here</a>. I have not reviewed it, since in this case, I do not envisage needing to modify it.</p>

<p>In conclusion, when the intent is crisp and the domain small, vibe coding may work extremely well. When the system grows and architectural trade-offs become material, discipline becomes essential. That is not an argument against generative AI. It is an argument for treating it as a powerful amplifier. Code quality matters more, not less, in the age of generative AI. Until models evolve in ways that reason more structurally about long-term design consequences, sustainable AI-assisted development depends on disciplined architecture. If the underlying design is coherent, it accelerates you. If it is messy, it accelerates the mess.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Generative AI" /><category term="Claude Code" /><category term="Clean Code" /><category term="Vibe Coding" /><summary type="html"><![CDATA[Further experience with Claude has strengthened my view that code quality matters more, not less, in the age of vibe coding. Left unguided, it amplifies duplication and weak abstractions; tightly constrained, it can be impressively precise. Generative AI is an amplifier. It accelerates coherence, and it accelerates mess.]]></summary></entry><entry><title type="html">Why Are Experiences Of Vibe Coding So Polarised?</title><link href="http://www.stephen-cresswell.com/2026/01/01/Why-Are-Experiences-Of-Vibe-Coding-So-Polarised.html" rel="alternate" type="text/html" title="Why Are Experiences Of Vibe Coding So Polarised?" /><published>2026-01-01T00:00:00+00:00</published><updated>2026-01-01T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2026/01/01/Why-Are-Experiences-Of-Vibe-Coding-So-Polarised</id><content type="html" xml:base="http://www.stephen-cresswell.com/2026/01/01/Why-Are-Experiences-Of-Vibe-Coding-So-Polarised.html"><![CDATA[<h2 id="tldr">TL;DR</h2>
<p>This post explores why experiences with Generative AI assisted software development vary so dramatically. While some developers report order of magnitude productivity gains, others encounter architectural drift, excessive code, and serious operational risk. This gap is not primarily caused by the tools themselves, but by differences in goals, constraints, and methods of use.</p>

<p>Treating Generative AI as a meta tool that operates on intent rather than code mechanics, the post makes explicit its own optimisation goals: negligible operational debt and highly malleable, clean code. A controlled experiment using a small but realistic URL shortener service tests three ways of working with Claude Code.</p>

<p>True <a href="https://x.com/karpathy/status/1886192184808149383">vibe coding</a>, where you <em>“forget the code even exists”</em>, proved unreliable. A guided approach using techincal prompts, explicit implementation notes, rules, and manual review produced a clean, low debt system in just over an hour, delivering an estimated eight to sixteen times improvement over manual development. However, these gains should be interpreted cautiously, as coding represents only a fraction of a software engineer’s working time. In contrast, an unattended approach without guidance completed faster but generated substantially more code, significant technical and operational debt, and multiple design regressions.</p>

<p>The conclusion is that Generative AI can deliver dramatic gains, but only when projects are already established or properly bootstrapped, and the agent is strongly constrained. Left unattended, it reliably drifts towards verbosity and accidental complexity. The real question for organisations is therefore not whether to adopt Generative AI, but how to use it in a way that aligns with their long term engineering goals.</p>

<h2 id="introduction">Introduction</h2>
<p>Jason Gorman’s recent <a href="https://codemanship.wordpress.com/2025/11/25/the-future-of-software-development-is-software-developers/">post</a> on the future of software development is causing quite a stir. The comments on the accompanying Hacker News <a href="https://news.ycombinator.com/item?id=46424233">discussion</a> span an extraordinary range of experience with vibe coding, particularly when using Claude Code. Some describe dramatic productivity gains and rapid delivery of complex systems. Others report dangerously misleading output, architectural drift, and large amounts of unusable code. Both sides speak with confidence, often dismissing the other as naïve, reckless, or simply doing it wrong. The discussion reflects a broader pattern which has been bothering me for some time.</p>

<p>I have been a professional software engineer for thirty years, and strong disagreements are nothing new. Functional versus object oriented programming, static versus dynamic typing, competing approaches to testing. What is different this time is the nature of the tool itself. Generative AI does not primarily operate at the level of a language, framework, or architectural paradigm. It operates at the level of intent. You describe the outcome you want and it produces an implementation that attempts to achieve it. For straightforward applications, where the architectural trade offs are limited or inconsequential, this can be highly effective. However, once those trade offs become material, around performance, evolvability, correctness or operational risk, the model is still making implicit architectural decisions without properly accounting for their long term impact. It is improving quickly, but it is not yet a replacement for professional judgement.</p>

<p>It does not matter that Generative AI is unusually non-deterministic. For it to be useful to anyone, it still has to produce desired outcomes with sufficient consistency. That makes the scale and intensity of the reported disagreement surprising. Vastly different experiences are unlikely to be explained by an inherent defect in the tool itself. Allowing for <a href="https://thedecisionlab.com/biases">bias</a> and differing <a href="https://keirsey.com/temperament-overview/">personality temperaments</a> does not adequately explain the size of gap this time. The same tool is being described, with equal confidence, as producing dangerous, unmaintainable AI slop on the one hand, and delivering twenty times productivity on the other.</p>

<p>This gap is pushing organisations to make extreme decisions. Some, driven by unrealistic expectations, are rushing too quickly into AI adoption, creating unnecessary anxiety and disruption. Others are avoiding it entirely, missing out on potential benefits and frustrating their internal AI champions. The gap needs to be understood and closed.</p>

<h2 id="what-i-want-from-generative-ai">What I Want From Generative AI</h2>

<p>One possible explanation for difference in experience is that people want different outcomes. If that is the case, then disagreement about Generative AI performance is inevitable. Before comparing tools or techniques, the goals themselves need to be explicit. I want Generative AI to rapidly create applications with the following characteristics:</p>

<h3 id="1-negligible-operational-debt">1. Negligible Operational Debt</h3>
<p>Operational debt is distinct from technical debt. Technical debt only incurs a cost when change is required and is often an explicit trade-off. Operational debt creates ongoing risk and <a href="https://blog.while-true-do.io/devops-4-types-of-work/">unplanned work</a> and, in the worst cases, can consume an entire team’s capacity through incidents and urgent remediation.</p>

<h3 id="2-unparalleled-malleability">2. Unparalleled Malleability</h3>
<p><a href="https://www.oreilly.com/library/view/clean-code-a/9780136083238/">Clean Code</a> is malleable. When we write clean code, we make a comparatively small sacrifice now to reserve the ability to change rapidly in the future. The best way to achieve clean code is to write as little code as possible. This is done by creating a good domain model. When the domain model is good, the code vanishes. As Linus Torvalds is <a href="https://read.engineerscodex.com/p/good-programmers-worry-about-data">reported</a> to have said:</p>

<blockquote>
  <p>“Bad programmers worry about the code. Good programmers worry about data structures and their relationships.”</p>
</blockquote>

<p>A good domain model requires good encapsulation. Behaviour is contained in one place, colocated with the associated data. Accessors are few, mutators fewer still. Conditional logic is pushed to the boundaries of the application and eliminated internally through polymorphism. The code clearly expresses intent.</p>

<p>I have a nagging suspicion that my attachment to clean code may be what Scott Adams calls <a href="https://en.wikipedia.org/wiki/Loserthink">Loserthink</a>, and no longer needed in the world of Generative AI. Until that suspicion is proven, I am sticking with it though. The risk of a future filled with vast quantities of even less malleable code than we already have is too great.</p>

<h2 id="my-experience-of-vibe-coding-so-far">My Experience Of Vibe Coding (So Far)</h2>

<h3 id="the-good">The Good</h3>

<p>It is often stated that Generative AI is effective at mechanical or rote tasks, or comparable to the output of a junior engineer, but this is not where it offers the most value. The area where I have found my Generative AI tool of choice, <a href="https://www.claude.com/product/claude-code">Claude Code</a>, to be most effective is where I have strong high-level judgement about what needs to be achieved, but lack the detailed, low-level knowledge to implement it without time, research, and mistakes. In those cases, the limiting factor is not judgement, but execution.</p>

<p>CI/CD pipelines leveraging Docker provide a good example. I know what I want a pipeline to do, how the stages should fit together, and what correct behaviour looks like. Implementing that by hand usually involves reading documentation, iterating on syntax, and discovering edge cases through failure. Claude is much faster at cycling through that execution loop than I am, especially after installing the <a href="https://cli.github.com">GitHub CLI</a>, which is an absolute game changer for working with <a href="https://github.com/features/actions">GitHub Actions</a>!</p>

<p>In summary, Generative AI adds the most value when it operates below my level of judgement but above my knowledge or ability to acquire it. It accelerates low-level execution without being asked to make high-level decisions it is poorly suited to make.</p>

<h3 id="the-bad">The Bad</h3>

<p>At the same time, Claude, has ignored explicit instructions, made false assumptions, implemented changes that were not requested, and drifted away from the intended structure, particularly early in a codebase when there is less existing context. It has also misdiagnosed problems and disappeared down rabbit holes without ever fixing them, or worse, unilaterally decided that they can be safely ignored.</p>

<p>What stands out is how sensitive the results are to relatively small changes in how the tool is used. Claude is not a compiler. The results are not deterministic. Small differences in context, ordering, or phrasing can lead to materially different outcomes, even when the intent appears unchanged. Overall, the outcomes are still positive, but I have yet to achieve the 20x improvement reported by some. Either those reports are grossly overstated, or those making them have found ways to circumvent these issues and make Claude perform consistently well. If the latter is true, I want to learn and adopt their methods, but what are they?</p>

<h2 id="what-we-need-is-an-experiment">What We Need Is An Experiment</h2>

<p>To move this discussion forward, we need something more concrete than confident anecdote. When the same tool is reported to produce both dangerous, unmaintainable systems and dramatic productivity gains, opinion alone cannot tell us whether the difference lies in the tool itself, the goals being optimised for, or the way it is being used. The only way to separate those factors is to make the goals explicit and then test, in a controlled way, whether a particular method of using Generative AI can reliably produce outcomes aligned with them. What is needed is a repeatable test case against which different methods can be evaluated. It must be small enough to run repeatedly, representative of real world software, and sufficiently complex to expose meaningful trade offs and failure modes. A URL shortening service meets those criteria. Here are the stories:</p>

<table>
  <thead>
    <tr>
      <th>Story</th>
      <th>Title</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/1">EPIC</a></td>
      <td>URL Shortener Service</td>
      <td>Build a complete URL shortener service with persistence, automatic expiry, and scheduled maintenance.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/2">1</a></td>
      <td>Project Initialisation</td>
      <td>Create the project structure and infrastructure.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/3">2</a></td>
      <td>Shorten URL</td>
      <td>Shortens the given URL.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/4">3</a></td>
      <td>Get URL</td>
      <td>Returns a URL for the given short key.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/5">4</a></td>
      <td>URL Redirection</td>
      <td>Redirect requests for a short key to the URL.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/6">5</a></td>
      <td>Improve Duplicate Key Handling</td>
      <td>Detect and handle the extremely rare case where the same short key is generated for different URLs.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/7">6</a></td>
      <td>Expire Redirects</td>
      <td>Automatically expire the redirects when they have not been accessed for a configurable period of time.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/8">7</a></td>
      <td>Delete Expired Redirects</td>
      <td>Automatically delete expired redirects.</td>
    </tr>
    <tr>
      <td><a href="https://github.com/cressie176/shorty/issues/9">8</a></td>
      <td>Schedule Database Maintenance</td>
      <td>Schedule daily database maintenance to maintain PostgreSQL query planner statistics.</td>
    </tr>
  </tbody>
</table>

<h3 id="hypothesis">Hypothesis</h3>

<p>The vast variation of experience comes from how Generative AI is being used, not from a fundamental weakness in the tool itself.</p>

<h3 id="apparatus">Apparatus</h3>

<ul>
  <li><strong>Machine:</strong> MacBook M1 Pro with 32GB RAM</li>
  <li><strong>Operating system:</strong> macOS 15.6.1 (24G90)</li>
  <li><strong>Model:</strong> Claude Sonnet 4.5 via AWS Bedrock</li>
  <li><strong>Context window:</strong> 1MB</li>
  <li><strong>Claude version:</strong> 2.0.76</li>
  <li><strong>Editor:</strong> <a href="https://zed.dev">zed</a> (barely used)</li>
  <li><strong>Generative AI execution:</strong> Separate terminal window running in plan mode, with safe GitHub and Bash commands pre-allowed</li>
  <li><strong>CLI tooling:</strong> <a href="https://cli.github.com">GitHub CLI</a> and other <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/macos-tools/commands/install-macos-tools.md">macos-tools</a></li>
  <li><strong>Claude Marketplace</strong>: <a href="https://github.com/cressie176/cressie176-claude-marketplace">cressie176-claude-marketplace</a></li>
  <li><strong>Node.js Templates:</strong> <a href="https://github.com/cressie176/node-templates">node-templates</a></li>
  <li><strong>Stories:</strong> <a href="https://github.com/cressie176/shorty/issues">shorty/issues</a></li>
</ul>

<h3 id="method-1-prompt-bootstrapping-implementation-notes-and-manual-accepts-abandoned">Method 1: Prompt Bootstrapping, Implementation Notes and Manual Accepts (Abandoned)</h3>
<p>The initial approach was to use pre-written stories, marketplace skills and interactive prompts to fully implement the URL Shortening service. The intention was to encode structure, constraints, and best practices entirely through instructions. Unfortunately, this proved unreliable, particularly while the codebase was in infancy. Even when instructions were repeated and made increasingly explicit, Claude would occasionally ignore them or drift away from the intended structure. Continuing in this direction wasted both time and tokens. I needed a way to bootstrap the application without Claude.</p>

<h3 id="method-2-template-bootstrapping-implementation-notes-and-manual-accepts">Method 2: Template Bootstrapping, Implementation Notes and Manual Accepts</h3>
<p>For my second attempt, I still worked from pre-written stories, marketplace skills and interactive prompts, but my infrastructure story instructed Claude to bootstrap the service from a custom <a href="https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-template-repository">GitHub template repository</a>. A base service template establishes the core structure, with additional templates layered on top for concerns such as PostgreSQL or other infrastructure. This allows common practices to be shared while still supporting different combinations.</p>

<p>Traditional automation struggles here. As the number of layers increases, reliably merging templates becomes difficult, particularly where cross-cutting concerns are involved. This is where Generative AI proved useful. Each layer includes a “Wiring.md” file describing how it should be integrated into the base. Claude can read and apply these instructions in a way that would be awkward to achieve with scripts. The bootstrap process also made it possible to provide Claude.md files in the templates, but this introduced merge problems as templates were combined. Using <a href="https://code.claude.com/docs/en/memory">rules</a> proved more effective, as each template can include its own rules separately.</p>

<p>Even with templates and rules in place, Generative AI still required technical direction at the story level. Each story therefore includes explicit Implementation Notes recommending an approach, along with reminders to review the relevant rules and skills. These could have been entered interactively, but this would have made the experiment less repeatable and less transparent. Templates capture structural decisions, rules capture behavioural constraints, and stories capture local trade-offs and more nuanced technical decisions. Together, they allow experience to be shared through the artefact itself, rather than through constant explanation or review.</p>

<p>Claude was instructed to install the required templates before implementing any functional stories. With that foundation in place, apart from the occasional and minor course correction, a single prompt was all that was required.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>╭─── Claude Code v2.0.76 ────────────────────────────────────────────────────────────────────────────╮
│                                                    │ Tips for getting started                      │
│                    Welcome back!                   │ Run /init to create a CLAUDE.md file          │
│                                                    │ ──────────────────────────────────────────────│
│                    * ▗ ▗   ▖ ▖ *                   │ Recent activity                               │
│                   *             *                  │ No recent activity                            │
│                    *   ▘▘ ▝▝   *                   │                                               │
│                                                    │                                               │
│ arn:aws:bedrock:eu-west-1:808… · API Usage Billing │                                               │
│          ~/Development/cressie176/shorty           │                                               │
╰────────────────────────────────────────────────────────────────────────────────────────────────────╯

  /model to try Opus 4.5. Note: you may need to request access from your cloud provider

──────────────────────────────────────────────────────────────────────────────────────────────────────
&gt; Implement the URL Shortener epic https://github.com/cressie176/shorty/issues/1 one story at a time 
──────────────────────────────────────────────────────────────────────────────────────────────────────
  ? for shortcuts
</code></pre></div></div>
<p>One deliberate adjustment was made around test-driven development. I allowed Claude to generate tests and production code for a story in a single pass, rather than enforcing a strict red, green, refactor cycle. My assumption was that the model does not benefit from incremental test feedback in the same way a human does, although this remains an open question.</p>

<p>Almost all of my interaction during this phase took place using Claude via the terminal. I reviewed diffs and observed Claude’s workflow there, intervening only when necessary. I didn’t avoid editing in an IDE, I just never felt the need. I only switched to zed once Claude had completed each story, in order to review the changes as a coherent logical unit and to scan through the tests.</p>

<p>I reran the experiment multiple times from the same starting point and received approximately similar results.</p>

<h3 id="results">Results</h3>

<p>The code can be found here: <a href="https://github.com/cressie176/shorty/">shorty</a></p>

<table>
  <thead>
    <tr>
      <th>Story</th>
      <th>Description</th>
      <th>Status</th>
      <th>Time</th>
      <th>Interventions</th>
      <th>Commit</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td>Project Initialisation</td>
      <td>✅</td>
      <td>00:08</td>
      <td>0</td>
      <td><a href="https://github.com/cressie176/shorty/commit/2f37d746d84039a2f58e239fa5bae90760db194b">2f37d74</a></td>
    </tr>
    <tr>
      <td>2</td>
      <td>Shorten URL</td>
      <td>✅</td>
      <td>00:21</td>
      <td>7</td>
      <td><a href="https://github.com/cressie176/shorty/commit/2f37d746d84039a2f58e239fa5bae90760db194b">b83c51f</a></td>
    </tr>
    <tr>
      <td>3</td>
      <td>Get URL</td>
      <td>✅</td>
      <td>00:10</td>
      <td>4</td>
      <td><a href="https://github.com/cressie176/shorty/commit/ea2f83796b255a7c1130ae4baa92ac097deba1e0">ea2f837</a></td>
    </tr>
    <tr>
      <td>4</td>
      <td>URL Redirection</td>
      <td>✅</td>
      <td>00:07</td>
      <td>3</td>
      <td><a href="https://github.com/cressie176/shorty/commit/9ebc8e37754125c7f6d5feb7c464643a1d24e232">9ebc8e3</a></td>
    </tr>
    <tr>
      <td>5</td>
      <td>Handle Key Collisions</td>
      <td>✅</td>
      <td>00:10</td>
      <td>3</td>
      <td><a href="https://github.com/cressie176/shorty/commit/9da1d50236d8eae8bd378246d7acd14294859d22">9da1d50</a></td>
    </tr>
    <tr>
      <td>6</td>
      <td>Expire Redirects</td>
      <td>✅</td>
      <td>00:06</td>
      <td>0</td>
      <td><a href="https://github.com/cressie176/shorty/commit/aafa1ba1d980758b303e2749df01f711e73f69eb">aafa1ba</a></td>
    </tr>
    <tr>
      <td>7</td>
      <td>Delete Expired Redirects</td>
      <td>✅</td>
      <td>00:03</td>
      <td>0</td>
      <td><a href="https://github.com/cressie176/shorty/commit/64199839424353f7bd002400afbb28631326607d">6419983</a></td>
    </tr>
    <tr>
      <td>8</td>
      <td>Schedule Database Maintenance</td>
      <td>✅</td>
      <td>00:02</td>
      <td>0</td>
      <td><a href="https://github.com/cressie176/shorty/commit/4ed283719e72528998c7d6cccf3ffdafffae5339">4ed2837</a></td>
    </tr>
    <tr>
      <td> </td>
      <td> </td>
      <td> </td>
      <td>01:07</td>
      <td>17</td>
      <td> </td>
    </tr>
  </tbody>
</table>

<h4 id="tool-rejections-deduped">Tool Rejections (deduped)</h4>
<ol>
  <li>Don’t duplicate config in tests</li>
  <li>Prefer single-line if statements</li>
  <li>Don’t use try-catch for testing errors, use throws/rejects</li>
  <li>Pass the full redirect config not just the key</li>
  <li>Use object parameters in constructors</li>
  <li>Don’t be lazy with assertions (use eq not ok/match when you know the full string)</li>
  <li>Inject the full error message into the JSX template</li>
  <li>Suppress expected error logs in tests</li>
  <li>Destructure { rows } instead of result.rows</li>
</ol>

<p>Using this approach, Claude correctly implemented the URL shortener service in one hour and seven minutes, with minimal intervention or further prompting. The architectural drift and disobedience seen earlier largely disappeared once the environment was properly bootstrapped.</p>

<p>There were some minor style problems. Claude is overly fond of blank lines, and despite it being mentioned in the implementation plan, it missed that an object’s toJSON() method will be called automatically by Hono’s JSON serialiser. Slightly more serious was that Claude omitted running database migrations from each integration test. The tests still passed because migrations were run from the template Postgres.test.ts, but this may not have run first, potentially causing intermittent failures. I have since updated the node-pg template to run migrations automatically from within the Postgres class to avoid this in future. Overall, these are minor niggles, and the code satisfied my goals of minimal operational debt and cleanliness.</p>

<p>I estimate it would have taken me 1 to 2 working days to produce an equivalent codebase from the same templates without AI. This would suggest that Claude achieved an 8x to 16x improvement. However, the implementation time does not reflect the full cost. These results required repeated iteration on both the stories and the skills. Significant effort went into refining story structure, clarifying implementation notes, and adjusting skills so that Claude behaved consistently.</p>

<h3 id="method-3-templated-bootstraping-no-implementation-notes-and-automatic-accepts">Method 3: Templated Bootstraping, No Implementation Notes and Automatic Accepts</h3>

<p>As another experiment, I tried a deliberately unattended approach without the benefit of implementation notes. The code can be found here: <a href="https://github.com/cressie176/shorty/tree/claude-unattended">claude-unattended</a>
<br /></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>╭─── Claude Code v2.0.76 ────────────────────────────────────────────────────────────────────────────╮
│                                                    │ Tips for getting started                      │
│                    Welcome back!                   │ Run /init to create a CLAUDE.md file          │
│                                                    │ ──────────────────────────────────────────────│
│                    * ▗ ▗   ▖ ▖ *                   │ Recent activity                               │
│                   *             *                  │ No recent activity                            │
│                    *   ▘▘ ▝▝   *                   │                                               │
│                                                    │                                               │
│ arn:aws:bedrock:eu-west-1:808… · API Usage Billing │                                               │
│          ~/Development/cressie176/shorty           │                                               │
╰────────────────────────────────────────────────────────────────────────────────────────────────────╯

  /model to try Opus 4.5. Note: you may need to request access from your cloud provider

──────────────────────────────────────────────────────────────────────────────────────────────────────
&gt; I want you to implement https://github.com/cressie176/shorty/issues/1 story by story.
  I want you to completely ignore the implementation notes within the GitHub issues for everyting
  EXCEPT the Project Initialisation story. You MUST NOT even read them otherwise they will influence 
  you. To avoid reading them pipe the output from the gh issue command into something to remove 
  everything after the "Implementation Notes" title so that it is completely unavailable to you. 
  When you think a story is done, build, lint, test and commit.
──────────────────────────────────────────────────────────────────────────────────────────────────────
  ? for shortcuts
</code></pre></div></div>

<h3 id="results-1">Results</h3>

<p>Claude completed all eight stories in 30 minutes and 34 seconds. On the surface, this looks impressive. In reality the code was a hot mess.
<br />
<br /></p>

<table>
  <thead>
    <tr>
      <th>Area</th>
      <th>Metric</th>
      <th>Method 2 (Guided)</th>
      <th>Method 3 (Unattended)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Domain</td>
      <td>Files</td>
      <td>1</td>
      <td>1</td>
    </tr>
    <tr>
      <td> </td>
      <td>TypeScript lines</td>
      <td>55</td>
      <td>6</td>
    </tr>
    <tr>
      <td> </td>
      <td>SQL lines</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>TSX lines</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>Comments</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>Services</td>
      <td>Files</td>
      <td>4</td>
      <td>4</td>
    </tr>
    <tr>
      <td> </td>
      <td>TypeScript lines</td>
      <td>47</td>
      <td>199</td>
    </tr>
    <tr>
      <td> </td>
      <td>SQL lines</td>
      <td>7</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>TSX lines</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>Comments</td>
      <td>0</td>
      <td>7</td>
    </tr>
    <tr>
      <td>Routes</td>
      <td>Files</td>
      <td>3</td>
      <td>3</td>
    </tr>
    <tr>
      <td> </td>
      <td>TypeScript lines</td>
      <td>31</td>
      <td>66</td>
    </tr>
    <tr>
      <td> </td>
      <td>TSX lines</td>
      <td>18</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>SQL lines</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td> </td>
      <td>Comments</td>
      <td>0</td>
      <td>0</td>
    </tr>
    <tr>
      <td>Overall</td>
      <td>Total files</td>
      <td>8</td>
      <td>8</td>
    </tr>
    <tr>
      <td> </td>
      <td>Total comments</td>
      <td>0</td>
      <td>7</td>
    </tr>
    <tr>
      <td> </td>
      <td><strong>Total lines (excl blanks)</strong></td>
      <td><strong>158</strong></td>
      <td><strong>271</strong></td>
    </tr>
  </tbody>
</table>

<p>The guided version produced 158 lines of code with no comments. The unattended version produced 271 lines with seven (pointless) comments. <strong>That’s 72% more code and 104% more TypeScript for the same behaviour!</strong></p>

<p>Furthermore, the unattended implementation introduced substantial technical and operational debt:</p>

<ul>
  <li>Unnecesary new services for key generation and expiry management.</li>
  <li>Redirects retrieved from the API were not expired.</li>
  <li>Consecutive database queries were performed using separate client calls, so they were not part of the same transaction, in disregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/typescript-service-cookbook/skills/typescript-service-cookbook/SKILL.md#unit-of-work-pattern-with-asynclocalstorage">Unit of Work</a> pattern.</li>
  <li>Errors polluted the logs during test runs in disregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/typescript-tdd-cookbook/skills/typescript-tdd-cookbook/SKILL.md#suppressing-expected-error-logs">Suppress Expected Error in Logs</a> guidance.</li>
  <li>It implemented an explicit rude word filter rather than removing vowels, increasing maintenance burden.</li>
  <li>setInterval was used for the maintenance tasks without unref(), which can prevent the process from exiting and does not scale horizontally.</li>
  <li>Functions were far larger, with SQL and JSX inlined making the code harder to follow, in disregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/typescript-clean-code-cookbook/skills/typescript-clean-code-cookbook/SKILL.md#function-design">Function Design</a> guidance.</li>
  <li>PostgreSQL was used poorly, with application code compensating for weak queries, e.g.
    <ul>
      <li>Expiry logic was checked in application code rather than expressed directly in a query.</li>
      <li>Duplicate urls were inserted rather than being upserted.</li>
      <li>Scheduling was done in application code in disgregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/postgresql-cookbook/skills/postgresql-cookbook/SKILL.md#scheduled-deletion-of-old-records-with-pg_cron">pg_cron</a> guidance.</li>
    </ul>
  </li>
  <li>The codebase was littered with low value comments that narrated rather than clarified, in disregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/typescript-clean-code-cookbook/skills/typescript-clean-code-cookbook/SKILL.md#comments">Comments</a> guidance.</li>
  <li>Tests used Monkey patching instead of dependency injection to fake behaviour.</li>
  <li>Fetch was used in tests directly increasing duplication, reducing readability, making them brittle. and in disregard of the <a href="https://github.com/cressie176/cressie176-claude-marketplace/blob/main/plugins/typescript-tdd-cookbook/skills/typescript-tdd-cookbook/SKILL.md#test-client-pattern">Test Client</a> pattern.</li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p><strong>Method 1 (Prompt Bootstrapping, Implementation Notes, and Manual Accepts)</strong> showed that Claude is ineffective when asked to bootstrap a non-trivial system from scratch using prompts alone. Even with detailed stories and marketplace skills, progress was slow and fragile, with frequent drift making the process an uphill and unnecessary battle. The results suggest that Claude is far more productive when working on an existing or bootstrapped codebase, and that it appears to weight existing artefacts more heavily than abstract guidance from prompts or skills. This limitation motivated the move to a templated bootstrap approach.</p>

<p><strong>Method 2 (Template Bootstrapping, Implementation Notes, and Manual Accepts)</strong> demonstrates that Claude can produce high-quality code far more rapidly than even an experienced software engineer, but only when projects are already established or properly bootstrapped, and the agent is guided by effective prompts, whether through marketplace skils, embedded directly in user stories or provided interactively. While marketplace skills represent a long-term investment that can be reused, stories do not. Incorporating detailed implementation notes into each story was necessary for this experiment, but it also made those stories brittle. Real world success will therefore depend either on becoming very good at writing implementation notes up front, or on the engineers driving Claude being skilled enough to provide sufficiently strong just-in-time guidance. Where that is the case, stories that previously took days can be delivered in hours, shifting the primary bottleneck from implementation to story writing. It may be that AI-empowered teams will need to be significantly smaller, narrower in focus, and able to move rapidly between applications and domains to circumvent this bottleneck.</p>

<p>It is also important to put these productivity gains into context. Software engineers do not spend anything close to 100% of their time writing code. Design discussions, team ceremonies, research, troubleshooting, supporting others, administrative work, and training all consume significant portions of a typical working day. Even a dramatic improvement in coding effectiveness therefore does not translate directly into an equivalent improvement in overall productivity.</p>

<p>In contrast, <strong>Method 3 (Template Bootstrapping, No Implementation Notes, and Automatic Accepts)</strong> demonstrates that Claude is not yet something that can be left to operate unattended while still producing consistently good outcomes. Without strong constraints, it reliably drifts towards verbosity, duplication, and accidental complexity, even when the resulting system is functionally correct. That gap between apparent success and long-term maintainability likely explains much of the current scepticism and pushback.</p>

<p>After all of this, I am still left pondering the following open questions:</p>

<ol>
  <li>Were my goals the right ones (particularly my need for Clean Code)?</li>
  <li>Was my method the most effective way to use Generative AI, and specifically Claude Code?</li>
  <li>Do other goals matter more in different contexts?</li>
</ol>

<p>If you are getting better results from vibe coding, what are you optimising for, and how does your approach support that? If you are getting poor results how does your approach differ from mine? I’d love to know.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Generative AI" /><category term="Claude Code" /><category term="Clean Code" /><category term="Vibe Coding" /><summary type="html"><![CDATA[A controlled experiment reveals why developers report vastly different experiences with AI coding tools - and what it takes to achieve reliable, high-quality results.]]></summary></entry><entry><title type="html">Addendum: Open Source Contributions in Hiring</title><link href="http://www.stephen-cresswell.com/2025/01/21/addendum-open-source-contributions-in-hiring.html" rel="alternate" type="text/html" title="Addendum: Open Source Contributions in Hiring" /><published>2025-01-21T00:00:00+00:00</published><updated>2025-01-21T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2025/01/21/addendum-open-source-contributions-in-hiring</id><content type="html" xml:base="http://www.stephen-cresswell.com/2025/01/21/addendum-open-source-contributions-in-hiring.html"><![CDATA[<p>After publishing my recent <a href="https://www.stephen-cresswell.com/2025/01/07/in-defence-of-coding-tests.html">article</a> defending coding tests, I read a <a href="https://www.linkedin.com/posts/james-mahy_stop-using-github-contributions-in-hiring-activity-7280527334883962880-esvT">post</a> by James Mahy, a genuinely kind, and exceptionally talented developer whom I ironically had the good fortune to interview some years ago. In it, he raises some interesting points, suggesting that GitHub contributions do not always reflect motivation, skill, or tenacity, and may even be misleading. My own experience leads me to a different conclusion.</p>

<p>I have been contributing to open source for almost two decades. It has taught me a great deal about real-world problem-solving, communication, and collaboration. Furthermore, reviewing a candidate’s contributions - whether personal projects, pull requests, or issues, can reveal insights into their creativity, communication skills, and technical mindset. This applies equally to junior developers because even small projects or issues can demonstrate qualities like passion, curiosity, intelligence, diligence, empathy, humility, and respect.</p>

<p>That said, I agree with James that open source contributions should not be a requirement. Plenty of excellent developers do not engage in open source for various reasons. But if someone has made meaningful contributions, I want to hear about them. For me, it is less about the green squares and more about the stories behind them; what interests they have, what challenges they faced, and what they learned along the way.</p>

<p>Open source is just one piece of the puzzle, but it is often a fascinating one. I believe it can be a valuable part of the hiring conversation, no matter what the candidate’s level of experience.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Software Engineering" /><category term="Hiring" /><category term="Interviews" /><category term="Coding Tests" /><category term="Open Source" /><summary type="html"><![CDATA[Why open source contributions can reveal valuable insights about candidates - not as a requirement, but as a window into passion, problem-solving, and communication skills.]]></summary></entry><entry><title type="html">In Defence Of Coding Tests</title><link href="http://www.stephen-cresswell.com/2025/01/07/in-defence-of-coding-tests.html" rel="alternate" type="text/html" title="In Defence Of Coding Tests" /><published>2025-01-07T00:00:00+00:00</published><updated>2025-01-07T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2025/01/07/in-defence-of-coding-tests</id><content type="html" xml:base="http://www.stephen-cresswell.com/2025/01/07/in-defence-of-coding-tests.html"><![CDATA[<p>Hiring software engineers is one of the most challenging and critical responsibilities for any team. Having founded a consultancy and led hiring for other companies, I know first-hand how much is at stake.</p>

<p>A wrong hire can delay work, frustrate teams, and cost both significant time and money. It is not good for the candidate either. Being in the wrong role causes anxiety, damages confidence, and leaves a blemish on their career history. Furthermore, dealing with someone who underperforms is disruptive, demoralising and exhausting for their manager, their colleagues, and the person themselves. Conversely, a good hiring process means that successful candidates get to work on interesting projects with great colleagues and vice versa. This is why it is in everyone’s interest that the process is effective.</p>

<p>However, the methods used in hiring have faced increasing scrutiny, particularly coding tests, whether take-home or live. While these methods are not perfect, dismissing them outright risks missing valuable insights. Let me explore some of these debates and how thoughtful hiring processes can strike a balance.</p>

<h3 id="common-criticisms-of-coding-tests">Common Criticisms of Coding Tests</h3>

<p>Common criticisms of coding tests include:</p>

<ul>
  <li>
    <p><strong>Anxiety:</strong> Many candidates experience stress caused by live coding exercises, which can lead to underperformance during interviews.</p>
  </li>
  <li>
    <p><strong>Time constraints:</strong> Take-home projects can be burdensome, especially for those with limited time due to family or other commitments. An ex-colleague even refused to interview for companies who set take-home tests.</p>
  </li>
  <li>
    <p><strong>Unrealistic tests:</strong> Many coding tests focus on niche skills like recursion or bit masking that are rarely used in day-to-day work. These tests often fail to measure a candidate’s ability to perform in real-world scenarios, leaving a gap in assessing their practical effectiveness.</p>
  </li>
</ul>

<h3 id="why-a-live-coding-test-is-valuable">Why a Live Coding Test Is Valuable</h3>

<p>One of the most revealing parts of a coding interview is observing candidates using their toolchain (computer, IDE, command line, etc). An experienced software engineer develops a kind of muscle memory, navigating their environment with ease and efficiency. Whether it is memorised shortcuts, efficient workflows, favourite packages, or thoughtful organisation, these details strongly correlate with hands-on experience. They demonstrate not only a candidate’s technical skill but also their approach to problem-solving and familiarity with their craft. It is not about perfection; it is about the habits and fluency that only come from practice.</p>

<p>A live coding test allows me to observe these behaviours in action. It shows how candidates approach a problem, how they structure their code, and how they troubleshoot unexpected issues. It also provides insight into their communication skills and how they handle feedback, both of which are critical in collaborative environments.</p>

<p>In some cases, it can reveal red flags about a candidate. For instance, I have encountered several candidates who ignored feedback, which raised concerns about their adaptability and willingness to engage constructively. In another example, I asked a candidate to adapt their in-memory solution to one using persistence with a datastore of their choice. To my surprise, they responded with exasperation, raising serious concerns about their suitability for a collaborative, team-oriented culture.</p>

<h3 id="coding-tests-a-flexible-approach">Coding Tests: A Flexible Approach</h3>

<p>My preferred approach addresses the concerns around live and take-home tests by offering flexibility. I use tests that do not require company domain knowledge but do demonstrate frequently used software engineering skills. These tests are tailored to the experience level of the role, ensuring they are both relevant and fair. Candidates are encouraged to use whatever resources they have available, including Google, Stack Overflow, and AI tools. I always attempt the test myself to ensure its appropriateness and feasibility. Additionally, at least one existing employee of the appropriate level is asked to attempt it as well. I also try to put the candidate at ease prior to starting the live aspect of the test, rather than jumping straight in and advising them that it is not necessary to finish.</p>

<p>I provide candidates with the coding test beforehand. At a minimum, I ask them to read the exercise and come prepared with their environment ready to go. If they prefer, they can complete the test in advance and walk through their solution during the interview, although I will ask them to make changes so I can observe them work. This allows:</p>

<ul>
  <li>
    <p>Time-poor candidates to do the minimum and focus on live problem-solving,</p>
  </li>
  <li>
    <p>Candidates who feel anxious to prepare more thoroughly in advance.</p>
  </li>
</ul>

<p>This approach gives candidates control over how they prepare and ensures I can still evaluate problem-solving skills, adaptability, and hands-on coding ability. Additionally, I have offered a second chance to candidates who suffered badly due to nerves but who performed well otherwise.</p>

<h3 id="final-thoughts">Final Thoughts</h3>

<p>To paraphrase Warren Buffett, at the heart of any good hiring process is the search for intelligence, energy, and integrity. Depending on the level of role, knowledge and wisdom become increasingly important too. No hiring process will ever be perfect at evaluating these criteria, but thoughtful adjustments, like offering flexible coding tests, observing how candidates use their tools can make the process more effective. The goal is to use these tools not as rigid gatekeepers but as opportunities to learn about candidates and identify those who will thrive in your team.</p>

<h3 id="recommended-reading">Recommended Reading</h3>
<ul>
  <li><a href="https://blog.codinghorror.com/why-cant-programmers-program/">Why Can’t Programmers.. Program?</a></li>
  <li><a href="https://dannorth.net/interviewing-for-evidence/">Interviewing for Evidence</a></li>
</ul>]]></content><author><name>Stephen Cresswell</name></author><category term="Software Engineering" /><category term="Hiring" /><category term="Interviews" /><category term="Coding Tests" /><summary type="html"><![CDATA[Why live coding tests remain valuable in hiring, and how to make them fair and effective for both candidates and employers.]]></summary></entry><entry><title type="html">Introducing Filby: An Open Source Library for Managing Shared Reference Data</title><link href="http://www.stephen-cresswell.com/2024/12/29/introducing-filby.html" rel="alternate" type="text/html" title="Introducing Filby: An Open Source Library for Managing Shared Reference Data" /><published>2024-12-29T00:00:00+00:00</published><updated>2024-12-29T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/12/29/introducing-filby</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/12/29/introducing-filby.html"><![CDATA[<p>Shared reference data can be a tricky aspect of software development, especially in distributed or microservice-based architectures. Enter <strong>Filby</strong>, an open-source library that simplifies the management of such data, ensuring consistency, reliability, and flexibility. Here, we’ll explore how Filby works, its key benefits, and why it might be the solution you’ve been searching for.</p>

<h4 id="the-problem-managing-shared-reference-data">The Problem: Managing Shared Reference Data</h4>

<p>Most applications rely on reference data—information that changes infrequently but must remain consistent across the system. However, managing this data across distributed systems introduces challenges:</p>

<table>
  <thead>
    <tr>
      <th>Challenge</th>
      <th>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Consistency</td>
      <td>Whenever we duplicate our reference data, we increase the likelihood of inconsistency. Even if we have one authoritive source of truth, we may cache the reference data in multiple systems, resulting in temporary inconsisenty unless cache updates are sychronised. Given the reference data is slow moving, a short period of inconsistency may be acceptable.</td>
    </tr>
    <tr>
      <td>Load Times</td>
      <td>Some reference data sets may be too large to desirably load over a network connection for web and mobile applications. Therefore we should discourage accidentally including large data sets into a client bundle, or requesting large data sets over a network.</td>
    </tr>
    <tr>
      <td>Reliability</td>
      <td>Requesting data sets over a network may fail, especially when mobile. Bundling local copies of reference data into the application (providing they are not too large) will aleviate this, but increase the potential for stale data.</td>
    </tr>
    <tr>
      <td>Stale Data</td>
      <td>Even though reference data is slow moving, it will still change occasionally. Therefore we need a strategy for refreshing reference data.</td>
    </tr>
    <tr>
      <td>Temporality</td>
      <td>When reference data changes, the previous values may still be required for historic comparisons. Therefore all reference data should have an effective date. Effective dates can also be used to synchronise updates by including future records when the values are known in advance. This comes at the cost of increased size, and there may still be some inconsistency due to clock drift and cache expiry times.</td>
    </tr>
    <tr>
      <td>Evolution</td>
      <td>Both reference data, and our understanding of the application domain evolves over time. We will at some point need to make backwards incompatible changes to our reference data, and will need to do so without breaking client applications. This suggests a versioning and validation mechanism. The issue of temporality compounds the challenge of evolution, since we may need to retrospecively add data to historic records. In some cases this data will not be known.</td>
    </tr>
    <tr>
      <td>Local Testing</td>
      <td>Applications may be tested locally, and therefore any solution sould work well on a local development machine.</td>
    </tr>
  </tbody>
</table>

<h4 id="meet-filby-version-control-for-reference-data">Meet Filby: Version Control for Reference Data</h4>

<p>Think of Filby as <strong>version control for reference data</strong>. Just as source control systems like Git track changes to source code, Filby manages reference data changes. Like checking out a commit, your applications can use Filby to retrieve reference data for a given change set id. They can also inspect the changelog to find which change set was in effect at a given point in time, and subscribe to reference data update notifications.</p>

<h4 id="key-features-and-benefits">Key Features and Benefits</h4>

<p>Filby offers a structured approach to managing shared, temporal reference data, delivering several advantages:</p>

<ol>
  <li><strong>Safe, Predictable Updates</strong>: Deploy new reference data ahead of time and activate it when needed.</li>
  <li><strong>Consistency Across Systems</strong>: The change set mechanism ensures consistency, even with distributed systems.</li>
  <li><strong>Historic and Future Data Access</strong>: Retrieve past data or schedule future changes.</li>
  <li><strong>Customisable Projections</strong>: Tailor views of reference data for specific clients or systems.</li>
  <li><strong>Version Control</strong>: Support backward-incompatible changes with versioned projections.</li>
  <li><strong>Caching Made Easy</strong>: Change sets should never change making them highly cacheable.</li>
  <li><strong>Local Development Support</strong>: Test applications locally using HTTP mocking libraries.</li>
</ol>

<h4 id="filbys-core-concepts">Filby’s Core Concepts</h4>

<p>To understand how Filby works, let’s break down its main components:</p>

<ul>
  <li><strong>Projections</strong>: Versioned views of reference data, tailored to specific use cases.</li>
  <li><strong>Entities</strong>: The individual pieces of reference data.</li>
  <li><strong>Data Frames</strong>: Snapshots of entities at specific points in time, grouped by change sets.</li>
  <li><strong>Change Sets</strong>: Logical bundles of updates with a common effective date.</li>
  <li><strong>Notifications</strong>: Update events for maintaining synced data across systems.</li>
  <li><strong>Hooks</strong>: Custom event handlers triggered by data changes.</li>
</ul>

<pre>
┌───────────────┐
│               │
│               │ announces changes via
│  Projection   │────────────────────────┐
│               │                        │
│               │                        │
└───────────────┘                        │
        │ depends on                     │
        │                                │
        │                                │
        │                                │
       ╱│╲                              ╱│╲
┌───────────────┐                ┌───────────────┐                 ┌──────────────┐
│               │                │               │                 │              │
│               │                │               │╲ delivered via  │              │
│    Entity     │                │ Notification  │─────────────────│     Hook     │
│               │                │               │╱                │              │
│               │                │               │                 │              │
└───────────────┘                └───────────────┘                 └──────────────┘
        │ aggregates                    ╲│╱ is raised by
        │                                │
        │                                │
        │                                │
       ╱│╲                               │
┌───────────────┐                ┌───────────────┐
│               │                │               │
│               │╲ is grouped by │               │
│  Data Frame   │────────────────│  Change Set   │
│               │╱               │               │
│               │                │               │
└───────────────┘                └───────────────┘
</pre>

<h4 id="a-real-world-use-case">A Real-World Use Case</h4>

<p>Imagine a system managing holiday park data. With Filby:</p>

<ol>
  <li>Define entities like “Park” and “Season” in JSON or YAML files.</li>
  <li>Create a changelog for the Park and Season data updates in JSON, YAML, CSV or SQL.</li>
  <li>Tailor projections to specific needs, such as a Mobile App requiring a compact view of park details.</li>
  <li>Use Filby’s API to retrieve park data at any point in time.</li>
  <li>Add “hooks” to notify other systems that new data and/or projects are available.</li>
</ol>

<pre>
               Change
               Hook       Invalidate Cache
           ┌─────────┐   ┌──────────────┐
           │         │   │              │
           │         ▼   │              ▼
┌────┬───────────┬──────────┐       ┌────────┐   GET /changelog/parks/v1        ┌──────────┐
│    │           │          │◀──────│        │◀─────────────────────────────────│          │
│    │           │ RESTful  │       │        │                                  │  Mobile  │
│ DB │   Filby   │   API    │       │  CDN   │                                  │   App    │
│    │           │          │       │        │   GET /projection/parks/v1?id=29 │          │
│    │           │          │◀──────│        │◀─────────────────────────────────│          │
└────┴───────────┴──────────┘       └────────┘                                  └──────────┘
           ▲
           │
           │
┌─────────────────────────────────────────────┐
│                                             │
│               Reference Data                │
│                 Change Sets                 │
│                                             │
└─────────────────────────────────────────────┘
     ▲           ▲           ▲           ▲
     │           │           │           │
     │           │           │           │
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│         │ │         │ │         │ │         │
│   CSV   │ │  YAML   │ │  JSON   │ │   SQL   │
│         │ │         │ │         │ │         │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
</pre>
<p>In the above scenario the first of the two API calls, <code class="language-plaintext highlighter-rouge">/api/changelog</code> lists the changes sets id and effective dates for the specified projection. The second API call <code class="language-plaintext highlighter-rouge">/api/projection/v1/parks</code> requestes the park data at a specific point in time. The requests are routed via a CDN for caching. When the data behind the projections a hook causes the cache to be invalidated.
However, it’s completely up to you how the projections are accessed - you could build a RESTful API to expose them over HTTP as above, bundle them in a client side JavaScript module or export them as a set of <a href="https://avro.apache.org/">Apache AVRO</a> files to S3.</p>

<h3 id="conclusion">Conclusion</h3>

<p>Filby transforms how you manage shared, temporal reference data, combining the best practices of source control with runtime flexibility, no matter how the data is consumed. It isn’t a turn key system though, you need to model and provide the data, then write the code calls the Filby API to retrieve the reference data at a point in time and expose it to the outside world.</p>

<p>Ready to explore Filby? <a href="https://github.com/acuminous/filby">Start here</a>.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Temporal" /><category term="Versioned" /><category term="Reference Data" /><category term="PostgreSQL" /><summary type="html"><![CDATA[An open-source library that simplifies managing versioned reference data across distributed systems with consistency, reliability, and flexibility.]]></summary></entry><entry><title type="html">Freedom vs. Frameworks</title><link href="http://www.stephen-cresswell.com/2024/12/25/freedom-vs-frameworks.html" rel="alternate" type="text/html" title="Freedom vs. Frameworks" /><published>2024-12-25T00:00:00+00:00</published><updated>2024-12-25T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/12/25/freedom-vs-frameworks</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/12/25/freedom-vs-frameworks.html"><![CDATA[<p>In his book <em>Drive</em>, Dan Pink highlights autonomy as one of the critical motivators for high-performing teams. Autonomy empowers engineers to innovate, solve problems creatively, and own their work. But with great autonomy comes great responsibility — and sometimes, significant challenges. One such challenge is the tendency for engineers to gravitate towards shiny new toys, often prioritising popularity over suitability. I explored this phenomenon in my blog post <a href="https://www.stephen-cresswell.com/2024/04/17/prisma-and-the-naivety-of-crowds.html">“Prisma and the Naivety of Crowds”</a>, where I discussed how crowdsourcing trends can sometimes overshadow thoughtful, context-specific decision-making.</p>

<p>Guidelines are necessary to strike a balance between autonomy and consistency, ensuring that choices align with organisational goals. But guidelines alone are insufficient. As Jeff Bezos famously said, “Good intentions don’t work. Good mechanisms do.” To address this, I’m excited to introduce <a href="https://github.com/acuminous/eslint-plugin-tech-radar">eslint-plugin-tech-radar</a>, a robust mechanism to help engineering teams enforce dependency guidelines at scale.</p>

<h3 id="introducing-eslint-plugin-tech-radar">Introducing eslint-plugin-tech-radar?</h3>

<p>A traditional <a href="https://www.thoughtworks.com/radar/byor">Tech Radar</a> provides a visual framework for evaluating tools and technologies based on maturity and strategic alignment. However, it doesn’t prevent engineers from inadvertently or deliberately installing prohibited modules. Mechanisms like private npm registries or post-installation scans have significant drawbacks—blocking transitive dependencies or acting too late in the pipeline.</p>

<p><strong>eslint-plugin-tech-radar</strong> solves these problems by integrating directly into your development workflow. It provides:</p>

<ul>
  <li><strong>Proactive validation:</strong> Dependencies are checked against your organisation’s Tech Radar during development, pre-commit, or CI/CD stages.</li>
  <li><strong>Shared configuration:</strong> Centralised rules ensure consistent enforcement across repositories.</li>
  <li><strong>Flexibility:</strong> Teams can override or adapt rules on a per-repository basis using familiar ESLint escape hatches.</li>
  <li><strong>Version tracking:</strong> A built-in “latest” rule ensures your shared configuration stays up to date.</li>
</ul>

<p>By leveraging eslint-plugin-tech-radar, teams can adhere to established guidelines without sacrificing agility.</p>

<h3 id="getting-started">Getting Started</h3>

<h4 id="step-1-define-your-tech-radar">Step 1: Define Your Tech Radar</h4>

<p>Start by building your Tech Radar. For example:</p>

<pre><code class="language-csv">name,ring,quadrant,isNew,description
prisma,hold,backend,FALSE,Persistence
winston,hold,backend,FALSE,Logging
bunyan,hold,backend,FALSE,Logging
@pgtyped/query,assess,TRUE,Persistence
orchid-orm,trial,backend,FALSE,Persistence
pino,adopt,backend,FALSE,Logging
sequelize,adopt,backend,FALSE,Persistence
</code></pre>

<p>Export this CSV as a JSON configuration:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx <span class="nt">--package</span> eslint-plugin-tech-radar <span class="nt">--</span> export-tech-radar <span class="se">\</span>
  <span class="nt">--input</span> radar.csv <span class="se">\</span>
  <span class="nt">--documentation</span> https://github.com/your-organisation/tech-radar <span class="se">\</span>
  <span class="nt">--output</span> radar.json
</code></pre></div></div>

<h4 id="step-2-create-a-shared-eslint-configuration">Step 2: Create a Shared ESLint Configuration</h4>

<p>Create a shared ESLint configuration that uses your Tech Radar:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"extends"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"eslint:recommended"</span><span class="p">],</span><span class="w">
  </span><span class="nl">"plugins"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"tech-radar"</span><span class="p">],</span><span class="w">
  </span><span class="nl">"rules"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"tech-radar/adherence"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="s2">"error"</span><span class="p">,</span><span class="w">
      </span><span class="p">{</span><span class="w">
        </span><span class="nl">"hold"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"prisma"</span><span class="p">,</span><span class="w"> </span><span class="s2">"winston"</span><span class="p">,</span><span class="w"> </span><span class="s2">"bunyan"</span><span class="p">],</span><span class="w">
        </span><span class="nl">"assess"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"@pgtyped/query"</span><span class="p">],</span><span class="w">
        </span><span class="nl">"trial"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"orchid-orm"</span><span class="p">],</span><span class="w">
        </span><span class="nl">"adopt"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"pino"</span><span class="p">,</span><span class="w"> </span><span class="s2">"sequelize"</span><span class="p">],</span><span class="w">
        </span><span class="nl">"ignore"</span><span class="p">:</span><span class="w"> </span><span class="p">[],</span><span class="w">
        </span><span class="nl">"documentation"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://github.com/your-organisation/tech-radar"</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">],</span><span class="w">
    </span><span class="nl">"tech-radar/latest"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="s2">"error"</span><span class="p">,</span><span class="w">
      </span><span class="p">{</span><span class="w">
        </span><span class="nl">"packages"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"eslint-config-your-organisation"</span><span class="p">]</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">]</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h4 id="step-3-enforce-rules-across-repositories">Step 3: Enforce Rules Across Repositories</h4>

<p>Include the shared configuration in your application’s ESLint setup:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"extends"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"eslint-config-your-organisation"</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Run ESLint as part of your CI/CD pipeline or pre-commit hooks to ensure compliance.</p>

<h3 id="real-world-examples">Real-World Examples</h3>

<p>If your <code class="language-plaintext highlighter-rouge">package.json</code> includes a dependency on <code class="language-plaintext highlighter-rouge">prisma</code> (a “hold” package), ESLint will flag it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt; eslint .

~/your-application/package.json
  1:1  error  Package 'prisma' is discouraged. See https://github.com/your-organisation/tech-radar for more details  tech-radar/adherence

✖ 1 problem (1 error, 0 warnings)
</code></pre></div></div>

<p>If your shared configuration is outdated, the “latest” rule will catch it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt; eslint .

~/your-application/package.json
  1:1  error  Package 'eslint-config-your-organisation' must be version 1.0.2.  tech-radar/latest

✖ 1 problem (1 error, 0 warnings)
</code></pre></div></div>

<h3 id="additional-tips">Additional Tips</h3>

<h4 id="block-installations-using-a-preinstall-script">Block Installations Using a Preinstall Script</h4>

<p>To prevent undesirable dependencies from being installed, run ESLint during <code class="language-plaintext highlighter-rouge">npm install</code>:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"scripts"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"preinstall"</span><span class="p">:</span><span class="w"> </span><span class="s2">"eslint ."</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h4 id="encourage-healthy-discussion">Encourage Healthy Discussion</h4>

<p>Changes to the Tech Radar should be accompanied by documented discussions in pull requests or issues. This fosters transparency and ensures alignment.</p>

<h3 id="conclusion">Conclusion</h3>

<p>Autonomy is essential, but so is alignment. eslint-plugin-tech-radar bridges the gap, providing a scalable mechanism for enforcing dependency guidelines. By embedding this tool in your workflow, you can maintain the freedom engineers need to innovate while ensuring their choices align with organisational objectives.</p>

<p>Ready to give it a try? Check out the <a href="https://github.com/acuminous/eslint-plugin-tech-radar">GitHub repository</a> for more details and start building a better, more consistent codebase today.</p>

<hr />
<p><br /></p>

<h4 id="recommended-reading">Recommended Reading</h4>
<ul>
  <li><a href="https://www.danpink.com/books/drive/">Drive</a></li>
</ul>]]></content><author><name>Stephen Cresswell</name></author><category term="Autonomy" /><category term="Control" /><category term="Engineering" /><category term="Leadership" /><category term="Management" /><category term="Tech Radar" /><summary type="html"><![CDATA[Balancing engineering autonomy with organisational standards through eslint-plugin-tech-radar - a mechanism that enforces dependency guidelines without sacrificing developer freedom.]]></summary></entry><entry><title type="html">Turn The Team Around!</title><link href="http://www.stephen-cresswell.com/2024/12/24/turn-the-team-around.html" rel="alternate" type="text/html" title="Turn The Team Around!" /><published>2024-12-24T00:00:00+00:00</published><updated>2024-12-24T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/12/24/turn-the-team-around</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/12/24/turn-the-team-around.html"><![CDATA[<p>Targets can be a double-edged sword. When poorly conceived they encourage behaviours that prioritise metrics over real success. For instance, setting a goal like “close 100 tickets this quarter” might push a team to focus on easy wins rather than meaningful work. However, I once encountered a target that broke this mould and completely shifted my perspective. It was when challenged to “make Portal a destination team”.</p>

<p>At the time, the School Portal team had a poor reputation. Engineers were reluctant to join, citing a disjointed codebase and a team culture that, while professional, felt siloed and under immense pressure. David’s target wasn’t about deliverables; it was about creating a team environment where people wanted to be. Success would be measured by whether engineers were eager to join us.</p>

<h3 id="taking-stock">Taking Stock</h3>

<p>When I first joined the Portal team, I took a close look at the challenges we faced. The problems were easy to spot but significant in scale. The four engineers on the team were working in isolation, each handling months-long initiatives. This approach had been the norm for over a year, leading to knowledge silos. There was no slack. If something broke, only the original developer could fix it, and this would derail whatever feature they were currently working on. The product manager was constantly juggling stakeholder expectations amidst these disruptions, and rapidly losing their confidence.</p>

<p>The quality of the work also left much to be desired. Without collaboration or shared accountability, individual coding styles went unchecked. For example, I know my tendency to overcomplicate solutions needs balancing input from others, but here, such feedback loops were nonexistent. The user experience was no better. Customers frequently reported “bugs” that were actually just confusing designs that violated basic UX principles.</p>

<p>An earlier attempt to fix the codebase had backfired. The team had embarked on a large-scale refactor, but without breaking it into manageable, independent deliverables. They didn’t finish a quarter of the work before their three-month grace period elapsed, leaving the code in an even worse state than before. This was compounded by architectural issues which resulted in frequent timeouts and race conditions.</p>

<p>The planning board was another headache. Hundreds of tickets sat there — bugs, technical improvements, features. It was impossible to prioritise effectively. Meanwhile, bugs were piling up, creating a flood of failure demand (work caused by defects or inefficiencies, as opposed to value demand which directly serves customer needs). One bug in particular highlighted the problem. Over the course of a year, an issue with embedded videos was reported five times by different customers. Each time, it was investigated, deemed low priority, and closed without being fixed. This resulted in a poor customer experience and wasted time not only for our team but also for the company’s helpdesk. Each report required a day of investigation, yet fixing the issue outright would have taken less than two day.</p>

<p>What struck me most, though, was the team’s attitude. They openly disparaged the codebase. It reminded me of a story from “Turn the Ship Around!” by L. David Marquet, where the submarine crew were ashamed of their vessel until they rediscovered pride in their work. The Portal team needed that same shift in mindset.</p>

<h3 id="turning-the-tide">Turning the Tide</h3>

<p>The first challenge was morale. Without hope for change, there would be no progress. Luckily, I had an ace up my sleeve: John Hustler, a colleague and friend. John’s thoughtful approach and willingness to go where he was needed most made him the perfect addition. When he joined, it gave me the support I needed to align the team.</p>

<p>To foster a sense of pride, I introduced a small but symbolic initiative. We created a “Show Portal Some Love” ticket. Every time someone made an improvement, no matter how small, they earned a ❤️. The team may have thought it a bit childish, but my hope was that it gave them permission to make things better, while celebrating small wins and lightening the mood. Slowly but surely, they started to see the Portal as something worth investing in.</p>

<p>The next step was tackling waste. I reviewed every ticket on our bloated planning board, moving low-priority ones to a separate board and closing those that weren’t worth pursuing. The issues I moved weren’t high-priority from the product perspective, but they generated a lot of noise or failure demand. I broke them down into clear, actionable steps and with the help of a borrowed contractor, chipped away at them. Quick wins, like simple UX improvements reduced customer confusion and lowered the volume of reported issues. We also removed unreliable, nice-to-have features, such as a tool for exporting CVs as a single PDF. This feature failed one in ten times, often crashing the system or timing out. Fixing it would have taken weeks, but removing it took minutes, eliminating a major source of frustration for both users and the team.</p>

<p>With fewer interruptions, the team could finally focus. The improved flow created a virtuous cycle: more capacity led to more fixes and a higher standard of work, which led to even greater capacity. Over time, we gained enough breathing room to address deeper issues, like architectural flaws.</p>

<p>Meanwhile, I worked with the product manager and Chief Product Officer to reduce the team’s workload. We dropped one of the large deliverables, freeing up an engineer to focus on improvements. This decision, though difficult, paid off as the team regained momentum.</p>

<h3 id="a-new-way-of-working">A New Way of Working</h3>

<p>Once the immediate crises were under control, I shifted our approach. Instead of each engineer tackling separate features, we began working together on a single epic. Our first shared effort was integrating social media into the Portal jobs platform. The team was initially skeptical. They worried it would slow them down, but the results spoke for themselves.</p>

<p>Working together meant we could deliver incrementally. Instead of waiting three months to finish three separate features, we completed one feature in the first month, another in the second, and the final one in the third. Not only did this approach deliver value faster, but it also fostered collaboration and shared ownership.</p>

<h3 id="the-results">The Results</h3>

<p>Within six months, the transformation was evident. The backlog was cleared, the main board had fewer than 25 tickets, and morale was at an all-time high. Six months later, the team had re-architected the application, eliminating performance bottlenecks and race conditions while continuing to deliver product features. We showcased our work at departmental demos, and, for the first time, other engineers were expressing interest in joining the Portal team.</p>

<p>By focusing on collaboration, reducing waste, and building pride, we had turned the team around. The journey wasn’t about individual heroics but about creating an environment where the team could thrive. It’s proof that with the right focus, any team can become a destination team.</p>

<hr />
<p><br /></p>
<h4 id="recommended-reading">Recommended Reading</h4>
<ul>
  <li><a href="https://davidmarquet.com/turn-the-ship-around-book/">Turn the Ship Around!</a></li>
  <li><a href="https://www.nngroup.com/books/design-everyday-things-revised/">The Design of Everyday Things</a></li>
</ul>]]></content><author><name>Stephen Cresswell</name></author><category term="Team" /><category term="Rescue" /><category term="Motivation" /><category term="Slack" /><category term="Waste" /><category term="Agile" /><category term="Targets" /><summary type="html"><![CDATA[How a single unconventional target - 'make Portal a destination team' - transformed a struggling, siloed team into one where engineers wanted to work.]]></summary></entry><entry><title type="html">NPM search is broken</title><link href="http://www.stephen-cresswell.com/2024/12/23/npm-search-is-broken.html" rel="alternate" type="text/html" title="NPM search is broken" /><published>2024-12-23T00:00:00+00:00</published><updated>2024-12-23T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/12/23/npm-search-is-broken</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/12/23/npm-search-is-broken.html"><![CDATA[<p>As an active user of npm and an author/maintainer of several libraries, I’ve recently encountered significant problems with npm’s new search experience. While I appreciate the effort the npm team has put into improving search, the current implementation has introduced serious issues that negatively impact discoverability and usefulness.</p>

<h3 id="whats-the-issue">What’s the Issue?</h3>
<p>The new search algorithm prioritises objective sorting criteria like relevance, download counts, dependency counts, and publication date. However, these changes have led to surprising and often unhelpful results in real-world usage. Here are some key concerns:</p>

<ul>
  <li>
    <p><strong>Irrelevant Results</strong>: Searching for “RabbitMQ” now returns results like <a href="https://www.npmjs.com/package/vasync">vasync</a> and <a href="https://www.npmjs.com/package/slugid">slugid</a>, which are unrelated to RabbitMQ. This behaviour was not observed with the previous search implementation.</p>
  </li>
  <li>
    <p><strong>Misranked Relevant Packages</strong>: Established and actively maintained libraries have been pushed far down the rankings for relevant keywords, making them difficult to find.</p>
  </li>
  <li>
    <p><strong>Prominence of Obscure or Stale Packages</strong>: In searches like “hierarchical configuration,” packages that are outdated, rarely downloaded, or both dominate the results, displacing widely used and current libraries.</p>
  </li>
</ul>

<h3 id="what-might-have-gone-wrong">What might have gone wrong?</h3>
<p>From my observations, it appears the new search algorithm may disregard or de-emphasise package metadata such as “keywords” in favour of the contents of the package’s README. The problem with prioritising the README is that it inadvertently boosts irrelevant libraries. For example, many libraries include a section in their README for migrating from previous versions. This causes them to rank highly for searches like “database migration” even when they have nothing to do with databases migration. Similiarly, many libraries include a section titled “Config” or “Configuration” to explain how to set up the package. This means irrelevant libraries frequently rank highly when searching for “configuration”</p>

<h3 id="demonstrating-the-problem">Demonstrating the problem</h3>
<p>To see just how poorly the new search performs for these examples, I searched for “hierarchical configuration” and reviewed the top 10 results…</p>

<h4 id="default-search-order">‘Default’ search order</h4>
<table>
<tr><th>Rank</th><th>Library</th><th>Assessment</th><th>Relevant</th><th>Current</th><th>Popular</th></tr>
<tr><td>1</td><td>node-env-configuration</td><td>A hierarchical configuration library with 4K downloads, last published 8 years ago</td><td>✅</td><td>❌</td><td>❌</td></tr>
<tr><td>2</td><td>ngx-access</td><td>An access control library for Angular with 78 downloads, last published 4 years ago</td><td>❌</td><td>❌</td><td>❌</td></tr>
<tr><td>3</td><td>turing-config</td><td>A hierarchical configuration library with 30 downloads, last published 7 years ago</td><td>✅</td><td>❌</td><td>❌</td></tr>
<tr><td>4</td><td>typeconf</td><td>A hierarchical configuration library with 21 downloads, last published 7 years ago </td><td>✅</td><td>❌</td><td>❌</td></tr>
<tr><td>5</td><td>config</td><td>A hierarchical configuration library with 6M downloads, last published 5 months ago</td><td>✅</td><td>✅</td><td>✅</td></tr>
<tr><td>6</td><td>d3-hierarchy</td><td>A library containing layout algorithms for hierarchical data with 17.6M downloads, last published 3 years ago</td><td>❌</td><td>❌</td><td>✅</td></tr>
<tr><td>7</td><td>config-core</td><td>A hierarchical configuration library with 20 downloads, last published 4 years ago</td><td>✅</td><td>❌</td><td>❌</td></tr>
<tr><td>8</td><td>@ehosick/config-core</td><td>A republish / duplicate of (7)</td><td>✅</td><td>❌</td><td>❌</td></tr>
<tr><td>9</td><td>nconf</td><td>A hierarchical configuration library with 3.5M downloads, last published 1 year ago</td><td>✅</td><td>✅</td><td>✅</td></tr>
<tr><td>10</td><td>fconf</td><td>A hierarchical configuration library with 11 downloads, last published 5 years ago</td><td>✅</td><td>❌</td><td>❌</td></tr>
</table>

<p>Only 8 of the above results are configuration libraries, 6 have fewer than 100 downloads and 7 haven’t been published within 3 years. Only <a href="https://www.npmjs.com/package/config">config</a> and <a href="https://www.npmjs.com/package/nconf">nconf</a> arguably justify a top 10 spot. More relevant, current and popular libraries like <a href="https://www.npmjs.com/package/cosmiconfig">cosmiconfig</a> (61M) and <a href="https://www.npmjs.com/package/dotenv">dotenv</a> (42M) do not even feature on the first page.</p>

<h4 id="most-downloaded-this-week-search-order">‘Most Downloaded This Week’ search order</h4>
<table>
<tr><th>Rank</th><th>Library</th><th>Assessment</th><th>Relevant</th><th>Current</th><th>Popular</th></tr>
<tr><td>1</td><td>commander</td><td>A cli library with 165M downloads, last published 7 months ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>2</td><td>execa</td><td>A process execution library with 87M downloads, last published 1 month ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>3</td><td>schema-utils</td><td>A webpack validation library with 80M downloads, last published 1 year ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>4</td><td>strip-json-comments</td><td>A library to remove comments from JSON files with 62M downloads, last published 1 year ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>5</td><td>cosmiconfig</td><td>A hierarchical configuration library with 61M downloads, last published 1 year ago</td><td>✅</td><td>✅</td><td>✅</td></tr>
<tr><td>6</td><td>eslint</td><td>A linting library with 45M downloads, last published 1 week ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>7</td><td>dotenv</td><td>A hierarchical configuration library with 32M downloads, last published 3 days ago</td><td>✅</td><td>✅</td><td>✅</td></tr>
<tr><td>8</td><td>diff-sequences</td><td>A library for comparing sequences with 41M downloads, last published 1 year ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>9</td><td>@eslint/eslintrc</td><td>A linting library with 36M downloads, last published 1 month ago</td><td>❌</td><td>✅</td><td>✅</td></tr>
<tr><td>10</td><td>css-select</td><td>A CSS selector compiler with 32M downloads, last published 3 years ago</td><td>❌</td><td>❌</td><td>✅</td></tr>
</table>

<p>While all of the results are popular and current, only 2 are configuration libraries.</p>

<h3 id="conclusion">Conclusion</h3>
<p>The new npm search algorithm returns either</p>

<ul>
  <li>mostly relevant, but unpopular and unmaintained libraries, or</li>
  <li>mostly irrelevant, but popular and well maintained libraries</li>
</ul>

<p>Neither option is useful.</p>

<p>I understand that balancing relevance, popularity, and recency is complex. Furthermore, improving tools like npm search is no small feat, especially given the wide-ranging needs of its user base. I deeply appreciate the work that maintainers to improve npm’s tools. However, the current implementation has clear flaws that undermine its usability.</p>

<p>One of the npm projects authors/maintainers announced the changes via <a href="https://github.com/orgs/community/discussions/144952">this discussion</a>, and invited feedback. I offered the above analysis three weeks ago, but it was initially misunderstood, and after clarification, has still to be adequately acknowledged. If you’ve encountered similar issues or have ideas for improvement, your feedback could help improve npm’s search experience. Please join <a href="https://github.com/orgs/community/discussions/144952">the discussion</a> on GitHub.</p>

<p>Thank you.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="npm" /><category term="search" /><summary type="html"><![CDATA[NPM's new search algorithm returns either relevant but unmaintained packages, or popular but irrelevant ones. Neither is useful.]]></summary></entry><entry><title type="html">Reminders as Code</title><link href="http://www.stephen-cresswell.com/2024/12/23/reminders-as-code.html" rel="alternate" type="text/html" title="Reminders as Code" /><published>2024-12-23T00:00:00+00:00</published><updated>2024-12-23T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/12/23/reminders-as-code</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/12/23/reminders-as-code.html"><![CDATA[<p>In the fast-paced world of software engineering, missing a critical api-key renewal can cause significant disruption. In one event I was made aware of, the secret key for a third-party payment provider expired after three years. Since the provider did not send a warning, and the individual who created the key forgot to set a reminder, the company lost its ability to process payments, leading to a brief but avoidable outage and a scramble to restore operations. Another example involves Active Directory client secrets. These tokens have a maximum two-year expiry and provide only a small orange triangle as a warning. Such subtle cues can easily be overlooked, leading to avoidable downtime. These incidents demonstrate the necessity of a robust and reliable reminder system.</p>

<p>Many organisations attempt to manage reminders through software calendars like Outlook. However, the user interfaces for such applications are not designed for managing distant events, and if the creator of a reminder leaves the organisation, it may be lost or become read-only.</p>

<p>SAAS platforms that provide automated reminders are another option, but they come with their own challenges. Often, reminders are not a first class features and the automation required to set them up is complex. Furthermore, they may have a per user licensing model or fall within the domain of an IT operations team, making them prohibively expensive or inaccessible. Finally, these services may not be available years into the future, and may lack an export feature.</p>

<p>Ultimately, you cannot rely on third parties to notify you about critical events, on engineers to set reminders, or on organisations to provide systems that make it convenient to do so.</p>

<h3 id="reminders-as-code">Reminders as Code</h3>

<p>Reminders as Code offers an open, convenient and long term solution to this problem. Namely, storing reminders in a text file, committed into to a source code repository. By doing so you gain the following advantages:</p>

<ul>
  <li>
    <p><strong>Transparency</strong>: The reminders are transparent, versioned, and provided good source control management practices are in place, will be accompanied by a meaningful commit message, so their purpose can be understood after the original committer has left.</p>
  </li>
  <li>
    <p><strong>Automation</strong>: All good issue tracking systems expose a programmatic API, so with a suitable script, reminders can be created as issues at the appropriate time.</p>
  </li>
  <li>
    <p><strong>Notification</strong>: Issue tracking systems typically have a wide range of user configurable notification mechanisms, ensuring the reminders do not go unmissed.</p>
  </li>
  <li>
    <p><strong>Longevity</strong>: Because reminders are stored in an open format, they can easily be migrated if you migrate to a new issue tracking system.</p>
  </li>
  <li>
    <p><strong>Scalability</strong>: The approach scales across teams and projects without relying on individual ownership or third-party platforms.</p>
  </li>
  <li>
    <p><strong>Accessibility</strong>: Anyone with access to the source control system and issue tracking system can manage and receive notification of reminders.</p>
  </li>
</ul>

<h3 id="introducing-knuff">Introducing Knuff</h3>

<p><a href="https://github.com/acuminous/knuff">Knuff</a> is an open-source reminders as code implementation. Reminders are specified as JSON or YAML and stored in plain text. A script processes the reminders file, identifying those that are due, posting them to the issue tracking system of choice (e.g., GitHub) via a ‘driver’. The script must be run by a scheduler (such as that provided by GitHub Actions) on at least a daily basis.</p>

<h4 id="example-reminders-file">Example Reminders File</h4>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  <span class="c1"># The reminder title</span>
  <span class="na">title</span><span class="pi">:</span> <span class="s1">'</span><span class="s">Update</span><span class="nv"> </span><span class="s">API</span><span class="nv"> </span><span class="s">Key'</span>

  <span class="c1"># The reminder body</span>
  <span class="na">body</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">The API Key expires on the 14th of July 2025.</span>
    <span class="s">- [ ] Regenerate the API Key</span>
    <span class="s">- [ ] Update AWS Secrets Manager</span>
    <span class="s">- [ ] Redeploy the website</span>
    <span class="s">- [ ] Update the reminder for next year</span>

  <span class="c1"># Optional description that can be useful to track the background behind the reminder</span>
  <span class="na">description</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">The API key expires yearly.</span>

  <span class="c1"># Schedule a single reminder for 1st July 2025 (See RRULE / RFC5545 format)</span>
  <span class="na">schedule</span><span class="pi">:</span> <span class="pi">|</span>
    <span class="s">DTSTART;TZID=Europe/London:20250701T080000</span>
    <span class="s">RRULE:FREQ=DAILY;COUNT=1</span>

  <span class="c1"># A set of labels / tags</span>
  <span class="na">labels</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s1">'</span><span class="s">reminder'</span>

  <span class="c1"># A list of repositories to post the reminders to</span>
  <span class="na">repositories</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="s1">'</span><span class="s">acuminous/reminders'</span>
</code></pre></div></div>

<h4 id="example-script">Example Script</h4>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="nx">fs</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">node:fs</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">yaml</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">yaml</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">Octokit</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@octokit/rest</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">Knuff</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@acuminous/knuff</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">GitHubDriver</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@acuminous/knuff-github-driver</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">GITHUB_TOKEN</span> <span class="o">=</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">GITHUB_TOKEN</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">PATH_TO_REMINDERS</span> <span class="o">=</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">PATH_TO_REMINDERS</span> <span class="o">||</span> <span class="dl">'</span><span class="s1">reminders.yaml</span><span class="dl">'</span><span class="p">;</span>

<span class="kd">const</span> <span class="nx">config</span> <span class="o">=</span> <span class="p">{</span>
  <span class="na">repositories</span><span class="p">:</span> <span class="p">{</span>
    <span class="dl">'</span><span class="s1">acuminous/knuff</span><span class="dl">'</span><span class="p">:</span> <span class="p">{</span>
      <span class="na">owner</span><span class="p">:</span> <span class="dl">'</span><span class="s1">acuminous</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">name</span><span class="p">:</span> <span class="dl">'</span><span class="s1">knuff</span><span class="dl">'</span><span class="p">,</span>
      <span class="na">driver</span><span class="p">:</span> <span class="dl">'</span><span class="s1">github</span><span class="dl">'</span><span class="p">,</span>
    <span class="p">},</span>
  <span class="p">},</span>
<span class="p">};</span>

<span class="kd">const</span> <span class="nx">octokit</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Octokit</span><span class="p">({</span> <span class="na">auth</span><span class="p">:</span> <span class="nx">GITHUB_TOKEN</span> <span class="p">});</span>
<span class="kd">const</span> <span class="nx">drivers</span> <span class="o">=</span> <span class="p">{</span> <span class="na">github</span><span class="p">:</span> <span class="k">new</span> <span class="nx">GitHubDriver</span><span class="p">(</span><span class="nx">octokit</span><span class="p">)</span> <span class="p">};</span>
<span class="kd">const</span> <span class="nx">knuff</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">Knuff</span><span class="p">(</span><span class="nx">config</span><span class="p">,</span> <span class="nx">drivers</span><span class="p">).</span><span class="nx">on</span><span class="p">(</span><span class="dl">'</span><span class="s1">error</span><span class="dl">'</span><span class="p">,</span> <span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">reminders</span> <span class="o">=</span> <span class="nx">yaml</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">fs</span><span class="p">.</span><span class="nx">readFileSync</span><span class="p">(</span><span class="nx">PATH_TO_REMINDERS</span><span class="p">,</span> <span class="dl">'</span><span class="s1">utf8</span><span class="dl">'</span><span class="p">));</span>

<span class="nx">knuff</span><span class="p">.</span><span class="nx">process</span><span class="p">(</span><span class="nx">reminders</span><span class="p">).</span><span class="nx">then</span><span class="p">((</span><span class="nx">stats</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`Successfully processed </span><span class="p">${</span><span class="nx">stats</span><span class="p">.</span><span class="nx">reminders</span><span class="p">}</span><span class="s2"> reminders`</span><span class="p">);</span>
<span class="p">}).</span><span class="k">catch</span><span class="p">((</span><span class="nx">error</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">(</span><span class="nx">error</span><span class="p">);</span>
  <span class="nx">process</span><span class="p">.</span><span class="nx">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="p">});</span>
</code></pre></div></div>

<h3 id="example-github-action">Example GitHub Action</h3>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">name</span><span class="pi">:</span> <span class="s">Check Reminders</span>

<span class="na">on</span><span class="pi">:</span>
  <span class="na">workflow_dispatch</span><span class="pi">:</span> <span class="c1"># Allows manual triggering of the workflow</span>
  <span class="na">schedule</span><span class="pi">:</span>
    <span class="pi">-</span> <span class="na">cron</span><span class="pi">:</span> <span class="s1">'</span><span class="s">*/60</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*</span><span class="nv"> </span><span class="s">*'</span> <span class="c1"># Runs every 60 minutes</span>

<span class="na">jobs</span><span class="pi">:</span>
  <span class="na">run-reminder</span><span class="pi">:</span>
    <span class="na">runs-on</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>

    <span class="na">steps</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Checkout repository</span>
        <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>

      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Install dependencies with npm ci</span>
        <span class="na">run</span><span class="pi">:</span> <span class="s">npm ci</span>

      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Run Knuff</span>
        <span class="na">run</span><span class="pi">:</span> <span class="s">node index.js reminders.yaml</span>
        <span class="na">env</span><span class="pi">:</span>
          <span class="na">GITHUB_TOKEN</span><span class="pi">:</span> <span class="s">${{ secrets.GITHUB_TOKEN }}</span>

</code></pre></div></div>

<h3 id="limitations">Limitations</h3>

<p>Knuff’s primary limitation is that it can’t force engineers to use it. However, drawing on inspiration from the Kevin Costner movie Field of Dreams: “If you build it, they will come”; Knuff encourages adoption through its simplicity and utility.</p>

<p>A second limitation exists for implementations involving many reminders or repositories. A GitHub Action or Personal Access Token like the one used in the above example may be rate limited, forcing you to register the script as a GitHub App and to implement a more convoluted setup and authorisation flow. See <a href="https://github.com/acuminous/knuff/tree/main/examples/enterprise">here</a> for an example.</p>

<h3 id="conclusion">Conclusion</h3>
<p>Reliable reminders are not just a convenience; they are a necessity in ensuring the smooth operation of modern software systems. The consequences of missed notifications, ranging from expired keys to overlooked updates, can lead to downtime, financial loss, and unnecessary fire-fighting. Traditional approaches, such as relying on individual engineers or third-party solutions, have proven inadequate due to their inherent limitations, including ill-suited UI, lack of transparency, dependence on specific platforms, and the risk of obsolescence or inaccessibility.</p>

<p>By adopting the Reminders as Code approach, organisations can address these shortcomings. This method integrates reminders into the development process itself, ensuring they are version-controlled, transparent, and long-lasting. Tools like <a href="https://github.com/acuminous/knuff">Knuff</a> exemplify how this approach can be operationalised, providing a practical, open-source framework that enables automation, scalability, and accessibility.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Reminders" /><category term="Calendar" /><category term="RRule" /><category term="rfc5545" /><summary type="html"><![CDATA[Store critical reminders in version control to avoid costly outages from expired API keys and certificates - a reliable, long-term solution that doesn't depend on third parties.]]></summary></entry><entry><title type="html">Sleepwalking into a Big Data Nightmare</title><link href="http://www.stephen-cresswell.com/2024/10/29/sleepwalking-into-a-big-data-nightmare.html" rel="alternate" type="text/html" title="Sleepwalking into a Big Data Nightmare" /><published>2024-10-29T00:00:00+00:00</published><updated>2024-10-29T00:00:00+00:00</updated><id>http://www.stephen-cresswell.com/2024/10/29/sleepwalking-into-a-big-data-nightmare</id><content type="html" xml:base="http://www.stephen-cresswell.com/2024/10/29/sleepwalking-into-a-big-data-nightmare.html"><![CDATA[<p>Sourcing big data within organisations has become routine, with operational data stores (or their replicas) feeding data lakes like Snowflake or Redshift via expensive Change Data Capture (CDC) tools. These lakes become the foundation for reporting and increasingly, Artificial Intelligence. At first glance, this appears efficient and scalable. But there is a lurking problem: tight coupling between the operational data store and downstream analytics.</p>

<h4 id="the-hidden-danger-of-tight-coupling">The Hidden Danger of Tight Coupling</h4>

<p>Operational data stores are not static; they evolve rapidly under active development. Features are added, models are refined, and, in Agile environments, changes are iterative rather than pre-planned. By sourcing data directly from these stores, you tether your reporting and analytics to a system in flux. This creates several critical issues:</p>

<ol>
  <li>
    <p><strong>Constraints on Engineering</strong>: Changes to the operational data store must accommodate downstream dependencies. Engineers may become bogged down by these constraints, slowing down the development of new features.</p>
  </li>
  <li>
    <p><strong>Broken Integrations</strong>: Without automated tests, changes to the operational store often break models and reports. The resulting chaos is compounded when consumers, often financial and commercial teams, rely on these reports to make decisions.</p>
  </li>
  <li>
    <p><strong>Corporate Fallout</strong>: When reports fail, the fallout is rarely trivial. Commercial stakeholders typically wield significant organisational power. Engineering teams find themselves under pressure to implement ad-hoc fixes and facing new layers of process bureaucracy, introducing waste and blocking progress.</p>
  </li>
</ol>

<h4 id="a-cautionary-tale-job-adverts-at-tes-global">A Cautionary Tale: Job Adverts at Tes Global</h4>

<p>A clear example of this dynamic comes from my experience with Tes Global, the world’s largest teacher network. At Tes, I led the team developing a jobs board where the lifecycle of a job advert was anything but simple, transitioning between implicit and explicit states of ‘draft’, ‘published’, ‘live’, ‘new’, ‘ending soon’, ‘ended’, ‘closed’ and ‘deleted’. Here’s how they worked:</p>

<ul>
  <li>A job advert began as a ‘draft’, and was ‘published’ by the advert’s creator.</li>
  <li>Published job adverts automatically becoame ‘live’ after the advert start date.</li>
  <li>Job adverts could be soft ‘deleted’ before going live, but after that, they could only be cancelled since there were associated viewing stats, applications, and potentially charges.</li>
  <li>Job adverts were ‘new’ for the first two days, and ‘ending soon’ for the final two, after which they transitioned to ‘ended’.</li>
  <li>Finally, the job advert was ‘closed’ once the application close date had passed, indicating that no further applications could be submitted.</li>
</ul>

<p>The complexity was compounded by the absence of an explicitly persisted state for job adverts. This decision was partly because some state change triggers were time-based, making explicit persistence challenging. Instead, the state was derived in application code from various flags and dates. However, we later faced the somewhat unenviable task of reverse engineering the rules and duplicating them in the data lake, with all the reliability issues that entailed.</p>

<h4 id="a-better-approach">A Better Approach</h4>

<p>A better solution would have been to explicitly model the job advert’s state and export that refined model to the data lake. Real-time events could have fed the data lake while simultaneously unlocking the ability to coordinate and propagate changes across our microservice architecture. Alternatively, data changes (deltas) could have been exported from the database at regular intervals and uploaded to a cloud-based storage service in a text-based, machine-readable format, such as <a href="https://avro.apache.org/">Apache Avro</a>. Both these approaches would also have allowed us to write automated tests to catch breaking schema changes. Although this demands more upfront thought, patience, and effort, it avoids the pitfalls of tight coupling and fragile integrations.</p>

<h4 id="why-fast-and-loose-fails-in-big-data">Why Fast and Loose Fails in Big Data</h4>

<p>Organisations often opt for the fast-and-loose approach because they believe it offers agility. Every Data team I’ve worked with has request we “just give them everything”, because they are unsure of which reports will be required in the future. This mindset is pervasive but flawed. Poorly modelled data leads to brittle systems, wasted engineering hours, and unreliable analytics.</p>

<p>In contrast, slow and steady wins the race when it comes to data modelling, not least because we are on the brink of a Machine Learning (ML) and Generative AI revolution. However, these technologies remain impotent without high-quality, well-structured data to train on. The upfront investment in thoughtful modelling pays exponential dividends in long-term reliability and usability.</p>

<h4 id="final-thoughts">Final Thoughts</h4>

<p>The allure of big data has led many organisations to sleepwalk into an architectural nightmare. Directly coupling reporting to operational data stores may seem expedient but ultimately causes more harm than good. By prioritising deliberate schema design, testing, and simple integration, organisations can avoid these pitfalls and unlock the full potential of their data. The choice is simple: sleepwalk into chaos or step deliberately into clarity.</p>]]></content><author><name>Stephen Cresswell</name></author><category term="Big Data" /><category term="Snowflake" /><category term="AI" /><category term="Domain Modelling" /><category term="ETL" /><summary type="html"><![CDATA[Tightly coupling data lakes to operational databases creates a maintenance nightmare. Decoupling through an anti-corruption layer preserves agility without breaking downstream analytics.]]></summary></entry></feed>