Good Enough AI

Writing code has, for better or worse, taken a variable amount of my time pre-2023. Post-2023, you suddenly have the fatigue, cognitive load, and bouts of irritation start to evaporate.. It’s not a complete seamlessness by any means, but as you exit a work session with remarkable new tools, you must philosophically consider what has just happened and why? Also, it’s important, especially for a certain class of work and those that are curious about it: why for programming?.

The questions above really help illustrate the gulf that is developing between the various users of LLM tools. For some, its sparsely used and a toy, for programmers, it’s like discovering the mental combustion engine.

Knowledge and Verification Bubbles

It certainly helps to examine different knowledge spheres in terms of scope and subjectivity. This is where “programming” as a sphere has various advantages:

Brevity: Programming Languages have the distinct advantage of having very few terms (keywords) and those keywords fixed in their function.
Reasonably Verifiable: Execution of the program results in a healthy population of “correct” solutions. Compare this to more subjective (and equally, or more important to humans) knowledge, like literature.
Modularity in problem and solution: The above has indirectly resulted in QA destinations (like Stack Overflow) that provide excellent training signals.
Logical Structure: Program structure is fairly logical and dependencies (even in complex situations) can be reasonably discerned

Compressing the feedback loop

Here we arrive to, what I think due to personal experience…the true philosophical moment. Recall the programming experience, lets say in the 2010s, for small tasks within a large task (“build an app, dashboard, components, API, etc”)

Decide on paradigm: Programming Language opinions, frameworks, documentation maturity, etc
Execute feedback loop:
1. Write code
2. Run: Verify correctness
3. Debug: Read stack-traces, logs, etc
4. Search: Documentation, Google, etc
5. Repeat
Refactor: clean-up with architectural best practices

This, of course, is a slim example of the general feedback loop. The steps can vary, but the general loop is fairly consistent. Here is where the boundlessness, fuzziness, and variability contribute so heavily. At each of these steps, the cost of finding information, integrating, and verifying can be high. Furthermore, the cost of the wrong answer (or “path”), can be extremely high as this cascades into prior and subsequent efforts. To help visualize:

You can imagine the vertical axis as being possible start paths, and the horizontal axis as distance in time and effort to the next task. The issue mainly lies in choosing and evaluating various paths, one that could be ultimately more time consuming or erroneous. What’s changed:

Suddenly, because of LLMs in programming, there is more of a “straight-line” effect between choices and tasks. It cannot be overstated how impactful this is. It compresses the exhausting feedback loop greatly. Although this may seem fully apparent and taken for granted, we must appreciate the significance of this for software engineering.

Pathfinding: From Learning to Search

We can think of this common feedback loop as a pathfinding problem. What’s interesting is that these “ideal” paths for all sorts of practical programming functions, due to human and synthetic data, are being quickly discovered. Higher level tasks such as “building an app, but with these variations” is starting to transition from a time-consuming learning problem, to a cached search problem. The implications of this are enormous, but why did this happen?

“Good-enough-ness”

People unfamiliar or uninterested in software engineering tasks sometimes consider this type of work to be esoteric and/or arduous. Software engineers can also have a habit of touting the willpower and intellect needed to do what they do. But in reality, the work isn’t that complicated. Furthermore, things are getting quite “good enough”.

Consider examples like SaaS applications with user management, dashboards, notifications, etc. At a lower-level, lets also consider their interface components and CRUD APIs. Might seem simplistic, but this represents the vast majority of generic SaaS applications that have verticalized in different industries. Even the smallest models can attack this problem set in components, with new architectures attacking this problem from high-level specification to components. Sure there are variations and opinions that humans need to inject, but the integration time has become fairly constant. The delta between a large team building a sophisticated app, vs one with new tools, is shrinking significantly. SaaS, unfortunately, also relies on the belief that customers are too dumb and lazy to do it themselves. This is changing fast.

This is a massive industry, and if you add lower-level cloud architecture tools, its even more massive. At the end, only the physical substrate access (compute, storage) really matters. This presents a difficulty for incumbents, where now, customers look for extreme differentiation, or just proclaim: “good-enough”