Analysis: AI, Code Generation and the Software Engineering Profession
Amazon Cloud Chief Matt Garman made some comments recently on the future software engineering at AWS, and predictably it has caused stir and click-baiting, polarized coverage. Time to take stock.
In a leaked recording, Amazon cloud chief tells employees that most developers could stop coding soon as AI takes over
Matt Garman sees a shift in software development as AI automates coding, telling staff to enhance product-management skills to stay competitive.
Part 1: The Maturing Landscape of Coding Assistance Tools
Productivity Gains Are Real
It’s a fairly conservative prediction that software engineers will spend less time writing code in the future. A quick look at second-generation coding tools like v0.dev or postgres.new is enough to see the writing on the wall. More specialized tooking is making it’s way into developer productivity tooling and is injecting dramatic productivity increases. Tasks that previously took days shrink to hours, hours to minutes with the right tools and understanding of how to apply them.
The automation of web component construction for typical shadcn/ui, tailwind, react user interfaces has reached a point where models like Claude 3.5 Sonnet, v0.dev or frontend.ai produce usable results in mere seconds. There’s always the complex edge case AI struggles with, of course, but those are not what the average developer in their field spends time on now… at the moment.
Postgres Sandbox - An AI powered Database Builder
In-browser Postgres sandbox with AI assistance
Frontend.ai
Use our Frontend AI tool to generate components from images and text. Sign up for early access to our open-source JavaScript plugin builder.
Amazon CEO Andy Jassy himself provided some data on context on the quoted productivity in a recent LinkedIn post that lead to the comments:
“The average time to upgrade an application to Java 17 plummeted from what’s typically 50 developer-days to just a few hours,” he wrote. “We estimate this has saved us the equivalent of 4,500 developer-years of work (yes, that number is crazy but, real).”
Andy Jassy on LinkedIn: One of the most tedious (but critical tasks) for software development… | 159 comments
One of the most tedious (but critical tasks) for software development teams is updating foundational software. It’s not new feature work, and it doesn’t feel… | 159 comments on LinkedIn
The job of the human developers in this scenario shifts from searching and updating the codebase to reviewing the proposed changes by the AI, accepting or rejecting them and handling the occasional edge case or failure point found by the test instrumentation.
What companies like Amazon with large, monolithic codebases, a high degree automatic test coverage and access to top end AI capabilities are able to use today foreshadows the release of more efficienct, accessible tooling into the market in the coming months.
Significant Room for Productivty Growth, even if models stop improving.
While model quality will undoubtedly further improve, there’s legitimate doubt in the ability to maintain the capability scaling trajectory we’ve seen in the last two years. This however obscures the pent up potential for further productivity gain by better and deeper integration of AI scaffolding into existing developer tooling:
Real world interaction with the latest generation of developer tooling, while impressive, shows massive friction due to clumsy integration, mismatched interface paradigms (like an over-reliance on chat) and painful workflows due to unaddressed technological limitations such as Output Token Windows.
In our estimation, these challenges are not a function of AI or AI limitations but merely a reflection of strategically under-resourced product investments by the companies producing the technology. We are confident that dramatic, unrealized gains in developer productivty are gated purely by this lack of investment, which we believe is an intentional choice - prioritizing AI model scaling over product growth as competitive strategy.
This should be seen as a temporary situation however - As hype fades and revenue pressures start bearing down on the ecosystem, competition will shift towards product differentiation and drive rapid improvements in this field. There is some indication, that this process has already started. Anthropic has been steadily launching developer focused features into Claude, and because that’s not happening fast enough, the Open Source community has also created independent tooling overcoming some of the highest friction productivty killers in AI workflows.
GitHub - jahwag/ClaudeSync: ClaudeSync is a Python tool that automates the synchronization of local files with Claude.ai Projects
ClaudeSync is a Python tool that automates the synchronization of local files with Claude.ai Projects - jahwag/ClaudeSync
The specific vectors that are most influential on software engineering related productivity gains are:
- Long context windows - The ability to fit entire codebases and/or supporting libraries and documentation into an AI model’s memory (context window) complex refactoring and upgrade operations to happen in a single step. LLM context windows have increased from 2 kilo-tokens to 2-million kilo-tokens in the span of less than two years and continue to grow.
- Increasingly reliable code generation models - Code models continue to increase in quality month over month with no plateuing yet in sight.
- Increasingly current models - Software projects are continuously changing and critical bugfixes or major architecture changes happen frequent enough that outdated models quickly do more harm than good. The more current a coding model’s knowledge cutoff is, the more useful it is and iteration time on shipping later knowledge cut-offs have been improving.
- Integration with developer tooling and job context - Closer connection with developer tooling such as source control, github and IDE reduces the friction of many every-day activities. Better tooling can also fill the gap left by knowledge cutoff by injecting what has changed into the model along with the developer context.
By keeping and eye on these factors and their progress, we can clearly see that we are still in the early stages of AI fuelled productivity growth in software, that model performance along is not a significantly constraining factor and that we can expect dramatic changes (with downstream effects on job profiles and, yes, labor market) ahead of us.
Code Generation: AI’s current best case scenario.
Programming languages have clear structure, small vocabularies, and their entire history, best practices, source code, and applications are public. GitHub, Stack Overflow, and the public web hold more high-quality, annotated, and structured training data in code, issues, kernel mailing lists, Q&A, and documentation than any other field.
Recent research tells us that the quality of training data is crucial for model performance, often outweighing scale. As such, it should come as no surprise that software engineering is one of the most exposed fields when it comes to AI disruption.
Coding Assistance is real, Chatbots are meh, Agents are fiction (at present)
There’s a lot of noise in the market, particularly about futuristic agents, “robot coworkers” and chatbots. The signal under the noise however tells a different story:
- The most successful tools in the market at this point are ones ightly integrated with developer experience/IDE or offering superior user-experience for specialized usecases (such as aforementioned v0).
- Chat based interfaces have their use, but only with tight integration in a developer’s context - otherwise the interfaces turn into clunky productivity sinks quickly.
- Agents are science fiction and unlikely to make an impact in the market in the next 1-2 years.
Insert: Why Agents are a distraction.
Agents - autonomous systems able to complete multi step tasks or even multi task jobs are dominating the hype cycle and attracting significant investment.
There are however good reasons to discard them in context of at scale developer productivity for the time being:
The reliability of Large Language Models, the heart and brain of agentic systems, continues to be abysmal when it comes to the needs of automation. While they perform well enough for single step inference tasks, no generalized LLM based solution is able, at the current time, to reliably generate the expected outputs 90% or more of the time, even under lab conditions.
Fundamental shortcomings (hallucinations) and unsolved challenges (prompt injection, jailbreaks) and competing objectives (usefulness vs harmlessness) in the core technology are, by all indication, not scaling by throwing more compute, training data or parameters at them.
To achieve reliable automation over multiple tasks, we require the technology to show a more than order of magnitude improvement in reliability (99.9%) to overcome compounding error.
Currently, we are not aware of any scientific breakthroughs or products able to overcome these challenges, or on a credible trajectory to do so in the near future. Given the clear productivity potential of non agent systems, we suggest focusing on those and keeping an eye on agents as a frontier technology.
Agentic RAG AI — more marketing hype than tech advance
CIOs are so desperate to stop generative AI hallucinations they’ll believe anything. Unfortunately, Agentic RAG isn’t new and its abilities are exaggerated.
Focus on productivity gain, not replacement scenarios.
To much attention is spent on narratives like “software engineers (not) going extinct” - the profession is at no risk of going extinct.
At the same time the rules of the market demand that any injection in productivity has to convert to corporate growth or cost savings, and from that perspective it looks like Software Engineering is definitely in trouble.
- Price flexibility for many software products is minimal as many products are already commodity priced and the market is saturated.
- Increased sales is equally challenging, as many products have matured and are entrenched in their fields.
- The equal availability of AI tooling to all market participants means limited ability to leverage it for a competitive advantage.
The current, significant productivity growth potential with AI, not realized yet in most companies, therefore has the potential to trigger serious, industry wide reductions in staff levels until such time that companies find new growth opportunities in their industries.
The End of the “Engineers as Pokémon” Era
Big Tech, as always, is ahead of the market. People, especially in technical roles, are the second-largest cost center for Silicon Valley companies and more flexible than infrastructure when it comes to driving short-term savings.
It’s no surprise that aggressive investments are being made to reduce people costs after the last two years. The AI revolution in Big Tech has primarily taken the shape of infrastructure (GPU) investments replacing talent and product capabilities as the main frontline of competition.
Where once rapid response product teams were deployed to quickly match other tech giants’ moves in the market, leading to the famous hoarding of top talent like rare Pokémon cards, building the largest GPU capabilities (and denying the same to the competition) has become the new law of the land.
Silicon Valley also leads the charge of deploying AI tools internally and, concurrently updating perfomance and productivity expectations for developers to take it into account.
Meta rolling out AI chatbot trained on internal data to employees
When considering how to power the chatbot, Meta had discussions with Microsoft and OpenAI, but it decided to employ a separate, in-house model
While few organisations have the ability to follow quickly in the footsteps of Big Tech, the trend is nevertheless clear: Machines will add significant productivity to Software Engineers and, in the short to mid term, it is unclear how that productivity can be absorbed.
The broader market has started to take notice
We would like to draw attention to the transcript of BP’s latest earnings call. The company, by no means a technology company, not operating at the cutting edge of technology and not particularly beset by cost cutting pressures, offered the following:
We’ve done an awful lot to digitize many parts of our business and we’re now applying Gen AI to it. The places that we’re seeing tremendous results on are coding. We need 70% less coders from third parties to code as the AI handles most of the coding, the human only needs to look at the final 30% to validate it, that’s a big savings for the company moving forward.
BP p.l.c. (NYSE:BP) Q1 2024 Earnings Call Transcript
Craig Marshall: Well, thanks, everyone for joining BP’s First Quarter 2024 Results Call today. - All Parts
When the CEO of a large oil company offers these kind of unprompted comments in an earnings call, it is time to pay attention, because in one or two quarters, it will be shareholders asking them directly to the CEO of every other company with cost savings opportunities.
These companies are unlikely to use the latest developer tooling as well. The most commonly deployed solution in the field, Github Copilot, is approaching it’s second birthday, with only limited advances visible on the primarly deployed core product.
Part 2: Impact And Change management
Impact of AI tools on developers is uneven
Our experience shows that AI’s productivity gains are unevenly distributed. High-skill, senior engineers see a much stronger boost from AI tools than their less experienced peers. This stands in contrast to lower skill ceiling jobs like call center customer service, where AI can elevate junior employees to intermediate-senior performance, driving redundancies at the higher end of the pay scale.
And this is where AI hits a snag: people aren’t born senior talent. The tasks currently used to grow the next generation of talent are falling to the AI + Senior Engineer productivity gain much faster than senior roles are opening up.
The Current Challenge: Keeping employees (engaged)
Exhortations about the impending demise of software engineers are as misguided as saying “gunpowder made soldiers obsolete”. Yes, the productivity increase, in the short term, will definitely have job impact, but nuch of the world runs on (legacy) software, much of the world will continue to run on software and reliability of AI is not improving on a trajectory that would wipe out the profession anytime soon.
However, the constant media drumbeat of “Your job is doomed”, Silicon Valley’s aggressive cost cutting and the overall market situation in technology are having a measurable negative effect on employee engagement and optimism.
How will AI affect jobs and employment? Survey suggests massively
It’s been almost a year since AI’s bloom last spring, recalling another flowering: The “tulip mania” of the Dutch Golden Age, one of the most infamous examples of a financial bubble in economic history. But will ChatGPT blossom, with ramifications for any worker’s job, or will it wither as the petals fall off the proverbial plant?
74% of IT pros see AI making their skills obsolete
The IT industry is headed toward a sea change on skillsets as AI adoption becomes more commonplace, according to a Pluralsight survey of executives and IT workers.
It also does not help that tech companies are misusing AI as cover to execute aggressive job cuts. As we’ve discussed in context of such announcements from Cisco and Intuit, their job cuts and economic situation have everything to do with mis-management, changing competitive environment (such as the IRS launching free tax filing into the US market, making Intuits offer’s obsolete) and lack of growth potential.
These “We’re cutting jobs to invest in AI” narratives are bundled with job cuts and sprinkled into investors eyes to keep them from dropping the stock like a hot potato for better opportunities.
Cisco Lays Off 5,500 Workers to Invest More in AI, Despite Making $10.3 Billion in Profit
Cisco posted $10.3 billion in profits last year but is still laying off 5,500 workers as part of an effort to invest more in AI.
Intuit will lay off 1,800 workers and hire new ones to advance its AI ambitions
The company said the layoffs are about reallocating resources toward its “most critical areas”
In reality, the talent market for AI enabled engineering talent in 2024 is so constrained, the AI credential situation so challenging, that at-scale hiring of “1000s of AI engineers” is completely unrealistic exposing the narrative as investor fairy tales.
Reality check
We find that most companies are unable to successfully AI talent due to unsuitable hiring pipelines, lack of recruiter and hiring manager understanding what makes effective AI talent (vs. traditional ML), highly constrained talent supply, competition from better funded companies and inability to identify, interview and vet legitimate talent vs. the thousands of Linkedin University AI graduates swamping the market every week.
Existing skills depreciate rapidly in this environment. For example, much of what was taught as “prompt engineering” over the last two years turned out to be highly specific to a single model (“ChatGPT”), not easily generalizable and, due to progress on the technology, obsolete.
Relief is not in sight: Generative AI remains a rapidly changing frontier technology that requires constant learning, broad context and the ability to filter information in a heavily noise polluted ecosystem very few have the ability to successfully navigate.
Our engagements with institutions of learning also indicates that we are years away from scalable and durable talent pipelines into the industry. The pace of change, the inability to attract top tier talent from the frontier in the crafting of curiculums, the pervasive presence of snakeoil sellers and internal transformation deficit are currently preventing institutions of learning from meeting the market’s needs for talent and certifications.
Investing in existing talent continues to be the best option.
Given these realities, rather than pursuing fantasies of hiring unicorn AI talent from the market, companies need to shift strategy to meet the problem.
The last few years have seen significant eroison in corporate L&D efforts, especially in technology. Following the lead from Twitter, Meta and Google and under the impression of the American culture war targeting L&D flagship projects like “DEI”, investment in employee learning and development and upskilling have been radically dialled down, with many organisations cutting up to 90% of their L&D staff and capacity.
This may turn out to be a costly mistake on the level of outsourcing much of internal IT in the years leading up to Y2K. Generative AI, as General Purpose Technology,
Companies would be well-advised to enable their own employees to engage in upskilling. The illusion that AI talent will be available in the market to replace existing employees is an expensive pipe dream, ignoring the harsh reality of a completely overwhelmed education system unable to keep up with rapidly changing technology.
Technical Considerations
Beyond investing in talent, the quality of an organisation’s codebase is poised to make a serious difference when it comes leveraging AI on the job:
- A well commented codebase helps guiding coding the generation processes.
- Access to ancilliary information in machine readable form, such as design documents and test-cases can significantly improve responses as well.
- Test coverage is critical as well. Once AI starts making changes in hundreds of places at once, such as during an update or refactoring operation, having full test coverage of the codebase becomes critical in supporting human reviewers.
Appendix - Fundamental Changes on the Horizon
When the steam and internal combustion engine enabled machines to enter the labor economy and replace human labor with electricity, few were able to envision the longer term implications, including the rise of the knowledge economy.
The first phase of transformer engine powered AI entering the knowledge workplace is playing out similarly - substitution of human labor with compute and models trained to replicate that labor.
But past these immediate effects, we can now see much more fundamental changes on the horizon: The democratization of General Purpose Compute will likely trigger much more dramatic changes on the world of Software and Software Engineering than the current, short term, efficiency pushes.
We’ve explored some of these possibilites in this essay last month:
Analysis: Claude and the Dawn of the “Post App” era
Claude.ai, the AI unicorn currently coming closest to beating OpenAI’s GPT4 in general performance, recently released a new UX feature which significantly changes how the user interacts with AI. It is the improvements to their User Experience (UX) howevever that show us where AI is heading in the long run.
Further Reading
Ecosystem Intelligence To keep an eye on what’s happening on this topic, you can follow our Code Generation and Work and Labor Channels.
Part 2: Report: AI Coding Tools, October 2024 is now online.
About the Author
Georg Zoeller is a Co-founder of Centre for AI Leadership and a former Facebook/Meta Business Engineering Director. He has decades of experience working on the frontlines of technology and was instrumental in building and evolving Facebook’s Partner Engineering and Solution Architecture roles.