There is surely no shortage for arbitrary combinations of words in job titles, and this is no exception from that. However, as a Fullstack AI Infrastructure Engineer myself, i want to share my interpretation of what the job and daily doing actually looks like.

Let’s dissect the title and see how it fits together.

“Fullstack”

“Fullstack” is a term used to describe the range of topics someone is responsible for. A “stack” in IT describes a combination of approaches, tools, and frameworks. Using the classic example of web development, the stack might be Typescript + PostgreSQL + ASP.NET API. Traditionally, you might have someone for the database (or many databases), someone doing back-end coding, and someone else to focus on delivering the back-end data in a consumable form via a web browser to the user.

Fullstack developers fill all those roles at once. They care as much for the database and its schema as they do for the middleware and presentation layers. Many modern frameworks support this, with ASP.NET MVC or Blazor as examples.

For AI Infrastructure engineering, fullstack has the same meaning, but a different “stack”. Caring for infrastructure is only a coding challenge as far as infrastructure as code goes, but the “AI” in “AI Infrastructure” pulls away the convenience of having a consistent, uniform platform, such as a focus on Azure, GCS, or AWS.

In the young and fast-paced world of AI, standardized solutions exist but come with the problem of inevitable delays due to client-vendor communication overhead. Your competitor could already be multiple features and innovations ahead while you wait for the vendor of your software to implement a feature you asked for weeks or months ago. Therefore, keeping as much control and development as possible in-house is not a nice-to-have; it's a necessity. This is where the stack of Fullstack AI Infrastructure Engineers goes beyond running services: creating services. The non-standardized technical nature of the AI landscape makes agile software development a must-have to keep the infrastructure connected and up-to-date.

To give an example: You have a high demand for a specific LLM, but no cloud provider gives you enough capacity to fill that demand. You need to reach outwards and connect multiple cloud providers, each of which comes with its own obstacles and standards.

On the consumer side, the employees who benefit from the AI Infrastructure fall into three main interest groups, in my experience: developers, semi-technicals, and users. Developers use the infrastructure to ship features in existing or new products, and users leverage AI to improve their daily work. Semi-technicals are users who are not formally developers but can still use LLMs in their work beyond chat interfaces. They enhance workflows with AI capabilities, sometimes in complex and often impactful ways.

All three groups need their own unique ways of using the infrastructure. For users, it's chat front-ends, local or shared, with RAG capabilities and seamless access via web browsers or similar low-hurdle applications. Semi-technicals need documentation and consulting, while developers need bleeding-edge access to new features.

To summarize, the "fullstack" in "Fullstack AI Infrastructure engineering" spans from running services to software development, where services need to address multiple groups: users, semi-technicals, and developers.

“AI”

While this might seem self-explanatory, there is more to “AI” than the blanket term suggests.

Where the label “AI” is often falsely attributed to software that simply makes decisions for marketing purposes, multiple technologies and ways to use them have emerged since the popularization of ChatGPT with GPT-3.

The first that comes to mind are LLMs - GPT, Gemini, Claude, etc. As new LLMs are developed over time, some are better than others, and that might change per update cycle. Where GPT models are the best to use at some point, a month later you might prefer to go with Gemini due to some new feature or vastly improved capabilities. The AI infrastructure must be able to quickly adapt to shifting requirements and new opportunities.

Secondly, there are capabilities such as RAG, OCR, voice, images, videos, and tools. RAG is the ability to recall information from a pile of information that far exceeds the context size of an LLM, while OCR is an interface to extract structure from unstructured documents, mostly images. OCR existed long before LLMs came around but experienced a huge improvement, with mistral-ocr currently leading the landscape.

While LLMs generate and understand text, the generation and understanding of voice, images, and videos are now adopted by almost all major AI companies and to a limited extent in the open-source world as well. The infrastructure needs to not only facilitate text but also offer working with voice, images, and videos as well, with availability at scale.

Lastly, tools are a big AI topic, under the name of “agents.” Offering the hosting of agents and/or their tools must be part of an AI infrastructure. Initially having OpenAPI as standard and lately transitioning into MCP, the availability of tools is as crucial as the availability of the agent itself. Without tools, the agent is useless. The dedicated infrastructure has the opportunity to shift the effort of hosting and securing tools away from developers and towards a centralized entity, giving developers more time for implementation.

Agents are a topic for semi-technicals and users as well: Microsoft Copilot is trying hard to come back from the hallucinating initial versions that were based on GPT-4o. Agents-as-a-service can also be offered by an AI infrastructure team to enable users without technical knowledge to still leverage agents for their work.

“Infrastructure”

Infrastructure is the foundation of action and movement: road infrastructure enables the transportation of people and goods, and power-grid infrastructure enables machines and technical devices to run far beyond the power source. Roads and power lines do not transport cars nor do they tell the weather; they facilitate the ability of others to do so.

In IT systems, infrastructure can be the service of offering a network signal to devices, the rollout of hardware to employees, or the hosting of VMs for others to use. “AI Infrastructure” has a similar approach:

An interface for employees of various technical knowledge towards AI capabilities. But this infrastructure needs to be “fluid”; an LLM or technology that worked yesterday might be unavailable tomorrow, or a better alternative might be available, and no one wants to miss out on the opportunity to use it as soon as possible. The hard part is to reduce the usability friction for the users as much as possible while not only keeping it all running but constantly moving forward. This implies being up to date as well, and solving problems before they arise.

Luckily, with modern cloud systems, agile software development approaches, and a motivated team, this is doable.

AI Infrastructure must be globally aware as well. Different requirements from interest groups might entail specific data security standards, such as GDPR in Europe. The interests in data security must be matched with the interest in availability across cloud providers, countries, and continents.

“Engineer”

A pool of requirements, current and future opportunities, data security, availability, and performance for different interest groups that constantly change cannot simply be deployed and administered; it must be engineered. Currently, custom software development is the only viable way to keep up with the pace of innovation without risking falling behind.

Active, in-house software development that tightly engineers the technology around the internal requirements and proactively acts based on new technologies protects against costly dependencies on external resources, both money and time-wise.

This includes predicting future requirements based on domain knowledge: how are developers and users going to include new abilities of the AI infrastructure? Is this new marketed feature actually good, or is it just hyped up? How many TPM and RPM are you going to expect from where, to where, with which constraints, and who is going to deliver the capacity to you? Which failovers are possible, and is some creativity going to be needed?

Those and other related questions do not have blanket answers; they are highly specific to different organizations and projects. Yet, someone has to continuously ask them, manage to collect the answers, and engineer the infrastructure around them as well as possible.

A good way I found to improve the infrastructure for future requirements is to be “your own best client.” If there is an overlap between providing the infrastructure and using the infrastructure in the same role, the base of the infrastructure improves almost automatically. This, in turn, also opens the door for in-house consulting: if you have someone who already used the feature that you are about to use, they can help you jump roadblocks before you even see them.

Last but not least, scaling and cost control go hand in hand. If your projects end up on the wrong side of an Excel sheet and the benefit is not worth the cost, they will be closed. Therefore, adapting to user habits in terms of scale and functionality required is important to limit cost generation to when it is unavoidable.

Conclusion

Those are the daily tasks of a "Fullstack AI Infrastructure Engineer," providing a service to users and other engineers that removes the headache and management overhead of juggling capacity, availability, requirements, and opportunities while supplying the newest technologies in a consumable form.

What does a "Fullstack AI Infrastructure Engineer" do?

“Fullstack”

“AI”

“Infrastructure”

“Engineer”

Conclusion

More from this blog

OpenAI Responses API multi turn CURL

O365 Connectors retirement: how to keep your Teams Webhook alive

Moving from Azure.AI.OpenAI to Semantic Kernel 1.0: implementing ODataAutoPilot (with video)

How to scale autonomous AI agents

Command Palette

“Fullstack”

“AI”

“Infrastructure”

“Engineer”

Conclusion

More from this blog