Most articles about machine learning show off pretty diagrams and perfect pipelines. My real workday looks nothing like that. It is more “half‑finished notebook, three cups of coffee, and a script called final_v7.py that definitely is not final.”
So here is the honest version of how I use ML day to day as an AI engineer, without pretending everything is slick and over‑engineered.
Mornings: messy notebooks and small experiments
Most mornings start with a notebook. Not a polished one. The kind with cells named “test” and “new idea 3.”
I open a notebook when:
I am poking at a new dataset and just want to see what is in there.
I am trying a new model or API and want to feel how it behaves.
I have a vague idea like “maybe we chunk this differently” and I do not want to overthink it.
A typical flow looks like this:
Load a small slice of data, not the whole thing.
Write the simplest prompt or model call that could possibly work.
Print 10 results and stare at them.
Add quick notes in Markdown: “too verbose,” “kept missing dates,” “surprisingly good at this part.”
There is no dashboard, no automation, just me and a loop of “tweak, run, squint.” It feels more like sketching than engineering, which is exactly what I want at this stage. I am trying to develop a gut sense of how the model reacts, not build a product yet.
The secret is that most ideas die here. I am fine with that. Cheap, fast failures in a notebook are much better than heroic rewrites later.
Afternoons: turning the good bits into tiny tools
When something in a notebook starts to feel solid, it earns the right to leave.
My personal rule is:
If I copy the same code into a second notebook, it probably deserves a home of its own.
So I start collecting tiny tools:
A function for “clean this text and strip out the stuff we do not care about.”
A wrapper that formats prompts in a consistent way.
A little evaluator that takes input, output, and gold answer, then spits out a simple score.
None of these are fancy. Think 30 to 50 lines, tops. But once they exist, my future notebooks get lighter. I can focus on new questions instead of rewriting the same boilerplate.
It is a bit like building your own Lego set over time. At first you have random pieces everywhere. After a while you have a few stable bricks you trust, and you can stack them quickly.
The constant question: “Do I really need something complicated?”
People hear “AI engineer” and imagine graphs with dozens of nodes. The truth is, I spend a surprising amount of time talking myself out of complex architectures.
When a new problem shows up, my internal dialogue is pretty simple:
Can this be solved with one script that runs end to end
Will a human look at the results anyway
Is there an actual decision the system needs to make, or is it just a straight line of steps
If the answers are “yes, yes, no,” I stay simple.
For example, I once needed to take a bunch of free text entries and map them to a small set of categories. The temptation was to spin up a fancy service. Instead I wrote a one‑off script that:
Read a CSV.
Called a model with a classification prompt.
Logged the result, the original text, and the confidence.
Spit out a new CSV for someone to review.
That was it. We ran it once, fixed a few edge cases by hand, and moved on. No agents, no queues, no microservices. Just a boring script that did its job, then retired.
There is a weird kind of joy in finishing a problem and knowing there is nothing hiding behind it. Just a single file you can open, read, and understand in five minutes.
The time I over‑engineered everything
Of course, I do not always resist the urge to build something more elaborate. One of my favorite “oops” moments started with a content pipeline I wanted to make “smart.”
The goal sounded reasonable:
Take long articles.
Split them into chunks.
Summarize each chunk.
Tag them with topics.
Store everything so it could be searched later.
Instead of a simple pipeline, I got ambitious. I designed:
One agent to decide how to chunk.
One agent to summarize.
One agent to tag.
A fourth agent to “coordinate” the other three.
On paper, it looked clever. In reality, it was a mess:
Debugging felt like chasing ghosts. When something went wrong, which agent was to blame
Changing a simple parameter, like chunk size, meant touching several pieces.
I spent more time babysitting the system than improving the actual summaries.
After a week of this, I did the only sane thing: I threw it out.
The replacement was almost embarrassingly simple:
A function to chunk text.
A loop that ran summarization and tagging in order.
A final write to a database.
One script. No agents. No coordinators. The results got better, not worse, because I could finally see what was going on again.
That experience burned a lesson into my brain:
If the straight line version is not good, the multi‑agent version probably will not save you. It will just hide the problem under extra layers.
Now, whenever I get the urge to spin up a complex system, I ask myself, “Have I earned this yet” If the answer is no, I go back to a notebook.
When an agent actually makes sense
To be fair, there are times when an agentic setup actually helps. My own portfolio assistant is one of them.
I wanted visitors to my site to be able to:
Ask “show me your project work with agents” and jump straight there.
Ask “what is your view on sleep at night investing” and get an answer drawn from my newsletter, not thin air.
Get links back to specific posts or sections without hunting through menus.
I tried the simple approach first: a regular chatbot hooked to my content. It could answer questions, but it had no sense of actions:
It could not scroll you to a section.
It could not choose when to answer directly versus when to point you somewhere.
Everything was just text in, text out.
This is where an agent layer actually added something useful. I gave it a very small toolbox:
One tool to jump to a section of the site.
One tool to search and quote from newsletter articles.
One tool to bundle up links to relevant pages.
Then I wrapped it in a simple set of rules:
If the question is “where” or “show me,” favor navigation.
If the question is “what do you think about,” search the content and answer, then give links.
If you cannot answer from my stuff, say so instead of inventing ideas I never wrote.
The result feels different:
Conversations can turn into actions.
Answers carry references back to my actual work.
I can add new content and trust the assistant to use it, without hand‑coding every path.
The key difference from my earlier over‑complicated project is that here, the agent is making real choices:
“Should I answer” versus “should I move you somewhere.”
“Do I know this from the content” versus “I should admit I do not.”
Without those choices, an agent is just a chatbot in a fancy coat.
The quiet art of saying “nope, not yet”
If there is one skill I have slowly learned as an AI engineer, it is the art of saying no to complexity.
When a new request comes in, I ask myself five questions:
What is the smallest version of this that would be genuinely useful
Can I prototype it in a single notebook and a tiny evaluation script
Who will actually maintain this three months from now
Will an agent make a clear decision that a simple script cannot
Will I be able to explain this architecture to a tired teammate in five minutes
If I cannot answer those questions cleanly, I pull back. I might still play with ideas in notebooks, but I do not “upgrade” them to the grand system.
Most days, my work looks like:
Cleaning up input data.
Writing quick prompts and checking a handful of outputs.
Measuring whether a change actually improved anything.
Refactoring something small so the next experiment is easier.
Deleting code that felt clever but did not earn its keep.
It is not glamorous. It is also strangely fun. There is a nice rhythm in catching yourself about to overcomplicate something, and then deciding, “No, today we stay simple.”
If you ever feel pressure to make your ML work look more shiny than it is, remember: the best systems I have shipped rarely started with a diagram. They started with a messy notebook, a blunt question (“Do we really need more than this”), and the willingness to keep things boring until boring was no longer enough.

