Weeks Using AI in Backend Development: What Actually Helped

Developers have always forged their own way of doing things. AI won’t change that.

A few weeks ago I shared my first thoughts on using AI in development in a different article.

Seven weeks ago I started seriously using AI in backend development. Not to replace engineering work, but to reduce repetition in the boring parts: migrations, DTOs, tests, analysis, and temporary tooling.

This isn’t about spinning up 30 agents working 24/7. It’s about small, practical things that genuinely helped me in day-to-day development.

One thing I realized very quickly is:

AI work isn’t defined. Try everything you can and find your own way to use it.

The workflows people share online might work great for them, but a lot of value comes from experimenting in your own context with your own problems.

For me, it started with a legacy project I was helping rewrite.

The Main Use-Case: Rewriting Old Code to a New Application Using AI

We have code written by the ancient ones. The kind of code you avoid touching because something is definitely going to break. But at the same time, you know it eventually has to go.

The application had many components, and we wanted to gradually replace them by rewriting parts into a new Spring Boot application.

I started working on it without AI and without any knowledge of the business domain, the application itself, or even the language it was written in (Perl).

The approach was pretty standard:

analyze
document
learn
slowly start making changes

The goal was to rewrite around 90 REST APIs while:

figuring out which ones were actively used
restructuring them to better fit Java conventions
writing tests
comparing old and new API responses
learning the surrounding business domain along the way

It felt like the perfect opportunity to test what AI was actually capable of.

I wrote the prompt:

Please I neeed to create new Application from this old one. Thank you very much.

Just kidding, obviously.

I wouldn’t learn anything from that, and I definitely wouldn’t be able to validate the output.

Instead, I created the initial application structure using Spring Initializr and manually implemented the first four APIs.

Only then did I spin up the first AI agent.

The First Use-Case: Reducing Repetitive Work

AI is at its best when it simplifies repetitive work. Not necessarily complex engineering work — just the annoying things you repeat over and over again.

For me, the biggest improvement came from persistent context.

Create Your F*cking CLAUDE.md File

I spent the first two days without one.

At the time I wanted to understand the base tool first before adding more context and customization. That experiment didn’t last long.

Then I finally created a CLAUDE.md file.

The impact was immediate.

It removed:

repeated explanations
missing project context
inconsistent outputs
a lot of unnecessary back-and-forth

Created with a single /init command, it acts as persistent project context loaded at the start of every session.

And the best part is that you can continuously evolve it.

You can define:

coding conventions
testing expectations
naming rules
architectural constraints
anything else important for your workflow

This was the first moment where AI stopped feeling like a generic chatbot and started feeling more like a development tool.

Oh! Sweet DTO Generation

Once the context was stable, repetitive code generation became much more reliable.

DTO generation was one of the easiest wins.

Especially when dealing with native queries returning huge datasets with dozens of fields.

AI handled most of those cases surprisingly well.

I still encountered occasional:

type mismatches
naming conflicts
inconsistencies with existing project conventions

but those were usually easy to catch and fix manually.

And honestly, many of those mistakes are the exact same mistakes developers make themselves when writing repetitive boilerplate code.

The Second Use-Case: Testing With AI While Testing the AI

Setting up Java Tests Using AI

This was the first time I was actually a little fascinated by AI as a developer.

The goal was to set up a database testing environment:

using Testcontainers
dual-database setup

I knew what the result should look like. But this was a learning opportunity to follow along and observe the way AI is “thinking”.

It took about 20 minutes. There were configuration, initialization, and orchestration issues along the way. What I learned was that AI works a lot like your junior colleague:

it tries until it figures it out
it won’t clean up the mess it creates in the process
it asks for more information when it gets a little bit lost

And you should be there to help in those situations. You should always be prepared to give it more context or offer direction. Especially in cases where you know what you want.

Don’t expect magic to just happen.

Still, I have to admit — watching AI generate decent tests from minimal context still feels a little bit magical.

Writing Tests Using AI

Have you ever thought about code coverage as a good metric for tests?

Unfortunately, that’s the default behavior. It aims for maximal code coverage, as many developers also do. You might end up with far more tests than necessary, that don’t add much real value and are a pain in the ass to maintain.

But you are not without options:

define the tests using CLAUDE.md
write test examples
validate tests

With a little bit of work, it will use existing tests as a guide for future tests. Every test you consider good makes the new ones better.

I’m currently at about 130 tests written entirely by AI. They are not perfect, but neither would manually written tests be. You always miss something.

Like in every development process, a little bit of refactoring is always necessary.

Tests are an important part of the development process - avoid just “generating them”. AI can test what it sees, but if there are edge cases, you have to get involved.

And that’s why one of the things in my testing process was testing those new APIs and comparing them with the old ones.
And once you start changing response structures, field types, or formatting, validating those differences manually becomes painful very quickly.

The Third Use-Case: Simple Helper Tools With AI

One of the most annoying parts of development is writing temporary tools.

You know the kind:

migration scripts
response comparators
data transformation utilities
one-off debugging tools

They solve a real problem, but you also know you’ll probably use them once and never touch them again.

That used to make them feel like a bad investment of time.

AI changed that completely.

I knew exactly what changed in API responses:

snake_case to camelCase
strings to numbers where numbers were expected
structure for non-standard APIs
manual paging vs Spring Boot paging
datetime formatting

And there were also a few edge cases with more complicated structure changes.

AI wrote a Python script that was able to compare data even with those changes.

Hours of saved time and energy. Especially because writing those scripts manually often takes longer than expected.

And those special cases? I just gave the AI information about how to call both APIs, and it generated dedicated comparison scripts on the fly.

Of course, I tried to “break” it to verify it works. The comparisons were accurate enough to be genuinely useful.

Another tool like that was a database migration tool. The schemas differed slightly between tables — some JSONB columns were added, others removed, and a few fields had different defaults or structures.

What was fascinating this time wasn’t that it was able to create it, but that it helped me discover problems in the old data: fields not fitting the new schema, weird defaults, inconsistent values, and much more.

But even with all of that, there was still one remaining problem: not all APIs were actively used, and we didn’t want to waste time rewriting endpoints nobody depended on.

Wait… isn’t AI also a great analytical and research tool?

The Fourth Use-Case: Analyze, Analyze and Analyze Again

One of the integrations on those old APIs belongs to a different team. I could ask them to do the research, but that’s once again a pretty bad investment of their time.

Normally I would probably do it myself looking into their code, because that’s where the truth lies. But it occurred to me:

I know what I’m looking for
where to look for it
and I have access to both codebases

I pointed the AI at the FE application and referenced the old APIs.

After a few minutes — literally a few minutes — I ended up with a detailed report:

list of APIs used by the FE application
list of APIs not used by the FE
where in the code I can find them (name, line number, etc.)
how they are used - a little bit of “context”
is paging used or not
which fields are used for sorting
which URLs I could use to inspect how the FE consumes and renders the API data M I made a lot of analyses like this in the past - more than I would like to admit. And it would take me days depending on the situation.

No, it wasn’t perfect, but it was close enough. I needed to check it manually anyway, but you can imagine how much time it saved me.

This fundamentally changed how I think about AI in development.

Generating code is useful, but that’s often the easiest part of engineering work.

The difficult part is usually understanding large systems:

figuring out dependencies
tracing behavior
identifying what is actually used
finding safe places to make changes

Especially in old codebases that nobody fully understands anymore.

And that’s where AI started feeling less like a code generator and more like a research and analysis tool.

For the kind of work waiting for me in the upcoming months, that might end up being the most valuable capability of all.

The Final Use-Case: AI Is Just Another Tool

I know all of these examples probably sound small compared to what the internet says AI can do.

But honestly, I think that’s the wrong way to look at it.

You don’t need 30 nonstop AI agents to get real value from AI.

A lot of value comes from much smaller things:

reducing repetitive work
accelerating analysis
building disposable tooling
reducing overhead in everyday development

And the more I use it, the more I realize there isn’t one correct way to work with AI.

The workflows that work for me might not work for someone else.

That’s probably the most important thing I’ve learned so far:

AI work isn’t really defined yet.

You experiment. You adjust. You discover where it helps and where it doesn’t.

And just like with any other tool, using it effectively still requires understanding the problem you’re solving.

AI doesn’t replace engineering knowledge. If anything, it makes good engineering judgment even more important.

I think it’s reducing the friction between understanding a problem and experimenting with a solution.

And learning how to use that well is just another skill developers will need to build.