TL;DR

When upgrading Spring Boot from 2.7 to 3.2 and replacing Sleuth with Micrometer, I missed enabling automatic context propagation for a reactive WebFlux service. This caused tracing issues where child spans weren’t linked to their parent. Adding a single line of code (Hooks.enableAutomaticContextPropagation()) fixed the problem, enabling proper trace propagation and improving debugging for the team.


Full tale

When we upgraded Spring Boot from version 2.7 to 3.2, we swapped Sleuth for Micrometer. I even wrote a post about that process, where I thoroughly tested MVC services for HTTP, message, and database tracing. However, there was one service — out of more than 30 — that I missed. This service is reactive and uses Spring WebFlux.

While I understand the basics of reactive applications, I’m far from an expert. When I did the upgrade for that service, I only checked whether the tracing export was working correctly, and it was.

But recently, while helping a team member with that service, I realized something was off. We didn’t have the expected waterfall of spans based on the code execution. We only had the parent trace with no child spans.

To investigate, I added some extra spans in the local environment to see if my assumption was right — that the trace wasn’t propagating properly. Sure enough, I found that the child spans didn’t have the parent ID reference, which is what links the spans together.

Here’s the code I added:

   @GetMapping(path = ["/my-entity"])
    suspend fun get(filter: MyEntityFilter): Page<MyEntity> {
        tracer.createSubSpan("JV-test-1") {
        }

        tracer.createSubSpan("JV-test-2") {
            tracer.createSubSpan("JV-test-3") {
            }
        }

        //other operations
        return myEntity.findByFilter(filter)
    }

This code generated the following trace.

image showing the parent trace without the child spans

Missing child spans in the distributed trace

According to the code, there should have been three additional spans: (JV-test-1,JV-test-2, and JV-test-3)

If we searched for the other spans by name, we could find them, but they didn’t have the right parent reference. They were treated as root spans because their IDs matched the parent span’s ID.

image showing the child span without the parent connection

Child spans without link to the parent

After several rounds of debugging, I discovered that adding a single line of code fixed the issue. We simply needed to add this hook:

 Hooks.enableAutomaticContextPropagation()

And place it in the main function of the StarterApp, like this:

fun main(args: Array<String>) {
   SpringApplicationBuilder(StarterApp::class.java)
       .web(WebApplicationType.REACTIVE)
       .build()
       .run(*args)


   // new line added
   Hooks.enableAutomaticContextPropagation()
}

I was reminded of the old saying that the value of a hammer isn’t in the hammer itself, but in knowing where to use it. In this case, expertise — and knowing where to focus your effort — was much more valuable than just putting in more work (or, in my case, doing endless Googling).

Once I added the line, everything worked as expected.

image showing the parent trace with all the child spans

Complete distributed trace with all the child spans

From what I understand, the reactor engine (which WebFlux uses) doesn’t automatically carry thread-local context like an MVC application does. This configuration ensures that trace information propagates properly to child spans, linking them to the parent span.

I’m happy the fix was simple and worked as expected. Now, the team can better debug and understand their service’s behavior in production. Plus, I learned something new and interesting along the way.

Cheers.