LaravelPHPJune 24, 20264 min read

Laravel Octane performance profiling: Building Custom Flame Graphs

Laravel Octane performance profiling is essential for stable production. Learn to implement custom Xdebug-based flame graphs to debug long-running worker latency.

LaravelOctanePHPPerformanceProfilingXdebugBackend

Last month, we noticed our primary checkout worker in a Laravel Octane environment started creeping toward a 400ms latency spike after about two hours of uptime. Standard logs told us that it was slow, but they were useless at telling us why. We were leaking cycles somewhere in the dependency injection container, and traditional tools were too heavy to run in production without crashing the process.

If you’re running high-throughput PHP, you know that php-fpm hides a lot of sins by killing the process after every request. Octane changes the game by keeping the application in memory. This is a massive win for performance, but it makes any memory leak or inefficient loop a ticking time bomb.

The Problem with Traditional Profiling

We first tried using Blackfire, but the overhead of injecting their probe into every request was too high for our specific traffic patterns. We needed a "surgical" approach—something we could trigger on-demand for a single request without restarting our Octane workers.

The goal was to generate flame graphs. Flame graphs are the only way to visualize CPU usage in a way that makes sense, showing you exactly which functions are stacking up during execution. We decided to leverage Xdebug's trace and profile capabilities, but in a way that doesn't choke the event loop.

Implementing Custom Xdebug Profiling

To make this work, we don't want Xdebug running globally. Instead, we use a custom middleware that checks for a specific "debug" header. If the header is present, we toggle the Xdebug profiler for that single request.

First, ensure you have Xdebug installed and configured in your php.ini. You'll want to set the output directory to a location writable by your Octane user:


INI
xdebug.mode = profile
xdebug.output_dir = /tmp/xdebug-profiles
xdebug.profiler_output_name = cachegrind.out.%p.%t

Now, create a middleware to handle the trigger. This allows you to profile specific requests in production safely:


PHP
namespace App\Http\Middleware;

use Closure;
use Illuminate\Http\Request;

class ProfileRequest
{
    public function handle(Request $request, Closure $next)
    {
        if ($request->hasHeader('X-Profile-Request')) {
            ini_set('xdebug.profiler_enable', 1);
        }

        return $next($request);
    }
}

By keeping the profiler off by default, we avoid the performance hit that usually makes Laravel Octane memory management: solving circular reference leaks so difficult to track.

Visualizing the Bottleneck

Once you have the cachegrind file, you need to convert it into a format that a flame graph visualizer can read. I’ve found that gprof2dot or the speedscope web tool are excellent for this.

Generate the profile: curl -H "X-Profile-Request: 1" https://api.your-app.com/checkout
Locate the file in /tmp/xdebug-profiles.
Use a converter like pyprof2calltree to translate the file: pyprof2calltree -i cachegrind.out.1234 -o profile.dot

When you load this into a tool like Speedscope, you’ll see the call stack. In our case, we found a service provider that was re-instantiating an object on every request because we hadn't properly bound it as a singleton. It was a classic "lazy loading" trap that only manifested once the container grew past a certain size.

Why This Beats Standard Monitoring

When you’re dealing with Octane, you need to be careful about state. If you aren't careful, you’ll end up fighting the same issues I discussed in Laravel Octane memory management: implementing custom object pooling.

Standard APM tools give you a high-level view, but they rarely show you the stack depth of a recursive Eloquent relationship or a heavy collection transformation. By using flame graphs, you can see if a specific function is being called 5,000 times per request. It’s the difference between guessing and knowing.

Trade-offs and Lessons

We initially tried to profile the entire process for 5 minutes, but that generated a 2GB file that crashed our analysis tool. Lesson learned: always profile a single, isolated request.

Also, remember that Xdebug adds a non-trivial overhead. If you're profiling under heavy load, the results might be slightly skewed by the profiler's own impact on the CPU. It’s better to use this on a staging environment that mirrors your production traffic volume.

If you are still seeing memory growth after fixing these bottlenecks, you might need to look into Laravel Octane JIT compilation: deterministic request pre-warming to stabilize the memory footprint further.

FAQ

Does this work in production? Yes, if you use a header-based trigger. Never leave Xdebug profiler enabled globally in production.

Can I use this to debug memory leaks? It’s better for CPU bottlenecks. For memory leaks, you’re better off using memory_get_usage() in a custom middleware or checking your opcache stats.

What if I'm using roadrunner? The middleware approach works regardless of whether you're using Swoole or Roadrunner, as long as the underlying PHP worker respects the ini_set call.

I’m still experimenting with automating the conversion of these profiles to a centralized dashboard. Manually moving files around is fine for a one-off debug, but for a team, it’s a bottleneck in itself. Next time, I’ll probably hook this into a sidecar container that auto-uploads these files to a storage bucket for team review.

Back to Blog

Laravel Octane performance profiling: Building Custom Flame Graphs

The Problem with Traditional Profiling

Implementing Custom Xdebug Profiling

Visualizing the Bottleneck

Why This Beats Standard Monitoring

Trade-offs and Lessons

FAQ

Similar Posts

Laravel Octane Observability: Building Custom OpenTelemetry Exporters

Laravel Octane JIT: Tuning PHP Performance for Deterministic Results

Laravel Octane Memory Management: Implementing Custom Object Pooling