Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add HTTP pprof endpoints #5363

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

inkel
Copy link
Contributor

@inkel inkel commented Feb 26, 2025

what

This PR adds a new --enable-profiling-routes flag that when true will mount the net/http/pprof endpoints, allowing to perform continuous profiling on the running server. This will allow a deeper inspection on how the application behaves, where the most CPU and memory is consumed, and so on. See profiling Go programs for additional information.

why

At Grafana Labs we use Atlantis with a large number of environments (740+) and when making a change that affects too many environments we have experienced slowness or OOM kills. We would like to make use of Grafana Pyroscope to investigate why this happens and hopefully contribute back.

tests

Added flag test.

@inkel inkel force-pushed the inkel/profiling-endpoints branch from 8be838e to 616316f Compare February 26, 2025 15:51
Signed-off-by: Leandro López (inkel) <[email protected]>
Signed-off-by: Leandro López (inkel) <[email protected]>
@inkel inkel force-pushed the inkel/profiling-endpoints branch from 616316f to ba44664 Compare February 27, 2025 18:46
@X-Guardian
Copy link
Contributor

Hi @inkel, I know this PR is still in draft, but I thought I would take a quick look, as it seems interesting.

Don't forget to update the server configuration and API docs: https://github.com/runatlantis/atlantis/blob/main/runatlantis.io/docs/server-configuration.md and https://github.com/runatlantis/atlantis/blob/main/runatlantis.io/docs/api-endpoints.md

I was also wondering whether --enable-profiling-api would be a better name for the flag?

@inkel
Copy link
Contributor Author

inkel commented Feb 28, 2025

Hi @inkel, I know this PR is still in draft, but I thought I would take a quick look, as it seems interesting.

Thanks @X-Guardian! Yeah, we are running a custom version with this enabled and I wanted to be sure that It Works™ before submitting for final review.

So far it only exposes the endpoints from net/http/pprof stdlib package, however, internally we are also using delta profilers for memory, mutex, and blocks, using this package, which give you the ability to have a "diff" view of the profiles, but that's when using Grafana tools, I'm not sure about others. Do you think it would be nice to include those as well in this PR, or better to discuss it and maybe adding those in the future?

Don't forget to update the server configuration and API docs: main/runatlantis.io/docs/server-configuration.md and main/runatlantis.io/docs/api-endpoints.md

Thank you! Will do.

I was also wondering whether --enable-profiling-api would be a better name for the flag?

🤔 that's a good question. I don't seem these endpoints as an API, as you can only fetch information from them, but I'm open to be convinced otherwise 😬

@X-Guardian
Copy link
Contributor

Your call with the profiling packages. If you want to get this one over the line first, that's fine.

Do any of these profiling packages have any performance impact if the endpoint isn't being actively read? We are trying to avoid adding unnecessary extra server options, so if there is no performance impact, maybe we should just always have the endpoint enabled...

@inkel
Copy link
Contributor Author

inkel commented Feb 28, 2025

🤔 another great question. I don't think they add much overhead, but let me ask around.

@inkel
Copy link
Contributor Author

inkel commented Feb 28, 2025

@X-Guardian asking some of my coworkers who are more knowledgeable about this, and the impact is around 1% of the performance. They also mentioned me that getting profiles for memory, block, and mutex can turn costly in some cases, and that's why it is suggested to use this package instead for those profiles.

While I like the idea of having these endpoints always enabled is appealing and would make the PR smaller, it perhaps warrants to leave it behind a flag and gather some real world usage data and then decide if removing the flag or not. WDYT?

@inkel
Copy link
Contributor Author

inkel commented Feb 28, 2025

Another reason to have this behind a flag is a security concern if you like, as they will expose information about your application. So I think that settles: better behind a flag.

@X-Guardian
Copy link
Contributor

OK, I still think enable-profiling-api what be a better option name or enable-profiling-api-endpoint. This would fit better with the terminology we use in https://github.com/runatlantis/atlantis/blob/main/runatlantis.io/docs/api-endpoints.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants