I wrote a thing about a somewhat bizarre CPU behavior observed during #PostgreSQL benchmarks on large systems. Seems related to cores being too idle, but hard to "prove" these internal CPU behaviors :-( Also, not sure how to mitigate it.

vondra.me/posts/benchmarking-i

Follow

@tomasv interesting post, did you consider running pgbench on another machine from the postgresql server, might be interesting as to how that shifted things, if you're thinking there is some benefit of the client and server processes being on the same core / core cluster.

@intrbiz Good idea. I did consider that, but then didn't actually try for some reason. Will give it a try tomorrow.

@intrbiz I ran the test with two of the 176-core machines, with pgbench on one and the database on the other. And in this case I don't observe the issue. I did that in both directions, so for each pinning strategy there are two data series. Of course, the pinning is mostly pointless - it's on the pgbench machine, it can't pin the backends at all. Still, I'm a bit surprised "none" wins this much.

@intrbiz It's however true the throughput is much lower - here's a chart with results for local (unix sockets) runs from both machines. It reaches almost 5M tps, the remote TCP only gets to 1M tps.

The network is pretty good. iperf3 says it can do ~80Gb/s, and per netperf the latency is about 0.07ms (min: 40us, mean: 73us, max 4441us).

Sign in to participate in the conversation
Mastodon

Time for a cuppa... Earl Grey please!