How to Calculate Maximum Threads in a Multi-Core Server: CPU Sockets vs Cores vs Threads


2 views

From your machine specs:

CPU(s):                20
Thread(s) per core:    1
Core(s) per socket:    10
Socket(s):             2

This configuration represents a dual-socket system with each socket containing 10 physical cores, totaling 20 logical processors (since hyperthreading is disabled with Thread(s) per core = 1).

Maximum threads = Sockets × Cores per socket × Threads per core

In your case: 2 × 10 × 1 = 20 simultaneous threads

Java thread count check:

int availableProcessors = Runtime.getRuntime().availableProcessors();
System.out.println("Available processors: " + availableProcessors);

Python multiprocessing:

import multiprocessing
print(f"Available CPUs: {multiprocessing.cpu_count()}")

While you can create more threads than available cores, they'll be time-sliced by the OS scheduler. For CPU-bound tasks, exceeding your physical thread count typically degrades performance due to context switching overhead.

For optimal performance:

  • Set thread pool sizes to match physical core count for CPU-intensive workloads
  • For I/O-bound tasks, you might use slightly more threads (typically 2x cores)
  • Consider NUMA architecture effects in multi-socket systems

Linux command to view thread count per process:

ps -eLf | grep [process_name] | wc -l

Or for system-wide thread count:

cat /proc/sys/kernel/threads-max

Let's decode your server specs first:

CPU(s):                20       # Total logical processors
Thread(s) per core:    1        # No Hyper-Threading
Core(s) per socket:    10       # Physical cores per CPU
Socket(s):             2        # Physical CPU packages

Your server has:

  • 2 physical CPUs (sockets)
  • 10 cores per CPU
  • No Hyper-Threading (1 thread per core)
  • Total logical processors: 20 (2 × 10 × 1)

In your configuration with HT disabled:

Maximum efficient threads = Total logical processors = 20

For C++ thread detection:

#include <iostream>
#include <thread>

int main() {
    unsigned int n = std::thread::hardware_concurrency();
    std::cout << "Optimal threads: " << n << std::endl;
    return 0;
}

Cases where more threads may be beneficial:

  1. I/O-bound workloads
  2. Threads waiting on external resources
  3. Implementing worker pools with oversubscription

Linux command to verify:

lscpu | grep -E 'Socket|Core|Thread|CPU$s$'

Windows PowerShell:

Get-WmiObject Win32_Processor | Select NumberOfCores, NumberOfLogicalProcessors

Python multiprocessing test:

from multiprocessing import Pool
import time

def stress_test(x):
    return x*x

if __name__ == '__main__':
    with Pool(processes=20) as pool:  # Match your logical CPU count
        start = time.time()
        pool.map(stress_test, range(10000000))
        print(f"Duration: {time.time()-start:.2f}s")

Try varying the process count (10, 20, 30) to observe performance differences.