How to assess the performance of your hardware fleet
If you’re serious about evaluating PC performance, you might need to rethink the way you test
Just as there’s a direct link between the performance of a business and the performance of its employees, so there’s a link between employee productivity and the hardware they use on the job. Slow application performance, unresponsive software or an unreliable PC can all have consequences. A 2020 UK study found that nearly half of British workers felt that outdated tech hindered their productivity, and that slow technology cost them, on average, up to 46 minutes a day. With a speedy, responsive PC, however, users have time to focus and work creatively, rather than wonder how long they’re going to have to wait while their PC grinds its gears.
This makes it crucial that IT teams refresh their PC and laptop fleets regularly. Yet this raises another issue: How do you evaluate performance and make meaningful comparisons that can guide you towards the right choice?
Evaluating PC performance has rarely been simple or straightforward, particularly when it comes to CPUs. Back in the 1990s you could make some assumptions based on clock speeds; the more MHz you had to play with, the faster your applications ran. The problem is that while many businesses still make these assumptions, maybe factoring in core counts, threads and – possibly – cache, these methods of assessment are fundamentally flawed.
As a new white paper from AMD, ‘Measuring what matters – Benchmark considerations for commercial PC purchases’, makes clear, modern processor design is less focused on ramping up clock speeds, and more focused on creating efficient super-scaler designs with deep instruction pipelines and a more sophisticated use of cache, enabling each core to execute multiple instructions per clock cycle. In fact, designers may even focus on reducing the number of clock cycles required to execute a given set of instructions, rather than rely on pushing clock speeds upwards. If you automatically assume that a processor running at 3 to 4.8GHz is going to give you more performance than one running at 1.9 to 4.4GHz, you could be making a big mistake.
In fact, AMD’s white paper suggests, “The idea of boost and base frequencies is an oversimplification – the actual frequency at any given time will depend on workload, power-mode and platform thermal design.” This is true of desktop PCs, but even more so of laptops, where the thermal design of the device will constrain how long the CPU will be able to run at its maximum boost frequency. In real world usage, one CPU in one laptop might reach a higher boost frequency, but quickly throttle to a lower speed as power demands increase and temperatures rise. Another CPU in another laptop might reach a lower boost frequency but prove able to sustain it over longer periods of time.
Meanwhile, core counts matter. Provided an application is designed to make effective use of multi-threading – and most are these days – the AMD CPU with eight cores running 16 threads will still run a given operation faster than the Intel CPU running four cores with eight threads, higher clock speeds notwithstanding. In fact, it might find it easier to stay within thermal constraints and sustain higher clock speeds longer. Throw in all the other variables within the system – memory bandwidth, storage bandwidth, GPU performance – and simple MHz comparisons can be misleading.
So, how can businesses get a more accurate idea of real-world PC performance? The ideal would be a programme of user testing, with tests designed to mimic everyday real-world tasks in real working environments. The problem is that this is long, difficult to manage and potentially expensive, while it's difficult to pull out consistent and unbiased metrics. Software benchmarks can be more practical, but they need to be approached in the right way.
The obvious answer is application benchmarks, either using industry-standard PC benchmarks or, where the development resources are available, custom scripts based around real-world business tasks. However, both approaches have their issues. With industry-standard benchmarks, businesses need to make sure that they’re a good fit for the organisation, and not focused on consumer workloads or creative workloads that don’t reflect the kind of applications they actually use. Custom benchmarks, meanwhile, need regular updating to ensure that they’re not optimised for older architectures and still provide accurate, relevant results.
Both types share the problem that, while they can provide an accurate picture of performance in existing applications, they can’t predict how PCs will perform with new or emerging applications. What happens when a new, revolutionary tool appears that uses specific hardware features, but your benchmarks can’t tell you if a PC measures up?
This is where synthetic benchmarks come in. They can provide a snapshot of system or component-level performance, giving you some idea of how system X compares to system Y, or the read/write speeds of one SSD when compared to another. They can be quick, dirty and effective, but they need to be used with caution. As AMD’s white paper notes, narrow benchmarks of, say, the storage or memory subsystems may not call on features in the CPU pipeline or cache, skewing the result. “It is important not to overly weight these results,” it concludes, suggesting that “decision makers should instead evaluate the composite score, which exercises a broader set of processor functions”.
What’s more, synthetic benchmarks may not run for long enough or put enough stress on every part of the system to assess how performance will hold up against issues with memory bandwidth, I/O latency or thermal constraints.
It’s clear that no single benchmark or type of benchmark works for evaluating PC performance on its own, but where does that leave us? AMD’s white paper takes a smart approach, which combines “a wide range of both application-based and synthetic tests”. Through this kind of holistic, comprehensive testing, AMD shows that the Ryzen 7 Pro 5850U processor exceeds its rivals in a range of real-world applications, even though it runs at lower clock speeds and falls behind in some synthetic, single-threaded tests. No single benchmark can tell the whole story – and MHz to MHz comparisons won’t even get you past ‘once upon a time’. If you’re serious about evaluating PC performance, you really need the wider picture.
The Total Economic Impact™ Of Turbonomic Application Resource Management for IBM Cloud® Paks
Business benefits and cost savings enabled by IBM Turbonomic Application Resource ManagementFree Download
The Total Economic Impact™ of IBM Watson Assistant
Cost savings and business benefits enabled by Watson AssistantFree Download
The field guide to application modernisation
Moving forward with your enterprise application portfolioFree Download
AI for customer service
Discover the industry-leading AI platform that customers and employees want to useFree Download