So I got unhealthily curious about CPU power management.
[Note: I oversimplify CPU talk here. It's not all about CPU Speeds/GHz, it's also about type of CPU, generation, age, etc.]
Turns out you can control your processor speed in Windows. Not explaining how. Your own risk if you explore this.
Doing this stabilised my CPU's clock speed. At worst, only changing +-0.01, but frequently staying completely static.
This gave far more stable C3 readings, bit jumpy with unlocked framerate but less range with the jumpy numbers.
This guarantees CPU readings aren't affected by CPU power shenanigans, but is NOT a real world scenario, just an experiment to rule-out CPU power management.
Maxing out CPU caused unpredictable CPU speed, doing tests that were 10%, 30%, etc, all affected CPU speed. Indeed, CPU is affecting our results.
The way we measure now, if you let your CPU do it's own thing and fluctuate as workload gets higher, then:
1. Running one test at a time, your CPU would reach differing speeds during each test. Less accurate results. (I did this method for many months).
2. Doing tests with "2 Groups enabled and compare CPU Profiler", CPU gets even higher than doing singular tests (The difference in CPU will also be less clear, explained below).
Notably, the higher your CPU speed, the harder it is to observe a CPU difference between two tests. No use fluffing up both tests with a "For" loop, as CPU will just boost its speed and further decrease the CPU difference. If you had an imaginary 100GHz CPU, none of our measurements would even show a 0.1% difference and we wouldn't know there was a difference that could affect the average player with a 2.5GHz CPU. (Always test on lower-powered devices if you aim to target them).
Measurements with fixed CPU speed
I took the same example I posted in this topic: "Dummy Return function" vs "For Each loop", 1000 objects every tick, one test at a time.
Had to approximate measurements still, since it's never a perfectly-stable number (esp with higher GHz).
I also tested various clock speeds, in all 3 Framerate modes (VSync, Ticks Only, Full Frames) and noted down the CPU % for VSync, TicksPerMin for Ticks Only, and FPS for Full Frames.
Why did I do this? I don't know. Maybe someone finds it interesting or can extract some information from it, knowing it's a fixed stable CPU speed on every measurement.
--- No CPU Restriction ---
(Idle with preview open using 2% CPU, GHz floats between 4.15GHz to 4.21GHz).
(Very approximate as CPU speed changes often).
For Each
CPU: 13%
TPM: 1000 [~4.24GHz]
FPS: 940 [~4.22GHz]
Return Fn
CPU: 8%
TPM: 2250 [~4.24GHz]
FPS: 1840 [~4.22GHz]
--- 2.28GHz ---
For Each
CPU: 20%
TPM: 560
FPS: 545
Return Fn
CPU: 12%
TPM: 1220
FPS: 1000
--- 1.48GHz ---
For Each
CPU: 30%
TPM: 360
FPS: 345
Return Fn
CPU: 17%
TPM: 760
FPS: 670
--- 0.98 GHz ---
For Each
CPU: 43%
TPM: 225
FPS: 218
Return Fn
CPU: 25%
TPM: 475
FPS: 420
The kicker? This is all absolutely pointless lol. Regardless of the CPU doing its thing and adapting, or a fixed CPU speed, I still have the same lesson I learned from when I whipped up the original c3p file in 5 minutes.
With my ignorance these past months, I measured mainly by comparing CPU, running 1 test at a time, re-running a few times to see if any sudden differences occur.
Any test I have done, has benefited me in my C3 journey:
- Placing conditions in ideal ways
- Knowing when/when not to use conditions/actions
- Knowing the expensive actions
- Finding that "comparing longer strings can increase CPU insigificantly, but can add up qucikly if reading many long strings in large loop, so opt for a number or shorter string".
- Learning that getting a value from a deeply-nested JSON gets expensive quickly, not for 1 "Get" but observable in a loop - but keeping nests minimal with hundreds of keys is fine.
- JSON performs the worst in many read/write/iterate tests when compared with Array/Dictionary/JSON, but is is still powerful and an ideal way to store complex relational data.
- Understanding "Pick by UID" or "Pick Child/Parent" is the categorical optimal way to pick an object, again, especially if in a long loop - Instance Variables could eat away at CPU... (But now there's exploring "Evaluate expression" which may render Instance Variables more ideal to use in a busy loop
These all feel accurate, and if inaccurate or misremembering... Can go and measure.