Cycle counting was key on the Spectrum - for obvious things like the tape load routines but also for advanced techniques like the ‘Rainbow processor’ - updating the attribute bytes (those responsible for the infamous color clash) as each scan line progressed you could get different colors on each scan line.
Once made a tape-loading like pattern, and tried to get it as stable (not moving up or down on screen) as possible.
Managed to produce a program where with key presses, you could change delay in the loop in +/- 1 clockcycle increments. Mind you: fastest Z80 opcodes take 4 cycles.
How then? Well, there's also opcodes that take 5 cycles. Or 6. Or 7. And 8=2*4, 9=4+5, etc. Program just automated the insertion/removal of those in the inner loop. Of course I had to pick instructions that didn't mess with some Z80 registers.
Great fun (& educational) figuring out stuff like that. Fun times...
There was some game (and I think a program in a book) where the border color would be changed at a specific scan line to get a horizon that would span the entire screen.
I pretty much knew all the clock cycle counts for the instructions as a teenager, and you would code assembler with them always in mind.