For both alternatives we begin by computing how far the mouse has gone:
int m = abs(dx) + abs(dy); // Manhattan distance
For the single-pole RC exponential filter as WanderPanda suggested:
c -= c >> 5; // exponential decay without a multiply (not actually faster on most modern CPUs)
c += m;
For the box filter with the running-sum table as nostrademons suggested:
s += m; // update running sum
size_t j = (i + 1) % n; // calculate index in prefix sum table to overwrite
int d = s - t[j]; // calculate sum of last n mouse movement Manhattan distances
t[j] = s;
i = j;
Here c, i, s, and t are all presumed to persist from one event to the next, so maybe they're part of some context struct, while in old-fashioned C they'd be static variables. If n is a compile-time constant, this will be more efficient, especially if it's a power of 2. You don't really need a separate persistent s; that's an optimization nostrademons suggested, but you could instead use a local s at the cost of an extra array-indexing operation:
int s = t[i] + m;
Depending on context this might not actually cost any extra time.
Once you've computed your smoothed mouse velocity in c or d, you compare it against some kind of predetermined threshold, or maybe apply a smoothstep to it to get the mouse pointer size.
Roughly I think WanderPanda's approach is about 12 RISCish CPU instructions, and nostrademons's approach is about 18 but works a lot better. Either way you're probably looking at about 4-8 clock cycles on one core per mouse movement, considerably less than actually drawing the mouse pointer (if you're doing it on the CPU, anyway).
Once you've computed your smoothed mouse velocity in c or d, you compare it against some kind of predetermined threshold, or maybe apply a smoothstep to it to get the mouse pointer size.
Roughly I think WanderPanda's approach is about 12 RISCish CPU instructions, and nostrademons's approach is about 18 but works a lot better. Either way you're probably looking at about 4-8 clock cycles on one core per mouse movement, considerably less than actually drawing the mouse pointer (if you're doing it on the CPU, anyway).
Does that help?