Search
9:48:16
stassats`
except sxtb seems to be slower than sxtw
10:11:47
stassats`
huh, it becomes slow when the input exceeds signed-byte-32
10:11:56
stassats`
like it can't predict a branch
10:13:05
stassats`
(loop for i from from to to do (setf z (= j 0))) seems to be twice as fast when J is 1
10:18:35
stassats`
true on another arm64 cpu
10:18:54
stassats`
slight difference on an POWER9 cpu, no difference on x86-64
10:45:29
stassats`
i guess that's because of CMOV
10:45:43
stassats`
gotta implement CMOV for arm64 then
15:33:10
stassats
it was pretty easy, probably around an hour
15:33:19
stassats
having three operand instructions is nice
15:41:01
stassats
gcc/clang are pretty aggressive with cmov, like computing two different things and then doing a cmov
15:44:11
stassats
clang simultaneously computes div and mul, and but not two divs, gcc doesn't with one div, but two muls are ok
15:44:47
stassats
now implementing a thing like that is tricky, and tracking when and where it's a good idea is annoying