Search
7:31:13
terrorjack4
** NICK terrorjack
9:22:56
stassats
looks like i can test for signed-byte-32 by sign extending and comparing with the original value
9:23:17
stassats
which is just a single instruction on arm64, CMP NL2, NL2, SXTW
9:33:37
stassats
x86-64 needs two instructions, MOVSX and CMP, but it's still more compact than comparing against two numbers
9:33:55
stassats
even gcc doesn't know about that trick
9:34:01
stassats
i hope i didn't mess up the math, though
9:36:50
stassats
can be extended to 8-bit and 16-bit, but i wonder if any bit can be used by shifting first
9:48:16
stassats`
except sxtb seems to be slower than sxtw
10:11:47
stassats`
huh, it becomes slow when the input exceeds signed-byte-32
10:11:56
stassats`
like it can't predict a branch
10:13:05
stassats`
(loop for i from from to to do (setf z (= j 0))) seems to be twice as fast when J is 1
10:18:35
stassats`
true on another arm64 cpu
10:18:54
stassats`
slight difference on an POWER9 cpu, no difference on x86-64
10:45:29
stassats`
i guess that's because of CMOV
10:45:43
stassats`
gotta implement CMOV for arm64 then