Search
Tuesday, 24th of November 2020, 13:23:02 UTC
14:47:12
pfdietz
It's interesting that the ML page on Wikipedia does not mention the connection to LCF (which I knew about, having been at Cornell around that time.)
23:28:28
|3b|
what's the most efficient sbcl-specific way to convert a (unsigned-byte 32) to (signed-byte 32) with same bits?
23:38:29
stassats
|3b|: SB-C::MASK-SIGNED-FIELD?
23:39:00
|3b|
and good user-level options?
23:40:31
stassats
we're not as good as gcc at detecting portable patterns
23:40:42
|3b|
i guess this would be in code using sb-kernel, so sb-c is probably ok too
23:42:29
|3b|
mask-signed-field seems to work, thanks
23:46:43
stassats
(logior x (- (mask-field (byte 1 31) x))) is not that bad
23:48:33
stassats
let's agree you use this, and i motivate myself to finally add multi-combination transforms
23:49:07
|3b|
it's sbcl specific code already using internals, but i can use the logior if you like
23:49:30
|3b|
(fixing building floats from bits in float-features)
23:49:37
stassats
well, i don't really mind that
23:49:46
stassats
but i really want a portable pattern detector
23:50:13
stassats
like in the grown up compilers
23:50:32
|3b|
yeah, fast portable version would be nice for other code
23:50:59
stassats
it's marginally slower as it is
23:51:10
stassats
(haven't tried, but looking at the code)
0:04:05
stassats
yeah, the performance difference is very minimal
0:04:53
stassats
MOVSX also has to untag/retag for fixnums
0:10:36
stassats
|3b|: but if you're building floats, maybe you need something else?
0:11:55
|3b|
i have an (unsigned-byte 32) and want a single-float (and similarly for doubles)
0:12:04
|3b|
but sbcl expects a (signed-byte 32)
0:12:14
|3b|
in sb-kernel:make-single-float
0:14:33
stassats
what's some old fashioned lying to the compiler
0:14:46
stassats
(defun foo (x) (declare ((unsigned-byte 32) x) (optimize speed)) (sb-kernel:make-single-float (truly-the (signed-byte 32) x)))
0:15:01
stassats
(foo 3240099840) => -10.0
0:17:08
|3b|
:) i think i like using internals better than lying
0:17:22
stassats
that's internals too
0:17:34
stassats
you want it to be as fast as possible?
0:20:43
stassats
but do you want it to be boxed?
0:21:04
stassats
sb-kernel:make-single-float has no boxed-result vop
0:21:21
stassats
which ought to be just tagging
0:23:26
|3b|
yeah, i guess truly-the looks a bit simpler, assuming i know it is ub32 to start with
0:23:50
|3b|
which it looks like i don't
0:24:10
stassats
(sb-kernel:%make-lisp-obj (logior (ash x 32) sb-vm:single-float-widetag)) for tagging ;; don't use it
0:26:11
|3b|
with a type check, mask-signed-field and truly-the look about the same
0:26:55
|3b|
(and without a type check, mask-signed-field is a full call, so i probably want the check)
0:27:12
stassats
what about (sb-kernel:%make-lisp-obj (logior (ash x 32) sb-vm:single-float-widetag)) ?
0:30:25
stassats
that assumes it's going to be tagged, not stuffed into array, although, stuffing into an array should also not need FP instructions
0:30:42
|3b|
i think that's actually worse, since the function has an ftype to return single-float and it can't tell make-lisp-obj returns a single-float
0:30:45
stassats
only for float operations
0:31:05
stassats
|3b|: what about truly-the single-float around it?
0:32:07
|3b|
skips a movd xmm0,edx + movd edx,xmm0
0:32:19
stassats
ok, don't use that actual form, but that what make-single-float should expand into in that case
0:32:43
|3b|
ACTION hasn't actually tested to see if it works though
0:32:57
stassats
tried, it's slightly faster
0:35:38
|3b|
ok, i'll go with (sb-kernel:make-single-float (sb-c::mask-signed-field 32 (the (unsigned-byte 32) bits)))
0:35:54
|3b|
(sb-kernel:make-single-float (sb-ext:truly-the (signed-byte 32) (the (unsigned-byte 32) bits))) is a bit ugly (and longer) :)
0:36:41
|3b|
though possibly should just suggest the ftype specify ub32 and skip THE
0:39:58
stassats
ok, make-single-float to tagged single-float is used within sbcl itself, so i'll optimize that case
0:52:04
stassats
ok, i have a sb-kernel:make-single-float variant that takes up tagged fixnums and returns tagged floats, without untagging
0:52:10
stassats
and using float instructions
0:55:01
stassats
https://gist.github.com/stassats/af577e4ff0b117a47801548ef31ce6f9
0:55:32
stassats
lying to the compiler plus new vop vs m-s-f + m-s-f
0:58:33
|3b|
can it not figure out that mask-signed-field 32 returns sb32?
0:58:43
|3b|
or is it because it returns untagged ub32?
1:00:13
stassats
it can figure that out, but that doesn't help it to get rid of mask-signed-field itself
1:01:02
stassats
the peephole pass could detect MOVSX RDX, EDX MOVD XMM0, EDX
1:01:27
stassats
but then that leaves the initial untagging, which is too late to eliminate
1:01:51
|3b|
ok, will switch to truly-the instead of mask-signed-field then
1:02:06
stassats
well, maybe i'll make it detect mask-signed-field
1:02:32
stassats
mask-signed-field kinda feels safer, although on x86-64, the truly is going to work
1:02:38
|3b|
or just add sb-ext:ub32->single-float or something :)
1:02:52
stassats
won't help with older versions
1:03:20
|3b|
ok, i'll leave it with mask-signed-field for now
1:03:24
stassats
well, it's just an exercises in squeezing out the most out of this
1:03:39
|3b|
should still be a lot faster than ieee-floats :)
1:03:41
stassats
but in reality, if you want fast, you'll avoid a full-call
Wednesday, 25th of November 2020, 1:23:02 UTC