libera/#sbcl - IRC Chatlog
Search
3:08:20
pf3
hello, i'm engaged in a casual optimization golf. i've made the following (non-portable, and otherwise bad) code http://okturing.com/src/14959/body not cons, except for the last multiply, which insists on allocating a bignum. is there any way to trick python here to do a uint64*uint64->uint64 multiply?
3:09:54
hayley
I think (ldb (byte 64 0) (* ...)) should hint that you only care about the low 64 bits. But you will likely need to inline XORSHIFT1024, so that the return value may be passed as an unboxed value back to the caller.
3:12:20
pf3
aah, that's what it is, the return value. now i understand, and the note i saw at some point while trying to muck it also makes sense. thank you
3:27:59
pf3
oh, is this because 64 bit value doesn't have space for tag on a 64 bit system, so it has to be boxed?
3:30:24
hayley
Right. If the return value exceeds the range of a fixnum, SBCL needs to allocate a bignum to store the value.
3:32:10
pf3
right, dropping the number of bits make it work as is. now it makes sense, in my mind (usigned-byte 64) was immediate, but that's obviously wrong.
5:50:25
moon-child
st something to the effect of (progn (assert (< 0 i (+ i 10) (length array))) bunch of stuff with (aref array (+ i something between 0 and 10)))
5:51:18
hayley
I don't recall that being elided, as there is similar done manually in one-more-re-nightmare.
5:56:29
mfiano
moon-child: You can disable bounds checking for a particular body of code if that is what you are asking
5:58:04
hayley
The issue is if the compiler can automatically prove that bounds checks aren't necessary, and remove them itself.
6:00:26
mfiano
I wouldn't want that. I went through hell in Julia because of that. My CPU's FPU resources were bottlenecking out because of some bounds checks the compiler thought would be faster to remove.
6:01:17
mfiano
order of magnitude difference with bounds checking on for 1 particular array access :)
6:02:47
mfiano
I used llvm-mca to figure out why forcing an array to bounds check was MUCH faster than the compiler eliding the check for me.
6:02:58
moon-child
it was faster to have some bounds check than to not have it? Sounds like a separate issue which that simply revealed
6:04:52
moon-child
it is strictly less work to perform a bounds check than to not. At _most_, I would perhaps expect a few % difference from scheduling if you get unlucky. And I would not expect llvm-mca to catch that
6:04:57
mfiano
It was very short code on several occurences being faster to remove the elisison of one or a couple bounds checks
6:05:26
mfiano
Thing is, if one resource is topped out, you suffer, and my CPU has a ton of resources just for FPU alone
6:05:59
moon-child
if you are unlucky, you can get screwed over by scheduling, yes. That is not at all, not even a little bit, an argument that we should not elide bounds checks
6:06:06
mfiano
Good code will try to balance a particular piece of hardware's resources, not just assume it will run optimally by making the code optimal
11:32:32
scymtym
i'm pretty sure SBCL's constraint system can propagate /some/ information regarding indices always being within array bounds. see ARRAY-IN-BOUNDS-P in compiler/constraint.lisp