freenode/#clasp - IRC Chatlog
Search
13:35:51
drmeister
I'm making more progress on the GCing of code and multiple entry points and image save/load.
13:42:17
drmeister
After loading the image I'll walk memory, and for each ObjectFile_O I'll add it to the LLJIT and it will again allocate memory in the AMS pool.
13:43:06
drmeister
Image save just needs to write out the literals that are in the AMS pool and replace them after the code is recreated in the AMS pool.
13:43:34
drmeister
I can choose to have a simple vector of literals or I could get fancy and put the literals directly into the code and have a table of offsets for fixup.
13:43:57
drmeister
So, here is a question, how much better would it be to put the literals directly into the code?
13:44:47
drmeister
I think putting them directly into the code would give the ultimate in performance. It's what sbcl does.
13:48:56
SAL9000
drmeister: when you say "literals directly in the code", do you mean assembler immediates, or jump-over-data?
13:51:14
drmeister
To be very precise I need to generate some examples and show you what I mean in terms of the machine instructions. But rather than read a word out of a vector into a register using RIP-relative addressing we would simply load an 8-byte value into the register. The 8-byte value would be part of the instruction and would be a fixable pointer, directly in the code.
13:52:08
drmeister
It would look like a MOV of an 8-byte constant into a register. The constant part of the instruction IS the fixable pointer.
13:53:06
drmeister
Then I would have a vector of offsets into the code where these fixable pointers live and obj_scan would read, fix and write them back as the MPS moves things around.
13:59:44
SAL9000
drmeister: looks like we're using a name->object map to track "sidecar" data through LLVM, then iterate through the relocation section and match the names
14:07:32
drmeister
With the image save I'll look through the rest of the words in every object for things that look like pointers but point into C++ malloc memory.
14:08:17
drmeister
Then we can track down any remaining C++ objects that need to be handled - there may still be a few.
14:09:38
drmeister
Through a combination of closing and cleaning up things like streams and moving things into GC memory we can totally implement image/save load.
14:10:02
drmeister
Then we really clean things up and get rid of optimizations to speed up startup for instance.
14:18:50
Bike
oh i guess gmp's random stuff is independent of whether you're doing mpz or mpn or what, actually
14:28:27
drmeister
But I think there was a problem with GMP that prevented me from using their random number stuff.
14:29:38
drmeister
It's not the painful dependency it was back in the old days - but it's still a PITA.
14:33:08
drmeister
Damn, LLVM12 is forcing us to provide the function-type when we call or invoke a function.
14:33:57
drmeister
I think adding an argument to all of the functions for setting up a call will need the function-type (sigh).
14:34:32
drmeister
They are moving to this opaque pointer type convention where pointers won't have types anymore.
14:35:20
drmeister
Bitcasts don't have any impact on the code - but generating bitcast instructions and then llvm handling them only for them to disappear is stupid.
14:52:23
drmeister
I wonder what ECL does. They use GMP for other stuff and they certainly don't use boost.
14:56:11
drmeister
Now that we are working with vectors of limbs the activation barrier has been lowered.
15:01:16
Bike
we could, but gmp has a lot of specialized assembly code to do things as fast as possible that we would not be able to write or maintain an equivalent of
15:02:25
Bike
i tried reading the papers behind some of the algorithms and am pretty sure it would take me a while to even understand them
15:02:36
drmeister
Is there any way to get around the random-state problem with GMP? Some way to initialize a GMP random state from a number and then we use that number as the random-state?
15:03:19
Bike
also, it seems like we're not doing this correctly even right now. we can dump and load random states, but can't write and read them
15:07:23
drmeister
Ugh - let's ditch it. Look at the ECL code. They generate limbs directly from their random number generator.
15:17:54
Bike
oh, mpn_random just generates a number of random limbs anyway, whatever clamps it to a range is in mpz
15:18:02
drmeister
I need backtraces again for interpreted code. The llvmtot branch is breaking in the interpreter. Bleh.
17:45:58
Bike
that would kind of entail temporary storage should be on the heap, though, and that's kind of a weird issue in itself...
17:47:07
Bike
this time was in the middle of gc code, i also saw it in contagen_mul, and in a hash table lookup for a symbol
17:49:26
Bike
i shouldn't be using THAT much stack space though... (expt 1-word-bignum 5) should end up using, like, 32 words or something
19:14:44
Bike
a bunch of next-mul calls are (sometimes) enough to do it, but multiplying next-bignums is a really simple function. like twenty lines
19:15:50
Bike
and all that's being used are variable length arrays, it's like i'm doing something funny with alloca
19:22:16
Bike
i guess if it is a stack problem i can switch all the stack alocations to allocating a gcvec or something. kind of sucks though.
20:15:48
Bike
i'm crashing with only positive bignums so it's probably not the gc choking on negative sizes
20:30:08
Bike
and then you can run e.g. (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x)))) until it dies
20:32:23
Bike
um, clasp? i'm not sure what you're asking. i build with cando in but i don't think that matters
20:42:18
Bike
it builds fine. everything still uses the old bignum arithmetic, it's just that the next-bignum arithmetic is also available.
21:08:26
drmeister
(loop repeat 10000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x))))
21:09:44
drmeister
What does this do? advance to the first bignum after most-positive-fixnum and then multiply by 0,1,2...19?
21:10:54
drmeister
I'd worry about writing past the end of the object. Try turning on guards and I'll find the function that checks the guards.
21:12:15
drmeister
Yeah - but other than that what could go wrong? There are no illegal values for the limb vector - right?
21:14:28
Bike
https://github.com/clasp-developers/clasp/blob/bignum/src/core/bignum.cc#L447-L465 as you can see, it allocates result_size limbs on the stack, then passes that (or one less) in as initial contents
21:18:25
drmeister
What about the mpn_ functions. What if they write outside the vector on the stack?
21:20:26
Bike
the requirements for mpn_mul are that neither size input is zero, the destination size is the sum of the input sizes, and the left size is at least as big as the right size.
21:25:01
drmeister
I put two extra elements in the array on the stack and wrote 0xccCCccCCccCCccCC into the first and last one and I'm writing into result_limbs+1
21:29:52
Bike
in the function I linked, result_size is the sum of two absolute values, so it's positive
21:40:46
drmeister
(loop repeat 10000000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 do (core:next-mul x x))))