freenode/#clasp - IRC Chatlog
Search
13:48:56
SAL9000
drmeister: when you say "literals directly in the code", do you mean assembler immediates, or jump-over-data?
13:51:14
drmeister
To be very precise I need to generate some examples and show you what I mean in terms of the machine instructions. But rather than read a word out of a vector into a register using RIP-relative addressing we would simply load an 8-byte value into the register. The 8-byte value would be part of the instruction and would be a fixable pointer, directly in the code.
13:52:08
drmeister
It would look like a MOV of an 8-byte constant into a register. The constant part of the instruction IS the fixable pointer.
13:53:06
drmeister
Then I would have a vector of offsets into the code where these fixable pointers live and obj_scan would read, fix and write them back as the MPS moves things around.
13:59:44
SAL9000
drmeister: looks like we're using a name->object map to track "sidecar" data through LLVM, then iterate through the relocation section and match the names
14:07:32
drmeister
With the image save I'll look through the rest of the words in every object for things that look like pointers but point into C++ malloc memory.
14:08:17
drmeister
Then we can track down any remaining C++ objects that need to be handled - there may still be a few.
14:09:38
drmeister
Through a combination of closing and cleaning up things like streams and moving things into GC memory we can totally implement image/save load.
14:10:02
drmeister
Then we really clean things up and get rid of optimizations to speed up startup for instance.
14:18:50
Bike
oh i guess gmp's random stuff is independent of whether you're doing mpz or mpn or what, actually
14:28:27
drmeister
But I think there was a problem with GMP that prevented me from using their random number stuff.
14:29:38
drmeister
It's not the painful dependency it was back in the old days - but it's still a PITA.
14:33:08
drmeister
Damn, LLVM12 is forcing us to provide the function-type when we call or invoke a function.
14:33:57
drmeister
I think adding an argument to all of the functions for setting up a call will need the function-type (sigh).
14:34:32
drmeister
They are moving to this opaque pointer type convention where pointers won't have types anymore.
14:35:20
drmeister
Bitcasts don't have any impact on the code - but generating bitcast instructions and then llvm handling them only for them to disappear is stupid.
14:52:23
drmeister
I wonder what ECL does. They use GMP for other stuff and they certainly don't use boost.
14:56:11
drmeister
Now that we are working with vectors of limbs the activation barrier has been lowered.
15:01:16
Bike
we could, but gmp has a lot of specialized assembly code to do things as fast as possible that we would not be able to write or maintain an equivalent of
15:02:25
Bike
i tried reading the papers behind some of the algorithms and am pretty sure it would take me a while to even understand them
15:02:36
drmeister
Is there any way to get around the random-state problem with GMP? Some way to initialize a GMP random state from a number and then we use that number as the random-state?
15:03:19
Bike
also, it seems like we're not doing this correctly even right now. we can dump and load random states, but can't write and read them
15:07:23
drmeister
Ugh - let's ditch it. Look at the ECL code. They generate limbs directly from their random number generator.
15:17:54
Bike
oh, mpn_random just generates a number of random limbs anyway, whatever clamps it to a range is in mpz
15:18:02
drmeister
I need backtraces again for interpreted code. The llvmtot branch is breaking in the interpreter. Bleh.
17:45:58
Bike
that would kind of entail temporary storage should be on the heap, though, and that's kind of a weird issue in itself...
17:47:07
Bike
this time was in the middle of gc code, i also saw it in contagen_mul, and in a hash table lookup for a symbol
17:49:26
Bike
i shouldn't be using THAT much stack space though... (expt 1-word-bignum 5) should end up using, like, 32 words or something
19:14:44
Bike
a bunch of next-mul calls are (sometimes) enough to do it, but multiplying next-bignums is a really simple function. like twenty lines
19:15:50
Bike
and all that's being used are variable length arrays, it's like i'm doing something funny with alloca
19:22:16
Bike
i guess if it is a stack problem i can switch all the stack alocations to allocating a gcvec or something. kind of sucks though.
20:15:48
Bike
i'm crashing with only positive bignums so it's probably not the gc choking on negative sizes
20:30:08
Bike
and then you can run e.g. (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x)))) until it dies
20:32:23
Bike
um, clasp? i'm not sure what you're asking. i build with cando in but i don't think that matters
20:42:18
Bike
it builds fine. everything still uses the old bignum arithmetic, it's just that the next-bignum arithmetic is also available.
21:08:26
drmeister
(loop repeat 10000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x))))
21:09:44
drmeister
What does this do? advance to the first bignum after most-positive-fixnum and then multiply by 0,1,2...19?
21:10:54
drmeister
I'd worry about writing past the end of the object. Try turning on guards and I'll find the function that checks the guards.
21:12:15
drmeister
Yeah - but other than that what could go wrong? There are no illegal values for the limb vector - right?
21:14:28
Bike
https://github.com/clasp-developers/clasp/blob/bignum/src/core/bignum.cc#L447-L465 as you can see, it allocates result_size limbs on the stack, then passes that (or one less) in as initial contents
21:18:25
drmeister
What about the mpn_ functions. What if they write outside the vector on the stack?
21:20:26
Bike
the requirements for mpn_mul are that neither size input is zero, the destination size is the sum of the input sizes, and the left size is at least as big as the right size.
21:25:01
drmeister
I put two extra elements in the array on the stack and wrote 0xccCCccCCccCCccCC into the first and last one and I'm writing into result_limbs+1
21:29:52
Bike
in the function I linked, result_size is the sum of two absolute values, so it's positive
21:40:46
drmeister
(loop repeat 10000000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 do (core:next-mul x x))))
21:51:46
drmeister
It depends on the DEBUG_GUARD_EXHAUSTIVE_VALIDATE being turned on. But that might slow down compilation quite a lot.
22:00:38
drmeister
I pushed a few changes, one to add (gctools:validate-object xxx) and the other to suppress that overload that linux hates.
22:01:11
Bike
" ../../src/gctools/memoryManagement.cc:342 Invalid object with header @ 0x125ea4b40 message: header stamps are invalid"
22:47:08
drmeister
I should write a function like ROOM that walks all of the memory and checks every object
22:52:37
Bike
"../../src/gctools/memoryManagement.cc:342 Invalid object with header @ 0x124bc2ab0 message: bad tail content"
22:54:21
drmeister
For some reason when I recompiled cclasp and bclasp images won't load on linux - so I have to rebuild cclasp
23:39:31
drmeister
So I get... ./../src/gctools/memoryManagement.cc:342 Invalid object with header @ 0x14425820 message: bad tail content
23:45:20
drmeister
From https://github.com/clasp-developers/clasp/blob/bignum/include/clasp/gctools/memoryManagement.h#L487
23:46:01
drmeister
I replicated info in the header incase something gets clobbered I can still reconstruct the info for the object.
23:48:13
drmeister
The _tail_start is 0x60 and the _tail_size is 0x60. The tail_size can vary from object to object and allocation to allocation in a random way to try and avoid bugs with registration and alignment.
23:50:41
drmeister
The header is at 0x14425820 and the client starts at 0x14425860 (vtable-ptr, badge, number of limbs, limbs(2))
23:51:47
drmeister
The tail SHOULD start at 0x14425820+0x60 = 0x14425880 - and it should contain 60 0xcc
23:52:30
drmeister
But the first 8 bytes of the tail contain 0x03ffffffffffffff - so we aren't calculating the size of the object properly when we are allocating it.
23:56:35
Bike
uh.... so what's the fix here? do you mean the wrong size is being passed to TheNextBignum_O::create?
23:59:25
drmeister
I'm not sure. But it looks like a mismatch somewhere between what the allocator things the layout of the object should be and what you are writing into it.
23:59:46
drmeister
I'll have to dig around some more. Right now my girls are dragging me out the door to get some groceries.