freenode/#clasp - IRC Chatlog

13:34:56 drmeister Hello everyone

13:35:29 Bike good morning

13:35:51 drmeister I'm making more progress on the GCing of code and multiple entry points and image save/load.

13:38:36 drmeister I'm going to use the AMS pool to store the code and literals.

13:39:25 drmeister And every FunctionDescription will point to a code objects to keep it alive.

13:39:50 drmeister And code objects will point to their ObjectFile_O objects to keep them alive.

13:39:58 drmeister The broken circle of life.

13:41:18 drmeister I just need to subclass JITLinkMemoryManager to allocate in AMS.

13:42:17 drmeister After loading the image I'll walk memory, and for each ObjectFile_O I'll add it to the LLJIT and it will again allocate memory in the AMS pool.

13:43:06 drmeister Image save just needs to write out the literals that are in the AMS pool and replace them after the code is recreated in the AMS pool.

13:43:34 drmeister I can choose to have a simple vector of literals or I could get fancy and put the literals directly into the code and have a table of offsets for fixup.

13:43:57 drmeister So, here is a question, how much better would it be to put the literals directly into the code?

13:44:08 drmeister Currently we access literals using RIP-relative addressing.

13:44:47 drmeister I think putting them directly into the code would give the ultimate in performance. It's what sbcl does.

13:48:56 SAL9000 drmeister: when you say "literals directly in the code", do you mean assembler immediates, or jump-over-data?

13:51:14 drmeister To be very precise I need to generate some examples and show you what I mean in terms of the machine instructions. But rather than read a word out of a vector into a register using RIP-relative addressing we would simply load an 8-byte value into the register. The 8-byte value would be part of the instruction and would be a fixable pointer, directly in the code.

13:52:08 drmeister It would look like a MOV of an 8-byte constant into a register. The constant part of the instruction IS the fixable pointer.

13:53:06 drmeister Then I would have a vector of offsets into the code where these fixable pointers live and obj_scan would read, fix and write them back as the MPS moves things around.

13:54:09 drmeister Like llvm patch-points but inside of an instruction.

13:54:57 drmeister I believe that I can get the offsets from the linker table in the object file.

13:55:06 SAL9000 drmeister: MOV RAX, <8-byte value> is fine, we do that :)

13:55:23 drmeister And you fix it in obj_scan?

13:55:47 drmeister I believe you will answer yes.

13:55:59 SAL9000 Yes.

13:56:01 drmeister If so - how do you get the offsets when you are using llvm generated code.

13:56:06 SAL9000 (I was double-checking in case we do something weird)

13:56:29 SAL9000 that I don't know -- I'm far from fully grokking the llvm bits :(

13:58:07 drmeister This is the best way to do it.

13:59:44 SAL9000 drmeister: looks like we're using a name->object map to track "sidecar" data through LLVM, then iterate through the relocation section and match the names

14:01:31 Bike ok so actually it gets a uintptr which we then pass to a function expecting size_t

14:01:34 Bike what ever

14:02:07 drmeister Ha ha - the secret is that it's all ones and zeros underneath.

14:04:21 Bike it builds again, is the important part

14:04:46 Bike i think i can start actually replacing bignums now

14:04:54 Bike though there are still a few things referring to em directly...

14:05:43 drmeister Whoohooo! That's part of the master plan.

14:07:32 drmeister With the image save I'll look through the rest of the words in every object for things that look like pointers but point into C++ malloc memory.

14:08:17 drmeister Then we can track down any remaining C++ objects that need to be handled - there may still be a few.

14:09:38 drmeister Through a combination of closing and cleaning up things like streams and moving things into GC memory we can totally implement image/save load.

14:10:02 drmeister Then we really clean things up and get rid of optimizations to speed up startup for instance.

14:10:15 drmeister Then we release this damn thing as 1.0

14:11:29 drmeister Like the GMP random seed thing that might need to be handled.

14:12:12 drmeister clhs random-state

14:12:12 specbot http://www.lispworks.com/reference/HyperSpec/Body/t_rnd_st.htm

14:12:17 drmeister Whatever is backing that.

14:13:00 drmeister I've updated the llvmtot branch to build with the LLVM tip of trunk.

14:13:05 drmeister LLVM12 that is.

14:13:31 drmeister That's not even a twinkle in llvm.org's eye.

14:14:24 drmeister Meanwhile, the fork-server appears to be working well with jupyterlab.

14:15:16 Bike oh yeah, i dont' think i've covered random

14:16:05 Bike looks like we actually use boost for that

14:17:42 Bike so that's gonna be an issue

14:18:50 Bike oh i guess gmp's random stuff is independent of whether you're doing mpz or mpn or what, actually

14:19:03 Bike why do we use boost?

14:27:47 Bike uniformity with fixnum randoms, maybe?

14:28:10 drmeister It might be historical.

14:28:27 drmeister But I think there was a problem with GMP that prevented me from using their random number stuff.

14:28:40 drmeister Do they have anything we could use as backing for random-seed?

14:29:03 drmeister If we could eliminate boost as a dependency I'd be all for it.

14:29:38 drmeister It's not the painful dependency it was back in the old days - but it's still a PITA.

14:31:06 drmeister Fewer dependencies = good

14:33:08 drmeister Damn, LLVM12 is forcing us to provide the function-type when we call or invoke a function.

14:33:13 drmeister That's going to take some refactoring.

14:33:57 drmeister I think adding an argument to all of the functions for setting up a call will need the function-type (sigh).

14:34:32 drmeister They are moving to this opaque pointer type convention where pointers won't have types anymore.

14:34:41 drmeister That will be good. It will save us a lot of bitcasting.

14:35:20 drmeister Bitcasts don't have any impact on the code - but generating bitcast instructions and then llvm handling them only for them to disappear is stupid.

14:40:40 Bike gmp randoms have an opaque gmp_randstate_t state

14:40:43 Bike might not be able to serialize it

14:48:04 drmeister That might have been it.

14:48:34 drmeister You need to be able to copy random-state.

14:49:15 Bike well you can copy, just not serialize that i can see

14:49:33 drmeister Or rather, print it out and read it back in.

14:50:20 drmeister http://www.lispworks.com/documentation/HyperSpec/Body/22_acj.htm

14:50:22 drmeister This

14:52:23 drmeister I wonder what ECL does. They use GMP for other stuff and they certainly don't use boost.

14:54:30 drmeister They don't use GMP.

14:54:38 drmeister For random numbers.

14:54:47 drmeister To create random bignums they generate random limbs.

14:55:04 drmeister https://gitlab.com/embeddable-common-lisp/ecl/blob/develop/src/c/num_rand.d#L256

14:55:46 drmeister Could we ditch GMP entirely? Implement bignums ourselves?

14:55:53 drmeister Not now - but later.

14:56:11 drmeister Now that we are working with vectors of limbs the activation barrier has been lowered.

15:01:16 Bike we could, but gmp has a lot of specialized assembly code to do things as fast as possible that we would not be able to write or maintain an equivalent of

15:01:39 drmeister Ah - good point.

15:02:25 Bike i tried reading the papers behind some of the algorithms and am pretty sure it would take me a while to even understand them

15:02:36 drmeister Is there any way to get around the random-state problem with GMP? Some way to initialize a GMP random state from a number and then we use that number as the random-state?

15:03:02 Bike i'm lookin around

15:03:19 Bike also, it seems like we're not doing this correctly even right now. we can dump and load random states, but can't write and read them

15:03:28 Bike (write *random-state* :readably t) -> #<RANDOM-STATE >

15:03:34 drmeister "God made the integers; all else is the work of man." - Leopold Kronecker

15:04:05 drmeister Ouch - my bad.

15:05:27 drmeister Could you fix that?

15:05:45 Bike first i need to understand how the de/serializer right now works

15:05:58 Bike trying to find where in the boost docs it talks about stringifying the generators

15:07:23 drmeister Ugh - let's ditch it. Look at the ECL code. They generate limbs directly from their random number generator.

15:07:47 drmeister Let's not use GMP or boost. Neither fish nor fowl be we.

15:08:01 Bike no, we're going to keep using gmp

15:08:16 drmeister I mean don't use GMP for random numbers.

15:08:22 Bike oh. ok.

15:08:44 Bike i'm not totally sure how to do that properly, though

15:08:48 drmeister ECL does random numbers without GMP.

15:08:51 Bike ensure that generating by limbs gets you something truly uniform i mean

15:08:55 Bike other than test and reject

15:09:54 Bike ecl does mod but i think that's not technically correct

15:10:17 Bike well, test and reject is fine i guess.

15:12:49 Bike man i wish i had my taocp to tell me this

15:15:27 drmeister Do you want me to order you a copy?

15:15:57 Bike i already have a copy, just not with me

15:16:08 drmeister As in do you need it to do this work.

15:16:16 Bike it's sitting securely in my parents' basement

15:16:23 Bike oh, no, i can find info on the internet i'm sure

15:17:01 drmeister Okey doke.

15:17:54 Bike oh, mpn_random just generates a number of random limbs anyway, whatever clamps it to a range is in mpz

15:17:57 Bike so i'd have to replicate it regardless

15:18:02 drmeister I need backtraces again for interpreted code. The llvmtot branch is breaking in the interpreter. Bleh.

15:19:01 drmeister That's the corrosive effect of C++, that is.

15:21:11 Bike mpz tries 80 rounds of test and reject and if that fails uses mod

15:21:23 Bike probably still introduces some bias, but oh well

15:27:06 Bike of course i don't technically know how big a limb is

15:27:28 Bike i guess mp_bits_per_limb is enough probably

15:40:35 Bike i'll just have it use mod and hopefully we can fix it later

15:40:42 Bike don't think anyone's doing cryptography with clasp, anyway

15:40:56 Bike might have distant weird effects on simulations tho...

16:59:21 Bike now i'm getting nondeterministic segfaults for really big next-bignums wooooo

17:44:12 Bike oh, i might just be smashing the stack

17:44:13 Bike hmm

17:45:38 Bike lots of weird stuff is happening so maybe

17:45:58 Bike that would kind of entail temporary storage should be on the heap, though, and that's kind of a weird issue in itself...

17:46:12 drmeister If you smash the stack you see it in lldb with 'bt'

17:46:19 drmeister The backtrace will print for days.

17:46:37 drmeister It's probably something simple.

17:46:45 Bike well the backtraces are finite, they're just in pretty weird places

17:47:07 Bike this time was in the middle of gc code, i also saw it in contagen_mul, and in a hash table lookup for a symbol

17:49:26 Bike i shouldn't be using THAT much stack space though... (expt 1-word-bignum 5) should end up using, like, 32 words or something

18:16:53 selwyn so you are planning to put in image loading and saving? that's nice

18:19:01 Bike and this time it's something about an unbound class holder

18:19:06 Bike it's a segfault overall, though

19:08:23 Bike yeah i can't figure this out. can't even reproduce it regularly

19:14:44 Bike a bunch of next-mul calls are (sometimes) enough to do it, but multiplying next-bignums is a really simple function. like twenty lines

19:14:58 Bike and of course most of the time it does work

19:15:50 Bike and all that's being used are variable length arrays, it's like i'm doing something funny with alloca

19:22:16 Bike i guess if it is a stack problem i can switch all the stack alocations to allocating a gcvec or something. kind of sucks though.

19:22:28 Bike i'm gonna replace this one with a calloc/free and see if it helps.

19:45:49 Bike okay, that didn't help.

20:04:01 yitzi drmeister: Pushed file completion. quickclasp wll need to be updated for CLJ.

20:15:48 Bike i'm crashing with only positive bignums so it's probably not the gc choking on negative sizes

20:27:42 Bike I got nuthin. drmeister, any chance you could have a look at it?

20:28:46 drmeister Sure - put it in a branch?

20:29:30 Bike yeah, it's the "bignum" branch on github.

20:30:04 drmeister Or - you are on bigmac

20:30:08 Bike and then you can run e.g. (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x)))) until it dies

20:31:02 drmeister This is on bigmac - right?

20:31:07 Bike yeah

20:31:55 drmeister clasp or cando?

20:32:23 Bike um, clasp? i'm not sure what you're asking. i build with cando in but i don't think that matters

20:32:36 drmeister Where is the executable that crashes.

20:32:52 Bike src/build/boehm/iclasp-boehm

20:32:59 Bike er, src/clasp/build

20:34:48 drmeister It doesn't let me attach.

20:35:07 Bike huh? oh, lldb? yeah, the security whatever is on so you have to sudo.

20:36:13 drmeister It doesn't crash.

20:36:22 Bike it's not regular.

20:36:30 Bike try doing it more

20:39:00 drmeister I'm going to have to build it on linux

20:41:17 Bike did you get it to crash?

20:41:40 drmeister I did - but it's coming out of clasp compiled function.

20:41:45 drmeister How far do you get in the build?

20:42:18 Bike it builds fine. everything still uses the old bignum arithmetic, it's just that the next-bignum arithmetic is also available.

20:42:57 drmeister https://www.irccloud.com/pastebin/1qDD0l5e/

20:43:14 Bike so the long thing breaks it.

20:43:18 Bike my prophecy was correct

20:43:37 Bike delete the definition with long, i guess.

20:43:45 Bike then it'll build on linux but not mac.

20:44:47 drmeister https://www.irccloud.com/pastebin/yJcLBZz9/

20:58:15 drmeister Still building [108 of 454

21:07:14 drmeister I have to put it in a loop to get it to fail on linux

21:07:25 Bike it is irregular.

21:07:32 Bike good to know it's cross-OS though

21:07:40 drmeister What means irregular?

21:07:48 Bike nondeterministic

21:08:26 drmeister (loop repeat 10000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 for i = x then (core:next-mul i x))))

21:09:44 drmeister What does this do? advance to the first bignum after most-positive-fixnum and then multiply by 0,1,2...19?

21:09:50 drmeister No.

21:09:59 drmeister Yes

21:10:00 Bike it computes most-positive-fixnum to the nth power

21:10:24 Bike n = 20

21:10:28 Bike or 19 or something

21:10:54 drmeister I'd worry about writing past the end of the object. Try turning on guards and I'll find the function that checks the guards.

21:11:31 Bike not sure i understand. the bignum is only written to during its creation

21:12:15 drmeister Yeah - but other than that what could go wrong? There are no illegal values for the limb vector - right?

21:13:10 Bike right.

21:13:26 Bike i mean, each limb has no illegal values.

21:13:37 Bike the bignum could be invalid if its length is longer than its actual storage, i guess?

21:13:51 Bike in this case the memory allocation aspect is very simple, though

21:14:05 drmeister Yeah - and that would clobber a neighboring object or the boehm memory.

21:14:28 Bike https://github.com/clasp-developers/clasp/blob/bignum/src/core/bignum.cc#L447-L465 as you can see, it allocates result_size limbs on the stack, then passes that (or one less) in as initial contents

21:16:58 Bike so, what do i do? put DEBUG_GUARDS or something in debug options?

21:18:25 drmeister What about the mpn_ functions. What if they write outside the vector on the stack?

21:18:38 Bike then gmp would be broken.

21:18:56 Bike or we could be calling it wrong, i guess.

21:19:02 drmeister DEBUG_GUARD and DEBUG_GUARD_VALIDATE

21:20:26 Bike the requirements for mpn_mul are that neither size input is zero, the destination size is the sum of the input sizes, and the left size is at least as big as the right size.

21:21:58 drmeister Limbs are 64 bits?

21:22:08 Bike probably

21:22:20 Bike they're C longs

21:22:53 Bike or maybe not longs, but they're supposed to be a machine word

21:23:17 Bike there's an mp_bits_per_limb constant to look at if you want

21:25:01 drmeister I put two extra elements in the array on the stack and wrote 0xccCCccCCccCCccCC into the first and last one and I'm writing into result_limbs+1

21:25:59 drmeister That doesn't seem to be a problem.

21:29:11 drmeister These never go negative - right?

21:29:23 Bike what never go negative?

21:29:49 drmeister The bignums in this test code.

21:29:52 Bike in the function I linked, result_size is the sum of two absolute values, so it's positive

21:29:59 drmeister I'm being lazy and asking.

21:30:09 Bike the bignum itself could be negative if exactly one of the arguments is negative

21:30:12 drmeister https://www.irccloud.com/pastebin/x23en8VU/

21:30:32 drmeister So the stack isn't being clobbered.

21:30:34 Bike but in this breaking example it's all positive numbers obviously

21:30:42 drmeister Yeah

21:40:37 drmeister I return _Unbound<TheNextBignum_O>() and try this...

21:40:46 drmeister (loop repeat 10000000 do (let ((x (core:next-from-fixnum most-positive-fixnum))) (loop repeat 20 do (core:next-mul x x))))

21:40:56 drmeister It doesn't crash