freenode/#clasp - IRC Chatlog

14:27:15 drmeister Hello everyone.

14:27:35 drmeister I hit a bit of a roadblock in image save/load with Boehm - it's still possible in MPS.

14:28:34 Bike the precise boehm stuff isn't enough?

14:29:16 drmeister The problem with Boehm is 1. I need to put data and code close together for RIP relative addressing 2. So I need to put literal vectors into GC managed memory along with lots of code 3. I want the GC to ignore the code - it contains no pointers and I expect it will slow things down 4. So with Boehm I need to write my own marking procedure.

14:30:55 drmeister 5. There's a comment in the boehm header file that appears to have huge implications for this - I don't know how to get around it. They say that the marking function should only mark about 100 bytes of pointers before returning. I have no idea how I do that with things like simple-vector. I need to investigate what ECL does in this case.

14:32:05 drmeister https://github.com/ivmai/bdwgc/blob/master/include/gc_mark.h#L36

14:33:28 drmeister https://www.irccloud.com/pastebin/9Lh71L7x/

14:34:13 drmeister "do it in smaller pieces"???? How would I save the how much marking work I've done for any particular object and then pick up later?

14:36:23 drmeister This is ECL's marking function

14:36:24 drmeister https://github.com/roswell/ecl/blob/develop/src/c/alloc_2.d#L288

14:36:31 drmeister I don't see it marking simple-vectors

14:37:08 drmeister I don't see how it can work - other than for objects that it doesn't handle everything is scanned conservatively.

14:37:42 drmeister That's not going to work with my plan for Code_O objects.

14:42:11 drmeister What I need is something where I can say "for this object - only scan up to this point".

14:43:54 drmeister Hmmm, now that I say this something occurs to me.

14:45:36 drmeister ECL marks fields of well defined objects in a simple way in its marking function - ECL doesn't worry about overflowing the mark stack because the well defined objects only have a few fields to mark.

14:46:02 drmeister I'm assuming for the moment that when it doesn't handle an object that boehm marks the entire object conservatively.

14:46:57 drmeister If Code_O objects had small literal vectors (not something I can guarantee) then I could handle them like a well defined object.

14:47:36 drmeister I should check the size of literal vectors. They are unbounded and there is no easy way to bound them.

14:47:52 drmeister Their size is unbounded.

14:49:44 drmeister I've emailed Ivan Maidanski - the maintainer of boehm to see if he can give me pointers on this. https://github.com/ivmai

14:49:59 drmeister Hi karlosz

14:50:06 karlosz hello

14:50:30 drmeister Could you check the log for the last 30 min? I posted some stuff that I'd love to have your input on.

14:51:08 karlosz sure thing

14:55:23 karlosz drmeister: i'm not too familiar with how boehm scanning works, but even though vectors are unbounded in size, the size is still stored in a known position in the object, so couldn't the marking procedure just use that? or is the problem that you'd need to write a custom procedure in that case to parse the objects?

14:56:51 drmeister The problem is they say I can only do a certain amount of marking work and if I don't fully mark everything that I push the object back onto the marking stack.

14:57:12 drmeister How do I keep track of how much marking work is done if the marking function returns after doing a certain amount of marking work.

14:57:44 drmeister Or does it not return and it calls a function after doing a certain amount of marking and then it continues?

14:58:02 drmeister "it" in that last sentence meaning my marking function.

15:01:36 karlosz well, since you're supplying the mark procedure, can't you can keep track of the marking work there?

15:02:08 Bike the marking procedure you define is supposed to do only a small amount of work and is called rpeeatedly by boehm while it's marking

15:02:09 drmeister jackdaniel: In the precise GC mode ECL doesn't appear to mark things like the contents of vectors or hash-tables. Where does that happen?

15:02:11 Bike i think

15:02:26 drmeister Bike: That's what I thought.

15:02:34 Bike so drmeister is wondering how to keep state between calls

15:02:39 drmeister Right.

15:03:05 karlosz and you can't use a global or static variable?

15:03:17 Bike i suppose it would have to be thread local

15:03:18 drmeister Also, I can't be sure that I'll be called repeatedly for the same object.

15:03:40 drmeister I hope it would be called repeatedly for the same object.

15:04:27 karlosz hm, i guess otherwise you'd have to add a field to each object for the gc to keep track of the state

15:05:59 karlosz the comment there also seems to suggest that it's not actually necessary to break it up into smaller chunks

15:07:11 karlosz only that its an optimization?

15:07:44 karlosz in which case only varyobj sized objects would need to keep track of how much marking has been done, assuming that does incur space overhead of some kind

15:10:14 jackdaniel drmeister: I'm not looking very carefully at the code, but cl_object_mark_proc seems to mark the contents of vectors

15:10:41 jackdaniel self.t contains vector data, so the array pointer is marked, similar with hash.data

15:10:51 jackdaniel doesn't boehm traverse pointers from arrays? maybe not

15:14:42 drmeister jackdaniel: What does vector.data and hash.data point to?

15:15:36 jackdaniel to pointers (i.e ecl_unit32_t *b32) and hash table entry

15:15:47 jackdaniel but I may be wrong regarding what gc does

15:17:00 drmeister What does a hash table entry look like? Is it a key/value pair? Does hash.data point to a vector of key/value pairs that are GC managed pointers?

15:17:12 drmeister Does the precise mode work in ECL?

15:18:16 jackdaniel afaik it does work, hwoever it has some bugs (this may be one of these)

15:18:18 drmeister karlosz: I have the same feeling that it might be an optimization and not actually necessary - but I'm anxious about that.

15:18:44 jackdaniel hashtable_entry is a structure and it doesn't seem to be marked in alloc2

15:18:45 beach Does "precise mode" mean what I think it does? And if so, how much stack-scanning code does the client need to supply?

15:19:03 drmeister If the marking function is called repeatedly on the same object (I can check that) then I could keep the state of marking in thread local storage.

15:19:20 Bike i think it just means precise heap scanning.

15:19:20 drmeister beach: I guess it's still conservative on the stack.

15:19:26 Bike as opposed to how boehm usually works

15:19:37 beach Got it, thanks!

15:21:39 drmeister You are supposed to push the object back on the marking stack if it isn't finished. Maybe that ensures that it is the next object to be passed to the marking function. Then this wouldn't be so much trouble.

15:22:34 drmeister Still, it's going to complicate the code.

15:23:04 drmeister I haven't yet gotten boehm to call my marking function - so I'm missing something.

15:23:21 drmeister I'm calling GC_new_proc(my_marking_function) - I thought that was all it took.

16:17:17 Bike the stuff i had to speed up typecase probably isn't immediately suitable for subtypep because it's simplifying in different ways

16:17:27 Bike the question of what cases to optimize is kind of interesting

16:19:00 Bike and of course for type inference we can lose a lot of precision. karlosz, what kind of types do you think will come up most? i mean we can probably lose stuff like satisfies, right?

16:21:11 Bike optimizing just cl:subtypep generally might be something to avoid. we need it early which severely limits our options, and it has a lot of distinct use cases

16:24:55 Bike handling numbers, arrays, functions, and values types and collapsing everything else to classes might be good. maybe conses, not sure

16:32:05 Bike although i'm not sure of the imprecision semantics. if a function is defined as taking a (satisfies foo) argument and we collapse that into T and eliminate the test that seems unfortunate

16:44:46 karlosz Bike: from instrumenting subtypep the main problems right now are that it doesn't handle cons types at all

16:44:51 karlosz the ftypes you wrote have a ton of cons types

16:45:12 karlosz so everytime subtypep (canoncialize-type) encounters a cons type it does a C++ unwind tanking compiler performance

16:45:21 karlosz plus we basically lose out on any type inference with cons types

16:45:32 karlosz since subtypep with a cons type gives NIL NIL always

16:46:30 karlosz i think right now subtypep unwinding on cons types contributes to 10-20% compiler overhead at the moment on certain system

16:46:35 karlosz certain systesm

16:46:57 karlosz i wouldn't think about satisifes too much, it hardly comes up

16:47:42 karlosz the next thing that subtypep encounters a lot are unknown types coming from type declarations in defclass's and stuff like that

16:48:11 karlosz about 90% of the types we encounter in self build that we can't handle right now in subtypep are CONS types

16:50:39 karlosz to implement cons types properly we'll need to throw the bit vector implementation out and replace it with a normalizing interpreter like the one in sbcl i think

16:51:20 Bike well what i mean here is we don't need to use cl:subtypep in the compiler, we can have the ctype methods do something more specifically tailored to our needs

16:53:02 Bike i though the fact cl:subtypep gives up on cons types is nonconforming, it looks like

16:53:07 Bike although the fact*

16:55:01 karlosz Bike: it seems hard to make subtypep in cleavir handle cons types without invoking cl:subtypep

16:55:17 karlosz since you can have cons types embedded anywhere

16:55:30 karlosz so it seems like cons type handling really does need to be done in cl:subtypep

16:55:31 Bike what i'm saying is essentially we write a more restricted subtypep implementation.

16:55:42 Bike or, well, i write.

16:56:09 karlosz sure, i mean we do some of that with the values stuff already

16:56:22 Bike and then with cons types it can just call itself.

16:56:34 karlosz what can call itself?

16:56:42 Bike the special subtypep implementation

16:57:27 karlosz can cons types be embedded in any other specialized types?

16:57:39 karlosz that's what im worried about

16:57:46 Bike you can use them in complex and array types, if that's what you mean

16:58:13 karlosz yeah, that's why i thought it might be difficult to handle cons types outside cl:subtypep

16:58:29 karlosz since then we'd need the specialized subtypep to prase out complex and array types

16:58:35 Bike well, yes.

16:58:39 karlosz ah, okay

16:59:07 karlosz so essentially this restrictred subtypep will just handle compund and specialized types and punt to cl:subtypep for the more rpimtiive sutff

16:59:21 Bike no, i'm saying it won't punt to cl:subtypep ever.

16:59:46 Bike i mean it will be clasp-specific

17:00:02 karlosz ahhhh i thought you meant the cleavir ctype subtypep implementation

17:00:04 karlosz got it

17:00:07 karlosz yeah that should work

17:00:17 Bike yeah i meant like, specializing the methods rather than using the default implementation

17:00:24 karlosz right

17:00:25 Bike have to fix cleavir-bir to actually respect that, but i need to do that anyway

17:00:39 karlosz great

17:00:51 karlosz right now i'm making the literal dumper faster

17:01:07 karlosz apparently ltv/bignum is actually pretty common

17:01:12 karlosz but dumping it out as a string is slow

17:01:21 karlosz so probably should just dump out the raw bytes

17:01:49 karlosz ironclad dumps out a lot of bignums, i'm guessing for all the unsigned-byte 64 declarations

17:02:01 Bike ooh, yeah, that would do it.

17:04:32 Bike if you look at core__next_primitive_string in bignum.cc you can see how to get at the raw data. it's pretty simple. i guess we'd want functions to read and write it from a stream, or something?

17:04:52 karlosz okay, thanks

17:05:51 Bike next-primitive-string prints out the words for the bignum. you can try like (core:n-p-s (* m-p-fixnum m-p-fixnum))

17:05:59 Bike of course it prints in decimal but the actual data is the mp_limb_t's

17:09:15 karlosz well i was just thinking of dumping the bignum by dumping the bytes directly into the stream with a header for how many bytes

17:09:27 karlosz are you saying we still need to go through strings?

17:09:52 Bike nono, i'm just saying you can get a look at the data this way

17:10:19 Bike the data is just a size (which is sometimes negative to indicate that the bignum is) and an array of mp_limb_t, which is uint64_t or so

17:10:30 Bike actually i think it's long... i don't remember

17:19:37 karlosz OK great

17:19:57 karlosz i think it in terms of dumping the representation doesn't matter

17:20:02 karlosz but for loading it probably will yeah

17:21:55 Bike ok it looks like sbcl actually does do what i was thinking of doing but i was worried about the performance of

17:21:59 Bike so i suppose it's probably good enough

17:22:58 karlosz you mean just dumping the bytes and loading them in?

17:23:15 Bike no i'm back to thinking about subtypep

17:23:20 karlosz ah okay

17:23:24 Bike i'm sure dumping and loading the bytes for bignums is fine

17:23:33 Bike it's just an array of longs

17:23:49 karlosz yeah the subtypep implementation is just a normalizing interpreter really

17:24:03 karlosz the els paper shows that the performance is about the same as a bitvector approach

17:39:18 karlosz hm, i don't see any ltvc calls in the ll file

17:39:44 Bike yeah it goes through the virtual machine now i think

17:40:08 karlosz ah

17:40:20 karlosz is there a way to view that?

17:41:33 Bike the virtual machine definition? yeah let me try to remember where it's generated

17:41:41 Bike man the comment on top of cmpliteral desperately needs updating

17:42:05 karlosz er, i just mean the dumped byte codes in a human readable way

17:42:14 Bike Oh. No, I don't think so

17:42:20 karlosz ah okay

17:42:50 Bike the definition is in src/core/byte-code-interpreter.cc. it's generated by that stuff at the end of cmpliteral

17:43:10 karlosz right

18:00:16 karlosz is bits-per-limb exposed to lisp?

18:00:38 karlosz i guess the way the mp interface is written it would be better to dump out the limbs and then load them in

18:02:11 kpoeck Hello all

18:03:20 kpoeck I am having some fun with floating point exceptions in the latest version of clasp

18:03:32 kpoeck all on macos

18:04:15 kpoeck (mod 1.0 0.0) -> abort trap 6, perhaps within Bignum_O::create(v)

18:05:45 kpoeck I wanted to say (mod 1 0) hangs, but this is no longer true after a distclean

18:07:03 kpoeck Now it hangs with 100% cpu load

18:08:08 kpoeck debugger says in hangs in core::clasp_truncate at num_co.cc:171

18:09:10 kpoeck (mp:process-run-function :foo #'(lambda() (mod 1 0))) does not hang, but cpu is at 100%

18:10:31 kpoeck after evaluating (mp:all-processes) 3 times, I finally get the expected DIVISION-BY-ZERO exception and cpu lod is back down

18:11:03 kpoeck Do you also observe this?

18:12:21 Bike (mod 1 0) gets me an immediate division-by-zero

18:12:28 Bike karlosz: i don't think it is exposed

18:13:01 Bike oh, now it's hanging

18:13:02 Bike super

18:14:46 Bike well, as you can see, mod calls clasp_floor which calls clasp_truncate

18:14:54 Bike clasp_truncate will do a C "1 / 0"

18:15:05 Bike and that's doing some stupid bullshit instead of FPE i guess?

18:17:53 kpoeck let me try this in xcode

18:18:16 Bike well, maybe it is doing an FPE and it ends up in our signal handler and that hangs for whatever reason

18:18:56 kpoeck I put a fprintf in the beginning of handle_fpe

18:19:21 kpoeck to stderr, which I hope is not a buffered stream

18:19:32 Bike printf in a signal handler might make weird things happen

18:19:41 Bike weirder, i guess, we're already in weird territory

18:19:55 kpoeck so how could I test?

18:20:20 kpoeck a breakpoint in a signal handler is probably not better

18:20:26 Bike eh, just try it with printf

18:20:37 Bike i'm just saying it might not be as clean as usual printf debugging

18:35:19 kpoeck If i put the following in the beginning of clasp_truncate, the error goes away (not surprising)

18:35:21 kpoeck unlikely_if(clasp_zerop(divisor)) ERROR_DIVISION_BY_ZERO(dividend, divisor);

18:43:43 Bike sure. we should be able to rely on the hardware for this, though.

18:48:17 kpoeck yup

20:24:03 karlosz i changed ltvc_make_next_bignum to take a length argument instead of a string and now i get ("Mismatch of ltvc read types read '7' expected 's'")

20:24:13 karlosz do i also need to change anything in aclasp?

20:33:30 karlosz oh, it was a stale fasl thing

20:47:48 Bike yeah, i've gotten that same error for the same reason.

21:16:51 drmeister A light is starting to come on above my head. This mysterious stuff in the boehm gc_mark.h file is starting to look less mysterious.

21:16:52 drmeister https://github.com/ivmai/bdwgc/blob/master/include/gc_mark.h#L76

21:17:56 drmeister It's a tagging scheme for objects in Boehm. It's got tag bits and something like a stamp (proc_index) and something they call (env).