freenode/#clasp - IRC Chatlog
Search
18:41:28
drmeister
Since we are doing full inlining by repeatedly using partial inlining - that we are spending a lot of time doing partial inlining?
18:43:01
drmeister
When we do HIR inlining we are doing full inlining of functions by repeatedly doing partial inlining. It sounds like this is very expensive - is that the case?
18:45:04
karlosz
im not so familiar with the code, but if the inlining is to be done at the hir level and not the ast level then either way youd have to copy all the instructions to be inlined
18:49:00
karlosz
would it be more than just adding the copied instruction to the ownership table with the caller as its owner?
18:59:00
drmeister
Some other way - whole function at a time. The "trivial" way that compiler books say is "trivial".
18:59:37
drmeister
How much time is spent recomputing ownership information? We can profile that with dtrace.
19:05:34
Bike
i couldn't tell you how long it takes. i'm making an estimated guess based on my dumb code
19:26:50
drmeister
Shit - I was doing a (gctools:garbage-collect) for every top-level form read into the REPL.
19:32:20
karlosz
and it should account for at least 75% of the time in partial inlining, if im reading the profile correctly
19:32:29
drmeister
I think I thought in an interactive session it would be a reasonable thing to do. But the same code was being used during LOAD
19:38:37
Shinmera
It could be more excusable if it happens during a REPL clear (C-c M-o) but the implementation doesn't know about that.
19:40:17
drmeister
I saw it immediately now that I can profile MPS. count-calls showed 64% of the time was in gctools__garbage_collecte
19:41:15
drmeister
It never showed up with Boehm - I guess Boehm has a test and returns early if there is no point to a GC.
19:42:46
drmeister
./waf build_bmps (after ./waf build_amps was run) takes 8m54s - that's comparable to Boehm.
19:46:41
drmeister
This is the first time I am running using MPS with the new bclasp optimizations I added a few months ago - the ones that rewrite the llvm-ir to move lexical variables that can be register out of closures.
19:49:30
drmeister
So, I have this big, curved Samsung monitor - I think when I plug it in to my macbook pro that compilation slows down - could that be?
19:52:11
Shinmera
Either way, for instance a 1080p truecolor image only takes up about 6MB. Hardly noteworthy.
19:54:15
drmeister
It was every two years - but this looks like a significant refresh and we need another mac laptop.
19:55:19
Shinmera
I wonder when I'll buy a new workstation. Had this baby for I think 7 years now and it still runs just fine.
19:57:41
drmeister
If I can shave a few minutes off of clasp build time - cumulatively it will improve my quality of life.
20:01:03
drmeister
Bike: Could you write a make-load-form-saving-slots for classes and slot-descriptors?
20:01:50
drmeister
Doesn't have to be now. But I'd like to explore generating code to save the system.
20:02:44
Shinmera
drmeister: No, I mean setting up your new system until everything runs again as you need it to
20:04:14
drmeister
I think I do a clean rebuild this time though - I think I'm accumulating problems.
20:06:24
drmeister
But if I'm compiling on all cores and close it - I might as well have cycled the power.
20:17:51
nivpgir
at first my laptop froze for a while, and when it returned the build crashed, so I thought It was related, so I ran ./waf build_... again, and it compiled a few more files, but then crashed again on the same error
20:27:00
drmeister
Ok, previously ... 'STLIB': ['boost_filesystem', 'boost_date_time', 'boost_program_options', 'boost_system', 'boost_iostreams'],
20:30:08
drmeister
Could you do that? Then I'm sure I'm not wasting time chasing some phantom problem.
20:30:28
nivpgir
but with one job I'm afraid it's gonna have to stay running for the night, it's getting late here
20:30:55
drmeister
Ok. I'll be on tonight and tomorrow - if you come back with the results and they are the same I'll dig deeper.
20:37:45
drmeister
I really thought switching from static to dynamic libraries would make the difference.
20:40:07
nivpgir
my initial though was that I didn't install the correct dependencies, that I didn't convert package names correctly somehow
20:41:10
drmeister
What we are seeing looks exactly like what is described here: https://tecnocode.co.uk/2014/10/01/dynamic-relocs-runtime-overflows-and-fpic/
20:43:26
drmeister
We followed the prescription "Another (highly related) solution is to link against a shared version of the static object."
20:43:50
drmeister
So we had a static object, we switched to the dynamic object and the problem persists?
21:28:09
drmeister
Once compile-file starts MPS slows down because mps_ap_fill takes 67.4% of the time.
21:28:32
drmeister
That's the function that is called when an allocation point runs out of space and has to allocate a new page to start filling.
21:59:43
drmeister
Cleavir uses hash-tables to traverse trees of instructions. It has to rapidly build up a hash table and then it throws it away.
22:30:48
drmeister
Bike: beach suggested that we could convert the map-instructions-xxx instructions into generic functions so that we can redefine them in clasp.
22:31:52
Bike
would take a lot of editing, and i don't know if it would be hugely better if we did the thread local thing
22:31:56
drmeister
That means adding a required parameter to each of them and defining a dynamic variable cleavir-ir:*instruction-mapper* that is by default NIL and calls the current functions.
22:34:22
drmeister
But is this how we would do it? Right - can't we define a special variable (defvar cleavir-ir:*instruction-mapper* nil) and then pass that everywhere in Cleavir? Or do we need to add a system argument to their callers and to their callers and so on?
22:35:06
Bike
by thread local thing i meant the "putting an extra slot in instructions that indicates whether they've been mapped over" dealie
22:36:39
drmeister
It wouldn't be hugely different but for the fact that allocations due to map-instructions-xxx are hammering MPS.
22:37:32
drmeister
There are there options - but it's hard to figure out if they will improve things wrt MPS performance until we try them.
22:38:02
drmeister
(1) Move to open hashed hash-tables - eliminate cons cells from hash tables. There are other good reasons to do this.
22:38:49
drmeister
(2) use stealth mixins and rewrite map-instructions-xxx to add a 'touched' slot to instructions.
22:43:21
drmeister
Well, I would argue that it's a bit perverse to be allocating so much memory just to walk trees over and over again.
22:45:20
drmeister
I probably won't hear from Ravenbrook until next week to get any ideas of how to improve this from the MPS size.
22:46:35
drmeister
Do we need to pass a system parameter into every function that calls map-instructions-xxx and so on?
22:47:01
Bike
either that or have a dynamic variable for the system, or something, but that would be unusual yeah
0:31:20
Bike
why would mps_ap_fill be called more by one kind of allocation in particular? unless that allocation is really common i guess...
0:33:28
drmeister
It is pretty common - every time you enter a function. They are supposed to be pruned if there is no return-from that can target them
0:49:01
Bike
i mean, if i do like (concatenate 'string "foo" "bar") i thought it'd give back a character strin
0:51:34
drmeister
foo allocate bar allocate baz allocate biz allocate tang allocate wang allocate cc_gatherRestArguments allocate
0:52:54
drmeister
How difficult would it be to write an optimization that checks if the rest variable escapes and if it doesn't use a &va-rest?
0:55:34
Bike
i was going to ask if it's called by the caller or by the &rest function itself, but i guess that's obvious
0:57:52
Bike
even without my fancy optimization, i could put in something to eliminate almost all calls to the symbol method. might as well give it a shot i guess.
0:59:16
Bike
though... how are there 129 calls to make-instance/symbol and only 3 to make-instance/class
1:02:19
drmeister
https://github.com/clasp-developers/clasp/blob/dev/src/lisp/kernel/clos/builtin.lsp#L40
1:03:37
Bike
like i said, that one specifically i can deal with by other means. i'm wondering about some other ones, though
1:04:56
Bike
the standard allocate-instance methods ignore the initargs, but i guess the compiler doesn't know that so it gathers them anyway
1:05:57
drmeister
https://github.com/clasp-developers/clasp/blob/dev/src/lisp/kernel/clos/standard.lsp#L38
1:06:34
drmeister
That one only iterates over the list or passes it to an error. I can make a copy and iterate over the copy.
1:07:26
Bike
i don't like having implementation stuff in these general methods... but the compiler isn't good enough right now, so i'll suck it up
1:12:48
drmeister
No, I can't copy a vaslist - I'd need to be able to allocate a 24byte structure in the stack frame and use that.
4:25:00
drmeister
Bike: I get one pass through a &va-rest list - even if I copy it into another variable.
4:26:20
drmeister
I would need a special operator to copy the &va-rest parameter to into another variable so that I can convert it to a list if there is an error.
4:27:52
drmeister
But we would have to know how large they need to be at compile time - we don't have that.
4:29:01
Bike
we would? you can have variable alloca... the "function" anyway, dunno the llvm semantics
4:45:19
drmeister
The only thing we lose is the ability to provide the arguments in the error message.
4:46:42
drmeister
If I could go (let ((saved-initargs (core:vaslist-copy initargs))) (declare (dynamic-extent saved-initargs)) ... )
4:52:08
drmeister
I could allocate it on the heap - but that kind of defeats the purpose of avoiding consing.
5:11:21
drmeister
Hoookay - that was a terrible idea. shared-initialize repeatedly loops over the initargs
6:00:38
drmeister
I changed less than 10 lines of code and now vaslist's are 64bytes rather than 32 and they keep a copy of their original value.