freenode/#clasp - IRC Chatlog

10:54:49 selwyn hi everyone

10:58:37 beach Hello selwyn.

11:00:52 drmeister Howdy

11:01:04 drmeister Yesterday when I built CST it took about 150min

11:01:09 drmeister Today it took 125min

11:01:16 beach Hello drmeister.

11:03:46 selwyn i built in 1h51 just now - basically no change. i wonder if i did something wrong? asdf did take a while

11:06:30 drmeister I don't know - timing can be tricky. The flame graphs show improvements in inlining.

11:06:51 selwyn so what is the CST?

11:06:56 drmeister The next thing to do is value numbering and type inference.

11:07:13 drmeister Do you mean what is the difference between the CST compiler and the AST compiler?

11:07:16 selwyn yes

11:08:26 drmeister A couple of things - (1) the CST compiler tracks source locations for every expression. Source information is propagated all the way from where the source is read down to the object files.

11:09:00 drmeister That will allow us to get to the exact expression that is causing a problem.

11:09:49 drmeister (2) The CST compiler fixes a subtle bug in the AST compiler that I still don't really appreciate - I think it's with closed over environments.

11:10:26 drmeister In fixing that bug it generates code where LET statements are expanded into calls to functions. This slows down the generated code significantly (by a factor of 2 or so).

11:11:07 drmeister The solution is to do a lot more inlining at the HIR level to improve the generated code performance.

11:11:24 drmeister That inlining code is what we are dealing with now - it slows the compiler down a lot.

11:11:50 drmeister So CST is a correct compiler with better debug information.

11:12:02 selwyn i think i understand

11:12:31 drmeister So we crawl through this valley of slowness to come out the other side with a better compiler.

11:13:02 drmeister The next big thing to do is value numbering, type inference and removal of dead code.

11:14:13 selwyn right. i have to go now, i will read the logs and be back in a couple of hours. thanks for explanation

12:11:50 drmeister Here's a new flame graph - with some updates to colors and whatnot.

12:11:59 drmeister https://usercontent.irccloud-cdn.com/file/HBw5d4Hi/out-14601.svg

12:12:29 drmeister I have forked the FlameGraph repo - it's in clasp-developers - use that one.

12:12:38 drmeister It has a '-color clasp' option.

12:13:09 drmeister Aqua is Common Lisp, green is C++, red is the GC, yellow is everything else.

12:18:51 drmeister I changed the file that I compile-file for this flame graph - I switched to format.lsp - it has a really long pause at the end when it is compiling the format compiler-macr

12:20:12 drmeister o

12:22:42 drmeister I have a suggestion: (1) We temporarily add a slot to instructions that stores a 'visited' mark and write a new version of of map-instructions-xxxx that uses the 'visited' mark rather than a hash table to keep track of what instructions have been visited. I believe that this will cut inlining time by quite a lot.

12:23:09 drmeister (2) We work on value numbering, type inference and removing useless code paths to tighten up the code.

12:23:25 drmeister (3) We decide if we want to undo step (1).

12:40:07 drmeister I'm trying step (1) now.

15:51:58 kpoeck Hello

15:53:29 kpoeck karlosz: Changing to new sicl - with your patches- reduced build time for clasp - after wiping build - from Compilation finished in 5:24:51 to Compilation finished in 4:03:12, quite an improvement

15:53:52 kpoeck Will now tr to generate a flamegraph as requested

16:14:13 kpoeck Flamegraph for the regression tests in https://kpoeck.github.io/out-23635.svg

16:56:16 karlosz hello everyone. I have some more inlining improvements here: https://github.com/robert-strandh/SICL/pull/131

16:56:16 minion karlosz, memo from flip214: yes, I did declare them inline ;)

16:56:50 karlosz i know you guys just did another rebase, but i observed another 2-3 hours improvement with these patches

16:57:12 Bike the problem i had with set-predecessors is that we could inline a function that doesn't return, and that will cut off whatever the call dominates

16:57:40 karlosz hm, maybe just the reinitialize data part then

16:58:07 karlosz could you maybe take a look? i did some instrumenting and thought i got rid of most of that

17:00:18 Bike well it's basically the same issue. we don't know what instructions we have to unhook from predecessors or definers/users

17:27:13 drmeister Hello everyone.

17:29:21 drmeister Did we dig too greedily and too deep when trying to speed up Cleavir?

17:29:22 drmeister https://www.youtube.com/watch?v=PfhZB7rQ7iA

17:29:36 selwyn hi all

17:30:27 selwyn drmeister: i'll find out the hard way - building with https://github.com/robert-strandh/SICL/pull/131 now

17:33:28 karlosz selwyn: seems like with the computer you have its not so hard ;)

17:35:39 selwyn mustn't grumble :)

18:47:44 drmeister Bike: We have a _SetfFunction slot - do you know if I use eval::funcall(_sym_touched->_SetfFunction,value,object) - will that set the value of the 'touched' slot?

18:49:10 drmeister I think that's how it's supposed to work if I want to set the value of an objects slot.

18:49:20 drmeister I'll find out soon enough.

18:49:36 kpoeck karlosz: A flamegraph compiling asdf with your fixes that are already merged to sicl

18:49:39 kpoeck https://kpoeck.github.io/out-26745.svg

18:52:03 karlosz kpoeck: thanks! looks consistent with what drmeister generated last night

18:53:38 karlosz and you can even click to expand these bars. how fancy

18:55:41 Bike drmeister: i don't understand. the touched slot in what?

18:56:00 drmeister It's the name of a slot in a class.

18:56:21 Bike the name of the writer function isn't necessarily the same as the name of the slot

18:56:25 drmeister (defclass foo () ((touched :initarg :touched :accessor touched)))

18:56:44 drmeister Right - right.

18:57:26 drmeister I know the name of the accessor is touched - so the setf function is (setf touched).

18:57:31 Bike in that case it would work yeah

18:58:35 drmeister I've rewritten map-instructions, map-instructions-with-owner and map-instructions-arbitrary-owner in C++ and switched from using a hash-table to using a slot called 'touched' in each instruction.

19:00:06 drmeister I want to see if my hypothesis that overhead and using hash-tables to keep track of what instructions have been seen is responsible for a lot of the time taken by inlining.

19:01:48 drmeister Better algorithms are better.

19:02:15 selwyn just built in 1h37

19:02:47 drmeister How long does it take to build the AST version on your machine?

19:06:11 selwyn about 30 minutes

19:20:50 drmeister Bike: Did you add automatic generation of xxxx-instruction-p predicates? I thought you did but I don't see enclose-instruction-p

19:21:22 Bike no, i just brought it up as a possibility.

19:22:20 Bike i guess with the dtree interpreter in place there's not as much worry of weird slowdowns, so i may as well

19:25:05 drmeister selwyn: My new linux machine at home builds AST in about the same amount of time.

19:25:11 drmeister So we are off by about a factor of 3.

19:25:11 Bike itll take me a bit to figure out since it involves anonymous generic functions and stuff

19:25:57 Bike also i have no idea how to arrange it for, like, C++ classes

19:26:02 drmeister FYI startup is a bit slower now because there are some compilations happening even when it only compiles the 1024th time a gf is called.

19:26:12 selwyn ok

19:26:37 drmeister Also - once you start compiling things there are big slowdowns as all the compilation chickens come home to roast.

19:27:01 drmeister ACTION likes to torture his metaphors

19:31:09 Bike like, i can put the thing to set up the predicate in initialize-instance for class, but that doesn't work for anything built in

19:31:17 Bike since they're initialized in weird ways and all

19:31:50 Bike and if i put in a check for the predicate at runtime that's slower

19:32:05 drmeister It's ok - I'm just hacking here - I added an enclose-instruction-p and enter-instruction-p predicate.

19:32:43 Bike don't push it to the repo or anything, that caused me a lot of rebasing problems before

19:32:53 drmeister I'm not pushing nuthin'

19:33:39 drmeister I don't touch sicl - that's beach's and your domain - this is an experiment that I would be perfectly happy to see fail.

19:33:55 Bike i mean, you did it before with my sicl fork

19:34:08 selwyn speaking of chickens: at ELS i found out there is an attempt at chicken scheme <-> c++ integration

19:34:27 selwyn http://wiki.call-cc.org/eggref/5/bind

19:34:34 drmeister Ah - that - right.

20:04:43 drmeister First glance at a flame graph that uses C++ versions of map-instructions-XXX

20:05:06 drmeister This is inline.lisp being compiled when Cleavir first kicks in.

20:05:17 drmeister https://usercontent.irccloud-cdn.com/file/9cJiMQYr/out-27101.svg

20:06:31 drmeister It's not directly comparable to the other flame graphs I've posted because this is single threaded.

20:09:31 selwyn what are the most useful things we can deduce from these flame graphs?

20:22:49 drmeister The width of a bar is how much time is being spent in that function.

20:23:19 drmeister It's describe nicely here: https://github.com/brendangregg/FlameGraph

20:23:48 drmeister But this last one is less useful than the next ones I'll generate.

21:01:27 drmeister Well, I'll be hornswoggled - the C++ version doesn't appear to offer any benefits whatsoever.

21:01:58 drmeister I'm not sure - because I have to time an entire build - but the flame graphs look exactly the same!

21:02:19 Bike shoudln't they at least not have the gethash and stuff

21:02:19 drmeister Except for the changes I made - hang on - incoming flame graphs.

21:02:40 drmeister Yeah - they are different in the respects where I changed them - that's different.

21:02:50 drmeister Here is the graph I posted this morning...

21:03:01 drmeister https://usercontent.irccloud-cdn.com/file/HBw5d4Hi/out-14601.svg

21:03:27 drmeister Here is the C++ified version.

21:03:34 drmeister https://usercontent.irccloud-cdn.com/file/70NhaVWe/out-29363.svg

21:04:18 drmeister In the Common Lisp version DO-INLINING is 63.05% of the time.

21:04:55 drmeister In the C++ified version it's 64.16% of the time.

21:05:11 drmeister The difference is noise. I'm stunned.

21:06:17 Bike it did decrease the reinitialize-data fraction by a percent or two. but yeah, doesn't seem to have helped much