freenode/#clasp - IRC Chatlog
Search
1:57:47
drmeister
I modified the static analyzer to generate more descriptive output of classes that the scraper could use to build compile-file info that clasp needs for precise GC.
2:00:02
drmeister
The stamp_wtag numbers at the end 7, 11, 15 should be ignored - we need to recalculate them in the scraper - but we already do that.
2:02:09
drmeister
The static analyzer generates this clasp_gc_xxx.desc file that contains these S-expressions and it still generates the clasp_gc_xxx.cc file that we will continue to use until we get the scraper to generate everything that is in the clasp_gc_xxx.ccc
2:03:45
drmeister
With this I think we can merge extensions together and generate a description of the layout of all objects that works with the specific combination of extensions that a person has.
3:06:55
drmeister
Did you get any more insights into your call site optimization technique after the talk?
3:07:47
drmeister
Oh? That was worth figuring out? I was kicking myself afterwards for asking a stupid question. :-)
3:07:48
beach
The callee allocates the snippet, so it can determine how far away it is from itself, and use the appropriate call.
3:10:03
drmeister
Let me say that better. Short calls give better performance on x86. We have switched between the large (all 64bit calls) and the small code model (primarily short calls) and there was significant difference in performance. It's not all due to jumps. But a 64bit call/jump means load a register and then jump to the contents of that register.
3:14:26
drmeister
I've been wondering if for functions with few arguments you couldn't leave space in the caller for most cases.
3:16:00
beach
Sure, that's possible. But even for functions like that, you have the indirection for the static environment and the entry point, and you load the environment even when it is not necessary, so even in cases like that, I win.
3:16:26
beach
On SICL, there are also more indirections than on most systems, so it is important to eliminate those.
3:17:03
drmeister
Could you elaborate on the "But even for functions like that, you have the indirection for the static environment and the entry point, and you load the environment even when it is not necessary, so even in cases like that, I win."
3:19:03
drmeister
I don't doubt that the technique wins. I'm thinking about reducing the amount of snippet allocation since one might be able to guess an upper limit in bytes for the snippet and leave that amount of space in the caller and the callee would rewrite inside the caller when it is redefined.
3:19:05
beach
Without my optimization, you would have to follow an indirection through the name (a symbol or a function cell) to get to the function object. Then, from the function object, you would do two memory accesses to get to the static environment and to the entry point. Right?
3:19:56
beach
And in the case of SICL, the function is a standard object, so I have another indirection through the header.
3:20:52
beach
With my technique, you need no indirection through the name, and if the callee is not a closure, you don't load the static environment.
3:21:47
beach
Then the callee needs to take them from that place and store them in the places where it wants them.
3:22:45
beach
With my technique, it may be possible to avoid the intermediate location and take it directly from where the caller put them and stick them where the callee wants them.
3:24:19
drmeister
Estimating the maximum size the snippet could ever be from how the caller calls it and leaving enough space to set up the arguments, make the call and write the multiple return values.
3:24:30
beach
I think that would be messy, but you can try that if you like. You may have to do that because of the way your GC works. In fact, that was my original idea a year or so ago, and because of the space restriction, I did not pursue it.
3:25:53
beach
You may no longer be able to inline the callee completely in the snippet as I suggested in some cases.
3:27:25
drmeister
It just struck me that it's like the movitz paper - 90% of calls are 3 args or fewer sort of thing.
3:29:37
beach
then when it is traced, we can turn all the snippets into default ones, and actually do that call.
3:32:09
drmeister
Is the long jump to CAR that cheap that it's reasonable to jump to the snippet and jump back rather than inline in the caller?
3:34:26
drmeister
Spaghetti code though. I don't mean that as a criticism. But that's what it will look like in memory.
3:35:01
drmeister
::notify Bike This is what the static analyzer produces now for cando: https://gist.github.com/drmeister/8e2cbdfb89ddce2c9a83f75b470b5b4e
3:35:41
beach
I am not sure I understand the question. Nobody needs to look at the memory. So as long as it does what it is supposed to, it should be fine.
3:36:34
drmeister
Yup - it's not a criticism. It makes sense. I've been doing a lot of disassembly of memory lately.
3:38:21
beach
I see. Well, you might want your disassembler to take into account this technique and produce something more obvious than a jump.
3:45:12
Colleen
Bike: drmeister said 10 minutes, 11 seconds ago: This is what the static analyzer produces now for cando: https://gist.github.com/drmeister/8e2cbdfb89ddce2c9a83f75b470b5b4e
3:47:08
drmeister
I'm thinking we can read this info into the scraper and merge the info from clasp with the info from each extension. We can check everything for internal consistency and if it checks out then the extensions are compatible with each other and we assign stamp values and then generate the C++ info in header files like the scraper already does.
3:48:25
drmeister
I think everything in the clasp_gc.cc file can be generated from this - and the merging should be straightforward.
3:49:28
drmeister
We would need to check the consistency of each extensions data with the clasp we are extending and make sure that no two extensions define the same classes (rare).
3:51:23
drmeister
Once we assign the stamps in the scraper - the scraper can then generate the IsA and TYPEQ tests.
3:51:34
Bike
does the scraper understand different namespaces? like is it ok if two different namespaces define the same class? well i guess it must be since llvmo has function and stuff
3:55:07
drmeister
It is stupid in that it only recognizes "namespace XXXX {" and treats everything like there is only one level of namespaces.
3:55:47
drmeister
There are places where we use two - but I think I've been careful not to define stuff the scraper will recognize inside of a nested namespace.
3:56:37
drmeister
Yeah - I think it just watches for "namespace XXXX" and that causes the scraper to say - "ok, I'm in namespace XXXX and that means I'm in package YYYY".
3:57:33
drmeister
The NAMESPACE_PACKAGE_ASSOCIATION(mp, MpPkg, "MP") macro maps C++ namespaces to CL packages.
4:00:25
drmeister
Bike: Could you take a look at the scraper and plan how to incorporate this information from the static analyzer?
4:01:10
drmeister
We will put the output into the main directory of each extension and then tell waf how to gather them all up.
4:01:43
drmeister
Then we will pass all the names to the scraper so that it can digest them and generate the a richer version of the code that it already generates.
4:03:33
drmeister
I think we can eliminate the define-stampwtag entries and put the inheritance that they define in the class-kind entries.
4:57:20
drmeister
::notify cracauer` I think I fixed things so that the buildbot should work with the main branch.