libera/#sicl - IRC Chatlog

3:24:06 hayley Though I am not proposing such a design for SICL, a read barrier could also be used to avoid the indirection through a dyad for standard objects. Say, a rack could start with a word that was, say, 0 if the rack is up to date, or a pointer to the next rack. The barrier would test if that word of a standard object is non-zero, just after loading a reference to a standard object. This would also allow for concurrent compaction, but it has the downside

3:24:06 hayley that a race involving CHANGE-CLASS would probably break EQ.

3:27:08 hayley I don't recommend using this approach as I don't know if all the branching in the read barrier is faster than the additional indirection in the dyad-and-rack representation, especially when the rack can be cached in the latter. And the former requires another tag test to rule out cons cells and immediate values, by the barrier needing to be run just after loading a reference; whereas loading the rack from a dyad can be done at any time.

3:29:16 beach I have systematically ruled out read barriers as being too costly, but I might have to start thinking about those.

3:30:02 hayley I was thinking that the indirection is a bit like a read barrier, so in a way, we already use read barriers.

3:30:25 beach It is, yes.

4:02:48 hayley But the indirection is only used when we need to retrieve a slot, whereas we'd have to run this read barrier more "eagerly", and we'd have to test tags for every tagged pointer we load. So the cost model is quite different.

4:04:21 beach Sure.

5:25:46 beach` ** NICK beach

6:02:20 moon-child what's the difficulty with races?

6:05:20 hayley If one thread had run the read barrier earlier, and held onto that rack, and then another thread performed CHANGE-CLASS on the object, then gave the first thread the new rack, the racks would not be EQ. One could put a barrier on EQ (a la Shenandoah, replication coping) but that's just another slowdown.

6:05:50 hayley Or maybe there's another time a thread can realise the old rack is stale?

6:07:00 moon-child what if the normal state of a rack is that the first word is a pointer to that rack

6:07:11 moon-child then eq is comparing the first words of the operands

6:07:43 moon-child I guess that's more expensive than it would otherwise be, since you have to dispatch on _type_

6:09:06 moon-child I remain unconvinced that change-class needs to be fast. But sm2n said he uses it...

6:12:18 moon-child dumb idea

6:12:32 moon-child what if you reserve a bunch of lowtags

6:12:45 moon-child and cycle through them, using the chosen lowtag as a generation number

6:12:53 moon-child and when you overflow, you stop the world or something

6:15:47 moon-child hayley: oh, duh, right, I forgot this trick

6:16:16 moon-child you allocate a separate header and rack, but you put the rack directly after the header in memory

6:17:25 moon-child you load the rack pointer from the header, and then have a branch checking that the rack pointer is right after the header

6:18:01 moon-child and then you start reading from the location two words after the header right away, in the happy path--you don't have to wait for the rack pointer load to get back before you continue

6:18:26 moon-child for a change-classed object, the rack pointer will point somewhere else, and you take the mispredict

6:18:31 moon-child but the header pointer is still stable

6:34:12 hayley I had thought about EQ comparing another word, and putting the rack after the header, but not the branch trick.

6:37:33 hayley And that is clever, but I would like the fast path to eventually happen after a CHANGE-CLASS, if possible. The read barrier does at least achieve it.

6:39:42 hayley Maybe there's some way to push fix-up work onto GC, and to push for an earlier GC if we take the slow path too many times (oh dear, it's like replication for thread-local heaps again).

6:39:59 moon-child presumably you have some kind of global compactor. So you can do that then

6:40:19 moon-child 'push for an earlier GC if...' ha, well

6:40:25 hayley Then it's a question of how often that global compactor runs.

6:40:56 moon-child yeah

6:41:00 hayley In SICL and my GC for SBCL, it hopefully doesn't run very often.

6:42:59 moon-child well. change-class hopefully doesn't run very often

8:24:45 splittist moon-child: you don't want those things to be self-fulfilling - no-one uses change-class so we can make it slow so no-one uses change-class...

8:47:46 hayley Beats "people don't use standard objects in performance-sensitive code".