freenode/#sicl - IRC Chatlog
Search
5:21:31
drmeister
Bike: In the paper that you posted (that I got from Steve Blackburn) what are the different yieldpoint methods in Figure 3 describing?
5:22:20
drmeister
I understand the (a) conditional one (I think). But (b) and (c) - I don't see how they work.
5:23:17
no-defun-allowed
Some Java virtual machines cause threads to read or write an address, which causes a segfault, which then gets handled.
5:23:30
Bike
they're explained under "Trap-Based Polling Yieldpoints". Basically, when the yieldpoint is hit it does a meaningless memory operation on some page. When you want the yieldpoint to activate, you protect the page so that memory operations on it cause the system to trigger an interrupt
5:24:16
drmeister
That is faster than a comparison and a branch? I guess the slow path is a lot slower - right?
5:25:46
Bike
I think figure 5 is the one you want. sometimes the traps do much worse but not always.
5:26:22
drmeister
Ok. I should have read the paper again. I was going off the figure 3 that Steve posted in a Zulip post and I needed the paper to decode it.
5:26:42
Bike
though it depends. in the results under "Global Yieldpoints" they get a 2.5% overhead for a conditional, 2.0% for the load trap, and then 36% for the store trap
5:27:40
Bike
that kind of memory trap stuff isn't something i've dealt with before, so i don't know the ins and outs very well, unfortunately
5:27:57
Bike
but my impression is that garbage collectors and stuff have often used these mechanisms
5:29:58
beach
And it is not clear to me what the performance penalty of invoking the operating system would be.
5:36:45
beach
I know I need something like that when the global collector requests its "roots" from the nursery collectors.
5:38:17
Bike
they mention garbage collection, "user-level thread preemption" so i guess interrupt-thread, code patching, "biased locking" which from a quick google is probably irrelevant whatever it is, and profiling.
5:39:18
Bike
the paper is mostly about yieldpoints themselves. doesn't go into the applications too much
5:41:18
Bike
we actually have problems in clasp with thread interrupts. you can only interrupt threads at safe/yield points, which for now are allocations. So if a thread is in a loop that doesn't allocate, you can't interrupt it.
5:41:47
Bike
i suppose cleavir is probably well enough developed now that i could whip up an insertion pass. i know you already wrote some loop detection code
5:44:04
beach
Hmm. I am thinking that one could have a counter on back arcs, so only test the yieldpoint every 100 times or so.
5:45:37
Bike
you could work type inference into it. if the code has (loop for i below n ...) and n is an (integer 0 100) the compiler doesn't bother inserting points.
5:48:34
beach
Here is another idea. Instead of checking at function calls, check at function returns. Then it can be done by modifying return addresses on the stack.
6:35:37
ebrasca
no-defun-allowed: You start in big endian , if you like to change to litle endian you need to run some istructions in big endian.
6:36:53
no-defun-allowed
I see. Wouldn't you decide when the executable starts? Generating code for both big- and little-endian to support both seems silly even for #sicl standards.
6:39:51
no-defun-allowed
If so, you might just keep running in big endian (assuming nothing else requires little endian).
6:43:48
no-defun-allowed
Although stylewarning told me "no one likes big endian" (when I asked about it while porting a Smalltalk implementation to the Wii, which was stupid enough to not do its own endianness conversion when loading images), but evidently the OPAL people like big endian.
6:44:21
no-defun-allowed
I think SICL will be 64-bit only, and I haven't heard of any language implementation which lets you switch between 32-bit and 64-bit code as such.
6:44:53
no-defun-allowed
From memory, one switches between modes on x86-64 by modifying the page table, which has a bit for 32/64-bit code, so clearly only the OS can do that.
6:46:24
no-defun-allowed
From memory, macOS has "fat" binaries which could have both PowerPC and x86, and then AMD64 and AArch64 code, but that selects one architecture at load-time.
6:49:03
no-defun-allowed
Perhaps I need to modify that Smalltalk to do endianness conversion, because I really dislike that one should have to modify the Xerox image to run it on another computer - wasn't the whole point that it was device independent? But I recall that advice came from Xerox, which makes it harder to argue with.
6:49:22
beach
I don't imagine switching between endian-ness, and I am not planning to support 32-bit processors.
7:08:00
beach
It depends on the person. I don't think you know enough about the low-level details of implementing Common Lisp to take on something like compiler optimization or generic dispatch. There is lots of very mundane stuff, but that's too trivial for you, so I don't want to give you that. There is not much in the middle.
7:09:41
no-defun-allowed
The first that comes to mind is a random number generator, because I spent a few weeks writing fast RNGs for simulation.