freenode/#clasp - IRC Chatlog
Search
17:41:43
drmeister
frgo: Have you used Read-Copy-Update for lock free reading of a data structure that is updated very infrequently?
17:43:50
drmeister
It requires that updates to the data structure happen when all threads that could access that data structure is blocked or at a safe point.
17:47:53
drmeister
I'm puzzling over how to implement this for the class table - currently a hash-table.
17:49:41
drmeister
I have an idea. If I put an alist in front of the class-table that contains add/remove class instructions and search that first and then search the class-table.
17:50:03
drmeister
Then - if I could block all of the threads and use the add/remove class instructions to update the class-table.
17:50:56
drmeister
AND I keep a pointer to the alist/class-table in thread-local storage for each thread. So find-class would go through this pointer in thread-local storage.
17:55:17
drmeister
Question: Can I set things up so that all threads reach a safe-point and pause so that I can update the class-table, update all of the thread local storage pointers to the class-table and then allow the threads to continue on their way?
18:12:09
drmeister
It's something to do with a global atomic pointer to a linked list of set-class instructions followed by the class table.
18:13:51
drmeister
It's a bit long. To summarize - the compile-file-parallel was experiencing a lock contention for the read-lock for the class table.
18:14:57
drmeister
Have you ever used Read-Copy-Update? https://en.wikipedia.org/wiki/Read-copy-update . It sounds like it's easier to implement in operating systems than user land.
18:18:31
drmeister
I would continue to use a hash-table for the class-table - but first find-class would search a linked list of "set-class" instructions that would shadow anything in the class hash-table.
18:19:03
drmeister
I can atomically update a global variable to the linked list of "set-class" instructions using CAS.
18:19:46
drmeister
Occasionally I would want to update the class hash-table using the set-class instructions so that the list of set-class instructions doesn't get too long.
18:20:03
frgo
Hm - IIUC then this requires some sophisticated handling of sequencing the updates. Yeah, if this could be done by appending to a list ...
18:21:13
frgo
What I don't get is how we access the linked list by multiple threads in a lock-free way ..
18:21:23
drmeister
I think the updating of the class hash-table using the set-class list is the tricky part. I need to get all threads to a safe point and then do the update, then erase the set-class list and then tell every thread to use the class hash-table.
18:23:05
frgo
You'd need to interrupt a thread, re-orient the class access to the class hash-table, and the tell the thread to continue.
18:25:12
drmeister
Can I tell every thread - the next time you reach a safe-point pause and wait until all threads have reached a safe-point and then update the class-table and then proceed?
18:25:54
frgo
So, establishing a signal handler in each thread - this handler is the "I need to wait for class access reorientation and this is signaled by a condition variable."
18:27:11
drmeister
Is this a terrible idea? If there is one thread that is off doing something compute intensive and it doesn't reach a safe-point for an hour - everything will block for that time.
18:27:12
frgo
I can't imagine how one can "insert" a safepoint into normal C++ code. Except I manipulate at a higehr level, AST, IR, ...
18:29:04
drmeister
It's a call to this: https://github.com/clasp-developers/clasp/blob/dev/src/gctools/interrupt.cc#L225
18:30:41
drmeister
https://github.com/clasp-developers/clasp/blob/dev/include/clasp/gctools/threadlocal.h#L49
18:36:22
frgo
Coming back to your point on a thread running for an hour: That's why I think you would have to interrupt the threads by a signal. You can't just wait until they reach a safepoint.
18:38:27
frgo
Still - tricky. What if you have 10000 threads ... Delivering signals to hat nr of threads is not exactly a cheap/ fast oepration.
18:39:29
frgo
Fortunately we don't support Windows. We had a case where exactly this scenario caused Win10 to crash - very reliably so. Microsoft is working on it ...
18:43:56
drmeister
I have to ensure that a safe point call is made periodically. It’s the eternal problem of c++ exceptions and Unix signal handlers not working together
18:44:50
frgo
Exactly. I would shy away for this very reason. Too many unknowns to get this stable - tells me my experience.
18:46:09
drmeister
It would only come up when redefining classes. Then all the threads would have to sync to update the class table
18:47:39
drmeister
I have an interim solution. Replace typep and find-class calls in the compiler with predicate tests.
18:49:12
drmeister
Changing code and avoiding using TYPEP and FIND-CLASS in future code until we implement Read-Copy-Update for the class table.
18:50:11
drmeister
If you want to run parallel code - make sure it doesn't contain TYPEP or FIND-CLASS calls - and that it avoid reader/writer locks - they are useless.
18:51:18
drmeister
No - it's not attractive. The only attractive solution I see requires read-update-copy.
18:52:32
drmeister
That requires synchronizing threads and that requires judicious invocation of safe-point.
20:02:50
drmeister
::notify Bike - I'm replacing the use of *invocation-context* special variable with code that passes it as a second argument to introduce-invoke-aux here...
20:07:07
drmeister
scymtym: What was the deal with transforming FIND-CLASS into (load-time-value (find-class-cell ...)) ?
20:07:55
drmeister
Instead of storing class-symbol-name/class we store class-symbol-name/(CONS class nil) in the class table?
20:09:56
drmeister
Yeah - Bike said... "and we have (typep x 'foo) => (classp x (car (load-time-value (gethash 'foo *class-name-hash*)))) or so"
20:14:37
drmeister
I realized that RCU is even more complicated with the moving garbage collection of MPS.
20:15:20
drmeister
Any GETHASH can trigger a REHASH - which would also have to be deferred to the class-table synchronization time.
20:16:00
drmeister
This is because objects move in memory and if they move and they can't be found because the hashing algorithm has them at a different place in the hash-table - then that would normally force a REHASH.
20:16:43
drmeister
I know how to handle this but it would require dropping down to a linear search of the class-table until the class-table synchronization happens.
20:39:08
drmeister
::notify Bike I've changed how classes are stored in the class table. Now (typep x 'foo) --> (classp x (car (load-time-value (ext:find-class-cell 'foo))))
21:00:18
drmeister
I wonder what the consequences would be if I made every CONS cell have an atomic CAR and CDR.
21:26:19
drmeister
It's problematic to return a cons cell containing a class in the car. Unless the car is atomic there could be race conditions.
21:50:54
Colleen
Bike: drmeister said 1 hour, 48 minutes ago: - I'm replacing the use of *invocation-context* special variable with code that passes it as a second argument to introduce-invoke-aux here...
21:50:54
Colleen
Bike: drmeister said 1 hour, 47 minutes ago: Tell me what you think when you see this.
21:50:54
Colleen
Bike: drmeister said 1 hour, 11 minutes ago: I've changed how classes are stored in the class table. Now (typep x 'foo) --> (classp x (car (load-time-value (ext:find-class-cell 'foo))))
21:52:21
drmeister
I've saved my previous work and now I'm in a separate branch exploring making the car and cdr of cons cells atomic. Also Symbol_O::_GlobalValue is now atomic.
21:53:12
Bike
introduce invoke will be going away with the changes to hir, so dont sweat it too much
21:54:22
Bike
with the cells we would juzt need atomic accessors, it's ok if the regular accessors aren't, though i dont' know how it goes on the C++ ends
21:56:38
Bike
http://www.sbcl.org/manual/#index-compare_002dand_002dswap has a list of places sbcl can access atomically or through cas
21:58:29
drmeister
That sbcl thing is all very well and good - but in C++ land I think I need to declare the types as atomic - then the compiler should make sure no funny business happens with them.
21:58:45
drmeister
I don't think I can use atomic operations on non-atomic data - I think that would be dangerous.
22:00:18
Shinmera
That reminds me of another thing that's vastly complicated by not having a GC: lockless datastructures
22:01:07
Shinmera
I mean, it can get pretty dicey even with a GC, but trying to read some papers on how to do the same in C++ without a GC made me think a noose would be a better option
22:03:34
drmeister
What's nice about C++ is once I declare those slots as std::atomic<T_sp> - then the compiler starts spitting out all sorts of compile time error messages when the code violates the type rules.
22:05:28
Shinmera
If you guys expose a CAS to lisp that works on svref and struct places, let me know.
22:05:46
drmeister
That's the exploring part - I keep making changes to the code to spread the "atomicness" around. If things keep making sense more and more code compiles. If it doesn't - then I stop and trash these changes.
22:07:12
drmeister
https://github.com/clasp-developers/clasp/blob/hashtable/src/llvmo/intrinsics.cc#L113
22:19:11
Bike
i mean a symbol value read is a single atomic op, but we have two steps, 1 grab reference 2 read it
22:47:12
drmeister
Hmm, I would have to change how bclasp works with special variables if I wanted to proceed.
22:48:36
drmeister
I cant store classes in cons cells - because I can't guarantee that they will be read atomically.
22:51:05
drmeister
I think I do - the C++ compiler forced me to make Symbol_O::_GlobalValue atomic if I make the car and cdr of a cons cell atomic. This is because of how the symbol bindings work in clasp now that we have multithreading.
22:53:36
drmeister
No - every access to a cons cell would have to be through atomic operations. .load() .store(val) and .compare_exchange_strong(expected,new_value)
22:54:07
drmeister
Fooling around with making their car and cdr std::atomic<T_sp> - whaddayathinkofthat?
22:54:50
drmeister
Making the change and then fixing a bunch of compiler errors to propagate the atomicness out to the rest of the code.
22:55:08
stassats
if that'll allow to use std::compare_and_swap_something, then sure, but if it forces to serialize on every write, then not so sure
22:55:28
drmeister
But I hit a snag in how bclasp works with values - it hands out references to them - and I can't tell the receiver that these are atomic references.
22:56:28
drmeister
Yeah - how bad would it be if it forces to serialize on every write? I think that's what will happen. The compiler will ensure that the car and cdr are always read and written atomically. Would that be a huge drag?
23:02:39
stassats
but in general, making things that are not going to be used atomically atomic is not the best idea
23:08:07
drmeister
Yeah - that assignment involves this call: call std::__atomic_base<unsigned long>::operator=(unsigned long)
23:12:58
drmeister
Ok, does that help me though? Should I make them atomic and then use memory_ordered_relaxed everywhere that I don't care about memory order?
23:14:17
drmeister
I don't have a way to deal with atomic values in Common Lisp yet. I haven't made the conceptual leap.
23:14:41
drmeister
stassats: Do you have a few minutes - I can explain where this all came from - it's a good story.
23:15:44
drmeister
I figured out why my compile-file-parallel was only giving me about 1.5x speedup - it's the read lock in FIND-CLASS that is causing lock contention.
23:16:37
drmeister
So I've been trying to figure out how to fix all of the calls to (typep x '<cleavir-class>) that are invoking find-class and that are leading to lock contention.
23:17:04
drmeister
I started by replacing (typep x '<cleavir-class>) with predicates (<cleavir-class>-p x) - but that grew tiresome.
23:19:26
drmeister
I could - yes - for these cases. I thought instead though that I would put each class in a cons cell - and then use (classp x (car (load-time-value (ext:find-class-cell '<cleavir-class>))))
23:21:10
drmeister
In case some thread updates the class and code somewhere holds the cell for that class...
23:21:59
drmeister
I understand that these are cleavir classes and we aren't going to be modifying them - but still...
23:22:52
stassats
but even if it's atomic, what if you get the wrong value after you return from CAR?
23:23:50
stassats
suppose you have (when (typep x 'class) do-things-to-x) and something is redefined it at (when (typep x 'class) [here] do-things-to-x)
23:24:08
stassats
your atomicity won't help, it's the job of whoever does concurrent class changes to make things thread-safe
23:28:17
drmeister
Do you think it's a reasonable idea to store the class in the class table in the car of a cons cell and to let that cons cell out into Common Lisp land?
23:30:04
drmeister
Why not use a cons cell? I've been using them for cells in a couple of other places.
23:33:25
drmeister
Hmm, that seems a bit overkillish. Why would you do that? I'm asking because your rational may influence other things I'm thinking about.
23:33:43
stassats
modifying a class is not going to become atomic by declared a wrapper structure atomic, so you'll have a lock of some sort
23:34:46
stassats
why do that? that's how you program things, by creating structure out of amorphousness
23:38:42
drmeister
Then (load-time-value (find-class-holder '<cleavir-class>)) will return one of those.
23:39:07
drmeister
I'll still need a lock of some sort to read it - correct? That's what you were talking about
23:39:40
drmeister
(with-class-holder-lock-get-class (load-time-value (find-class-holder '<cleavir-class>)))
23:41:15
drmeister
So a function would work? (get-the-class-and-dont-give-a-crap-if-anyone-writes-to-the-class-holder (load-time-value (find-class-holder '<cleavir-class>)))
23:42:05
drmeister
And then (setf (find-class 'foo) new-class) uses serializes writes to the class-table with a lock?
23:43:42
drmeister
This seems dangerous - I was investigating the Read-Copy-Update approach for the class-table. https://en.wikipedia.org/wiki/Read-copy-update
23:44:54
drmeister
I was reading that wiki page and I was watching a couple of lectures on designing operating systems where they described Read-Copy-Update.
23:46:12
drmeister
I don't know, I thought if I don't declare things std::atomic that BAD THINGS would happen.
23:47:12
stassats
the problem with reordering is say you're changing a class from 3 slots to 5 slots, you're writing 3 to NSLOTS, then another code is iterating over all slots, it sees NSLOTS as being 5, but the slot vector isn't visible yet, and it's still holding 3 slots => error
23:47:36
drmeister
Seriously though - I thought worried about subtle race conditions that would be very difficult to debug if I don't use std::atomic.
23:49:59
drmeister
I probably can't plaster over the code with a consistent memory model at this point.
23:52:38
drmeister
I really appreciate the help here. So I'll create a ClassHolder_O class and have it store a class - I'll put those into the class-table. I'll read them without a lock (I'm nervous about this) but write to them with a lock.
23:56:12
drmeister
I don't have fences. I have pthreads mutexes and I have std::atomic and copy-and-swap.
23:58:30
stassats
i can't say that it'd be the right decision not now, but now, that's what i would've done
0:00:13
stassats
the only possible issue with this - the class itself is not decent yet, but the value will always be right
0:01:34
stassats
and you're attaching it to the symbol as well, if that works, then the wrapper cell would automagically work
0:05:27
drmeister
That just seemed way too simple. I could write the class into a slot in the symbol - that would be wasteful I guess because most symbols would not have a class.
0:05:47
drmeister
I could write it into the property list of the symbol - but people could screw with that.
0:07:29
drmeister
I'm starting to lose my train of thought. I am going to break for a bit and think about it.
0:24:18
drmeister
Hmmm, of course none of this class-holder stuff is portable - so we can't use it in the cleavir code. I'll have to talk with Bike about this.
0:28:26
drmeister
I figured out how to read the macOS Activity Monitor <sample process> output. It's really useful for figuring out what is going on.
0:33:16
stassats
if it doesn't exist you can actually stuff a function to run there, which then changes itself to run something else after it's determined what exists