freenode/#clasp - IRC Chatlog
Search
12:38:06
drmeister
It happens even if I run the thing in serial (counter to what I said above). So breaking up the compiler as I have has a problem.
13:52:28
drmeister
The Department of Energy just announced a $100M program to develop new desalination technologies.
13:52:28
drmeister
https://www.energy.gov/articles/department-energy-announces-100-million-energy-water-desalination-hub-provide-secure-and
13:53:44
drmeister
We are developing our own technology that looks more like lfarm - for distributing jobs across large clusters of machines.
13:57:27
drmeister
Sadly, no. Bike and I will take a look at compile-file-parallel tomorrow - it almost works - but there is something wrong with how I'm splitting up the ast generation and the subsequent steps. There's a form in babel that causes an assertion failure in cleavir.
14:00:50
drmeister
stassats: Do you every go looking for thread contention? I was going to try some of the tools in xcode to see if I can get any insight why compile-file-parallel doesn't go higher than about 200% CPU.
14:02:25
drmeister
It might even be malloc - google has this tcmalloc that is supposed to give better multithreaded performance.
14:05:36
drmeister
How do I determine GC impact. Regular profiling and look at how much time is spent in the gc? I can do that no problem.
14:34:08
drmeister
lparallel works great - I'm running multiple quantum mechanics calculations in parallel
15:03:00
scymtym
drmeister: where do the results go? are there side-effects for collecting the results?
15:04:20
drmeister
I write out a file and then use ext:system to launch a program that does the calculation and reads it and then writes the results to another file.
15:05:08
drmeister
But yes - like stassats implies - until I run 'ls' on the directory the results of the calculation are a quantum superposition of run and not run.
15:08:01
drmeister
I'm calculating partial charges for compounds out of this database: http://zinc.docking.org/browse/subsets/
15:09:59
scymtym
drmeister: i see. i was asking because one of the strengths of lparallel's PMAP* functions is the fact that results are returned in the same way the CL counterparts would return them
15:12:49
drmeister
I'm comparing the results of running this stand along executable to results that Cando generates. With Cando I will use that because it does the calculation within Cando.
15:15:02
drmeister
lparallel looks pretty straightforward. I'm going to keep muddling along with it.
15:17:41
drmeister
Like if I want to shut down the calculation because it's going too long or I want to restart it with different arguments.
15:19:57
drmeister
Right now, clasp doesn't handle Control-C or interrupts well, I'm looking into it.
15:20:35
drmeister
When I start working with Cando seriously in the jupyterlab interface - almost immediately after starting up the first calculation I want to interrupt it and change the arguments and run it again.
15:21:02
scymtym
i think the safest solution is having the tasks cooperate by, for example, checking a flag
15:22:15
scymtym
iirc, it has means to cancel tasks and force the thread pool to shut down, but i don't think those are safe in general
15:23:41
stassats
when you interrupt it allows you to " 1: [TRANSFER-ERROR] Transfer this error to a dependent thread, if one exists.
15:27:49
drmeister
I think I need clasp to poll for interrupts at safe points and then use the Common Lisp restart machinery to shut things down safely.
15:29:10
scymtym
but that is only convenience, isn't it? the core issue is what drmeister says: unless interruptions happens at safe points, unwinding from interruptions is unsafe
15:32:49
drmeister
Give me a bit of time - I'll ask more intelligent questions once I get my bearings on this.
15:33:20
scymtym
drmeister: asked about aborting and restarting computations that take too long. i think there is potential to hose the system e.g. by leaving temporary files lying around or leaving data structures in an inconsistent state
15:34:20
drmeister
Right now I can't get lldb to ignore SIGINT and SIGSTOP so I can debug what's going on with my handlers. Grrrr
15:34:55
Shinmera
You definitely at least want to unwind properly to trigger unwind-protects and such
15:35:44
Shinmera
But then you could be interrupting during an unwind protect and things like that, so safe points are nice.
15:36:56
scymtym
stassats: i think you can't expect users (and library developers) to do all the async unwind protection
15:39:18
drmeister
scymtym: C++ unwinding and unix async signals are completely incompatible with each other. There is no way to unwind out of an async signal handler. The best one can do is set a flag in the signal handler and then wait for the C++ code to check it and unwind.
15:43:37
scymtym
stassats: maybe i'm dense here. is the scenario you are assuming with or without cooperation by the worker threads? that is are they actively checking some flag and signal the error in response to that?
15:45:58
stassats
but still, people are interrupting things, unwinding from errors, etc. without much problem
15:46:14
scymtym
for a signaled error and the handler invoking a restart (which could even abort the thread, sure), UNWIND-PROTECT is enough to ensure consistency. but not when the error is signaled from an interruption
15:46:31
stassats
async safety only comes to play when you use to send messages between threads or to do any other kind of work
15:47:21
stassats
what's left otherwise? killing the process? well, should've stayed in C++ for that
15:49:13
scymtym
i'm just saying you can't make async unwinds part of your system's normal operation
15:52:46
scymtym
sure, same for me. but if somebody designed a library that somehow used async unwinds, you couldn't use it in a webserver, say
0:07:14
drmeister
C1CCC1 The first 1 will be encountered by the parser before the last one - correct?