freenode/#clasp - IRC Chatlog

12:38:06 drmeister It happens even if I run the thing in serial (counter to what I said above). So breaking up the compiler as I have has a problem.

13:50:41 drmeister It's time - I'm looking into lparallel now.

13:52:28 drmeister The Department of Energy just announced a $100M program to develop new desalination technologies.

13:52:28 drmeister https://www.energy.gov/articles/department-energy-announces-100-million-energy-water-desalination-hub-provide-secure-and

13:52:51 stassats and you're going to use lparallel for that?

13:53:00 drmeister Indeed I am.

13:53:44 drmeister We are developing our own technology that looks more like lfarm - for distributing jobs across large clusters of machines.

13:54:17 drmeister But for little things, lparallel was the plan.

13:55:21 drmeister Excellent, I can quickload lparallel with no fuss.

13:55:54 stassats in parallel?

13:57:27 drmeister Sadly, no. Bike and I will take a look at compile-file-parallel tomorrow - it almost works - but there is something wrong with how I'm splitting up the ast generation and the subsequent steps. There's a form in babel that causes an assertion failure in cleavir.

13:58:31 drmeister But lparallel builds pretty quickly in serial mode.

14:00:50 drmeister stassats: Do you every go looking for thread contention? I was going to try some of the tools in xcode to see if I can get any insight why compile-file-parallel doesn't go higher than about 200% CPU.

14:01:47 stassats i'm not really doing any parallel computation

14:02:25 drmeister It might even be malloc - google has this tcmalloc that is supposed to give better multithreaded performance.

14:02:33 stassats only ever dealing with deadlocks and thread safety

14:02:43 drmeister Ok.

14:03:25 stassats but what's the GC impact?

14:04:11 drmeister I don't know - I'll also compare MPS to boehm.

14:04:52 drmeister First it needs to be correct though.

14:05:36 drmeister How do I determine GC impact. Regular profiling and look at how much time is spent in the gc? I can do that no problem.

14:06:28 stassats increase the number of bytes it can produce without gcing

14:07:19 drmeister This is cleavir we are talking about - it's gonna do what it's gonna do.

14:07:30 stassats no, in the gc

14:07:44 drmeister Oh - I see what you mean.

14:08:05 drmeister I'll keep that in mind.

14:08:09 drmeister Thank you.

14:34:08 drmeister lparallel works great - I'm running multiple quantum mechanics calculations in parallel

14:36:51 drmeister https://usercontent.irccloud-cdn.com/file/JiGZ6fr2/image.png

15:03:00 scymtym drmeister: where do the results go? are there side-effects for collecting the results?

15:03:46 stassats no side effects until you observe them

15:04:20 drmeister I write out a file and then use ext:system to launch a program that does the calculation and reads it and then writes the results to another file.

15:05:08 drmeister But yes - like stassats implies - until I run 'ls' on the directory the results of the calculation are a quantum superposition of run and not run.

15:06:30 drmeister But they take a bloody long time to run and without parallelism it's a PITA.

15:08:01 drmeister I'm calculating partial charges for compounds out of this database: http://zinc.docking.org/browse/subsets/

15:09:59 scymtym drmeister: i see. i was asking because one of the strengths of lparallel's PMAP* functions is the fact that results are returned in the same way the CL counterparts would return them

15:11:38 drmeister Each calculation takes about 2 min

15:12:49 drmeister I'm comparing the results of running this stand along executable to results that Cando generates. With Cando I will use that because it does the calculation within Cando.

15:12:56 drmeister stand alone

15:13:54 drmeister Have you used lparallel? Or lfarm?

15:14:16 scymtym lparallel extensively, lfarm never

15:14:35 drmeister ACTION keeps in mind who he can tap for expertise

15:14:38 drmeister :-)

15:14:41 scymtym but my usage is probably rather basic

15:15:02 drmeister lparallel looks pretty straightforward. I'm going to keep muddling along with it.

15:15:12 scymtym just lparallel:pmap is already enough for so many things

15:15:58 drmeister How does lparallel work with interruptions?

15:16:01 stassats i only used the queues from lparallel

15:17:09 scymtym what kind of interruptions?

15:17:41 drmeister Like if I want to shut down the calculation because it's going too long or I want to restart it with different arguments.

15:19:57 drmeister Right now, clasp doesn't handle Control-C or interrupts well, I'm looking into it.

15:20:35 drmeister When I start working with Cando seriously in the jupyterlab interface - almost immediately after starting up the first calculation I want to interrupt it and change the arguments and run it again.

15:20:50 stassats interrupts are out of scope of lparallel, but it can deal with errors

15:21:02 scymtym i think the safest solution is having the tasks cooperate by, for example, checking a flag

15:21:12 drmeister Ok, "out of scope" I can accept.

15:22:15 scymtym iirc, it has means to cancel tasks and force the thread pool to shut down, but i don't think those are safe in general

15:22:58 drmeister Yeah - that's what I'm wrestling with - shutting down things safely.

15:23:41 stassats when you interrupt it allows you to " 1: [TRANSFER-ERROR] Transfer this error to a dependent thread, if one exists.

15:23:41 stassats 2: [KILL-ERRORS] Kill errors in workers (remove debugger instances).

15:24:29 stassats doesn't seem to be doing anything, since it's not an error

15:25:12 scymtym there is all (lparallel:kill-tasks TAG) to kill a subset of tasks

15:25:27 scymtym but all of this has to be async unwinding, right?

15:27:36 stassats i guess it could provide more restarts

15:27:49 drmeister I think I need clasp to poll for interrupts at safe points and then use the Common Lisp restart machinery to shut things down safely.

15:28:04 stassats like when interrupting lparallel:pmapc and say "kill all threads"

15:29:10 scymtym but that is only convenience, isn't it? the core issue is what drmeister says: unless interruptions happens at safe points, unwinding from interruptions is unsafe

15:29:43 stassats well, who cares about safety? you're not doing it to communicate about threads

15:29:51 stassats s/about/between/

15:30:26 stassats but to kill a runaway task

15:32:49 drmeister Give me a bit of time - I'll ask more intelligent questions once I get my bearings on this.

15:33:20 scymtym drmeister: asked about aborting and restarting computations that take too long. i think there is potential to hose the system e.g. by leaving temporary files lying around or leaving data structures in an inconsistent state

15:33:44 scymtym ACTION wanted to type "drmeister asked about"

15:33:57 drmeister Oh yes - that is a concern.

15:34:20 drmeister Right now I can't get lldb to ignore SIGINT and SIGSTOP so I can debug what's going on with my handlers. Grrrr

15:34:39 stassats scymtym: well, it's drmeister's concern

15:34:55 Shinmera You definitely at least want to unwind properly to trigger unwind-protects and such

15:35:35 stassats drmeister: process handle... doesn't work?

15:35:44 Shinmera But then you could be interrupting during an unwind protect and things like that, so safe points are nice.

15:36:56 scymtym stassats: i think you can't expect users (and library developers) to do all the async unwind protection

15:37:15 drmeister I got it: process handle SIGINT -s false and then kill -INT <pid-of-clasp>

15:37:57 stassats scymtym: no safe points can help there

15:39:18 drmeister scymtym: C++ unwinding and unix async signals are completely incompatible with each other. There is no way to unwind out of an async signal handler. The best one can do is set a flag in the signal handler and then wait for the C++ code to check it and unwind.

15:39:24 scymtym stassats: i'm not sure i understand. but we are going off on a tangent here

15:39:28 stassats especially considering you don't need any interrupts

15:39:39 stassats you're expected to invoke a restart from an ERROR that unwinds

15:40:19 drmeister In Windows you can unwind out of a signal handler but not in linux or unixes.

15:40:20 scymtym stassats: but how does the error get signaled?

15:40:28 stassats a call

15:42:05 scymtym drmeister: yes. i didn't know that about windows

15:43:37 scymtym stassats: maybe i'm dense here. is the scenario you are assuming with or without cooperation by the worker threads? that is are they actively checking some flag and signal the error in response to that?

15:44:02 stassats no, your unwinding concerns do not disappear with calls to ERROR

15:44:17 stassats they can leave partial files, and data structures

15:44:27 drmeister I'm finding this conversation very useful - even if it's a bit chaotic :-)

15:45:27 stassats interrupts can additionally damage locally manipulated data

15:45:58 stassats but still, people are interrupting things, unwinding from errors, etc. without much problem

15:46:14 scymtym for a signaled error and the handler invoking a restart (which could even abort the thread, sure), UNWIND-PROTECT is enough to ensure consistency. but not when the error is signaled from an interruption

15:46:31 stassats async safety only comes to play when you use to send messages between threads or to do any other kind of work

15:46:48 stassats but not during interactive development

15:47:02 stassats causes you gotta do what you gotta do

15:47:21 stassats what's left otherwise? killing the process? well, should've stayed in C++ for that

15:47:34 stassats s/causes/cause/

15:47:44 scymtym there is (let (a) (unwind-protect (setf a (open …)) (when a (close a))))

15:48:40 stassats you're left with a dangling FD, big deal

15:49:13 scymtym i'm just saying you can't make async unwinds part of your system's normal operation

15:51:21 stassats async unwinds are a part of my normal interaction with lisp

15:52:01 stassats i'm not operating a mars rover

15:52:46 scymtym sure, same for me. but if somebody designed a library that somehow used async unwinds, you couldn't use it in a webserver, say

15:53:36 scymtym ok, we all know the pitfalls and tradeoffs. i think we can leave it at that

17:28:02 drmeister I had a bug in the interrupt handler - now I can interrupt nicely.

17:29:27 drmeister https://www.irccloud.com/pastebin/Up2fvAeU/

19:44:33 drmeister Everything is nicely interruptible now.

19:44:44 drmeister clasp and jupyterlab.

0:06:13 drmeister scymtym: Are you online?

0:06:38 drmeister esrap parsers work left to right -correct?

0:07:14 drmeister C1CCC1 The first 1 will be encountered by the parser before the last one - correct?

0:07:28 drmeister I hope so.

0:07:39 drmeister I need to interpret the first one differently than the following ones.