libera/#clasp - IRC Chatlog

13:02:29 yitzi drmeister: It doesn't seem like anything special is required during the build or invocation of Cando for this. Sounds like the main issue is making sure the host and container have compatible MPI implementations. If they do it could be that you could just skip building a custom MPI in the container.

13:03:39 yitzi To start I would look to see what MPI version/implementation is on the cluster. A minimal build would just be adding :mpi t to the config.sexp and seeing if works.

13:30:06 drmeister Here's why I need MPI

13:30:08 drmeister https://usercontent.irccloud-cdn.com/file/yzKuafZx/image.png

13:30:43 drmeister That long tail.

13:32:03 drmeister If I had MPI - I'm not sure how to take advantage of it with lparallel.

13:32:35 drmeister I don't see which capability of lparallel I would use.

13:33:05 drmeister pmap takes things out of my hands and schedules jobs itself.

13:33:56 drmeister futures... I feel like I have a mental block on how to use them.

13:34:34 drmeister I could create a vector of say 20 entries and fill it with futures and check them every 100 milliseconds to see what future is complete and then put a new one in there.

13:35:02 drmeister How do I create a future that's closed over a specific piece of work...

13:35:54 drmeister If I were doing this myself I would use bordeaux threads and a queue and write my own thread pool that keeps taking jobs out of the queue and running them. That I understand.

13:36:01 drmeister That's how parallel-compile-file works.

13:36:24 drmeister I feel though that I should be able to use lparallel.

13:39:08 drmeister Hmm, maybe I can treat a future like a worker in a thread pool. I use a queue and each future checks the queue and does whatever work it gets.

13:39:58 drmeister Stepping back a bit and considering the graph above.

13:40:30 drmeister I tried to do load balancing by sorting the jobs in order of size, largest to smallest.

13:42:57 drmeister I haven't dug into it too deeply but it looked like pmap was ignoring my sorted order.

13:43:12 drmeister The tail is terribly long.

13:47:42 yitzi Are you sure you really need futures? Those are more for building a calculation expression in which the pieces are "in process" ... guess I'd have to know what you are doing in more detail.

13:51:07 yitzi And "checking the future to see which one is complete" ... isn't that what lparallel's worker pool is supposed to do?

13:52:30 drmeister I'm not sure at all. I'm musing aloud and interested in your and Bike's thoughts.

13:52:53 drmeister And anyone else who has done multithreaded programming.

13:53:31 drmeister In compile-file-parallel I have a thread pool and a write-one/read-many queue - I understand that pattern and I've used it many times.

13:53:56 yitzi Yeah, lparallel has a thread pool and you submit tasks to it over a channel. That is what is happening underneath pmap

13:54:11 drmeister Each worker blocks on the queue and gets a piece of work or a message to shut down.

13:54:57 drmeister The queue manager puts pieces of work on into the queue and when there are no more, one shut down message for each worker.

13:55:45 drmeister Load balancing is done by sorting the jobs by size and keeping them short.

13:56:42 yitzi Which works unless you have M long running jobs with M < N (the number of cores).

13:57:51 drmeister Yeah - and I'm not doing any of this currently.

13:57:53 yitzi Especially, if those jobs don't start immediately.

13:58:42 drmeister I set up 12 nodes each with 28 threads and at the start each node has a list of longish jobs and I just use `lparallel:pmap` to map over them.

13:59:11 yitzi Do you use the return values from the jobs?

13:59:14 drmeister I tried sorting the jobs in each list - but that seems to be thwarted by something `pmap`

13:59:34 drmeister There are no return values from the jobs - they write out files.

14:01:03 yitzi I'm look at the lparallel docs....you may need to submit the tasks yourself to preserve order.

14:02:08 yitzi Yeah, pmap chunks the sequence based on the number of worker thread....that is definitely not what you want.

14:02:49 drmeister I'm looking at the docs here: https://lparallel.org/

14:03:19 yitzi You could try adding :parts (length job) to pmap

14:04:02 drmeister They describe the easy to use facilities along the top.

14:04:33 yitzi Seems like that would force every thread to handle a single job.

14:05:46 yitzi Did you get that? Try :parts (length jobs)

14:06:25 drmeister I ignored :parts up until now - I'm trying to make sense of :parts.

14:06:40 drmeister If I say `:parts (length jobs)` what does that mean...

14:07:18 yitzi I am pretty sure that means break the list up into single jobs

14:07:19 drmeister At first glance the docs examples are confusing.

14:07:25 yitzi Basically this...

14:09:11 yitzi You have a 4 core processor. That is the default number of parts it break stuff up into. When you do PMAP over (j1 j2 j3 j4 j5 j6 j7) then C1 gets (j1 j2), C2 gets (j3 j4), C3 gets (j5 j6) and C4 gets (j7) ....

14:09:41 drmeister Yeah - I follow that.

14:10:13 yitzi If you do :parts 7 then C1 gets (j1), C2 gets (j2), C3 gets (j3), C4 gets (j4), in the queue goes (j5), (j6), and (j7)

14:11:51 drmeister I don't see how that follows... wouldn't that be what you get with `:parts 1`

14:12:09 drmeister I'm not arguing though, thinking about what you say...

14:12:19 drmeister Meanwhile - I'm trying it.

14:12:46 yitzi Pretty sure parts 1 would be C1 gets the whole list.

14:13:32 drmeister Ok, yeah - that kinda makes sense.

14:14:11 yitzi And this is still assuming that pmap respects the ordering of the list.

14:14:23 drmeister The default for :parts is the number of workers. Saying `:parts (length x)` would give one job to each worker if there were `(length x)` workers.

14:14:43 drmeister There are not - so as each worker finishes - they take another job?

14:15:15 drmeister Right - the order needs to be followed.

14:15:20 yitzi Yes, I am assuming that if the length is greater then the number of workers then lparallel will queue them

14:16:07 drmeister It's running now. We will get a graph in 10 min or so.

14:16:34 drmeister It would be very convenient if this works. Then I wouldn't need MPI yet.

14:17:11 drmeister I have a problem in the search that gets smaller and smaller the more searching I do.

14:17:43 drmeister It's a bit difficult to describe but imagine I'm generating puzzle pieces that must connect to other puzzle pieces.

14:17:47 yitzi I seem to recall, that is how the "channels" in lparallel work. I think that there is a bug in lparallel in that the next job in the queue won't start if you don't retrieve the result waiting on the channel. But that shouldn't be a problem for PMAP. I ran into this issue for the TIRUN app when we are sketching the ligands.

14:17:57 drmeister I'm searching small combinations of puzzle pieces.

14:18:27 drmeister With a short search - of say 20 - about 1% of the puzzle pieces don't fit a following piece.

14:18:53 drmeister With a search of 200 - it's about 0.4% of the puzzle pieces don't fit a following piece.

14:22:28 drmeister I don't know much I need to search to drive that to zero - zero would be best.

14:22:58 drmeister It takes about 4 hours on 12 nodes each with 28 cores to search 200.

14:23:50 drmeister I thought it would be best to address that long tail before I do anything else.

14:25:12 yitzi makes sense.

14:25:42 drmeister Here it is 10 min in...

14:25:43 drmeister https://usercontent.irccloud-cdn.com/file/RDTWpo7x/image.png

14:26:45 drmeister The measurement for each node is not very good - or lparallel is doing crazy things.

14:26:49 drmeister Here's one of them...

14:26:50 drmeister https://usercontent.irccloud-cdn.com/file/pIva8N0a/image.png

14:27:27 drmeister That drop at 9:20am - I gotta believe it isn't real.

14:27:56 drmeister I am assuming I need to watch the trend - and the trend looks like there is still a long tail.

14:28:10 drmeister https://usercontent.irccloud-cdn.com/file/hhFi4HkW/image.png

14:36:33 drmeister Yep - not good

14:36:34 drmeister https://usercontent.irccloud-cdn.com/file/a6065YmP/image.png

14:38:13 stassats looks colorful though

14:39:09 yitzi Well, either submit the jobs to kernel yourself....or write your own threadpool?

14:41:02 yitzi There are some examples of using futures in https://github.com/cando-developers/cando/blob/0fc1fa09ee22521403bd46e1b8298f82ae2d94f5/src/lisp/cando-widgets/molecule-select.lisp

14:41:59 drmeister Did you do that?

14:42:04 drmeister So you use `eval`

14:42:11 stassats lparallel seems to be... kinda abandoned

14:42:37 drmeister Or finished?

14:42:59 yitzi drmeister: yes...that was me. You could also keep it simple https://lparallel.org/kernel/

14:43:00 stassats parallel? too complicated to ever be

14:43:25 yitzi Just submit the tasks and make sure to eventually read the results.

14:43:36 yitzi The "futures" are bit weird, IMHO.

14:43:41 stassats i only used the queues from lparallel and then did my own stuff

14:44:49 drmeister yitzi: I'll read up on the kernel API.

14:45:12 drmeister There is no doubt - the tail is still there...

14:45:13 drmeister https://usercontent.irccloud-cdn.com/file/ioIlpkoQ/image.png

14:46:26 yitzi drmeister: I am pretty sure you just add all the tasks and then just idle while waiting for the results ... which are just indicative that the job completed.

15:24:16 drmeister So you just open a channel and submit-task's to it and they automatically go the the *kernel* and then you call receive-result for each task?

15:27:21 yitzi Think so

16:30:58 drmeister I added per-node/per-thread logging and my attempt at load balancing is absolute shite.

16:31:14 yitzi oh?

16:31:37 drmeister I was sorting the jobs based on the number of atoms - figuring more atoms take more time.

16:32:34 drmeister That's not at all the case - the amount of time varies hugely. Now I suspect that some non-linear optimizations are getting trapped and I'm letting them wander too long.

16:32:47 drmeister Digging deeper.

16:34:21 drmeister Some worker threads can finish 12 jobs in the time that one takes for one job.

16:34:55 yitzi So maybe not the fault of lparallel. Hmm....

16:44:27 drmeister Right

19:35:58 stassats are the jobs independent?

19:37:10 stassats i would have made a queue of jobs from which each thread repeatedly gets a job (or a batch of jobs, if each individual one is very small)

20:03:58 drmeister The jobs are independent yes.

20:04:10 drmeister yitzi: I get this when I try to build apptainer with `:mpi t`

20:04:12 drmeister https://www.irccloud.com/pastebin/lbxd7r31/

20:04:50 drmeister It's not a burning issue - it looks like I can push MPI into the future a bit because I think I solved the issue with the tail. I had an almost infinite loop of error /error handling.

20:06:11 yitzi If mpic++ isnt in an obvious place you can specify the path with `:mpicxx <path>`

20:20:22 drmeister But where would it be? Is this in the apptainer?

20:20:47 drmeister It's on the host at `/usr/bin/mpic++`

20:22:10 yitzi No, you need it in the container. We may need to install debian packages.

20:23:35 yitzi Looks like it is libopenmpi-dev

20:24:06 yitzi If it is not already in the container then just add that to the apt-get install in the def file

20:36:17 drmeister Trying that.

22:16:02 drmeister No more long tail...

22:16:03 drmeister https://usercontent.irccloud-cdn.com/file/qGJE6dRZ/image.png

22:18:46 drmeister It was a handler that recognized 3 or 4 linear atoms (a problem for non-linear optimization) and that caught the error and tried to shake up the 3 or 4 linear atoms. It doesn't work very well probably because the rest of the structure forces the atoms back into a linear arrangement.

22:19:28 drmeister There was a potential infinite loop of handling the error and then restarting the calculation and it generating the error again. It would very occasionally knock itself out of that cycle.

22:19:39 drmeister I set it up so it only tries 3 times and then gives up.

22:20:24 drmeister I have an MPI build in apptainer. I'm not sure how to test it though.