freenode/#sbcl - IRC Chatlog
Search
16:01:04
stassats`
another way to get rid of &rest copying, if the caller knows how much stack space the callee allocates and do that on its behalf, and then push all the arguments
16:03:08
stassats`
difficult with all the different function types, closures, symbols, callable instances
17:31:01
puchacz
okay, I got sbcl corruption and as stassats advised, I checked the function where it happened:
18:20:38
puchacz
pfdietz: it happens inside hunchentoot, and I think it is some sort of internet error, like I recorded yesterday, here: https://paste.ubuntu.com/p/cbC7jWjsmz/
18:25:31
flip214
puchacz: you are sure the machine is okay, hardware-wise? did you run a memtest already?
18:26:30
puchacz
flip214 - it is a virtual computer at a provider, I rented it because I wanted to run a simulation on 32 cores
18:28:15
puchacz
but I deleted it and restored it few times, I tried 2 locations, so must be different hardware. unlikely it is a "hardware" (virtualised or otherwise) problem
18:31:28
puchacz
I haven't tried full setup, but it looked to me that for my problem CCL (when I ran on my PC) was about 10 times slower, so I gave up.
18:43:23
flip214
puchacz: and you're running full-speed against hunchentoot there? or a limited amount of traffic?
18:45:50
puchacz
hunchentoot is an "interface" between my program on that virtual computer that runs simulations with different parameters, and Mathematica running on my PC that provides these parameters via HTTP
18:46:25
puchacz
so no simultaneous calls but hundreds and then thousands of calls in rapid succession.
18:47:59
puchacz
split the work in hunchentoot handler into 64 pieces, then join them all and respond
19:51:50
puchacz
okay, it seems I will need to prepare an isolated test case and send. Realistically - next weekend :)
20:00:44
stassats
bad file descriptor, faulting when writing to a buffer, it appears you're touching a stream that has been already closed
20:05:56
puchacz
hunchentoot is an "interface" between my program on that virtual computer that runs simulations with different parameters, and Mathematica running on my PC that provides these parameters via HTTP
20:06:09
puchacz
so no simultaneous calls but hundreds and then thousands of calls in rapid succession.
20:06:26
puchacz
my simulation is 64 threaded, split the work in hunchentoot handler into 64 pieces, then join them all and respond
20:08:12
stassats
next i'd modify HUNCHENTOOT::*SUPPORTS-THREADS-P* to always default to NIL and try that
20:09:12
puchacz
but you are right, hunchentoot does not need to be parallel, it is not a bottleneck
20:09:14
stassats
sure, but i have a feeling it's not going to help, but it's useful to know that it's not going to help anyway
20:14:02
stassats
but all that does is closing the socket, might explain the bad-file-descriptor-error, but not the segfault
20:24:09
puchacz
(it is running the simulations now with --lose-on-corruption by the way, if we are lucky it will corrupt it in 30 minutes maybe)
20:26:36
puchacz
mathematica sends something like http://my-ip/simulate?x1=3423.324&x2=34.7&x3=.... etc. and waits
20:27:17
puchacz
when all threads complete, it sends back in the same handler one number as a result, simulation score
20:27:38
puchacz
mathematica immediately after receiving it sends another http request, with different values of x1 x2 etc.
20:31:37
puchacz
it seems to be completing one simulation (so one http request / 64 threads / response cycle) within say 3 seconds
21:23:35
puchacz
that it was another requester, totally different from Mathematica trying to reach my regular web application at "/"
21:24:23
puchacz
so the reason it worked on my PC was simply that there were no web spiders etc. randomly trying to knock at port 8080
21:25:23
puchacz
whereas on the rented computer at the provider site, it is being inspected by all sort of spiders
21:25:59
puchacz
rather than saying something like no postgres connection, which is a socket protocol, no native library loaded
21:27:22
puchacz
in Lisp I have a habit of having one image.... just save-and-die, load it with all the extra baggage I don't need
21:29:43
pfdietz
There's a CL library for handling ELF format. The dream: a gdb replacement written in CL.
21:37:10
puchacz
(and sorry for confusion, it is a regular bug in my application after all, to leave the other handlers in)
21:43:22
|3b|
trying to execute the query even though there isn't a postgres server available does sound like user code bug though
21:43:23
puchacz
when I prepare the image for the simulation, I clear fasl cache, then I start everything up (which takes long as it has to compile all that is required from quicklisp again), then I load data from postgres into a global variable, and save-and-die
21:44:10
puchacz
it is just a strange habit to have the same image for a web application and numerical simulation :)
21:44:51
|3b|
nah, being lazy and reusing something that is already set up doesn't sound that strange :)
21:46:17
puchacz
anyway, if anybody wants more isolated program that would trigger the same crash, I can try to prepare it on next weekend, ping me at piotr.wasik@gmail.com
21:59:50
|3b|
yeah, looks like writing to streams left open from when image was saved gets corruption warning on linux
22:02:13
|3b|
file maybe could try to reopen and seek to same place or something, but i suspect that would be worse than erroring as often as not
22:05:08
|3b|
though in this case it looks like it is the (foreign?) buffer rather than the stream that isn't surviving the save
22:07:08
puchacz
I have enough knowledge now to run my simulations without undercover spy spiders interference
22:07:31
|3b|
maybe file a bug that writing to old streams after image save/reload should be handled better
22:45:07
pfdietz
I would recommend not building a deliverable from a development image. You should have a script that builds the deliverable from scratch.
22:47:34
|3b|
sounded like it was more or less doing that, just that part of the build involved grabbing data from a DB
22:48:00
|3b|
and it happened to also include code that would try to reuse that db connection in some unexpected cases
22:50:12
|3b|
arguably the db lib should clear its connection cache on image saving, but not many libs bother with that sort of thing :/
23:55:59
pkhuong
BTS tracing isn't too bad https://gist.github.com/pkhuong/1ce34e33c6df4b9be3bc9beb22415a47