Search
Friday, 31st of July 2020, 18:30:10 UTC
19:32:46
pjb
gendl: nothing useful is exported form ccl…
19:34:53
pjb
gendl: but it works only on unix or posix systems, and only for this version.
21:47:03
gendl
I only need it on Linux for now
21:47:24
gendl
trying to start a server on port 80 as root then quickly demote the process to a normal user before anyone nitices
22:08:23
pjb
gendl: go ahead. You can use #+ccl (ccl::getuid) #-ccl (posix-getuid) with #-ccl (cffi:defcfun (posix-getuid "getuid") () :int) and stuff…
22:28:19
gendl
Ok another issue -- after loading some stuff then saving an image with ccl:save-application, then restarting the saved image, I'm getting weird threading errors after restarting the server.
22:29:06
phoe
what sort of weird threading errors?
22:29:06
gendl
i don't think i'm running many threads in the image before doing the save-application - but is there something I should do to shut down multiprocessing before saving the image or something?
22:29:24
gendl
well if you really want to know, here they are:
22:29:54
phoe
did you close all foreign libraries and their data structures and such?
22:29:57
gendl
(and they don't happen if I just start a blank Gendl/CCL image and load the monolithic lx64fsl for my application -- in that case, the server runs fine)
22:30:13
phoe
asking because things like C callbacks and foreign memory do not get restored after image thawing
22:30:29
gendl
as far as i know the image is not loading any foreign libs before save-application
22:30:32
phoe
they need to be explicitly reinitialized and reopened after thawing
22:30:47
gendl
here let me get the specific errors...
22:32:19
gendl
https://www.irccloud.com/pastebin/OuS0CN3R/
22:32:39
phoe
do you save your streams anywhere?
22:32:49
gendl
i'm not sure what that means
22:32:56
phoe
(defvar *x* *standard-output*)
22:33:17
phoe
network sockets, or anything?
22:33:26
gendl
i'm not doing any of that as far as I know.
22:33:31
phoe
that's not a threading error, that's a no-applicable-method
22:33:37
phoe
which is a CLOS version of type error
22:34:02
gendl
to me it looks like compounded errors within the error message itself
22:34:07
phoe
the function SOCKET-ERROR-IDENTIFIER was called with a STREAM-IS-CLOSED error
22:34:20
phoe
a backtrace would be required
22:34:32
phoe
that's definitely not a threading error, doesn't look like that to me
22:34:43
gendl
ok let me turn on aserve debugging and try to get a backtrace
22:35:59
phoe
which package is SOCKET-ERROR-IDENTIFIER from?
22:36:18
gendl
hold on, I'll attach with slime and get a proper backtrace
22:39:34
gendl
https://www.irccloud.com/pastebin/kPwqN3et/
22:40:10
phoe
(NET.ASERVE::CONNECTION-RESET-ERROR #<CCL::STREAM-IS-CLOSED-ERROR #x302001BB751D>)
22:40:14
phoe
this looks like the culprit
22:40:28
phoe
it doesn't seem to expect to get a stream-is-closed error
22:40:33
phoe
where is the source for this function?
22:40:44
phoe
where is the source for net.aserve::http-worker-thread?
22:41:01
gendl
again, this is only happening in a saved application. If I load the exact same fasl into a fresh Gendl image and start the server, all is well.
22:41:09
gendl
hold on i'll get the source
22:41:27
phoe
I expect that some stream value gets cached in there and it's dead by the time the application thaws.
22:41:32
phoe
let's try to figure that out
22:42:14
gendl
i'm not aware that i'm starting aserve at all in the image when doing the build and save-application... but maybe i am accidentally somehow
22:42:24
gendl
finding source for net.aserve::http-worker-thread...
22:43:13
gendl
https://www.irccloud.com/pastebin/MKsAUf4V/
22:44:31
phoe
the actual error happens elsewhere, but the stack is already destroyed by the time it's handled
22:45:01
phoe
wait a second though... there's three places with CONNECTION-RESET-ERROR
22:46:12
phoe
all errors are passed to CONNECTION-RESET-ERROR
22:46:19
phoe
give me the source for that function
22:46:38
phoe
because this means that CONNECTION-RESET-ERROR *MUST* be able to handle all possible subtypes of errors
22:46:52
gendl
https://www.irccloud.com/pastebin/WXFStO1P/
22:46:55
phoe
and I suspect that it actually doesn't handle them all and makes implicit assumptions
22:47:15
phoe
if the error is a stream error, then it calls STREAM-ERROR-IDENTIFIER on it
22:47:30
phoe
and not all conditions of type STREAM-ERROR have a STREAM-ERROR-IDENTIFIER
22:47:47
phoe
is STREAM-ERROR-IDENTIFIER a generic function?
22:48:03
phoe
if that is the case, then this might be solved by defining a default method on it that returns NIL
22:48:45
phoe
I assume that on ACL this STREAM-ERROR-IDENTIFIER is present in all CL:STREAM-ERRORs
22:48:52
phoe
so we must make it portable
22:49:10
gendl
https://www.irccloud.com/pastebin/BGPpz400/
22:49:33
gendl
so the excl:stream-error-identifier is coming from zacl.
22:49:52
phoe
(defmethod stream-error-identifier (s) nil)
22:50:16
phoe
give me a list of methods that exist on that GF
22:50:24
phoe
(mop:ge-fu-me #'stream-error-identifier)
22:50:46
phoe
if there is no collision, then you can specialize this workaround on STREAM-ERROR
22:51:13
phoe
...wait, no, there can be no method on STREAM-ERROR because otherwise we wouldn't be getting N-A-M
22:51:43
gendl
There is no package named "MOP"
22:51:52
phoe
use your favorite mop package
22:52:21
phoe
ccl:generic-function-methods
22:52:34
phoe
...right, CCL doesn't have an internal concept of different packages
22:52:45
phoe
everything is in #<GOD-PACKAGE CCL>
22:52:59
gendl
https://www.irccloud.com/pastebin/y6RjYbpk/
22:53:16
phoe
we're lucky that AFAIK compiler backends actually go against the grain and define their own packages
22:54:05
phoe
https://www.irccloud.com/pastebin/kPwqN3et/ mentions a different function
22:54:11
pjb
threads are not saved in images. You need to restart them when you reboot it.
22:54:17
phoe
not STREAM-ERROR-IDENTIFIER
22:54:22
phoe
pjb: we're not talking about threads here
22:54:28
phoe
and I've already said that
22:54:39
phoe
gendl: we need to backtrack
22:55:15
phoe
for whatever reason, SOCKET-ERROR-IDENTIFIER gets called, and I don't know where or how
22:55:23
phoe
the source for HTTP-WORKER-THREAD doesn't mention this function
22:55:36
phoe
so I assume it's called from some sorta handler established somewhere
22:55:56
phoe
try defining that method on SOCKET-ERROR-IDENTIFIER instead
22:55:58
gendl
i just did the (defmethod excl:stream-error-identifier (s) nil)
22:56:20
phoe
OK - and s/stream/socket/ in there and let's try again
22:56:41
phoe
I need to crash asleep now
22:57:10
phoe
but let's see how the code behaves once that method is defined
22:57:36
gendl
https://www.irccloud.com/pastebin/vhvdP6aY/
22:58:00
phoe
do the same for SOCKET-ERROR-CODE, lol
22:58:43
phoe
it seems we're violating some implicit assumptions that aserve makes about condition objects
22:58:58
phoe
e.g. that socket errors have socket-error-code and such
22:59:06
phoe
why the hell does it assume that this is a socket-error though?...
22:59:16
phoe
I ain't gonna think about this tonight
22:59:25
phoe
ACTION goes for his garbage collection
22:59:44
gendl
https://www.irccloud.com/pastebin/h2WuZiXB/
23:00:02
gendl
that's with both redefined
23:00:18
gendl
socket-error-identifier and socket-error-code
23:01:09
phoe
the million dollar question that I think you'll need to solve: how did #<BASIC-CHARACTER-OUTPUT-STREAM :CLOSED #x3020003C96DD> end up in there?
23:03:05
gendl
now the error seems to break the swank connection and i'm not getting a backtrace
23:03:26
gendl
go to sleep, i'll configure it to run by loading the fasl at startup instead of doing a pre-saved image for now
23:03:37
phoe
OK - try to get a backtrace in your free while
23:03:37
gendl
it only takes a fraction of a second to load the fasl anyway
23:04:10
phoe
we got the proper error hitting the debugger now, we just need to figure out how it happened
23:04:43
gendl
will do. And will go through my build steps with a fine-toothed comb and see where i might be leaving streams hanging open
23:05:12
gendl
it seems like i'm accidentally starting aserve in the build image before doing save-application but i didn't think i was doing that
23:05:36
gendl
i'm not sure that's happening - just speculation
23:05:47
gendl
how else do we end up with these closed streams in the thawed image
23:05:50
phoe
yes, but that would have such symptoms
23:05:57
phoe
that's sort of what I'm expecting
23:06:05
phoe
ACTION lands in the debugger
0:23:58
gendl
phoe: I know you're (hopefully) asleep, but I got to the bottom of it!
0:25:30
gendl
still not sure of all the "why"s but the closed stream was a log stream
0:26:00
gendl
net.aserve's (vhost-log-stream (wserver-default-vhost *wserver*))
0:28:16
gendl
setting that to *terminal-io* fixes the issue.
0:28:49
gendl
I still have to track down when/where that is being initialized and how it ends up as a closed stream in the saved & thawed image etc.
0:30:10
gendl
Doing those catch-all defmethods you recommended really helped to get to the real backtrace which made it obvious it was getting the error when attempting to do logging. (should have kinda realized that anyway since the http requests were still being responded to -- the http streams weren't the problem)
Saturday, 1st of August 2020, 6:30:10 UTC