freenode/#sbcl - IRC Chatlog
Search
10:51:05
pfdietz
I've wanted to be able to compile sbcl with coverage, so I could see what parts are not being tested. Also, random test generation + coverage => automatically generating minimized tests that extend coverage.
12:00:48
pkhuong
pfdietz: the easiest way to get coverage everywhere might be editcore + HW/binary tracing like honggfuzz?
12:02:14
pfdietz
I want something where the coverage is made available in a form the reducer can easily access. It would also help if the a snapshot of the coverage could be taken and the state rolled back to the snapshot.
12:03:08
pfdietz
So: generate a test case, determine that it extends coverage, then reduce it to a minimal form that still extends coverage (and repeat that if the original had more coverage than the reduced form).
12:03:52
pfdietz
This is inspired right now by Doug K's comment on vop coverage in the most recent commit.
12:08:28
pkhuong
i don't have access to any machine with intel pt, but I do have BTS. i'll try to hack something up with intel's PMU today
12:28:00
pkhuong
the hardest part will be dropping branches to / from the C runtime and newly generated code
12:31:42
pfdietz
The approach I've taken on this sort of thing is instrumenting lisp when it's compiled. So if I can recompile part of the compiler, I can collect the information I want.
13:46:12
pfdietz
Ugh, obviously I was wrong. (let ((*macroexpand-hook* …)) <form>) does not affect the macroexpansions in <form>.
14:30:21
stassats
on a large number of &rest it's a clear win, but it requires modifying a lot of call/return vops and is slightly slower when the number is small
14:34:33
flip214
Would it make sense to move &rest to some heap-allocated frame instead of holding them on the stack?
14:35:49
flip214
apropos, I'd really appreciate a pair of functions that would allow to split a closure up into a code and an environment pointer, so that C callback structures that use a single (void*) argument for many functions (like a C++ vtable) can be easier handled
14:48:25
flip214
stassats: well, in this case I'm pushing multiple data items to C, and the C functions will run the callbacks at some later time, possibly one after another.
14:49:30
flip214
and instead of allocating some struct manually, storing my data in there, and passing its locked address around as a void*, I hoped that I could do that implicotly
14:52:08
flip214
instead of creating fresh (C-api) structures of closures all the time, I'd like to have one static structure with all the functions set up, and get the environment passed in via the (void*)
14:52:49
stassats
i still don't understand because i guess you're describing a solution, not the problem
14:54:04
flip214
there's a (foreign) structure that stores 8 or 10 function pointers. I allocate one, store closures in there, and call the C api with it. many, many times.
14:54:45
flip214
I would have hoped that I could instead allocate _one_ of these structures, store function pointers in there, and pass that _single_ structure to the C api _every_ time.
14:56:07
flip214
as an implicit way to pass the required state on, instead of allocating some class with the data myself
15:00:26
flip214
I'd like to avoid _manually_ allocating something to keep the state. The dynamic environment has all the information, so I'd like to pack that into a void* and send it on
15:02:44
flip214
I guess it would be more easy to explain in person.... sadly you won't be at SBCL20, but perhaps I'll get a chance at the next ELS
15:03:52
stassats
i doubt it, i'm not even close to getting how the normal closures are not suitable
15:16:01
pfdietz
Support for closure savings/loading in fasls would be really nice for some things I've tried to do.
15:31:03
puchacz
hi, can somebody tell me please if I am on the right track, I try to prepare a bug report that happens only if I run my program for about 30 - 60 minutes on 32 (virtual) computer. I did not isolate it yet, but I tried to follow http://www.sbcl.org/manual/#Signal-Related-Bugs - I recompiled sbcl with :sb-show, :sb-show-assem, :sb-qshow, :sb-xref-for-internals and :sb-hash-table-debug
15:32:03
puchacz
with these settings, I delivered image to the 32 core server and when the bug appeared, it did not corrupt the image this time, but I still tried to follow the steps with gdb and got this: https://paste.ubuntu.com/p/GJphMPXRwV/
15:33:36
puchacz
before when hunchentoot unbound session secret appeared, and USOCKET:BAD-FILE-DESCRIPTOR-ERROR signal, the image was corrupted with the message "continuing with fingers crossed", e.g. https://paste.ubuntu.com/p/cbC7jWjsmz/
15:34:23
puchacz
maybe this extra safety settings prevented image corruption or I did not wait for long enough
15:45:02
puchacz
if I manage to isolate it, shall I submit the program that triggers it even if all I get is something like https://paste.ubuntu.com/p/cbC7jWjsmz/ or not worth it?
15:46:19
stassats`
you'll get the function name with (sb-di::code-header-from-pc (sb-sys:int-sap #x539f3163))
15:47:37
puchacz
okay, I will recompile without these extras and try to provoke it again - I don't have the previous image anymore
15:52:05
puchacz
I don't know the syntax, I will stick to Lisp form: (sb-di::code-header-from-pc (sb-sys:int-sap #x539f3163))
16:01:04
stassats`
another way to get rid of &rest copying, if the caller knows how much stack space the callee allocates and do that on its behalf, and then push all the arguments
16:03:08
stassats`
difficult with all the different function types, closures, symbols, callable instances
17:31:01
puchacz
okay, I got sbcl corruption and as stassats advised, I checked the function where it happened:
18:20:38
puchacz
pfdietz: it happens inside hunchentoot, and I think it is some sort of internet error, like I recorded yesterday, here: https://paste.ubuntu.com/p/cbC7jWjsmz/
18:25:31
flip214
puchacz: you are sure the machine is okay, hardware-wise? did you run a memtest already?
18:26:30
puchacz
flip214 - it is a virtual computer at a provider, I rented it because I wanted to run a simulation on 32 cores
18:28:15
puchacz
but I deleted it and restored it few times, I tried 2 locations, so must be different hardware. unlikely it is a "hardware" (virtualised or otherwise) problem
18:31:28
puchacz
I haven't tried full setup, but it looked to me that for my problem CCL (when I ran on my PC) was about 10 times slower, so I gave up.
18:43:23
flip214
puchacz: and you're running full-speed against hunchentoot there? or a limited amount of traffic?
18:45:50
puchacz
hunchentoot is an "interface" between my program on that virtual computer that runs simulations with different parameters, and Mathematica running on my PC that provides these parameters via HTTP
18:46:25
puchacz
so no simultaneous calls but hundreds and then thousands of calls in rapid succession.
18:47:59
puchacz
split the work in hunchentoot handler into 64 pieces, then join them all and respond
19:51:50
puchacz
okay, it seems I will need to prepare an isolated test case and send. Realistically - next weekend :)
20:00:44
stassats
bad file descriptor, faulting when writing to a buffer, it appears you're touching a stream that has been already closed
20:05:56
puchacz
hunchentoot is an "interface" between my program on that virtual computer that runs simulations with different parameters, and Mathematica running on my PC that provides these parameters via HTTP
20:06:09
puchacz
so no simultaneous calls but hundreds and then thousands of calls in rapid succession.
20:06:26
puchacz
my simulation is 64 threaded, split the work in hunchentoot handler into 64 pieces, then join them all and respond
20:08:12
stassats
next i'd modify HUNCHENTOOT::*SUPPORTS-THREADS-P* to always default to NIL and try that
20:09:12
puchacz
but you are right, hunchentoot does not need to be parallel, it is not a bottleneck
20:09:14
stassats
sure, but i have a feeling it's not going to help, but it's useful to know that it's not going to help anyway
20:14:02
stassats
but all that does is closing the socket, might explain the bad-file-descriptor-error, but not the segfault
20:24:09
puchacz
(it is running the simulations now with --lose-on-corruption by the way, if we are lucky it will corrupt it in 30 minutes maybe)
20:26:36
puchacz
mathematica sends something like http://my-ip/simulate?x1=3423.324&x2=34.7&x3=.... etc. and waits
20:27:17
puchacz
when all threads complete, it sends back in the same handler one number as a result, simulation score
20:27:38
puchacz
mathematica immediately after receiving it sends another http request, with different values of x1 x2 etc.
20:31:37
puchacz
it seems to be completing one simulation (so one http request / 64 threads / response cycle) within say 3 seconds