libera/#clasp - IRC Chatlog
Search
14:53:49
Bike
clasp is not really built with the idea of having functions run freestanding like that.
15:41:25
yitzi
Bike: I just reverted return of -1 in set_column and made it return the column value directly. This match set_position, etc. vs making set_column return void.
18:05:55
Bike
one thing that sticks out in the swank perf data is that 3.5% of the time is spent in lisp_multipleValues, which does nothing but load a thread local variable. i guess doing so is pretty expensive.
18:15:43
drmeister
It takes 0.8 seconds to read the 916,200,191 bytes into a static vector with this...
18:19:48
Bike
with inline definitions disabled, the next most serious slow point in the swank perf data is eclector. it takes 18.6% of the time. that might need some closer review
18:22:40
Bike
i would still expect it to take a pretty good chunk of time since you're still making a gigabyte of objects
18:24:28
Bike
i want to fix that stuff but doing it right will be some pretty serious changes to the compiler
18:29:48
Bike
seems like it's still taking a while in fast-read-byte. might be more arithmetic problems
18:48:06
Bike
ok, dpb thing might be more fixable actually. in a quick test looks like if %ldb is inlined (which it is not, at the moment), and the inputs are declared fixnum, clasp does optimize it down to take 8 ms instead of 51 ms (a million iterations)
18:48:35
yitzi
You could use a different flamegraph library. There are some online ones. Would be nice to have lisp one
18:48:38
Bike
dunno if the type inference will handle things on its own, though. it's not super good at the log*** functions right now
18:49:32
Bike
things are at least set up so that log*** on fixnums will be inlined, so llvm can treat them as bitwise ops and do whatever it does for c code
18:54:13
drmeister
Recognizing something like " 7f3893eea528 FAST-READ-BYTE^FAST-IO^FN^^-lcl+0x1c8 (/tmp/perf-710561.map)" works fine.
18:55:26
drmeister
Does anyone know any perl regular expression magic so that it handles a line like...
18:57:28
Bike
so the regex is looking for some letters and numbers, then maybe space, then characters, and then " (something)", i guess?
18:57:48
Bike
oh, so in that case it's not choking on the space, it's the parentheses that are throwing it
18:58:05
drmeister
I WANT "clbind::WRAPPER_VariadicFunction<bool (*)(gctools::smart_ptr<core::Number_O>, gctools::smart_ptr<core::Number_O>), core::policy::clasp_policy, clbind::pureOutsPack<std::integral_constant<bool, true>, std::integral_constant<bool, true> >, clbind::BytecodeWrapper>::entry_point_2+0x83" to be the second matched string.
18:58:36
drmeister
And "(/home/meister/Development/cando/build/boehmprecise/iclasp)" as the last string.
19:36:15
drmeister
That didn't work and the problem is deeper in the perl code. So I hacked it and got it to work. It was a waste of time.
19:39:22
Bike
i have not been able to run perf with CLASP_ENABLE_TRAMPOLINES. it complains pretty inscrutably.
19:40:29
Bike
i wouldn't bother spending too much time on this though - the point is that it's spending time in a structure writer, which is a pretty simple function
21:08:52
yitzi
Bike: I am running flamegraphs on Inravina right now. There core::lisp_multipleValues is about ~8%.
21:48:18
drmeister
https://www.tripadvisor.com/Attractions-g8068204-Activities-Ravina_Trento_Province_of_Trento_Trentino_Alto_Adige.html
21:56:57
Bike
and hey, that bad eh? how unexpected and terrible. i guess this is why we profile. is it perchance being called from bytecode_vm
21:58:36
Bike
i was thinking of inlining it, but if the problem is just that it's called a lot and the main overhead is from the TLS lookup that might not help much
22:04:21
Bike
that's gonna require some creativity to deal with. drmeister rewrote things so bytecode_vm only calls lisp_multipleValues once (mostly) a few months ago, but that also means that bytecode_vm always calls it, even if multiple values don't actually need to be touched
22:04:25
Bike
https://github.com/clasp-developers/clasp/commit/7d430ced25cae2098bad163372779ff7d43a9055#diff-c42f70b38d648f901b23ff50fd9212017beabcf4a2cfef4d7ae091c9013fd12d
22:05:04
Bike
maybe we could do a little caching thing - so bytecode_vm doesn't call lisp_multipleValues usually, but if it does, calls it once and then just uses that thereafter
22:06:00
Bike
which won't show up in a flame graph since it's not done through a function, but i saw bytecode_call having a pretty respectable amount of self time
22:07:13
Bike
i would guess so. i forget the details of how TLS access works, but that drepper paper didn't make it seem easy
22:08:57
Bike
yeah, and there's other stuff to improve with what i'm doing as well, but it does stick out. bytecode_call and bytecode_vm are called A Lot after all, and if we adopt the bytecode fasl sort of stuff moving forward they'll be called even more
22:36:22
drmeister
Bike: I got imagemagick to tile an image from the 10GB image tiles - it's pretty fast. I can tile 12 of them in 6 seconds.
1:37:08
Bike
yitzi: not ssure how the tests missed this, but this is an implicit downcast (bad) https://github.com/clasp-developers/clasp/blob/main/src/core/lispStream.cc#L166-L173
1:42:00
Bike
there's also a little bug in the tests in that a couple of them expect a #! macro and now are failing
1:43:13
Bike
readtable-3 might be really wrong, since it makes a standard readtable, which shouldn't have an extension like #! in it