libera/#clasp - IRC Chatlog
Search
18:19:48
Bike
with inline definitions disabled, the next most serious slow point in the swank perf data is eclector. it takes 18.6% of the time. that might need some closer review
18:22:40
Bike
i would still expect it to take a pretty good chunk of time since you're still making a gigabyte of objects
18:24:28
Bike
i want to fix that stuff but doing it right will be some pretty serious changes to the compiler
18:29:48
Bike
seems like it's still taking a while in fast-read-byte. might be more arithmetic problems
18:48:06
Bike
ok, dpb thing might be more fixable actually. in a quick test looks like if %ldb is inlined (which it is not, at the moment), and the inputs are declared fixnum, clasp does optimize it down to take 8 ms instead of 51 ms (a million iterations)
18:48:35
yitzi
You could use a different flamegraph library. There are some online ones. Would be nice to have lisp one
18:48:38
Bike
dunno if the type inference will handle things on its own, though. it's not super good at the log*** functions right now
18:49:32
Bike
things are at least set up so that log*** on fixnums will be inlined, so llvm can treat them as bitwise ops and do whatever it does for c code
18:54:13
drmeister
Recognizing something like " 7f3893eea528 FAST-READ-BYTE^FAST-IO^FN^^-lcl+0x1c8 (/tmp/perf-710561.map)" works fine.
18:55:26
drmeister
Does anyone know any perl regular expression magic so that it handles a line like...
18:57:28
Bike
so the regex is looking for some letters and numbers, then maybe space, then characters, and then " (something)", i guess?
18:57:48
Bike
oh, so in that case it's not choking on the space, it's the parentheses that are throwing it
18:58:05
drmeister
I WANT "clbind::WRAPPER_VariadicFunction<bool (*)(gctools::smart_ptr<core::Number_O>, gctools::smart_ptr<core::Number_O>), core::policy::clasp_policy, clbind::pureOutsPack<std::integral_constant<bool, true>, std::integral_constant<bool, true> >, clbind::BytecodeWrapper>::entry_point_2+0x83" to be the second matched string.
18:58:36
drmeister
And "(/home/meister/Development/cando/build/boehmprecise/iclasp)" as the last string.
19:36:15
drmeister
That didn't work and the problem is deeper in the perl code. So I hacked it and got it to work. It was a waste of time.
19:39:22
Bike
i have not been able to run perf with CLASP_ENABLE_TRAMPOLINES. it complains pretty inscrutably.
19:40:29
Bike
i wouldn't bother spending too much time on this though - the point is that it's spending time in a structure writer, which is a pretty simple function
21:08:52
yitzi
Bike: I am running flamegraphs on Inravina right now. There core::lisp_multipleValues is about ~8%.
21:48:18
drmeister
https://www.tripadvisor.com/Attractions-g8068204-Activities-Ravina_Trento_Province_of_Trento_Trentino_Alto_Adige.html
21:56:57
Bike
and hey, that bad eh? how unexpected and terrible. i guess this is why we profile. is it perchance being called from bytecode_vm
21:58:36
Bike
i was thinking of inlining it, but if the problem is just that it's called a lot and the main overhead is from the TLS lookup that might not help much
22:04:21
Bike
that's gonna require some creativity to deal with. drmeister rewrote things so bytecode_vm only calls lisp_multipleValues once (mostly) a few months ago, but that also means that bytecode_vm always calls it, even if multiple values don't actually need to be touched
22:04:25
Bike
https://github.com/clasp-developers/clasp/commit/7d430ced25cae2098bad163372779ff7d43a9055#diff-c42f70b38d648f901b23ff50fd9212017beabcf4a2cfef4d7ae091c9013fd12d
22:05:04
Bike
maybe we could do a little caching thing - so bytecode_vm doesn't call lisp_multipleValues usually, but if it does, calls it once and then just uses that thereafter
22:06:00
Bike
which won't show up in a flame graph since it's not done through a function, but i saw bytecode_call having a pretty respectable amount of self time
22:07:13
Bike
i would guess so. i forget the details of how TLS access works, but that drepper paper didn't make it seem easy
22:08:57
Bike
yeah, and there's other stuff to improve with what i'm doing as well, but it does stick out. bytecode_call and bytecode_vm are called A Lot after all, and if we adopt the bytecode fasl sort of stuff moving forward they'll be called even more
22:36:22
drmeister
Bike: I got imagemagick to tile an image from the 10GB image tiles - it's pretty fast. I can tile 12 of them in 6 seconds.
1:37:08
Bike
yitzi: not ssure how the tests missed this, but this is an implicit downcast (bad) https://github.com/clasp-developers/clasp/blob/main/src/core/lispStream.cc#L166-L173
1:42:00
Bike
there's also a little bug in the tests in that a couple of them expect a #! macro and now are failing
1:43:13
Bike
readtable-3 might be really wrong, since it makes a standard readtable, which shouldn't have an extension like #! in it
2:52:36
drmeister
That's the Leucine amino acid, when the backbone phi, psi angles are 180, 140 the probability of finding the side-chain in the chi1, chi2 dihedral angles listed.
2:53:40
drmeister
There are about 720,000 entries in this database, one rotamer per line with comments.
2:57:10
drmeister
I compare the size of the gzip'd spiroligomer conformations.cpk file vs the protein rotamer library and its 17:1 in size.
2:58:01
drmeister
So it takes 17 times as much information to specify spiroligomer structures in a modular way vs proteins.
2:58:52
drmeister
Ok, that's not completely fair because they store only the essential dihedral angles and I store every dihedral angle, angle and bond length.
3:00:45
drmeister
So it takes about as much information to describe the essence of protein structure as it does to store spiroligomer structure and we both have about 20 functional groups. Within a factor of 2.
3:09:10
drmeister
Oh dear lord - this is ridiculous. I googled "Periodic Table" to lookup atomic masses and there's a javascript widget for rotating a Bohr model of an atom. That makes zero sense.