17:19:15stylewarningflip214: working on it again now; I don't think it has to do with inlining or anything like that
17:19:32stylewarningflip214: I think this has to do with some tricky business about where code is being loaded and why that location in RAM is faster.
17:20:59rk[ghost]stassats: aye. i figured i volunteered myself in asking, but i have been too far mentally from computers that it isn't easy for me to just get swinging agan
17:23:42stylewarning|3b|: the profiler didn't give me enough info to track down serious differences
17:23:47stassatsstylewarning: even individual runs have too much variation, it's hard to diagnose anything
17:24:36stylewarningstassats: i started computing standard deviations, and while the stddev of all but the fast case are around 500ms, the difference in timing is still greater than the stddev
17:24:38stassatsrk[ghost]: well, the fastest way for arm32 to grow threads would be for me to do it, but it's not pleasant and everyone should just use arm64
17:25:28stylewarningstassats: and interestingly, the stddev (120ms) in the fast case is about 80% smaller than the stddev (500ms) in the slow case
17:25:30stassatsstylewarning: can you measure gc timings as well?