freenode/#sicl - IRC Chatlog

4:04:12 beach Good morning everyone!

4:09:19 no-defun-allowed Good morning beach!

4:17:44 no-defun-allowed Is there much of a convention for example code in SICL? I ported the metadata probing code to use a made-up SIMD package, so that if one appears, implementing faster probing is just a matter of substituting function names; but it certainly won't work now, as that package doesn't exist.

4:18:29 beach What do you mean by "example code"?

4:19:46 no-defun-allowed It is code that could be used, but won't work without some platform-specific modification. Maybe "example code" is the wrong term.

4:20:36 beach What kind of convention are you looking for?

4:20:49 beach It is fine with me to stick such code in the repository.

4:21:00 beach It just needs comments to explain it.

4:22:20 no-defun-allowed I currently have a CL:ERROR form with a brief description at the start of the file, so that it won't load, and have named it "simd-metadata-table.example.lisp".

4:23:10 beach Sure, but why do you have two `.' in the name?

4:23:24 beach What's the significance of the first one?

4:24:32 no-defun-allowed I recall reading other codebases which turned a type .lisp into .example.lisp to denote that it was not meant to be used as is.

4:24:47 beach Fair enough.

4:25:11 no-defun-allowed But I take it that convention came off odd, so I could remove it.

4:25:46 beach It was just a question. Since we don't have a convention, this one is as good as any.

4:26:01 no-defun-allowed Okay then.

4:34:47 no-defun-allowed Should I send in a pull request with my linear probing hash table then?

4:34:58 beach Sure.

4:39:10 no-defun-allowed Okay, that's <https://github.com/robert-strandh/SICL/pull/199>

4:40:28 beach Merged! Thanks!

4:45:53 jeosol no-defun-allowed: just saw your pull request on implement a linear-probing hash-table into SICL. Good work

4:45:59 no-defun-allowed Thanks!

4:46:19 jeosol Decided to post here instead of the main #lisp channel

4:46:38 jeosol I am also watching the linked video to get relevant background on this

8:38:35 beach I am watching the presentation by Matt Kulukundis. I think a lot of the complications he talks about, like large keys and large objects, are problems that don't exist in a language with uniform reference semantics. Also, I think the tricks they invented are quite clever. Some of them may require some assembly code to be taken advantage of in a Common Lisp system.

8:48:18 no-defun-allowed Yes, you can cheat the large object problem with references, when you have objects (like symbols and fixnums) which can be compared with EQ. But I kept fairly faithful to the design of the table (except that I keep hashes in the table, so that resizing doesn't have to recompute hashes), and the only unportable part I recall is probing using SIMD.

8:49:18 beach I think that's true.

8:56:02 no-defun-allowed But I would imagine a Common Lisp implementation could include a safe SIMD library (and I'd like to say "portable" too, I think LLVM gets away with portable vector code somehow). It wouldn't be hard, as the only operations that touch memory are loads and stores, which would just require bounds checking.

8:57:57 no-defun-allowed ...when the elements are integers or floats. I don't think anyone has seriously considered putting references to objects in vector registers.

8:58:50 jeosol The talk was interesting indeed.

10:21:47 jackdaniel fun (subjective) observation: among all cl-related irc channels I'm in this is the most active

10:22:11 beach Wow!

10:23:33 no-defun-allowed 🎉

10:23:58 beach no-defun-allowed: I didn't follow all the details in the talk. I'll probably watch it again. But, here is a question: would your implementation be particularly fast for the use case that is so common in Cleavir, namely 1. test (not in the table) 2. insert it?

10:24:32 beach If so, we should suggest it to the Clasp people who seem to care about compilation time.

10:25:04 no-defun-allowed I have not tested the performance of inserting yet, but GETHASH for keys which aren't in the table appears to be about as fast as for keys in the table.

10:25:48 beach I was hoping for faster. Not so, huh?

10:26:59 no-defun-allowed Revising my notes, misses are slightly faster than hits.

10:27:31 beach Good.

10:27:56 no-defun-allowed On average, a miss is about 3% faster than a miss, but that doesn't seem significant.

10:28:12 beach Not really, no.

10:28:41 beach I think this might be a common use case. For Cleavir, we traverse an instruction graph with relatively few back arcs. So most of the time, it is not going to be in the table.

10:28:48 no-defun-allowed The list and bucket hash tables have slower misses than hits, but those are also insignificant results.

10:30:07 beach Do you do some caching so that a gethash followed by a (setf gethash) with the same object does not do the search all over again/

10:30:22 beach I believe SBCL does something like that, no?

10:30:28 no-defun-allowed I do not, currently. But SBCL does do that, yes.

10:32:20 beach So here is an idea. Keep a bitmap indexed by hash values modulo its size, but say 10 times as many entries as in the table. It would act as a Bloom filter, so that if there is a 0, then the object is definitely not in the table.

10:33:06 beach Then, instead of inserting immediately, accumulate objects sequentially until there is a hit in the table.

10:33:54 beach Then, at the very least, it is possible to make a good guess of the size the table should be to accommodate the new objects.

10:34:13 beach It would then avoid resizing and rehashing too often.

10:34:18 no-defun-allowed Right.

10:34:25 beach Provided the bitmap is big enough of course.

10:35:37 beach But a finer value than 10 can likely be derived.

10:35:44 beach Let me give that some thought actually.

10:36:22 beach But you get the idea.

10:36:40 beach It might be a special table for this use case.

10:37:42 no-defun-allowed I have found the list-hash-table to be faster than both the bucket table and linear probing table, with fewer than 7 mappings. A miss would provoke worst-case behaviour, but it's still somehow faster.

10:43:23 beach That's interesting.

10:43:26 no-defun-allowed Now I'm thinking about how similar a Bloom filter and the metadata table are in purpose. (They are evidently different in construction, but they both are used to handle misses quickly.)

10:43:43 beach OK, so here is a somewhat more thought-out idea...

10:44:55 no-defun-allowed Though, isn't there an often repeated observation that a list is a faster set than a hash table for, say, fewer than 20 elements? There is a constant overhead due to using generic function on SBCL, so the crossover point is lower.

10:45:46 beach Yes, the exact value varies, but the idea is consistently true it seems.

10:46:28 beach When the hash table is created, allocate a bitmap of some reasonable size that is significantly smaller than the overhead of the hash table itself, say 8 full words, so 512 bits.

10:47:23 beach For gethash, first take the hash value modulo the size of the bitmap. If the bit is cleared, return NIL.

10:48:58 beach For (setf gethash), either remember the latest hash value or do the test again. If it is a new object, accumulated it in a vector, I guess adjustable.

10:49:51 beach When the bit test returns 1, check how many objects have accumulated, and resize the table accordingly. Then insert all the existing elements plus the accumulated ones.

10:50:32 beach Also, adjust the size of the bitmap to reflect the current number of elements, and set the bits for the existing elements.

10:50:46 beach Maybe the adjustable vector is going to be a problem.

10:51:04 beach Maybe just a fixed-size vector, also depending on the size of the table.

10:52:10 beach Oh, now I remember what's killing the performance of Clasp...

10:52:37 beach It's the Boehm collector that doesn't handle the frequent reallocation of a bigger table.

10:53:04 beach So just doing fewer reallocations with better estimated sizes might be a great win.

10:56:25 no-defun-allowed It sounds like the metadata table could be used for this, but I can't work out how yet.

10:57:53 beach Yes, that's what inspired me. So let's put this idea on the back burner but keep it in mind.

11:00:53 no-defun-allowed But I'm also not sure how the bitmap allows for estimating hash table sizes. We count the number of keys used between some given key being used, and then reused (ignoring collisions in the bitmap)?

11:03:11 no-defun-allowed I will re-read your idea tomorrow.

11:20:10 beach No, the total number of existing elements, plus the new accumulated ones will give the estimate.

11:21:03 no-defun-allowed Ah right. Don't we resize the adjustable vector with accumulated objects using a similar scheme to the hash table?

11:21:12 beach I should put a meter in the Cleavir code so that we can see how many nodes are typically processed.

11:21:48 beach Yes, that's what I meant by the adjustable vector being a problem. But at least no probing is required.

11:22:14 beach Time for a lunch break.

11:22:21 no-defun-allowed See ya.

11:35:18 no-defun-allowed Well, when you insert the elements at the end of measurement, you still probe as well.

12:34:02 heisig Hello everyone!

12:36:41 heisig Speaking of hash tables - I always wondered why hash tables with less than about 15 elements aren't implemented as an alist.

12:37:24 heisig I occasionally write my own hash table wrapper that does exactly that, and it is measurably faster in many cases.

12:38:18 heisig In many practical cases, the number of elements in a table is much less than 15.

12:39:29 heisig Have I missed a reason why implementations don't do this?

13:11:16 beach Hello heisig.

13:37:12 pjb heisig: note the actual threshold count is implementation and platform dependent (and of course, also dependent on the test and hash functions (not sxhash!)). You would need to benchmark it at the start of the program.

13:37:58 pjb heisig: I've measured as low as 5 elements with sbcl and as high as 35 elements with clisp, on the platform where I tested it.

13:40:37 beach jackdaniel: Where did you find the statistics about the IRC channels?