freenode/#sicl - IRC Chatlog
Search
4:59:32
no-defun-allowed
I got the list-hash-table and bucket-hash-table (derived from Code/Hash-tables/hash-table.lisp) working and passing the test cases.
5:01:03
no-defun-allowed
Currently that still relies on the host's SXHASH, so it probably should not be used for EQ or EQL hash tables.
5:03:29
no-defun-allowed
There were some small mistakes (like calling ASSOC with :KEY #'CAR which is already done by ASSOC) and the macro-expansion of WITH-HASH-TABLE-ITERATOR wasn't correct.
5:04:25
beach
Can I assume you used generic functions and standard classes instead of what I did in hash-table.lisp?
5:05:47
no-defun-allowed
Oh, I made one change: HASH-TABLE-TEST should return a symbol, but it would be silly for hash table code to coerce a function to a symbol and back, so I introduced %HASH-TABLE-TEST which returns the "real" test function.
5:09:50
beach
Unless the vector needs to be resized, a bucket hash table can lock each bucket separately.
5:12:29
beach
There are also some papers on lock-free hash tables, and other papers criticizing those papers.
5:12:56
beach
Maybe you can turn that stuff into some kind of project to integrate with your studies.
5:28:56
beach
Some universities give travel grants for stuff like that, and it may take some time for them to process an application, so if you plan to go, you should investigate soon.
5:40:18
no-defun-allowed
I start in March, so the timing would be quite tight. I may also get an offer from a different university about halfway through February.
5:42:39
no-defun-allowed
Probably both. I don't know if the university would bother with a grant if I will only have attended for a month by ELS.
5:44:25
no-defun-allowed
But I do only have two weeks between the last round of offers and the start of the first semester, so there's not much time to think if I do get another offer.
5:46:44
no-defun-allowed
I heard it's done differently elsewhere, but here all the student preferences and results are given to a government organisation (VTAC) and then universities put out offers in three rounds across three months.
5:47:49
beach
That's an interesting way of dealing with the mixture of private and public universities, I guess.
5:48:08
no-defun-allowed
Good thing it does start late, because I would probably be re-considering giving up and growing vegetables if I had to commute to and from university three days a week on days like that enrollment day.
5:51:30
no-defun-allowed
(i.e. on 20/12/2019 when it was 40°; that kind of weather usually goes away late February)
5:52:35
beach
So you are not too worried about the duration of the commute if the temperature is reasonable?
5:54:11
no-defun-allowed
I am also considering renting an apartment closer to the university; the dorm prices are horrendous but they do provide a system for finding other hosts.
5:55:38
no-defun-allowed
Everyone told me that it's usable as time for studying, and I did have decent results doing some programming on the train (though I always fear someone will think I'm doing the bad kind of hacking in public somehow).
5:57:24
no-defun-allowed
Maybe, but they're approximately €150 per week, and currently I only have classes scheduled for three of every five days.
5:58:51
beach
The least expensive alternative would then probably be to commute, and perhaps try to crash at a friend's place one night per week.
6:00:53
no-defun-allowed
ACTION wonders how a time can have a "preference" value of 104% while picking timetable preferences; is it overbooked now?
6:04:15
no-defun-allowed
There's another time with 170%. If it goes over 200%, do they have to run two lectures in parallel?
6:22:27
beach
OK, this is Monday, and (as usual) Monday mornings are chaotic around here. I'll be off and on for the next several hours.
8:32:06
no-defun-allowed
On an SBCL host, I could use SB-KERNEL:GET-LISP-OBJ-ADDRESS to get the representation (not an address per sē for immediate values) of a value and push a function that rehashes all affected hash tables to *AFTER-GC-HOOKS*, but I'm not sure how much effort you think is appropriate with unportable optimisations.
8:33:41
beach
It is up to you if you want your code to work for SBCL. For SICL, the idea is to move keys to the global heap where their addresses will never change.
8:37:34
no-defun-allowed
Another plan at the back of my head is that bucket hash tables with EQ or EQL tests could use the hash as a "first guess", searching the bucket it designates first, then trying the others after. That would usually be faster than using an alist of all the values (provided the hash function is faster than doing however many comparisons, which admittedly is unlikely for EQ or EQL) but still slower than
8:39:41
beach
Do you mean that it would use the address, but that may be wrong if the object has moved?
8:40:07
no-defun-allowed
Then if it's in another bucket, it could be rehashed; but I would have to do a lot of testing because it could also be much slower.
8:40:50
beach
I think you can rely on the system having a way to detect that the GC has been run so that you can rehash then.
8:41:06
no-defun-allowed
It could, but I think it is also possible (but maybe not likely) it could be effective if even SXHASH is used.
8:42:08
beach
I see. I have not given much thought to SXHASH, so I don't think I'll be of much help.
8:46:43
no-defun-allowed
...since, well, I got interested in hash table implementation after reading about other language implementations' hash functions being the same for every table, and thus possible to attack by picking keys which have the same hash (modulo the table size), forcing it to make O(n) lookups.
8:52:54
no-defun-allowed
I have an implementation of the Fowler–Noll–Vo hash function from earlier experiments which can take any random initial state, which could make it difficult to pick adversarial keys. It also is very easy to write as a function which combines a previous state and new data, so it is quite suitable for implementing a SXHASH in my opinion.
8:56:48
beach
I think it would be fantastic if we (i.e. the SICL project) could suggest better hash-table implementations not only for SICL, but for other Common Lisp implementations as well.
9:10:00
no-defun-allowed
Yeah, after that, implementations of some other languages added salts and variance, but I haven't (so far) heard of any Lisp implementations that do.
9:14:37
shka_
hey heisig i implemented closure compilation approach, i don't have benchmarks yet, but code is much cleaner then the old one
9:19:20
no-defun-allowed
Nope, I don't, I made a typo while generating some benign keys to compare timings from.
9:46:36
no-defun-allowed
But something like that should be possible, and it would make for a not uninteresting talk.
10:17:31
no-defun-allowed
By picking fixnums that have the same hash (modulo the size of the table), I can get a very small variance (around 10%), but it could well be experimental error even though I test it a lot of times.
10:19:49
no-defun-allowed
Maybe the trick involves strings or some other composite type instead. Strings are slower to compare, and some web libraries and languages use hash tables to represent request parameters (such as PHP's $_GET).
10:50:09
no-defun-allowed
And I can create a table where all the pairs are in one bucket, but there seems to be no performance degradation.
10:51:49
no-defun-allowed
There is a single cache element to my knowledge, but I try to miss the cache by testing several different keys.
11:05:18
no-defun-allowed
Actually, I can't quite get them all in one bucket, but there is at least significant clustering.
11:16:37
no-defun-allowed
Got it, the bucket position is calculated modulo the index vector's length (which is the power of 2 above the pair vector's length), not the pair vector's length. Now I have all the eggs in one basket, and I can drop it as I please.
11:33:29
no-defun-allowed
Oddly, it appears putting all the string keys into one bucket makes GETHASH 10% or so faster than with them spread out. Maybe if you're kind enough to arrange the keys optimally, the program will get embarrased and will fall over.
14:23:44
beach
pnp: You should do: (asdf:load-system '#:sicl-boot) (in-package #:sicl-boot) and finally (defparameter *b* (boot))
14:24:16
beach
pnp: Some of the new SICL code is incompatible with the Cleavir version 1 systems. The SICL-BOOT system has the right dependencies.
14:26:31
beach
And since it is not really necessary for further development, I have not taken time to fix it.
14:29:43
beach
And I don't know whether any of the environments resulting from bootstrapping could play that role.
14:33:53
beach
Like I said, the REPL isn't that useful to me. So it would be there only to show off.
14:34:52
beach
Well, all the phases are executed by the boot procedure. So then all the environments are as full as they get. And I am thinking none of them is full enough to justify a REPL.
14:35:42
beach
The best thing might be to create a special environment for the REPL, and import lots of stuff from the others, and the rest from the host.
14:36:13
beach
So the REPL itself is not hard to accomplish, but it is tedious to set up an appropriate environment for it.