@Takapapatapaka

Takapapatapaka@lemmy.world · 27 days ago

Here is the main blog post that i remembered : it has a follow up, a more scientific version, and uses two other articles as a basis, so you might want to dig around what they mention in the introduction.

It is indeed a quite technical discovery, and it still lacks complete and wider analysis, but it is very interesting for the fact that it kinda invalidates the common gut feeling that llms are pure lucky random.

Takapapatapaka@lemmy.world · 27 days ago

Oh yes, cost of training are ofc a great loss here, it’s not optimized at all, and it’s stuck at an average level.

Interestingly, i believe some people did research on it and found some parameters in the model that seemed to represent the state of the chess board (as in, they seem to reflect the current state of the board, and when artificially modified, the model takes modification into account in its playing). It was used by a french youtuber to show how LLMs can somehow have a kinda representation of the world. I can try to get the sources back if you’re interested.

Takapapatapaka@lemmy.world · 27 days ago

Actually, a very specific model (chatgpt3.5-turbo-instruct) was pretty good at chess (around 1700 elo if i remember correctly).

Takapapatapaka@lemmy.world · 1 month ago

I did not, can you give me a hint ?

Takapapatapaka@lemmy.world · 1 month ago

I agree, having experienced this especially on mathematics pages. But on the other hand, from my experience, the whole article is very technical in those cases : I’m not sure making a summary would help, and im not sure you can provide a summary both correct and easily understandable in those cases.