@adlrocha - Late Arrival to the Fuss of LLMs

From zero to zero-point-one in a few resources

Feb 04, 2024

This publication comes with a whole week of delay due to some personal issues that I may write about in the next few weeks, but in the meantime:

🗞️ Hot Off the Press

This will become the main section of every newsletter. Here is where I share the new articles (either “blogs” or “TIL”) in my site since my last update (see my 2024 resolutions for additional context on this).

For this week I bring you a blog and a TIL:

[Blog] Late Arrival to the Fuss of LLMs: A list of introductory resources to the field of LLMs and transformer architectures.
[TIL] Remember `to('cpu')` in Pytorch to release GPU memory: A funny issue that I encountered while taking my first baby steps with Pytorch.

🍙 Byte-sized bites

In this new section of the newsletter, I will share a list of interesting links I may have come across, and that are worth sharing, since the last publication.

They can be really varied, and they may not following a single theme, so in order to filter them a bit I am adding tags at the beginning of each one so you can clearly identify the topic they are about: 🤖 = AI; 👨🏻‍💻 = software; 💾 = hardware; 🖧 = decentralization; 🔑 = cryptography; 🔬 = research. You get the drill!

[🤖, 💾]Doom, Dark Compute and AI: I love the concept of dark compute as the idle computing capacity from devices sitting at home that is not being used. I feel this “dry powder” will become increasingly useful in a world of AIs hungry for compute. Will someone be able to capitalize it?
[🤖] Sleeper Agents, or how to attack LLMs by poisoning their datasets, or releasing open models with poisoned weights.
[👨🏻‍💻] Who else is working on nothing?: From FOMO to post-AI techno-pessimism, I really enjoyed reading the different points of view from tech workers about the present and future of the field. How they approached their careers, and determined their priorities in life. I feel like there’s this background in many of us which raises the question of whether it makes sense to learn new skills or to write new code these days if LLMs and AI will end up doing it for us in no time. Where should we focus our efforts then?
[👨🏻‍💻, 🖧] And here one more reason reason in favor of open-source, open-platforms and the decentralization of AI?
[👨🏻‍💻, 🖧] Willow Protocol: A protocol for peer-to-peer data stores.
[🤖,🔑] ZK-ML tutorial example over the MNIST dataset from 0xPARK
[🤖,🔬] Blending is all you need (paper): Interesting paper that shows how integrating smaller models can outperform larger ones reducing the computational requirements. “integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters).”

@adlrocha Weekly Newsletter

Discussion about this post