I had my own suspicions back when HashiCorp changed the Terraform licensing (and, yes, I’ve pretty much switched to OpenTofu on the one or two things I use that require infra nuke & pave), but I can’t say that IBM did a bad deal here - even though I don’t see that much value in the enterprise market (multi-cloud is very much a reality, but once you have Kubernetes a lot of the truly relevant plumbing vanishes from Terraform), it is a nice complement to RedHat’s Ansible and IBM’s own (almost unknown) orchestration software.
And, of course, it’s yet another data point for doomsayers harping on about Open Source licensing being a barrier to corporate growth…
Just updated my Ideapad Flex, and it’s nice to see how it’s improved since I started using Fedora 36. So yes, I’ve definitely settled on Fedora as my laptop distribution of choice, even if I still target Ubuntu on the server.
(I’ve resisted moving my laptop to Silverblue or Bluefin for now, but I suspect it won’t take long.)
This week’s notes come a little earlier, partly because of an upcoming long weekend and partly because I’ve been mulling the LLM space again due to the close release of both llama3 and phi-3.
Thanks to my recent AMD iGPU tinkering I’ve been spending a fair bit of time seeing how feasible it is to run these “small” models on a slow(ish), low-power consumer-grade GPU (as well as more ARM hardware that I will write about later), and I think we’re now at a point where these things are finally borderline usable for some tasks and the tooling is finally becoming truly polished.
As an example, I’ve been playing around with dify for a few days. In a nutshell, it is a node-based visual environment to define chatbots, workflows and agents that can run locally against llama3 or phi-3 (or any other LLM model, for that matter) and makes it reasonably easy to define and test new “applications”.
It’s pretty great in the sense that it is a docker-compose invocation away and the choice of components is sane, but it is, like all modern solutions, just a trifle too busy:
It has most of what you’d need to do RAG or ReAct except things like database and filesystem indexing, and it is sophisticated to the point where you can not just provide your models with a plethora of baked-in tools (for function calling) but also define your own tools and API endpoints to call–it is pretty neat, really, and probably the best all-rounded, self-hostable graphical environment I’ve come across so far.
But I have trouble building actual useful applications with it, because none of the data I want to use is accessible to it, and it can’t automate the things I want to automate because it doesn’t have the right hooks. By shifting everything to the web, we’ve foregone the ability to, for instance, index a local filesystem or e-mail archive, interact with desktop applications, or even create local files (like, say, a PDF).
And all the stuff I want to do with local models is, well, local. And still relies on things like Mail.app and actual documents. They might be in the cloud, but they are neither in the same cloud nor are they accessible via uniform APIs (and, let’s face it, I don’t want them to be).
This may be an unpopular opinion in these days of cloud-first everything, but the truth is that I don’t want to have to centralize all my data or deal with the hassle of multiple cloud integrations just to be able to automate it. I want to be able to run models locally, and I want to be able to run them against my own data without having to jump through hoops.
On the other hand, I am moderately concerned with control over the tooling and code that runs these agents and workflows. I’ve had great success in using Node-RED to summarize my RSS feeds (partly because it can be done “upstream” without need for any local data), and there’s probably nothing I can do in dify that I can’t do in Node-RED, but building a bunch of custom nodes for this would take time.
Dropping back to actual code, a cursory review of the current state of the art around langchain and all sorts of other LLM-related projects shows that code quality still leaves a lot to be desired1, to the point where it’s just easier (and more reliable) to write some things from scratch.
But the key thing for me is that creating something genuinely useful and reliable that does more than just spew text is still… hard. And I really don’t like that the current state of the art is still so focused on content generation.
It’s not the prompting, or the endless looping and filtering, or even the fact that many agent frameworks actually generate their own prompts to a degree where they’re impossible to debug–it’s the fact that the actual useful stuff is still hard to do, and LLMs are still a long way from being able to do it.
They do make for amazing Marketing demos and autocomplete engines, though.
I recently came across a project that was so bad that it just had to be AI-generated. The giveaway was the complete comment coverage of things that were essentially re-implementations of stuff in the Python standard library in the typical “Java cosplay” style object-oriented code that is so prevalent in the LLM space. It was that bad. ↩︎
Back when I last tried it (on the Apple TV, quite a few years back), Provenance was very nice. I just hope it won’t follow Delta’s footsteps and discriminate against EU users.
I ended up throwing my back out on early in the week, so most of my time was spent in comical pain, moving around like a crab on stilts and trying to get some work done in between bouts of lying down, watching Fallout and reading Scalzi’s Starter Villain, which was actually quite fun.
Since it seems to be iOS emulator season, I thought I’d rewind back to my Easter break, when as a concession to the need to relax I decided to pack some form of gaming device. But since I also wanted to minimize packing, I settled on a game controller and using my iPad Pro for light gaming.
I know I am very late to this party, but the stupidest thing about this for me is that as a EU resident, if I want to download Delta (which is a free app), I have to install AltStore and pay Eur. 1.50 + VAT a year, which is definitely something I don’t agree with in principle. And it’s all Apple’s fault to begin with, regardless of the fact that the EU forced them to allow third-party app stores, because this is to cover Apple’s Core Technology Fee.
I’m just going to wait until this stupidity gets sorted out and I can get it via Apple’s own app store–there are literally dozens of Nintendo emulators out there still, and I suspect Delta will be outclassed by all the Android ports that are likely to pop up.
I’ve been pointing out that LLMs are barely optimized for ages now, so here’s another example of possible inference speedups that seems very promising (it works somewhat like on-the-fly distillation).
If this technique checks out and ends up implemented in mainstream tooling like ollama, it’s going to significantly lower compute and memory requirements for a bunch of scenarios.
As part of my forays into LLMs and GPU compute “on the small” I’ve been playing around with the AceMagic AM18 in a few unusual ways. If you missed my initial impressions, you might want to check them out first.
This is what I’d splurge on if I had free rein on my budget (and the time to relax and actually play with it).
Only thing it seems to be missing is using the USB port for audio (which seems kind of silly, since it would let people do away with their audio interfaces).
Also, it would be nice if Arturia ever bothered to ship V-Collection as iOS Audio Units, but I’ve mostly given up waiting on that.
Guess what, I just kept writing notes anyway. Easter Break helped reset my expectations towards work and this week I managed to get back into the swing of things, with a fair bit of writing and documentation work done on my own time.
I am so, but so sorry to see this happen. I paid my way through college doing design, but decided I would never again use Adobe software after Creative Cloud became the mess it is today.
Even though I don’t do graphical design as part of my profession (well, not explicitly at least), I willingly paid for Affinity to sponsor them as a viable alternative to Adobe.
And it sort of worked (until it backfired today): The Affinity suite is (for the moment) good quality native Mac software that does not rely on cloud features nor has a subscription model.
As much as their FAQ claims that will not change, I think we’ve all seen this before–in short, I don’t trust Canva one whit and fully expect to revisit this post in a year when Serif/Affinity breaks one of those three tenets above and forces me to move away from their software.
I have been having a highly unusual couple of weeks (which included recovering from a bout of food poisoning that hit the day after my last post), and I am now taking a break from my usual routine to try to get some perspective on things.
Instabuy. I’ve been a long-term PICO-8 fan, and this, even in such an early stage, is simply sublime.
I can’t wait to see what people come up with. I’m a bit worried that one of my kids joked that he wanted to do a PowerPoint clone, though.
The only flaw I can see is that it’s not available for ARM Linux yet (so I can’t run it on a Pi or any of my development boards), but I’m sure that will be fixed soon enough.
I’ve read through a bunch of articles on this and a fair chunk of the filing, and it seems like the DOJ doesn’t really have that much of a case here. Their accusations are so broad and have so many holes (well, most of the technical ones, and except some of the paywalls) that it’s almost as if they wanted to make the European Commission look good by comparison.
Or maybe, just maybe, someone had to justify their job. Stranger things have happened.
I watched this the other day and it was simply sublime. My own attempt at doing something similar (but far less realistic) has been languishing for the better part of two years, and it may take another two until I even print the first parts…
If you’ve been tracking my homelab adventures, you’ll know that I’ve been on the lookout for a small, powerful, and quiet Ryzen desktop and mini-server. I’ve been using a Ryzen 7 5700U-based laptop for almost three years now and it’s been a great experience, but I wanted to explore the desktop side of things, especially since my meanderings into edge computing have shown that ARM-based servers are at an interesting inflection point for running small AI models, and I just had no real data on AMD‘s capabilities in that space.
Not sure if genius, madness, or both, but generating near real time video of performance stats and viewing it in picture in picture mode is not exactly the first approach that comes to mind for designing an operating system monitor…