In it’s history, computing has gone through a cycle of consolidation. It’s been described before as a “Mainframe -> Personal Computer -> Mainframe” loop. Compute was originally in the hands of IBM, followed by a (still somewhat enjoyed) period of personal computer sovereignty, to today’s landscape of AWS and Azure data mega-warehouses accessed by low powered chromebooks. This seems to be due to both technological constraints and unfortunately, socio-political manuevering.

I’m hopeful for a similar but different trajectory for advanced machine learning models. Obviously, the creation of a true general artificial intelligence would be an unprecedented event in human history. The first mover advantage would be completely insurmountable. But maybe it’s possible for large models, with a combination of hardware advances, code optimization, and sufficient specialization, to reach a goldilocks zone where it’s possible for individuals to perform meaningful compute with them.

AI image generation, currently riding a wave of hype, is a walled garden. Models like DALL-E and MidJourney consume an absolutely gargantuan amount of compute. David Holz, CEO of MidJourney, said in one of his office hour sessions that their project was using ~1% of available GPU’s on their (major) cloud provider. As these tools gain more traction, these issues may increasingly face scaling issues.

Getting large modals into the hands of inviduals, if possible, will be key in gaining data sovereignty.