Language models keep growing in size. This is driven by the fact that model quality scales extremely well alongside model size. As a result, delivering these models to end users is becoming increasingly challenging. It’s a constant question of how to make serving these models faster and more cost-effective.
Hacker News is one of the leading online communities to discuss software and startup topics. I’ve frequented the site for over ten years and constantly admire the quality of its signal vs. noise ratio. It houses a wealth of knowledge and insightful discussions accumulated over the years. That invaluable