Creative Machines Archives - 311 Institute

Researchers trained an OpenAI rival in half an hour for $50

Thu, 27 Feb 2025 00:01:27 +0000

WHY THIS MATTERS IN BRIEF

Once a foundational AI model has been trained, which costs a huge amount of money, it’s actually easy to clone its core functions cheaply.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

Fresh on the back of the release of China’s DeepSeek R1 Artificial Intelligence (AI) model which cost $5.6 Million to train – which is multiples less than other Foundation AI models – researchers in the US managed to create a low-cost AI reasoning model rivalling OpenAI’s o1 model in just 26 minutes, as outlined in a paper published last week. The model, called s1, was refined using a small dataset of 1,000 questions and for under $50, according to TechCrunch.

To do this, researchers at Stanford University and the University of Washington used a method known as distillation – which allows smaller models to draw from the answers produced by larger ones – to refine s1 using answers from Google’s AI reasoning model, Gemini 2.0 Flash Thinking Experimental. Google’s terms of service note that you can’t use Gemini’s API to “develop models that compete with” the company’s AI models.

The Future of Generative AI and AI 2040, by AI Keynote Matthew Griffin

The researchers based s1 on Qwen2.5, an open source model from Alibaba Cloud. They initially started with a pool of 59,000 questions to train the model on, but found that the larger data set didn’t offer “substantial gains” over a whittled-down set of just 1,000. The researchers say they trained the model on just 16 Nvidia H100 GPUs.

Breaking the AI Scaling Laws, Deepseek R1

The s1 model also uses a technique called test-time scaling, allowing the model to “think” for a longer amount of time before producing an answer. As noted in the paper, researchers forced the model to continue reasoning by adding “Wait” to the model’s response. “This can lead the model to doublecheck its answer, often fixing incorrect reasoning steps,” the paper says.

OpenAI’s o1 reasoning model uses a similar approach, something the buzzy AI startup DeepSeek sought to replicate with the launch of its R1 model that it claims was trained at a fraction of the cost. OpenAI has since accused DeepSeek of distilling information from its models to build a competitor, violating its terms of service. As for s1, the researchers claim that s1 “exceeds o1-preview on competition math questions by up to 27%.”

The rise of smaller and cheaper AI models threatens to upend the entire industry. They could prove that major companies like OpenAI, Microsoft, Meta, and Google don’t need to spend billions of dollars training AI, while building massive data centers filled with thousands of Nvidia GPUs.

The post Researchers trained an OpenAI rival in half an hour for $50 appeared first on 311 Institute.

]]>

Researchers discover DeepSeek AI sending user data directly to China Mobile

Tue, 25 Feb 2025 11:09:33 +0000

WHY THIS MATTERS IN BRIEF

While people are used to apps sending data to different companies it’s much harder to discover where AI’s are sending it, and in this case your data is going straight to China Mobile, a Chinese government owned organisation.

The website of the sudden global smash hit Chinese Artificial Intelligence (AI) company DeepSeek, whose chatbot became the most downloaded app in the United States, has computer code that’s sending user login information to a Chinese state-owned telecommunications company that has been barred from operating in the US, security researchers say.

The web login page of DeepSeek’s chatbot contains heavily obfuscated computer script that when deciphered shows connections to computer infrastructure owned by China Mobile, a state-owned telecommunications company. The code appears to be part of the account creation and user login process for DeepSeek.

In its privacy policy, DeepSeek acknowledged storing data on servers inside the People’s Republic of China. But its chatbot appears more directly tied to the Chinese state than previously known through the link revealed by researchers to China Mobile. The US has claimed there are close ties between China Mobile and the Chinese military as justification for placing limited sanctions on the company. DeepSeek and China Mobile did not respond to emails seeking comment.

The Future of Cyber Security, by Cyber Keynote Speaker Matthew Griffin

The growth of Chinese-controlled digital services has become a major topic of concern for U.S. national security officials. Lawmakers in Congress last year on an overwhelmingly bipartisan basis voted to force the Chinese parent company of the popular video-sharing app TikTok to divest or face a nationwide ban though the app has since received a 75-day reprieve from President Donald Trump, who is hoping to work out a sale.

The code linking DeepSeek to one of China’s leading mobile phone providers was first discovered by Feroot Security, a Canadian cybersecurity company, which shared its findings with The Associated Press. The AP took Feroot’s findings to a second set of computer experts, who independently confirmed that China Mobile code is present. Neither Feroot nor the other researchers observed data transferred to China Mobile when testing logins in North America, but they could not rule out that data for some users was being transferred to the Chinese telecom.

Inside DeepSeek R1, by AI Keynote Matthew Griffin

The analysis only applies to the web version of DeepSeek. They did not analyze the mobile version, which remains one of the most downloaded pieces of software on both the Apple and the Google app stores.

The US Federal Communications Commission unanimously denied China Mobile authority to operate in the United States in 2019, citing “substantial” national security concerns about links between the company and the Chinese state. In 2021, the Biden administration also issued sanctions limiting the ability of Americans to invest in China Mobile after the Pentagon linked it to the Chinese military.

“It’s mindboggling that we are unknowingly allowing China to survey Americans and we’re doing nothing about it,” said Ivan Tsarynny, CEO of Feroot.

“It’s hard to believe that something like this was accidental. There are so many unusual things to this. You know that saying ‘Where there’s smoke, there’s fire’? In this instance, there’s a lot of smoke,” Tsarynny said.

Stewart Baker, a Washington, DC based lawyer and consultant who has previously served as a top official at the Department of Homeland Security and the National Security Agency, said DeepSeek “raises all of the TikTok concerns plus you’re talking about information that is highly likely to be of more national security and personal significance than anything people do on TikTok,” one of the world’s most popular social media platforms.

Users are increasingly putting sensitive data into Generative AI systems — everything from confidential business information to highly personal details about themselves. People are using generative AI systems for spell-checking, research, and even highly personal queries and conversations. The data security risks of such technology are magnified when the platform is owned by a geopolitical adversary and could represent an intelligence goldmine for a country, experts warn.

“The implications of this are significantly larger because personal and proprietary information could be exposed. It’s like TikTok but at a much grander scale and with more precision. It’s not just sharing entertainment videos. It’s sharing queries and information that could include highly personal and sensitive business information,” said Tsarynny, of Feroot.

The post Researchers discover DeepSeek AI sending user data directly to China Mobile appeared first on 311 Institute.

]]>

CEOs today will be the last to manage all human workforces as AI agents get to work

Mon, 24 Feb 2025 10:49:36 +0000

WHY THIS MATTERS IN BRIEF

Historically CEOs have resided over purely human workforces, but as AI Agents come into the workforce they might have to manage those as well.

Today’s CEOs are likely the last who will “manage a workforce of only human beings,” Salesforce CEO Marc Benioff told Axios’ Ina Fried in Davos. The rise of Generative AI agents which Benioff also described as “limitless digital labor” is among the next wave of advancements for the tech.

“We are really moving into a world now of managing humans and agents together,” he said, highlighting his own company’s Agentforce product which was launched in September along with its own sandbox where Salesforce’s own customers can test their agents, with Benioff applauding the technology at the time as “AI as it was meant to be.”

“Because I’m using Agentforce I just have that much more productivity,” he said, highlighting the increased ability of his Agentforce to resolve support inquiries, before adding that he’s thinking of ways to “redeploy” support agents in sales positions because those employees “don’t have as much to do because Agentforce is so productive for them.”

So, while I’ve talked about the unlimited human workforce many times before, as Artificial Intelligence (AI) automates jobs, tasks, and skills – such as coding and graphic design – and then democratises them for everyone this new “limitless” workforce adds a very interesting new wrinkle. It also begs questions such as: what happens when labour and the skills you and your business need are abundant, and what is the impact of that on company revenues and global GDP? As well as many other important questions.

The post CEOs today will be the last to manage all human workforces as AI agents get to work appeared first on 311 Institute.

]]>

Researchers use Quantum AI algorithms to break Generative AI bottlenecks

Sun, 16 Feb 2025 10:20:34 +0000

WHY THIS MATTERS IN BRIEF

Quantum computers are almost infinitely powerful, and Quantum AI is therefore hugely promising, and it’s starting to make a difference.

As we continue to see big advances in quantum computing researchers at the University of Waterloo Institute for Quantum Computing (IQC) have found that quantum algorithms could speed up Generative Artificial Intelligence (GAI) creation and usage.

Ronagh, who led the research team, says his work “focuses on the intersection of quantum science and AI and whether quantum computing can speed up mimicking real-world patterns and phenomena as AI and machine learning scientists have done.”

“We found that yes it can – but not for the typical generative AI problems such as those in computer vision and speech,” Ronagh says. “We saw more significant speed ups for the types of problems that have periodic patterns, for example in analysing molecular dynamics.”

The BIGGEST Societal Level Changes by 2040, by Futurist Keynote Matthew Griffin

The function of large molecules like proteins depends on how they fold into specific 3D structures, which makes the search and generation of these structures a vital problem in pharmacology. And current state-of-the-art techniques use generative AI to enhance this process.

Ronagh says even though quantum mechanical effects are typically ignored in molecular dynamics simulations, they can benefit from quantum computing solutions thanks to the periodicity of molecular bond angles. Many other examples of problems with such periodic structures exist in condensed matter physics and quantum field theories.

Ronagh says one of the most salient examples of the power of quantum computers is in cryptography. Shor’s algorithm famously uses the periodicity that underlies the factoring problem to break the RSA encryption. However, he clarifies that this is not a practical use case in itself but rather a demonstration of the unique capabilities of quantum algorithms. There is true potential in quantum computing rather than being merely a threat to information security.

“Hacking is a scary implication that drives our urgency for changing our encryption protocols, as well as our curiosity for whether quantum computers are buildable,” he says. “But, instead, we can aspire to simulate molecules better, leading to the development of superior materials and life-saving drugs. This holds the potential of being a very economically valuable application of quantum computers to our daily lives.”

He says exploring applications of quantum computing goes beyond daydreaming about the future impacts of quantum technologies.

“That’s where I think finding useful quantum algorithms is so important. They can tell us more about the types of applications we want to run on the computer we are trying to build, so we can design and optimize the computer architecture more informed, and plan the massive undertaking of building it better,” Ronagh says.

The paper was published in the Proceedings of Machine Learning Research.

The post Researchers use Quantum AI algorithms to break Generative AI bottlenecks appeared first on 311 Institute.

]]>

AI’s killer app could be the limitless digital workforce

Sat, 15 Feb 2025 14:22:15 +0000

WHY THIS MATTERS IN BRIEF

As Agentic AI rolls out it could end up being used to create a limitless digital workforce that we could all tap into.

If Don Draper from “Mad Men” was quintessentially, at his deepest self, an ad man, then Salesforce CEO Marc Benioff is likewise a sales guy. Lately he’s been selling – or more like singing the gospel – about Artificial Intelligence (AI) agents, their role in helping companies create “limitless workforces,” and Salesforce’s recently released agent-maker platform Agentforce.

It’s true that Benioff famously gets hyped about all of Salesforce’s latest offerings, but on Tuesday, as part of the company’s latest quarterly results, he also released numbers to back up why he’s so excited. Benioff said that Salesforce closed 200 deals for Agentforce in just one quarter and that it plans to hire 1,400 salespeople to help it close the many more deals it is working on.

The Future of Work and AI, by Future of Work Keynote Matthew Griffin

Agentforce “firstbecame available on October 24th, and we’re already seeing this incredible velocity, more than 200 Agentforce deals just in Q3,” Benioff told analysts on the quarterly conference call. “The pipeline is in the thousands for potential transactions that are coming up in future quarters.” He named FedEx, Adecco, Accenture, ACE Hardware, IBM, and RBC Wealth Management as Agentforce customers.

The company said it now expects to bring in more revenue for its fiscal year than it previously projected, too, $37.8 to $38 billion, which will be up 8% to 9% over its previous year, largely on the strength of its AI products.

Benioff recently told reporters that he expects Salesforce customers to deploy over a billion AI agents within the next year and that AI agents will allow companies to have an unlimited workforce – which is something I’ve been talking about for over a decade.

“These agents are not tools. They are becoming collaborators. They’re working 24/7 to analyze data, make decisions, take action,” he said on the conference call. “Salesforce has become, right out of the gate here, the largest supplier of digital labor, and this is just the beginning.”

How much of this vision turns into reality we’ll have to wait and see. LLM-based tech is still working to solve its hallucination problem – an issue baked into a technology that is quintessentially at its core about imitating creativity. Benioff said on the call that because Agentforce can train on the up to 300 petabytes of actual company data Salesforce manages, “you’re going to see remarkably low hallucinogenic performance.”

Other startups are working on other LLM issues necessary to turn AI agents into actual digital collaborators, like memory and state.

But as 2025 also becomes the year of AI, it’s becoming clear that enterprises have found a direction for their AI investments. After spending much of 2024 throwing around experimental budgets to answer the board-level question “what are we doing with AI?” the answer is apparently: AI agents for sales and customer service.

It is interesting, and not without irony, that Salesforce will be hiring humans to help them sell this tech. Maybe that means that AI will create jobs, not just replace them. Maybe it means that even a company touting the rise of the digital workforce isn’t ready to turn over the reins entirely to software just yet. But it will also be equipping its sales humans with AI sales development representatives.

As Salesforce COO Brian Millham explained on the call, “To capture this increased demand for Agentforce, we’re hiring 1,400 AEs globally in our fourth quarter and we’re also using new sales SDR agent and sales coaching agent to augment every seller.”

Salesforce isn’t alone in chasing this killer enterprise AI app. Startups offering SDR technology have boomed in 2024, attracting lots of VC investment and a lot of initial revenue – the object of many an exploratory enterprise AI budget. But it’s an area where the incumbents that hold the customer data to train the bots have the advantage like Salesforce, HubSpot, and ZoomInfo. Ditto for customer service bots.

The post AI’s killer app could be the limitless digital workforce appeared first on 311 Institute.

]]>

Deepseek claims its reasoning model beats OpenAI o1 in benchmarks

Thu, 13 Feb 2025 10:20:47 +0000

WHY THIS MATTERS IN BRIEF

Deepseek is a Chinese AI that cost $5.6 Million to train versus OpenAI’s models which cost hundreds of millions of dollars, so beating OpenAI’s benchmarks is significant.

Chinese Artificial Intelligence (AI) lab DeepSeek has released an open version of DeepSeek-R1, its so-called reasoning model, that it claims performs as well as OpenAI’s o1 on certain AI benchmarks.

R1 is available from the AI dev platform Hugging Face under an MIT license, meaning it can be used commercially without restrictions. According to DeepSeek, R1 beats o1 on the benchmarks AIME, MATH-500, and SWE-bench Verified. AIME employs other models to evaluate a model’s performance, while MATH-500 is a collection of word problems. SWE-bench Verified, meanwhile, focuses on programming tasks.

Being a reasoning model, R1 effectively fact-checks itself, which helps it to avoid some of the pitfalls that normally trip up models. Reasoning models take a little longer – usually seconds to minutes longer – to arrive at solutions compared to a typical nonreasoning model. The upside is that they tend to be more reliable in domains such as physics, science, and math.

Inside DeepSeek AI, by Futurist Keynote Matthew Griffin

R1 contains 671 billion parameters, DeepSeek revealed in a technical report. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.

Indeed, 671 billion parameters is massive, but DeepSeek also released “distilled” versions of R1 ranging in size from 1.5 billion parameters to 70 billion parameters. The smallest can run on a laptop. As for the full R1, it requires beefier hardware, but it is available through DeepSeek’s API at prices 90%-95% cheaper than OpenAI’s o1.

There is a downside to R1. Being a Chinese model, it’s subject to benchmarking by China’s internet regulator to ensure that its responses “embody core socialist values.” R1 won’t answer questions about Tiananmen Square, for example, or Taiwan’s autonomy.

Many Chinese AI systems, including other reasoning models, decline to respond – censor – to topics that might raise the ire of regulators in the country, such as speculation about the Xi Jinping regime.

R1 arrives weeks after the outgoing Biden administration proposed harsher export rules and restrictions on AI technologies for Chinese ventures. Companies in China were already prevented from buying advanced AI chips, but if the new rules go into effect as written, companies will be faced with stricter caps on both the semiconductor tech and models needed to bootstrap sophisticated AI systems.

In a policy document last week, OpenAI urged the US government to support the development of US AI, lest Chinese models match or surpass them in capability. In an interview with The Information, OpenAI’s VP of policy Chris Lehane singled out High Flyer Capital Management, DeepSeek’s corporate parent, as an organization of particular concern.

So far, at least three Chinese labs – DeepSeek, Alibaba, and Kimi, which is owned by Chinese unicorn Moonshot AI – have produced models that they claim rival o1. Of note, DeepSeek was the first – it announced a preview of R1 in late November. In a post on X, Dean Ball, an AI researcher at George Mason University, said that the trend suggests Chinese AI labs will continue to be “fast followers.”

“The impressive performance of DeepSeek’s distilled models […] means that very capable reasoners will continue to proliferate widely and be runnable on local hardware,” Ball wrote, “far from the eyes of any top-down control regime.”

The post Deepseek claims its reasoning model beats OpenAI o1 in benchmarks appeared first on 311 Institute.

]]>

Experiment proves the bigger the AI model the more AI agents collaborate

Thu, 06 Feb 2025 14:02:50 +0000

WHY THIS MATTERS IN BRIEF

As we look at a future dominated by AI and AI agents knowing how much and how well they can collaborate before failure matters alot.

Humans are social animals, but there appear to be hard limits to the number of relationships we can maintain at once. And now new research suggests Artificial Intelligence (AI) Agents may be capable of collaborating in much larger groups. In the 1990s, British anthropologist Robin Dunbar suggested that most humans can only maintain social groups of roughly 150 people. While there is considerable debate about the reliability of the methods Dunbar used to reach this number, it has become a popular benchmark for the optimal size of human groups in business management.

There is growing interest in using groups of AI agents to solve tasks in various settings, which prompted researchers to ask whether today’s Large Language Models (LLMs) are similarly constrained when it comes to the number of individuals that can effectively work together. They found the most capable models could cooperate in groups of at least 1,000 – a significant order of magnitude more than humans.

The Future of AI and Business, by AI Keynote Matthew Griffin

“I was very surprised,” Giordano De Marzo at the University of Konstanz, Germany, told New Scientist. “Basically, with the computational resources we have and the money we have, we [were able to] simulate up to thousands of agents, and there was no sign at all of a breaking of the ability to form a community.”

To test the social capabilities of LLMs the researchers spun up many instances of the same model and assigned each one a random opinion. Then, one by one, the researchers showed each copy the opinions of all its peers and asked if it wanted to update its own opinion.

The team found that the likelihood of the group reaching consensus was directly related to the power of the underlying model. Smaller or older models, like Claude 3 Haiku and GPT-3.5 Turbo, were unable to come to agreement, while the 70-billion-parameter version of Llama 3 reached agreement if there were no more than 50 instances. But for GPT-4 Turbo, the most powerful model the researchers tested, groups of up to 1,000 copies could achieve consensus – another significant finding.

The researchers didn’t test larger groups due to limited computational resources.

The results suggest that larger AI models could potentially collaborate at scales far beyond humans, Dunbar told New Scientist.

“It certainly looks promising that they could get together a group of different opinions and come to a consensus much faster than we could do, and with a bigger group of opinions,” he said.

The results add to a growing body of research into Multi-Agent AI Systems that has found groups of AIs working together could do better at a variety of math and language tasks. However, even if these models can effectively operate in very large groups, the computational cost of running so many instances may make the idea impractical.

Also, agreeing on something doesn’t mean it’s right, Philip Feldman at the University of Maryland, told New Scientist. It perhaps shouldn’t be surprising that identical copies of a model quickly form a consensus, but there’s a good chance that the solution they settle on won’t be optimal.

However, it does seem intuitive that AI agents are likely to be capable of larger scale collaboration than humans, as they are unconstrained by biological bottlenecks on speed and information bandwidth. Whether current models are smart enough to take advantage of that is unclear, but it seems entirely possible that future generations of the technology will be able to.

The post Experiment proves the bigger the AI model the more AI agents collaborate appeared first on 311 Institute.

]]>

Nvidia doubles down on AI World Models

Sun, 02 Feb 2025 13:49:11 +0000

WHY THIS MATTERS IN BRIEF

AI companies are increasingly trying to model everything in the world – at all scales – in simulation, and the upsides are HUGE.

Nvidia, the multi-trillion dollar Artificial Intelligence (AI) chip behemoth, has announced that it’s getting into “World Models” – AI models that take inspiration from the mental models of the world that humans develop naturally.

At CES 2025 in Las Vegas, the company announced that it is making openly available a family of world models that can predict and generate “physics-aware” videos, which is a huge deal especially when you consider the fact that more companies than ever before are developing new products in simulations and even creating massive world-scale digital twins, for example, to model the entire Earth. Nvidia is calling this family Cosmos World Foundation Models, or Cosmos WFMs for short.

The models, which can be fine-tuned for specific applications, are available from Nvidia’s API and NGC catalogs, GitHub, and the AI dev platform Hugging Face.

“Nvidia is making available the first wave of Cosmos WFMs for physics-based simulation and synthetic data generation,” the company wrote in a blog post. “Researchers and developers, regardless of their company size, can freely use the Cosmos models under Nvidia’s permissive open model license that allows commercial usage.”

There are a number of models in the Cosmos WFM family, divided into three categories: Nano for low latency and real-time applications, Super for “highly performant baseline” models, and Ultra for maximum quality and fidelity outputs.

The models range in size from 4 billion to 14 billion parameters, with Nano being the smallest and Ultra being the largest. Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters.

As a part of Cosmos WFM, Nvidia is also releasing an “up sampling model,” a video decoder optimized for augmented reality, and guardrail models to ensure responsible use, as well as fine-tuned models for applications like generating sensor data for autonomous vehicle development. These, as well as the other Cosmos WFM models, were trained on 9,000 trillion tokens from 20 million hours of real-world human interactions, environment, industrial, robotics, and driving data, Nvidia said – in AI, “tokens” represent bits of raw data which in this case is video footage.

Nvidia wouldn’t say where this training data came from, but at least one report – and lawsuit – alleges that the company trained on copyrighted YouTube videos without permission.

When reached for comment, an Nvidia spokesperson said that Cosmos “isn’t designed to copy or infringe any protected works.”

“Cosmos learns just like people learn,” the spokesperson said. “To help Cosmos learn, we gathered data from a variety of public and private sources and are confident our use of data is consistent with both the letter and spirit of the law. Facts about how the world works – which are what the Cosmos models learn – are not copyrightable or subject to the control of any individual author or company.”

Setting aside the fact that models like Cosmos don’t really learn like people learn, copyright experts say claims like Nvidia’s, which draw support from fair use legal doctrine, may not stand up to judicial scrutiny. Whether these companies prevail will largely depend on how courts decide fair use, which allows for the use of copyrighted works to make something new as long as it’s transformative, applies to AI training.

Nvidia claimed that Cosmos WFM models, given text or video frames, can generate “controllable, high-quality” synthetic data to bootstrap the training of models for robotics, driverless cars, and more.

“Nvidia Cosmos’ suite of open models means developers can customize the WFMs with data sets, such as video recordings of autonomous vehicle trips or robots navigating a warehouse,” Nvidia wrote in a press release.

“Cosmos WFMs are purpose-built for physical or Embodied AI research and development, and can generate physics-based videos from a combination of inputs, like text, image and video, as well as robot sensor or motion data.”

Nvidia said that companies, including Waabi, Wayve, Foretellix, and Uber, have already committed to piloting Cosmos WFMs for various use cases, from video search and curation to building AI models for self-driving vehicles.

“Generative AI will power the future of mobility, requiring both rich data and very powerful compute,” Uber CEO Dara Khosrowshahi said in a statement. “By working with Nvidia, we are confident that we can help supercharge the timeline for safe and scalable autonomous driving solutions for the industry.”

Important to note is that Nvidia’s world models aren’t “open source” in the strictest sense. To abide by one widely accepted definition of open source AI, an AI model has to provide enough information about its design so that a person could substantially re-create it and disclose any pertinent details about its training data, including the provenance and how the data can be obtained or licensed.

Nvidia hasn’t published Cosmos WFM training data details, nor has it made available all the tools needed to re-create the models from scratch. That’s probably why the tech giant is referring to the models as “open” as opposed to open source.

“We really hope [Cosmos will] do for the world of robotics and industrial AI what Llama … has done for enterprise,” Nvidia CEO Jensen Huang said onstage during a press event on Monday.

The post Nvidia doubles down on AI World Models appeared first on 311 Institute.

]]>

Chinese OpenAI alternative DeepSeek AI tanks US stocks by over $1 Trillion

Fri, 31 Jan 2025 13:49:20 +0000

WHY THIS MATTERS IN BRIEF

It was always going to happen – as I’ve been tracking for the past year – China was always going to develop a cheaper to train AI.

Conventional Artificial Intelligence (AI) wisdom suggests that building Large Language Models (LLMs) requires deep pockets – typically billions in investment. But DeepSeek, a Chinese AI startup, just as I’ve shown with other Chinese AI’s who are training AI models in new ways that result in much cheaper AI model pricing for consumers, just shattered that paradigm with their latest achievement by developing a world-class AI model for just $5.6 million that has orders of magnitude fewer parameters than the giant multi-trillion parameter AI models that it’s competing against.

DeepSeek’s V3 model can go head-to-head with industry giants like Google’s Gemini and OpenAI’s latest offerings, all while using a fraction of the typical computing resources. The achievement caught the attention of many industry leaders, and what makes this particularly remarkable is that the company accomplished this despite facing massive US export restrictions that limited their access to the latest Nvidia chips.

The numbers tell a compelling story of efficiency. While most advanced AI models require between 16,000 and 100,000 GPUs for training, DeepSeek managed with just 2,048 GPUs running for 57 days. The model’s training consumed 2.78 million GPU hours on Nvidia H800 chips – remarkably modest for a 671 Billion parameter model.

Inside DeepSeek, with Futurist Matthew Griffin

To put this in perspective, Meta needed approximately 30.8 million GPU hours – roughly 11 times more computing power – to train its Llama 3 model, which actually has fewer parameters at just 405 Billion.

DeepSeek’s approach resembles a masterclass in optimization under constraints. Working with H800 GPUs – AI chips designed by Nvidia specifically for the Chinese market with reduced capabilities – the company turned potential limitations into innovation. Rather than using off-the-shelf solutions for processor communication, they developed custom solutions that maximized efficiency.

While competitors continue to operate under the assumption that massive investments are necessary, DeepSeek is demonstrating that ingenuity and efficient resource utilization can level the playing field.

DeepSeek’s achievement lies in its innovative technical approach, showcasing that sometimes the most impactful breakthroughs come from working within constraints rather than throwing unlimited resources at a problem.

At the heart of this innovation is a strategy called “auxiliary-loss-free load balancing.” Think of it like orchestrating a massive parallel processing system where traditionally, you’d need complex rules and penalties to keep everything running smoothly. DeepSeek turned this conventional wisdom on its head, developing a system that naturally maintains balance without the overhead of traditional approaches.

The team also pioneered what they call “Multi-Token Prediction” (MTP) – a technique that lets the model think ahead by predicting multiple tokens at once. In practice, this translates to an impressive 85-90% acceptance rate for these predictions across various topics, delivering 1.8 times faster processing speeds than previous approaches.

The technical architecture itself is a masterpiece of efficiency. DeepSeek’s V3 employs a Master-of-Experts approach with 671 billion total parameters, but here is the clever part – it only activates 37 billion for each token. This selective activation means they get the benefits of a massive model while maintaining practical efficiency.

Their choice of FP8 mixed precision training framework is another leap forward. Rather than accepting the conventional limitations of reduced precision, they developed custom solutions that maintain accuracy while significantly reducing memory and computational requirements.

The impact of DeepSeek’s achievement ripples far beyond just one successful model.

For European AI development, this breakthrough is particularly significant. Many advanced models do not make it to the EU because companies like Meta and OpenAI either cannot or will not adapt to the EU AI Act. DeepSeek’s approach shows that building cutting-edge AI does not always require massive GPU clusters – it is more about using available resources efficiently.

This development also shows how export restrictions can actually drive innovation, which is something I’ve been saying forever during my keynotes. DeepSeek’s limited access to high-end hardware forced them to think differently, resulting in software optimizations that might have never emerged in a resource-rich environment. This principle could reshape how we approach AI development globally.

The democratization implications are profound. While industry giants continue to burn through billions of dollars, DeepSeek has created a blueprint for efficient, cost-effective AI development. This could open doors for smaller companies and research institutions that previously could not compete due to resource limitations.

However, this does not mean large-scale computing infrastructure is becoming obsolete. The industry is shifting focus toward scaling inference time – how long a model takes to generate answers. As this trend continues, significant compute resources will still be necessary, likely even more so over time.

But DeepSeek has fundamentally changed the conversation. The long-term implications are clear: we are entering an era where innovative thinking and efficient resource use could matter more than sheer computing power. For the AI community, this means focusing not just on what resources we have, but on how creatively and efficiently we use them.

The post Chinese OpenAI alternative DeepSeek AI tanks US stocks by over $1 Trillion appeared first on 311 Institute.

]]>

Microsoft CEO weighs in on AI Agents ending SaaS

Fri, 24 Jan 2025 14:11:45 +0000

WHY THIS MATTERS IN BRIEF

Software as a Service is the bedrock of the global software industry, and AI Agents could change it dramatically.

Over the holidays, an excerpt from an interview with Microsoft CEO Satya Nadella went viral in which Nadella allegedly predicted the “Death of Software-as-a-Service” (SaaS) because of the emergence of ever smarter Artificial Intelligence (AI) agents who will work together to build new kinds of software and deliver new services and capabilities very differently to the way they’re built and delivered today.

But Nadella did not at all speak about the “death of SaaS apps.” He repeatedly referred to business apps as “canvases” for writing, calculation, or other business functions.

Still, it is clear that the way how business users interact with apps will change dramatically as AI agents become the primary interface to SaaS apps for many users.

An AI agent might for example proactively identify customers for a specific seasonal upsale based on information in your CRM, draft and send a sales E-Mail using your mail service, register a purchase on your website and trigger the shipment of the product in your E-Commerce platform, monitor complaints through your customer support platform, and feed back all this information into your main CRM.

Looking at the complexity of these so called agentic workflows, it seems unlikely that a single AI system will be able to integrate all of these activities in one single piece of software in the near future – and that at this point, the agent will essentially have become a SaaS product itself.

In the interview, Nadella also seems to suggest that certain elements of a software, such as a spreadsheet, could be created “on the fly” by a Large Language Model (LLM). But it is doubtful whether this would be economical. While it is nice to be able to create a minimum viable product in an AI chat bot, it would not make sense to re-invent the wheel every time someone makes a certain standard query.

The Future of Artificial Intelligence Keynote, by Futurist Matthew Griffin

The current pace of technological progress is extremely fast, but as these technologies meet legacy technologies, there will be all kinds of unforeseen challenges to unlocking the full potential of AI. Organisations and societies naturally change at a slower pace than start-ups do.

In his essay “Machines of Loving Grace”, Anthropic co-founder Dario Amodei introduces the concept of a “compressed 21st century”¹ because advanced AI is likely to allow us to make discoveries in health and biology in a time span of five to ten years that would require 50 to 100 years of research without AI.

In the same essay, Amodei however also acknowledges that existing regulations may significantly slow down the impact that new technologies can have. It is therefore not surprising that it may take decades until we see the economic impact of AI – I actually think it’s more one decade than two because of the emergence companies built on AI but in any case, it will be more than two to three years.

There is also an alternative scenario to the one outlined by Satya Nadella where AI accelerates the release of new software. This scenario was described most eloquently by Scott Belsky in his newsletter “Implications”:

“DIY software will revolutionize apps for consumers and the enterprise. There has been much discussion of AI code reviews, GitHub co-pilot, and no-code application builders for the enterprise, but what are the implications of agent-assisted software development for consumers? Quick apps for your home or family were too hard to build until now. I think we’ll see some pretty remarkable and super niche software applications emerge in 2025, by and for consumers. And in the enterprise, the cost calculation of building your own internal tools and Generative Apps will start to merit AI-made homegrown solutions to workflows and enterprise functions – and increasingly agents will replace these functions, per the last forecast on the list as opposed to the usual “find a SaaS product to solve every need.””

In this scenario, AI agents trigger a surge of custom-made apps that are built with just one customer in mind.

While it is hard to predict how the market for software will develop, it is easy to see that the change will be significant: If AI agents become the primary point of interaction with users, developers will have to reimagine the architecture of their software and tailor them specifically to AI agents. Expenses for example may be filed easily just by taking a photo of a bill without the user having to open an app.

This however means that certain parts of a software business will become less relevant. As Azeem Azar puts it in a recent episode of “Exponential View”:

“I think that you’ve got a classic innovator’s dilemma if you are a traditional SaaS company, because the way in which you build, the way in which your products work, the way in which your teams are organized doesn’t necessarily sit nicely with an AI model,” he says.

On the other hand, larger SaaS firms will double down on their investments in well-designed user interfaces – let’s face it, the chat interface it not the end of UX design for AI. And many SaaS firms have already done a pretty good job in integrating AI and will continue doing so.

But what does this mean for the European Commission’s political agenda? AI agents will play a large role in closing the productivity gap between the US and the EU. At a time where talent is scarce, AI bots will allow companies to build a “limitless workforce”, according to Salesforce CEO Marc Benioff. But in order to do so, they will need to understand how AI agents fit into the existing digital rulebook of the EU.

In his interview, Nadella already pointed out some of the regulatory issues related to AI agents: Maintaining cybersecurity is naturally a key concern, shortly followed by questions around how access to these various apps by an AI agent should be governed. Should a company be allowed to restrict access for agents that for example are not build by them? If they do, could this be considered anti-competitive behaviour?

The Future of AI Agentic Cybersecurity Keynote, by Futurist Matthew Griffin

On a related note, it remains to be seen whether the emergence of AI agents shifts where value is created in the software market and how this shapes market dynamics in the software industry.

The more complex questions will resolve around how the deployment of AI agents will interact with existing laws such as the GDPR, cybersecurity regulations or the Product Liability Directive. Without a doubt, the software value chain will become more complex and it will become more difficult to assign responsibilities to individual economic actors.

Finally, while the Data Act mostly does not cover SaaS, its main idea of making data accessible is highly relevant for agentic AI. The law, however, only comes into force later this year and it remains to be seen whether it is even fit for purpose for the Internet of Things for which the law was originally created for.

Reducing regulatory complexity will be crucial for the Commission’s goal to increase the uptake of advanced technologies such as AI agents. Next week, the Commission plans to announce its “Competitiveness Compass.” This will be indication of how serious the new Commission is about streamlining the proliferation of digital rules.

The post Microsoft CEO weighs in on AI Agents ending SaaS appeared first on 311 Institute.

]]>