Decoder and Encoder LLM Models

MLPerf and the rise of latency-aware LLM benchmarking

Here is a sneak peek at the evolution of the MLPerf benchmark and how generative AI forced a radical shift in AI hardware ...

The New York Times

White House Considers Vetting A.I. Models Before They Are Released

The Trump administration, which took a noninterventionist approach to artificial intelligence, is now discussing imposing oversight on A.I. models before they are made publicly available. By Tripp ...

Ars Technica

Google’s TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

Even if you don’t know much about the inner workings of generative AI models, you probably know they need a lot of memory. Hence, it is currently almost impossible to buy a measly stick of RAM without ...

tvbeurope.com

Matrox Video’s encoders/decoders built into Rai’s IP workflow

As part of its move to an IP workflow, Italian broadcaster Rai has signed a three-year framework agreement with Matrox Video to use its Matrox ConvertIP Series of encoders/decoders and converters.

VentureBeat

Nvidia says it can shrink LLM memory 20x without changing model weights

Nvidia researchers have introduced a new technique that dramatically reduces how much memory large language models need to track conversation history — by as much as 20x — without modifying the model ...

VentureBeat

How LinkedIn replaced five feed retrieval systems with one LLM model, at 1.3 billion-user scale

LinkedIn's feed reaches more than 1.3 billion members — and the architecture behind it hadn't kept pace. The system had accumulated five separate retrieval pipelines, each with its own infrastructure ...

TechCrunch

Guide Labs debuts a new kind of interpretable LLM

The challenge of wrangling a deep learning model is often understanding why it does what it does: Whether it’s xAI’s repeated struggle sessions to fine-tune Grok’s odd politics, ChatGPT’s struggles ...

Fast Company

Are LTMs the next LLMs? This new type of AI can do what large-language models can’t

Large-language models (LLMs) have taken the world by storm, but they’re only one type of underlying AI model. An under-the-radar company, Fundamental, is set to bring a new type of enterprise AI model ...

Semiconductor Engineering

Ultra-low-bit LLM Inference Allows AI-PC CPUs And Discrete Client GPUs To Approach High-end GPU-Level (Intel)

A new technical paper titled “Pushing the Envelope of LLM Inference on AI-PC and Intel GPUs” was published by researcher at Intel. “The advent of ultra-low-bit LLM models (1/1.58/2-bit), which match ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results