Apple & NVIDIA Collaborate on 'ReDrafter' for Faster LLM Text Generation

Despite Apple's preference for its own silicon in AI tasks, the company has collaborated with NVIDIA to develop 'ReDrafter,' a new technique that speeds up text generation with large language models (LLMs). This collaboration highlights a shared goal of improving LLM performance, despite the complex history between the two tech giants.

'ReDrafter' Technique

Apple's open-sourced 'ReDrafter' combines beam search and tree attention to enhance text generation performance. This technique was then integrated into NVIDIA's TensorRT-LLM, a tool designed to accelerate LLMs on NVIDIA GPUs. This integration improves speed and reduces latency, while also decreasing power consumption.

"This research work demonstrated strong results, but its greater impact comes from being applied in production to accelerate LLM inference... ML developers using NVIDIA GPUs can now easily benefit from ReDrafter’s accelerated token generation for their production LLM applications with TensorRT-LLM." - Apple

Integration with TensorRT-LLM

To integrate ReDrafter, NVIDIA added new operators and exposed existing ones, significantly improving TensorRT-LLM's ability to handle complex models and decoding methods. With these enhancements, developers using NVIDIA GPUs can now easily leverage ReDrafter for faster token generation in their production LLM applications. Benchmarks have shown a 2.7x speed-up in generated tokens per second for greedy decoding, using the NVIDIA TensorRT-LLM with ReDrafter. This could considerably reduce user latency while consuming less power.

While this collaboration indicates a shared interest, a long-term partnership seems unlikely given the history between Apple and NVIDIA. We may see similar collaborations in the future, but a formal business relationship is not anticipated.

Source: Apple

Technetbook | The Tech Experts

Apple & NVIDIA Collaborate on 'ReDrafter' for Faster LLM Text Generation

'ReDrafter' Technique

Integration with TensorRT-LLM

About the author

Post a Comment

Beelink ME Mini: Pocket-Sized NAS with Huge Storage Redefining Compact Network Storage

PoX Memory: New Flash Memory 10,000 Times Faster Revolutionizing Data Storage

GMKtec K11 Mini PC with Ryzen 9 8945HS on Sale: Powerhouse Performance in a Small Form Factor

NVIDIA RTX 5090D China Export Ban Rumors: US-China Tech Tensions Rise

Q1 2025 PC Shipments: Lenovo Leads Market, IDC Report Analysis