NVIDIA Tensorrt Inference Server

NVIDIA Launches Inference Platforms for Large Language Models and Generative AI Workloads

SANTA CLARA, Calif., March 21, 2023 (GLOBE NEWSWIRE) -- GTC -- NVIDIA today launched four inference platforms optimized for a diverse set of rapidly emerging generative AI applications — helping ...

ADTmag

Red Hat, Nvidia Launch Co-Engineered AI Factory Platform for Enterprise Deployments

Red Hat and Nvidia are packaging AIOps into a single “factory” stack by combining Red Hat AI Enterprise with NVIDIA AI Enterprise for end-to-end, production-scale deployments. The focus is scaling ...

InfoWorld

Copy-paste vulnerability hits AI inference frameworks at Meta, Nvidia, and Microsoft

Flaws replicated from Meta’s Llama Stack to Nvidia TensorRT-LLM, vLLM, SGLang, and others, exposing enterprise AI stacks to systemic risk. Cybersecurity researchers have uncovered a chain of critical ...

Network World

Nvidia claims 10x cost savings with open-source inference models

Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to ...

TweakTown

NVIDIA's new Hopper H200 AI GPU tested: 3x faster GenAI with TensorRT-LLM in MLPerf 4.0 results

Using these new TensorRT-LLM optimizations, NVIDIA has pulled out a huge 2.4x performance leap with its current H100 AI GPU in MLPerf Inference 3.1 to 4.0 with GPT-J tests using an offline scenario.

Hosted on MSN

Chained bugs in Nvidia's Triton Inference Server lead to full system compromise

Security researchers have lifted the lid on a chain of high-severity vulnerabilities that could lead to remote code execution (RCE) on Nvidia's Triton Inference Server.… Wiz Research said that if the ...

SDxCentral

Nvidia sets benchmarking performance records with its H200 and TensorRT-LLM software

Nvidia has set new MLPerf performance benchmarking records on its H200 Tensor Core GPU and TensorRT-LLM software. MLPerf Inference is a benchmarking suite that measures inference performance across ...

BGR

NVIDIA Is Helping Apple Build A Faster And Better AI Experience

Apple and NVIDIA shared details of a collaboration to improve the performance of LLMs with a new text generation technique for AI. Cupertino writes: Accelerating LLM inference is an important ML ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results