Hands-on llm serving and optimization: hosting llms at scale
Auteur :
Wang, Chi / Hu, Peiheng
Éditeur :
O'Reilly Media
ISBN :
9798341621497
Date de publication :
30 avr. 2026
Dimensions :
23,2 x 17,8 cm
Langue :
Anglais
Pays d'origine :
USA
As the demand for real-time AI applications grows, along comes this comprehensive guide to the complexities of deploying and optimizing LLMs at scale. The authors take a real-world approach backed by practical examples and code, and assemble essential strategies for designing infrastructures that are equal to the demands of modern AI applications.Â