Seoul, South Korea–(BUSINESS WIRE)–FriendliAI launched the “Orca” service system which greatly improves the service efficiency of large-scale generative models during OSDI 2022. FriendliAI is a start-up company that provides PeriFlow, a development platform of large-scale AI models.
Orca is a service system that enables the efficient operation of large-scale AI models. It can remove inefficient delays from existing service systems by using two basic techniques: “iteration-level planning” and “selective batching”.
To understand, imagine a group of friends who would like to ride a four-seater tandem bike along the Hudson River. Some want to cycle for just 10 minutes, while others want a full hour. In existing service systems, once a bike ride has started, all riders would be forced to cycle until all passengers are satisfied (this team’s longest target bike time) and no other friends could join until the old group was finished and returned to the starting point.
Orca solved this problem by using a ‘shuttle’ system where the group returns to the starting point every 10 minutes, so riders can get off individually soon after reaching their goals and latecomers can rejoin the ride at bike without waiting long. This shuttle execution system corresponds to scheduling at the iteration level. Additionally, Orca provides another technique, selective grouping, to group initially ungroupable riders together before cycling begins.
With Orca, large-scale models can perform their generative tasks more than tens of times faster than existing service systems (with GPT-3 175B). Also, the cost of using large-scale models like GPT-3 is one-hundredth lower with Orca. The challenges of using large-scale models disappear with the new service system. Orca can serve these models to a much wider range of users thanks to a new level of accessibility.
Orca research, “Orca: A Distributed Serving System for Transformer-Based Generative Models”, was presented at OSDI 2022 (16th USENIX Symposium on Operating Systems Design and Implementation) on July 12, which is a premier conference in the field of computer systems. Orca is already used in production.
“Not only is data acquisition important for improving model learning, but maximizing the efficiency of the service system itself allows users to use large-scale generative models like GPT-3 of OpenAI,” said Byung-Gon Chun, CEO of AmicalAI. “I expect this research to increase the possibilities of using large-scale models on a variety of products.”
Download the Orca research document to check the detailed information.
FriendliAI, which introduced the Orca, is the startup providing the PeriFlow platform that makes large-scale AI development convenient for everyone. PeriFlow is a cloud-based platform that automates the entire AI development process, from training to inference and service deployment. Trained on PeriFlow, FriendliAI open source ‘GPT-FAI 13B’, a large-scale language model using 13 billion parameters.
Corresponding Author Byung-Gon Chun obtained his doctorate. in the Department of Computer Science at the University of California at Berkley. He is interested in building services to simplify AI at scale. Chun won the ACM SIGOPS Hall of Fame Award in 2020 and the EuroSys Test of Time Award in 2021. He has also received research awards from Google, Amazon, Facebook and Microsoft.