Kazakhstan Introduces Multilingual GPT Model, Advancing AI in Central Asia

Photo: Nazarbayev University

Kazakhstan Introduces Multilingual GPT Model, Advancing AI in Central Asia

  • 13 Dec, 21:00
  • Regions

Researchers at the Institute of Intelligent Systems and Artificial Intelligence (ISSAI) at Nazarbayev University in Astana have introduced ISSAI KAZ-LLM, a large language model (LLM) specifically designed for the Kazakh language.

Built on advanced neural network technology, the model serves as the foundation for Kazakhstan"s version of a generative AI system similar to GPT, The Caspian Post reports, citing The Times of Central Asia.

ISSAI KAZ-LLM is tailored to Kazakhstan"s multilingual and multicultural environment, supporting Kazakh, Russian, and English, with additional functionality for Turkish. The model addresses language barriers and advances the field of generative artificial intelligence for resource-limited languages.

The development team processed and synthesized over 150 billion tokens to ensure high-performance language capabilities. Beyond creating an AI tool, the project also fostered local expertise, involving Kazakhstani researchers at every stage, from data preparation to model implementation. Collaboration with international institutes enabled the creation of language-specific datasets and comparative analysis tools, utilizing input from linguists and state-of-the-art machine translation techniques.

KAZ-LLM has a wide range of applications, including Kazakh language translations, content generation, and bulk text processing. Training data was sourced exclusively from publicly available materials, such as Kazakh websites, news articles, and online libraries, supplemented by contributions from various organizations.

ISSAI director, Prof. Hussain Atakan Varol, told The Times of Central Asia: "This model reflects Kazakhstan"s commitment to innovation, self-reliance, and the growth of its technology ecosystem. Our team developed two versions of ISSAI KAZ-LLM: one with 8 billion parameters and another with 70 billion parameters. Both are built on the Meta Llama architecture, optimized for use on high-performance systems as well as resource-constrained environments. Released under a CC-BY-NC license, the models are available for non-commercial use on the Hugging Face platform."

The ISSAI team is already exploring next-generation AI systems, including language-vision models, while expanding support for additional Turkic and regional languages. These initiatives aim to strengthen regional ties, promote linguistic integration, and create substantial economic and technological opportunities in Kazakhstan and beyond.

Remarkably, this groundbreaking development was accomplished without government funding, with significant contributions from Kazakhstani IT companies.

Kazakhstan is also gearing up to launch the International Center for Artificial Intelligence alem.ai. The center is set to become a hub for transforming the country into an AI-driven economy, fostering innovation, attracting investments, and supporting startups.

By 2029, the export of Kazakhstani AI solutions is expected to reach $5 billion.

Related news

Researchers at the Institute of Intelligent Systems and Artificial Intelligence (ISSAI) at Nazarbayev University in Astana have introduced ISSAI KAZ-LLM, a large language model (LLM) specifically designed for the Kazakh language.