What is the name of your organization?
AI for Vietnam Foundation
What is the name of your solution?
Project ViGen
Provide a one-line summary or tagline for your solution.
ViGen project is to enable transformative applications of AI for economic growth and improving people’s lives.
In what city, town, or region is your solution team headquartered?
211 Hudson Bay St, Foster City, CA 94404, USA
In what country is your solution team headquartered?
USA
What type of organization is your solution team?
Nonprofit
Film your elevator pitch.
What specific problem are you solving?
Vietnamese is significantly underrepresented in generative AI systems due to the lack of comprehensive, high-quality datasets, creating barriers for over 100 million native speakers worldwide. Despite being the 16th most spoken language, Vietnamese datasets make up less than 0.1% of global AI training data, while English represents over 60%. This gap impacts sectors like education, with 23 million students, and business, which includes over 800,000 enterprises. Vietnam’s digital economy, projected to reach $49 billion by 2025, is limited by this issue. Factors contributing to the gap include fragmented data collection, lack of community collaboration, absence of standardized AI benchmarks, and low-quality data. These limitations hinder AI innovation and economic growth in Vietnam, reflecting a global challenge for many underrepresented languages.
What is your solution?
Project ViGen addresses the Vietnamese linguistic gap by creating high-quality datasets for training and evaluating generative AI (GenAI) models. It establishes a centralized open-data portal to crowdsource textual and audio content, encouraging contributions from individuals, businesses, academia, and government. This approach ensures datasets reflect Vietnam’s cultural and linguistic diversity, improving GenAI’s real-world performance.
ViGen also employs a standardized data-processing pipeline with quality control, deduplication, and regular updates to maintain dataset integrity and relevance. It develops culturally relevant evaluation benchmarks to measure AI model performance across tasks like language understanding, reasoning, conversation, and coding, guiding continuous improvement.
By promoting open-source licensing, community events, and user-friendly tools, ViGen fosters collaboration and innovation. This ecosystem empowers Vietnamese communities to shape their AI future, enhancing GenAI capabilities and unlocking economic opportunities and quality-of-life improvements for millions of Vietnamese speakers.
Who does your solution serve, and in what ways will the solution impact their lives?
Project ViGen directly benefits over 100 million Vietnamese speakers globally, including approximately 23 million students, 800,000 enterprises, educators, healthcare providers, and everyday digital users currently underserved by limited Vietnamese-language support in AI systems. Without high-quality datasets, foundation model providers (e.g., Meta, Google, OpenAI) face significant costs and complexity in integrating natural, culturally relevant Vietnamese support into global AI platforms. Researchers and developers encounter substantial barriers—fragmented data, insufficient benchmarks, and high training costs—that slow local AI innovation and limit inclusive AI solutions. Vietnamese businesses and individual users experience ineffective, linguistically inaccurate AI-driven services, hindering customer experiences, competitiveness, and digital adoption.
ViGen directly addresses these issues by creating the most comprehensive, high-quality Vietnamese datasets and standardized evaluation benchmarks ever built. These open-source resources significantly reduce complexity and cost for foundation model providers, enabling seamless Vietnamese integration into global AI models. Researchers and developers benefit from ready-to-use datasets, tools, and benchmarks, accelerating inclusive AI innovation and reducing development overhead. Businesses and individuals gain tailored, culturally sensitive AI applications, enhancing customer interactions, user experiences, and overall economic competitiveness—laying the foundational groundwork necessary for widespread, meaningful GenAI adoption across Vietnam.