National University of Science and Technology (NUST), National Information Technology Board (NITB) and Telecom network operator Jazz have signed a Memorandum of Understanding (MOU) to develop Pakistan’s first indigenous Large Language Model (LLM) with focus on Urdu, including datasets for Pashto and Punjabi languages. It is aimed at empowering individuals, businesses, and organizations with advanced AI tools in their native languages. The envisioned LLM is expected to drive innovation in Generative AI applications, boosting productivity and accessibility in critical sectors like healthcare, education, and agriculture.
![]() |
| GPT-4 Accuracy Scores. Source: The Economist |
Generative AI tools such as ChatGPT are powered by large language models, or LLMs. These models need to be trained on vast amounts of data in specific languages to be useful. Unfortunately, the Urdu content of the Internet is less than 0.1%. This will present a challenge for the developers of Urdu LLMs.
![]() |
| Online Content of Various Languages. Source: W3Techs |
Lack of Urdu content available for training ChatGPT affects the accuracy of the results for Urdu language users. For example, the GPT-4 accuracy score in question-answer tests in Urdu is just over 70%, compared with 85% accuracy score in the English language, according to data from OpenAI. Other South Asian languages, including Hindi, Bengali, Punjabi, Marathi and Telugu, suffer from the same problem.
It's not just a South Asian problem. These challenges exist in the developing world. Non-European languages are generally poorly represented online. It's a major obstacle for non-European nations in developing their own generative artificial-intelligence (AI) models, which rely on vast amounts of training data. Generative artificial intelligence (AI) can produce biased results due to a number of factors, including the data it's trained on, the algorithms used, and how it's deployed.
The use of AI in developing nations such as Pakistan will remain limited to a small number of people proficient in the use of the English language. Broadening the adoption of AI applications will require LLMs trained on local language content. The absence of this development could cost Pakistan the opportunity to take full advantage of the AI Revolution.
Related Links:
Riaz Haq
TCF set to bring AI-powered learning to teachers with Khanmigo
https://www.thenews.com.pk/print/1296015-tcf-set-to-bring-ai-powere...
The Citizens Foundation (TCF) and Khan Academy Pakistan have announced an innovative AI-powered collaboration to support teachers and enhance classroom learning in selected TCF schools.
This pilot initiative aims to empower teachers by enhancing teachers’ lesson delivery, fostering critical thinking, and improving classroom engagement for students in Grades 6-8. Under this collaboration, Khanmigo will be integrated into selected TCF schools to enhance mathematics and science instruction.
Unlike traditional AI, Khanmigo acts as an interactive teaching assistant, helping educators enhance their knowledge, craft lesson hooks, develop quizzes, and foster deeper student engagement.
The pilot programme will equip teachers with AI-driven teacher tools, provide structured prompts to guide teachers to develop learning material relevant to their students, and offer bilingual support in English and Urdu.
Additionally, Khan Academy Pakistan will train school leaders on effective AI integration, offering guidance on best practices for using Khanmigo in classrooms. This initiative will empower TCF teachers to refine their teaching methods, personalise learning experiences, and drive meaningful classroom discussions, making AI-driven learning more accessible, structured, and engaging for students. “At TCF, we want to ensure that technology serves as a bridge to better learning opportunities rather than a barrier,” shared Syed Asaad Ayub Ahmad, the president and CEO of TCF.
“We are hopeful that Khanmigo will be useful in serving as a thinking partner for TCF teachers in the classroom and a transformative step towards making high-quality education accessible and engaging.”
One of Khanmigo’s most promising features is its bilingual support, allowing teachers to instruct in both English and Urdu. This ensures that educators from diverse backgrounds can fully engage with the content. As the programme progresses, regional language support will be explored, further broadening its accessibility.
“Khanmigo aims to give every child in Pakistan access to world-class education,” said Zeeshan Hasan, CEO of Khan Academy Pakistan. “By empowering teachers, we are ensuring that AI becomes a tool for empowerment rather than a shortcut. This partnership with TCF is a step forward towards transforming how education is delivered in classrooms.”
“TCF strongly believes in the power of good teachers, and there is an undeniable social aspect of learning from a teacher. We are hopeful that KhanMigo will augment teacher skills to make classroom experience fun, engaging, and meaningful for the students,” shared Shazia Kamal, executive vice president, Outcomes at TCF.
With Pakistan facing a critical education crisis and a shortage of trained teachers, AI-powered solutions like Khanmigo offer a scalable and cost-effective way to enhance teaching quality.
While this initiative is currently in its pilot phase, TCF and Khan Academy Pakistan envision expanding the programme to more schools.
As AI continues to reshape global education, this partnership reaffirms TCF’s commitment to equipping teachers with the best tools to inspire and educate the next generation of Pakistan’s changemakers.
TCF is a non-profit organisation set up in 1995 by a group of citizens who wanted to bring about positive social change through education.
The 30-year-old organisation is among Pakistan’s leading organisations in the field of education, educating 301,000 students across 2,033 school units in the country.
Mar 29, 2025
Riaz Haq
Pakistani Developer Builds First AI Voice Tool for Sindhi Users
https://propakistani.pk/2025/04/07/pakistani-developer-builds-first...
A young Pakistani developer has successfully managed to create the first-ever AI tool to assist with the Sindhi language. These tools enable text-to-speech (TTS) and speech-to-text (SST) in Sindhi for the first time.
The 23-year-old software developer from Hyderabad, Fahad Maqsood Qazi, began work last year on an AI-based dubbing system for his company, Flis Technologies. During development, he realized there were no basic text-to-speech (TTS) or speech-to-text (STT) tools for Sindhi—a language spoken by nearly 40 million people worldwide.
Starting from Scratch
In August 2023, Qazi began gathering and transcribing hours of Sindhi audio from various sources, including YouTube videos, audiobooks, and news reports, to build a training dataset. Around the same time, he came across Mozilla’s Common Voice project, where Google employee Asad Memon had added Sindhi support.
Qazi merged that data with his own and began training AI models. By January 2024, he had built initial working versions of Sindhi TTS and STT systems. He also developed a tokenizer, a necessary tool for processing language in machine learning models, since one was not previously available for Sindhi.
Supporting Language Access
Sindhi is not formally taught in many countries where Sindhi-speaking communities live, which can result in younger generations being less familiar with the language. Qazi hopes his tools will make it easier for people to read, write, and speak Sindhi through digital platforms.
Qazi told Arab News:
My goal is to help them stay connected to it through speech and text tools. In many diaspora communities, younger Sindhis grow up without learning to read or write in their language.
In March, he uploaded his models to HuggingFace, which is essentially the GitHub for AI models, allowing developers and researchers access to his work.
Everyday Use and Accessibility
Qazi’s models could help Sindhi speakers send messages using speech input or listen to written text read aloud in Sindhi. These tools may also assist older adults and people with limited formal education in using the language in everyday communication.
Qazi said:
A person who can’t read Sindhi could use the TTS model to hear written stories. Or someone who never learned to write could still search for information and get answers by speaking.
Long-Term Potential
Qazi believes that the addition of Sindhi to tools like TTS and STT is necessary for the language to remain relevant in digital communication and technology.
“Without access to tools like these, Sindhi could be excluded from digital spaces,” he said. “Now it can be part of systems like voice interfaces, educational resources, and translation tools.”
By addressing a basic gap in language technology, Qazi’s work gives others a foundation to build further tools for Sindhi users, ensuring better access and usability in an increasingly digital world.
Apr 10, 2025
Riaz Haq
Bottom layer: Energy
Second layer: AI Chips
Third layer: Infrastructure (data centers, cloud services)
Fourth layer: AI Models
Top layer: Applications
Five layers of artificial intelligence (#AI): #Energy (#electricity), #Semiconductor #Chips, #DataCenters/ #CloudServices, AI #LLM Models and #Applications.
https://x.com/haqsmusings/status/2015281283077923050?s=61&t=mgT...
yesterday