Hyderabad : International Institute of Information Technology Hyderabad’s (IIITH) Language Technologies Research Centre (LTRC) celebrated the 25th year of its inception with a two-day event over the weekend that included several talks and panel discussions on the vision, mission, evolution and future of language technologies. The event highlighted the importance of academic research connected with real-life problems, the need for collaboration in technology development, and the role of language technology across regional languages for social good.
To mark the occasion LTRC launched free-to-download BhashaVerse language resources machine translation models. The multitask encoder-decoder model can translate from any of the 36 languages into any other including Tulu, Bodo, Bhojpuri, Magahi, Santhali etc. It helps with machine translation evaluation, error identification and automatic post-editing of 10 billion sentence pairs.
The Centre also announced the development of the BhashaVerse LLM decoder model for 36 Indian languages that can be used for summarisations, Q&A etc with some fine-tuning. In addition to these models, LTRC released synthetically generated and curated 10 billion Bhashik datasets for Indian language to Indian language pairs; a generic dataset, one in the Education domain that works across 17 different fields in English and 5 Indian languages and one for the Health domain in English and 8 Indian languages. For the first time in Indian languages, a dataset for automatic post-editing of text and machine translation evaluation has also been made available.
Established in October 1999 in response to the preeminent scientific challenge of enabling machines to read, understand and derive meaning from human languages, especially in the Indian context and Indian languages, it was, perhaps, the first such research theme-focused centre in the country. Today LTRC is the largest academic centre of speech and language technology in South Asia.
Initially focused on areas like Machine Translation, Semantic Parsing, and Information Extraction, the centre has expanded its research portfolio to include Speech Recognition, Text Generation, Sentiment Analysis, Dialogue Systems, and more. Over the years, LTRC has built a thriving ecosystem of researchers, students, and collaborators who have carried forward the centre’s pioneering work in various directions, many of which had their origins at LTRC itself.
The goodwill that the Centre has garnered over the years was self-evident by the active participation of several notable figures in the field of language technologies from academia and industry, as well as faculty, students, alumni and collaborators at the event.
Commenting on LTRC and its impact, Prof Vasudeva Varma, Head, LTRC said, “As the first natural language processing centre in the country we have pioneered several aspects of research and education. We have trained brilliant minds who are leading advances worldwide. Our lasting contribution to research, open datasets, tools and technologies have made a huge impact. Our successful technology transfers have brought industry and academia closer. We look forward to continuing to push the boundaries and our legacy of innovation”.