І

н

с

т

и

т

у

т

п

р

о

б

л

е

м

ш

т

у

ч

н

о

г

о

і

н

т

е

л

е

к

т

у

Languages

Main menu

Department of Visual and Speech Recognition

 Created in 2014 through the merger of the Department of Fundamental Problems of Speech Recognition and Visual Images Recognition Department.

The Department of Visual and Verbal Images Recognition is one of the oldest and most powerful structural divisions of the Institute of Artificial Intelligence Problems. Established in October 1993.

The main task of the Department is basic and applied research in the field of speech and visual recognition. There have been developed the systems to recognize separately pronounced words in a predefined dictionary as a whole. Such systems require prior programming creating a voice standard for each word. They have developed a number of applications, including voice control of a mobile robot, a program of entering math formulas using voice and many others.

 

The scientists of the Department work with advanced open-source software packages in the following areas:

- Creation of fundamental principles of recognition of visual images;

- Development of methods and technologies for the identification of objects in different conditions;

- Creation of systems of technical view of autonomous mobile robots;

- Development of intelligent special purpose video surveillance systems;

- Fusion speech recognition systems;

- Creation of methods and algorithms of speech recognition;

-Development of voice control systems for robotics complexes;

- Development of speaker-independent recognition systems;

- Creation of theoretical bases of identification in speaker recognition systems;

- Development of text-independent speaker verification systems.

The high-tech products of scientists are presented at regional and international scientific events, including the largest annual computer expo CeBIT (Hanover, Germany).

 

High-tech products of the Department:

 

Lip-based language recognition training system

A software system for teaching correct articulation in speaking Ukrainian based on lip reading has been developed. The purpose of the system is to facilitate the visual perception of the Ukrainian language by people with hearing impairments. Within the framework of the software system, facial recognition technologies, localization of the area of the lips in the image, recognition of the configuration of the lips technologies, and translation of the sequence of results of recognition of the configurations of the lips into words of the Ukrainian language technologies have been developed. The system imitates human visual perception of speech, thus the separate sounds and words are spoken correctly by the user. The developed software system is a unique scientific development and has no analogues in the world. The technologies meet international standards. The results are the development of speech recognition technology in the area of automating the process of lip reading and can be used to equip existing systems of acoustic speech recognition with additional channel of information that will improve their performance in conditions of noise and extraneous sounds exposure.

 

Experimental system of detection and recognition of license plates of vehicles

An experimental set of algorithms for automation of the process of observing the movement of vehicles has been developed. The complex includes methods of finding the position of the license plate on the image, methods for normalizing the image of the license plate by the angle, methods of comparison of the model of the position of symbols on the license plate with the image to determine the type of license plate, scale and the position of the symbols in the image, methods of recognizing symbols on the license plate.

Two license plate models have been released. The proposed license plate model allows a simple extension to recognize other types of license plates. The obtained results correspond to the global level of achievements in this field and can be used to automate the process of observation and management of road traffic.


 

 

 

Development of methods of recognition of fused spoken phrases within the concept of phonemic speech recognition with generalized transcription

 

 

Automated workplace of the expert in phonoscopy

 

 

Intelligent motion identification video technology