Audio Recognition and AI are two technological areas that are increasingly intersecting with the aim of transforming many sectors, especially in the investigative and forensic fields.
The patent for “method of processing an audio stream for the recognition of voices and/or background sounds and related system” on our audio technology, has given rise to further research and development in the “Polyphonic” platform.
The synergistic combination of the capabilities of speech recognition and the predictive abilities of AI has been at the center of Pragma Etimos’ studies for years and represents an important part of its core business.
The aim of this article is to illustrate the new “Polyphonic Verse” tool, a feature designed to enrich the Speaker Recognition platform. A significant evolution that offers advanced features and multilingual capabilities to be even more performing and support the work of experts.
Let’s see all the details.
What is Polyphonic Verse?
Polyphonic Verse is a major innovation in speech recognition and spoken language understanding.
The model was trained on a large and varied set of information with hundreds of thousands of hours of audio data. The latter include transcriptions of audio in multiple languages, including dialect forms.
What makes Polyphonic Verse unique from other solutions on the market is its robustness and versatility. In fact, it shows superior resistance to accents, including pronunciations, background noise, and technical language, thanks to its large and diverse database.
In addition to speech recognition and translation, Polyphonic Verse presents itself as a complete system for understanding and analyzing spoken language. Integrated with the “Polyphonic” platform, it provides a detailed analysis of the speaker by identifying not only the language but also the reference isogloss, so as to identify restricted geographical areas of speakers, their respective linguistic variants as well as slang traits, offering an accurate transcription of phonic findings of intra and inter speakers. This allows operators to analyse and understand audio content with greater efficiency and accuracy.
In summary, it is not just a speech recognition tool, but an integrated solution for understanding and analysing spoken language, aimed at improving the accuracy of work operations through a wide range of applications.
Dialect Recognition
Dialect recognition represents one of the most fascinating and complex challenges in the field of computational linguistics.
With the advent of advanced technologies such as speech recognition and Artificial Intelligence, the ability to identify and understand local dialects is becoming more and more realistic. Polyphonic Verse was created with the intention of overcoming this challenge and making real-time transcriptions of complex dialect forms.
A crucial aspect of dialect recognition is, as anticipated, the use and identification of isoglosses, imaginary lines drawn on a map that separate different linguistic areas on the basis of specific phonetic, morphological or lexical characteristics. Isoglosses help to identify boundaries between dialects and to understand how certain language variants are distributed geographically.
For example, in Italy, isoglosses can distinguish Neapolitan from Sicilian, or Lombard from Venetian. These lines are not rigid and often intertwine, reflecting the evolution and mutual influence between the various speeches. Accurate isogloss mapping is key to creating accurate and reliable dialect recognition models, and this is what the Pragma Etimos team focused on to develop Polyphonic Verse.
The new tool in question is constantly evolving and through Deep Learning algorithms it will be increasingly refined to recognize and translate complex dialect forms and other languages.
The main features of Polyphonic Verse
Here are the main features of Polyphonic Verse:
- Language Identification and Speaker Profiling: one of the features of Polyphonic Verse in our “Polyphonic” platform is the advanced language identification capability. Combined with tools for predicting the age and gender of the speaker, it offers a complete and detailed description of the user, accurately recognizing not only the spoken language but also evidence of alteration of the usual pattern of voice stress.Currently, our platform supports the recognition of more than 30 languages. In addition, we are collaborating with a Calabrian academic to develop specific and detailed datasets to identify the dialects of Calabria, making the technology adaptable to any other language or dialect with the appropriate training data.
- Linguistic variant research: Pragma Etimos and its experts have started in-depth sociolinguistic research aimed at identifying and isolating linguistic variants within the same isogloss, so as to be able to carry out new system training sampling for every possible lexeme or vernacular intercalated.This research activity is aimed at broadening the spectrum of sociolinguistic acquisition by isolating, also, paralinguistic aspects (such as prosodic facts), as well as those frequent lost traits of communication that are hidden in the extra-communicative dynamics of the so-called “silent language”.
- Advanced transcription: this feature allows for accurate and fast transcription of speech, making it easier for operators to analyse audio.
- Speech translation: Polyphonic Verse’s multilingual translation tool allows the translation of audio from other languages into Italian or English, further expanding the ability to analyse and understand audio content.Pragma Etimos plans to refine and enhance this service by integrating the understanding of the main dialects, offering users even more complete and global access to information.
- The search for fragments of interest within a lexicon: the tool’s work grid allows you to use a wide range of search queries that also concern the possibility of querying the system in order to identify, within an audio find, specific fragments such as, for example, the word “drug” or, again, to select parts of the exhibit where a given topic is treated, also by analogy and semantic familiarity.
At the heart of all these features is the Denoiser tool that can “clean” audio files from all background noise. This Speech Enhancement tool allows you to perform signal cleaning operations with respect to SNR (Signal to Noise Ratio) noise, offering interested parties a documented technical and scientific demonstration of filtering operations for judgment evaluations.
The new challenge of Pragma Etimos
The integration of Polyphonic Verse into the “Polyphonic” platform accelerates the process of speech analysis and understanding, allowing operators to perform their tasks more efficiently and quickly.
Automatic transcription and multilingual translation simplify day-to-day operations, reducing work time and optimizing available resources.
The development of the new tool is just the beginning of a path of continuous development and improvement.
Pragma Etimos is committed to continuing to refine and enhance the capabilities of Polyphonic Verse, ensuring a solution that is always at the forefront of speech recognition and spoken language understanding.
You may also like
ATHENA: TRANSFORM DATA INTO VALUABLE INFORMATION
A.T.H.E.N.A.: Archivial Thematic Heterogenous Encrypted Neuronal Analyser Transforming data into valuable information requires the preparation of neural models and the use of advanced technologies that are based on the ability to manage and analyse informations….
Risk Management: how to manage data
Developing a Risk Management plan is a particularly complex activity, which must consider a long list of factors, even distant from each other: from legal aspects to financial accounts, passing through the advertising sector, customer relations and commercial approaches…