10+ Views

Semantic Similarity

Recently, solutions for automatically measuring semantic similarity have gained much popularity in both academia and industry. The reason is that the possibility of automatically determining the degree of similarity between textual expressions has a great impact on a wide range of fields of knowledge. From question answering systems to automatic translation solutions, from data integration systems to query expansion methods to algorithms for automatic classification of text documents.

Currently, the solutions that yield the best results are those based on deep learning. Most of these solutions try to vectorize each word according to its occurrence in a text corpus, so that each word is transformed into a characteristic vector that can serve as input for a neural network. The solutions that have appeared and that make use of this approach give very good results. The problem is that it is almost impossible for a human operator to understand the models that have been created during the training phase, so in the end they are black box models. To overcome these limitations, we have been working on the so-called semantic similarity controllers, which are devices that can also achieve very good accuracy, but without giving up the interpretability of the learned model.