• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Scientists Present New Solution to Imbalanced Learning Problem

Scientists Present New Solution to Imbalanced Learning Problem

© iStock

Specialists at the HSE Faculty of Computer Science and Sber AI Lab have developed a geometric oversampling technique known as Simplicial SMOTE. Tests on various datasets have shown that it significantly improves classification performance. This technique is particularly valuable in scenarios where rare cases are crucial, such as fraud detection or the diagnosis of rare diseases. The study's results are available on ArXiv.org, an open-access archive, and will be presented at the International Conference on Knowledge Discovery and Data Mining (KDD) in summer 2025 in Toronto, Canada.

The problem of imbalanced learning is becoming increasingly relevant across various fields, including banking and medicine. Conventional methods, such as random oversampling, often generate low-quality samples or fail to accurately model rare class data.

Simplicial SMOTE (Synthetic Minority Oversampling Technique), a novel solution proposed by scientists from HSE University and Sber AI Lab, addresses these issues by enabling more accurate modelling of complex topological data structures and improving classifier performance on imbalanced datasets.

It generates new examples of a rare class by leveraging information from multiple closed instances ('simplex'), rather than just two close points, as in the original SMOTE and its well-known modifications. This facilitates a better understanding of the data and advances performance. The technique improves training on imbalanced data, where one class (eg, normal transactions) has many examples, while another class (eg, fraud) has few.

Researchers have experimentally shown on a large number of test datasets that the proposed approach achieves significantly better performance metrics, such as the F1 Score and Matthews Correlation Coefficient, for both the basic SMOTE and its modifications. In particular, an improvement was observed in gradient boosting, a classifier commonly used in practice.

'Our technique is particularly effective for tasks involving imbalanced data, where the rare class holds greater significance. Banks can use Simplicial SMOTE to detect fraud more effectively, and medical centres can apply it to diagnose rare diseases,' says Andrey Savchenko, co-author of the article and Leading Research Fellow at the Laboratories for Theoretical Modelling in AI of the HSE AI and Digital Science Institute.

The new technique can be integrated into existing oversampling algorithms (such as Borderline-SMOTE, Safe-level-SMOTE, and ADASYN), enabling better accuracy without significantly increasing computational complexity. According to the researchers, the developed approach could contribute to the creation of more accurate and reliable machine learning models, thereby improving the quality of analytics.

The study was conducted with support from the HSE Basic Research Programme.

See also:

Similar Comprehension, Different Reading: How Native Language Affects Reading in English as a Second Language

Researchers from the MECO international project, including experts from the HSE Centre for Language and Brain, have developed a tool for analysing data on English text reading by native speakers of more than 19 languages. In a large-scale experiment involving over 1,200 people, researchers recorded participants’ eye movements as they silently read the same English texts and then assessed their level of comprehension. The results showed that even when comprehension levels were the same, the reading process—such as gaze fixations, rereading, and word skipping—varied depending on the reader's native language and their English proficiency. The study has been published in Studies in Second Language Acquisition.

Registration for Russian Olympiad in Artificial Intelligence 2025 Now Open

Registration for the fifth season of the Russian Olympiad in Artificial Intelligence has opened. This year, the competition has gained international status. The event is open to students in the 8–11 grades both in Russia and abroad. The winners will receive benefits when applying to Russian universities.

Mortgage and Demography: HSE Scientists Reveal How Mortgage Debt Shapes Family Priorities

Having a mortgage increases the likelihood that a Russian family will plan to have a child within the next three years by 39 percentage points. This is the conclusion of a study by Prof. Elena Vakulenko and doctoral student Rufina Evgrafova from the HSE Faculty of Economic Sciences. The authors emphasise that this effect is most pronounced among women, people under 36, and those without children. The study findings have been published in Voprosy Ekonomiki.

Scientists Discover How Correlated Disorder Boosts Superconductivity

Superconductivity is a unique state of matter in which electric current flows without any energy loss. In materials with defects, it typically emerges at very low temperatures and develops in several stages. An international team of scientists, including physicists from HSE MIEM, has demonstrated that when defects within a material are arranged in a specific pattern rather than randomly, superconductivity can occur at a higher temperature and extend throughout the entire material. This discovery could help develop superconductors that operate without the need for extreme cooling. The study has been published in Physical Review B.

Scientists Develop New Method to Detect Motor Disorders Using 3D Objects

Researchers at HSE University have developed a new methodological approach to studying motor planning and execution. By using 3D-printed objects and an infrared tracking system, they demonstrated that the brain initiates the planning process even before movement begins. This approach may eventually aid in the assessment and treatment of patients with neurodegenerative diseases such as Parkinson’s. The paper has been published in Frontiers in Human Neuroscience.

Global AI Trends Discussed at International Foresight Workshop at HSE University

At an international foresight workshop on artificial intelligence held at HSE University, Russian and foreign scholars discussed the trends and challenges arising from the rapid development of AI.

Civic Identity Helps Russians Maintain Mental Health During Sanctions

Researchers at HSE University have found that identifying with one’s country can support psychological coping during difficult times, particularly when individuals reframe the situation or draw on spiritual and cultural values. Reframing in particular can help alleviate symptoms of depression. The study has been published in Journal of Community Psychology.

HSE Students Win International Olympiad in Artificial Intelligence

In the finals of the olympiad, the Russian team competed with 300 talented schoolchildren from 61 countries, including Australia, Brazil, Hungary, China, Mexico, the United Arab Emirates, Poland, Serbia, Singapore, the USA, Sweden, and Japan. The finals included team and individual rounds. In the team round, the Russian team made it into the top 10, winning a silver medal. In the individual competition, Russian schoolchildren won six gold medals, one silver, and one bronze.

‘Neural Networks Can Provide Assessments As Accurate As Humans’

Voice assistants have become part of everyday life. They can plan routes, play music and films, and answer questions. But the quality of their speech requires assessment. To address this, students of the Applied Artificial Intelligence Workshop at the HSE University and VK Engineering and Mathematics Schoolhave developed neural networks capable of evaluating speech synthesis.

Scientists Clarify How the Brain Memorises and Recalls Information

An international team, including scientists from HSE University, has demonstrated for the first time that the anterior and posterior portions of the human hippocampus have distinct roles in associative memory. Using stereo-EEG recordings, the researchers found that the rostral (anterior) portion of the human hippocampus is activated during encoding and object recognition, while the caudal (posterior) portion is involved in associative recall, restoring connections between the object and its context. These findings contribute to our understanding of the structure of human memory and may inform clinical practice. A paper with the study findings has been published in Frontiers in Human Neuroscience.