Table of Contents
Book of Abstracts
Plenary reports
Генеративный ИИ для больших моделей и цифровых двойников
A.Boukhanovsky, ITMO University, St.-Petersburg
Methodology for the use of neural networks in the data analysis of the collider experiments
L.V.Dudko, Lomonosov Moscow State University, Moscow
Most of the modern data analysis in the collider experiments uses multivariate methods. The application of neural networks in such analysis has different steps. Possible optimizations for each step are discussed in the talk. The general recipes for the steps can move application of neural networks from state of the art stage to .deterministic procedure.
Machine Learning in Gamma Astronomy
A.Kryukov, Lomonosov Moscow State University, Moscow
A.Demichev, Lomonosov Moscow State University, Moscow
Imaging Atmospheric Cherenkov Telescope (IACT) capture images of extensive air showers (EAS) generated by gamma rays and cosmic rays (charged particles) as they interact with the atmosphere. The much more frequent events associated with charged particles form the main background in the search for sources of gamma radiation, and therefore the sensitivity of IACTs to a large extent depends on the ability to distinguish between these two types of events. A conclusion about the properties of a primary high-energy particle can be drawn from the images in the telescope camera of the EAS which it initiated. In addition to classification (gamma or charged particle), its properties, such as energy and direction of arrival, can be estimated. Recently, along with the previously developed special methods of appropriate data processing to identify events with gamma rays and determine their parameters, methods based on deep learning have been successfully applied. These techniques, as applied to the analysis of IACT data, are the main subject of this survey report. After very short background information on Cherenkov telescopes and methods for analyzing their data without using deep learning, we discuss the existing deep learning methods for analyzing IACT data, both for classifying the types of detected particles and background rejection, as well as for determining EAS parameters. Showers initiated by gamma rays accounts for an insignificant fraction of the total number of observed EASs, while the background of showers induced by cosmic rays (charged particles) predominates. Therefore, the problem of effective background rejection by deep learning methods is extremely important and is discussed in details. Obtaining the parameters of the reconstructed gamma-ray events, such as the energy and direction of arrival, is the central task of this entire area of research, since it allows one to obtain information about the physical processes in the sources of these gamma rays and, as a result, about the important physical processes in the Universe. We also consider the possibility of using generative neural networks for fast modeling of images of gamma and proton events in IACT cameras, which can significantly speed up and improve the efficiency of experimental data processing. In addition, we shortly discuss two specialized free software packages developed for analyzing IACT data using deep learning methods (available on git-sites). The CTLearn software provides a backend for training neural networks to reconstruct IACT events using TensorFlow. The goal of the GammaLearn project is to find and build optimal neural networks for classifying gamma/cosmic rays and reconstructing gamma-ray parameters in combination with user tools for their easy launch and suitability for various experiments. In the conclusions the main learnings from the works considered in the review are presented.
Методы глубокого обучения для задач построения «цифровых двойников» технологических процессов
М.И. Петровский, ВМК МГУ, Москва
И.С. Лазухин, ВМК МГУ, Москва
«Цифровой двойник» технологического процесса представляет собой комплекс математических моделей, позволяющих: определять качественные и количественные зависимости между параметрами процесса; прогнозировать значения контролируемых и наблюдаемых параметров в динамике в зависимости от текущего состояния процесса и управляющих воздействий; выявлять скрытые зависимости, состояния и факторы, влияющие на технологический процесс; реализовывать выбор оптимальных управляющих воздействий в зависимости от целей и ограничений. При использовании методов машинного обучения для построения таких моделей входной информацией являются многомерные временные ряды показаний датчиков производственной системы, осуществляющей технологический процесс. Настоящий доклад посвящен опыту разработки и внедрения такого «цифрового двойника» для процесса крекинга нефти. Ключевыми компонентами разработанных технологий являются: комплексные методы статистического анализа, визуализации и предобработки данных, включая решения задач выбора периодов стабильной работы установки, выявления артефактов, выбросов и сбоев работы датчиков; извлечение признаков путем применения традиционных методов статистического анализа, а также методов машинного обучения на основе градиентного бустинга; построение дифференцируемых прогнозных моделей на основе современных нейростей глубокого обучения для прогноза значений контролируемых параметров в зависимости от текущего состояния и управления; применение данных дифференцируемых нейросетевых моделей в качестве ограничений, целевых функции и уравнений состояния для решения задачи оптимального управления с использованием подходов обучения с подкреплением и классического метода пристрелки (shooting method).
Высокопроизводительные компьютерные системы в задачах машинного обучения
Sponsor report
A.Moskovsky, RSK, Moscow
Track 1. Machine Learning in Fundamental Physics
23. Generating Synthetic Images of Gamma-Ray Events for Imaging Atmospheric Cherenkov Telescopes Using Conditional Generative Adversarial Networks
Ju.Dubenskaya, SINP MSU (Moscow),
A.Demichev, SINP MSU (Moscow), E.Gres, IGU (Irkutsk), A.Kryukov, SINP MSU (Moscow), S.Polyakov, SINP MSU (Moscow), D.Zhurov, IGU (Irkutsk), A.Vlaskina, SINP MSU (Moscow)
In recent years, machine learning techniques have seen huge adoption in astronomy applications. In this work we discuss the generation of realistic synthetic images of gamma-ray events similar to the ones taken by Imaging Atmospheric Cherenkov Telescopes (IACTs) using the generative model called conditional generative adversarial network (cGAN). The big advantage of the cGAN technique is the much faster generation of new images compared to the standard Monte Carlo simulation. But in order to use cGAN-generated images in a real IACT experiment, we need to ensure that these images are statistically indistinguishable from those generated by the Monte Carlo method. In this work we present the results of a study on comparing the parameters of cGAN-generated image samples with the parameters of image samples obtained using Monte Carlo simulation. The comparison is made by the so-called Hillas parameters, which form a set of geometric features of the event image, widely used in gamma-ray astronomy. Our study shows that the key point is the proper preparation of the training set for the neural network. The properly trained cGAN not only generates individual images well, but also well reproduces the Hillas parameters of the entire sample of generated images. Thus, machine learning simulations are a good alternative to time-consuming Monte Carlo simulations, and it is also fast enough to meet the growing demand for synthetic images in IACT experiments.
8. Generation of the ground detector readings of the Telescope Array experiment and the search for anomalies using neural networks
R.Fitagdinov, MIPT (Moscow region), INR RAS (Moscow)
I.Kharuk
Основной целью данной работы было разработать модели для генерации показаний поверхностного детектора с наибольшей амплитудой и для эксперимента Telescope Array с помощью нейронных сетей. Данные, используемые для обучения модели, были получены с помощью метода Монте-Карло. Для достижения данной цели были написаны генеративно состязательные сети Васерштейна с градиентным штрафом. Были получены визуально похожие данные. Была написана функция поиска аномалий, которая позволила не только искать расхождения реальных и смоделированных данных, но и воспроизводить данные близкие к заданным. Задачами дальнейших исследований могут стать написание генеративно состязательной сети для всех поверхностных детекторов, а не только для детектора с наибольшей амплитудой сигнала. Таким образом эта модель может быть хорошей альтернативой методу Монте-Карло, который сейчас применяется. Ее преимуществом перед ним может служить скорость, которая отличается на несколько порядков.
40. Neural network approach to impact parameter estimation in high-energy collisions using the microchannel plate detector data
K.Galaktionov (SPbSU, St.-Petersburg),
V.Roudnev, F.Valiev
Evaluation of the impact parameter (which is the distance between the trajectories of the colliding particles) in a single event in high-energy ion collisions is an important problem in data analysis in particle physics, since the impact parameter is crucial for understanding the dynamics of these collisions and is essential for extracting information about the properties of nuclear matter. In this work, we present the study of a neural network approach in estimating both the value of the impact parameter and the class of collisions (head-on or peripheral collisions). We have modeled and investigated the response of the detectors based on microchannel plates proposed in [1], using the $\mbox{Au}+\mbox{Au}$ collision dataset, at energies $\sqrt{s_{ NN}}=11\mbox{ GeV}$ obtained by the QGSM MC event generator. We have used the spatial distribution of particles and their time-of-flight data as event features, and we have shown that adding the information about time-of-flight of particles to such event characteristics as the number of registered hits and the average angle of hits on the detectors improves the quality of the impact parameter evaluation. Moreover, by comparing two detector systems with different pseudorapidity acceptance $(3.5 \eta 5.8 \mbox{ and } 4.4 \eta 5.8)$, we have shown that wider interval significantly improves the results. In the course of studies on this data set, the proposed algorithm was able to successfully classify more than 98\% of $\mbox{Au}+\mbox{Au}$ head-on collision events with an impact parameter of less than 5 fm and can be further useful as a fast trigger system. We also discuss further developments and improvements for possible applications of this technique in future experimental setups. 1] A.A. Baldin, G.A. Feofilov, P. Har’yuzov, and F.F. Valiev. Fast beam–beam collisions monitor for experiments at nica. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 958:162154, 2020. Proceedings of the Vienna Conference on Instrumentation 2019
28. The selection of rare gamma event from IACT images with deep learning methods
E.Gres (IGU, Irkutsk)
A.Demichev (SINP MSU, Moscow), Ju.Dubenskaya (SINP MSU, Moscow), A.Kryukov (SINP MSU, Moscow), S.Polyakov (SINP MSU, Moscow), D.Zhurov (IGU, Irkutsk), A.Vlaskina (SINP MSU, Moscow)
Imaging Atmospheric Cherenkov Telescopes (IACT) of gamma ray observatory TAIGA detect the Extesnive Air Showers (EASs), originating from the cosmic or gamma rays interactions with the atmosphere. Thereby telescopes obtain images of the EASs. The ability to determine the gamma rays from hadronic cosmic ray background in images is one of the main features of this type of detectors. However, in actual IACT observations simultaneous observation of the background and the source of gamma ray are needed. This observation mode (called as Wobbling) modifies the images of events, which affects the quality of selection by neural networks. Thus, in this work the results of the application of convolutional neural networks (CNN) in image classification task on MC images of TAIGA-IACTs are presented. The Wobbling mode are considered together with the image adaptation for adequate analysis by the CNN. Also taking into account all necessary image modifications the estimation of the quality selection by the CNN for the rare gamma events selection in MC simulation are given.
6. Preliminary results of neural network models in HiSCORE experiment
A.Kryukov (SINP MSU, Moscow),
A.Vlaskina (SINP MSU, Moscow), A.Demichev (SINP MSU, Moscow), Ju.Dubenskaya (SINP MSU, Moscow), E.Gres (IGU, Irkutsk), S.Polyakov (SINP MSU, Moscow), D.Zhurov (IGU, Irkutsk)
The paper presents preliminary results on determining the direction of the EAS axis in experiments representing an array of Cherenkov detectors. An example of such a facility is the HiSCORE facility deployed near Baikal lake as part of the TAIGA experiment. Two approaches have been considered. One is based on the representation of the time of registration of signal arrival by stations in the form of an image, which is processed using a convolutional network. Another approach is to allocate a subset of the same number of triggered stations in each events, which includes data on the location of stations relative to each other and the relative time of signal registration. The analysis is performed using a fully connected deep network. It was shown in the work that both approaches give approximately the same accuracy. In the future, we propose to optimize the architecture of both networks and the process of their training to improve the accuracy of predicting EAS parameters.
The work was supported by RSF, grant no.22-21-00442. The work was done using the data of UNU “Astrophysical Complex of MSU-ISU» (agreement EB-075-15-2021-675)
27. The use of conditional variational autoencoders for simulation of EASs images from IACTs
A.Kryukov (SINP MSU, Moscow)
S.Polyakov (SINP MSU, Moscow), A.Demichev (SINP MSU, Moscow), Ju.Dubenskaya (SINP MSU, Moscow), E.Gres (IGU, Irkutsk), D.Zhurov (IGU, Irkutsk), A.Vlaskina (SINP MSU, Moscow)
Imaging atmospheric Cherenkov telescopes (IACTs) are used to record the images of extensive area showers (EASs) caused by high-energy particles colliding with the upper atmosphere. Individual images can be used to distinguish between the types of primary particles and estimate the parameters of the events. The algorithms typically use the data from the events with known parameters simulated using Monte Carlo method. In this paper, we investigate the possible alternative approach of simulating IACT images by using conditional variational autoencoders. Both the characteristics of the individual images and their distributions are compared to those of the images generated by Monte Carlo simulation.
15. Универсальный метод разделения широких атмосферных ливней по первичной массе с помощью машинного обучения для черенковского телескопа типа СФЕРА
V.Latypova (SINP MSU, Moscow),
Азра К.Ж., Бонвеч Е.А. , Галкин В.И., Зива М.Д., Иванов В.А. , Подгрудков Д.А., Роганова Т.М., Чернов Д.В, Энтина Е.Л.
Важной задачей физики высоких энергий является нахождение массового состава космических лучей в диапазоне энергий 1-100 ПэВ. Целью эксперимента СФЕРА-2 является определение массового состава первичного космического излучения. Чем лучше происходит разделение событий, вызванных черенковским светом широких атмосферных ливней, тем точнее происходит оценка средней массы. В настоящей работе разработан эффективный способ разделения первичных ядер, образовавших широкие атмосферные ливни, на основе смоделированных событий аппарата СФЕРА с помощью методов машинного обучения. Поскольку для моделирования искусственных событий используется модель ядро-ядерного взаимодействия, результат разделения событий, а, следовательно, и оценка средней массы, может меняться. В эксперименте СФЕРА-2 эта проблема решена. Во-первых, благодаря использованию данных о черенковском свете, которые мало зависят от модели адронного взаимодействия. Во-вторых, обучение нейронной сети произведено одновременно на двух моделях ядерного взаимодействия, которые сильно отличаются между собой в диапазоне энергий 1–100 ПэВ. Это модели QGSJET-01 и QGSJETII-04. Таким образом, обеспечена независимость обработки экспериментальных данных от выбора модели нуклонного взаимодействия. Методами машинного обучения решена задача регрессии. Разделение событий на три группы ядер – протонов (p), азота (N), железа (Fe) с помощью нейронной сети происходит более точно, чем при использовании классических методов.
7. Using Neural Networks for Reconstructing Particle Arrival Angles in the Baikal-GVD Neutrino Telescope
A.Leonov (MIPT, Moscow region),
O.Kalashev
The problem of reconstructing the angles of arrival of neutrinos in the Baikal-GVD experiment using neural networks is being studied. We use Monte Carlo simulation data for single-cluster arrival events of atmospheric neutrinos with energies from 10 GeV to 100 TeV. The performance of the networks was compared with the standard reconstruction based on median angular resolutions. As a result, it was shown that neural networks can cope more accurately than standard reconstruction when restoring small polar arrival angles.
9. Application of machine learning methods in Baikal-GVD: background noise rejection and selection of neutrino-induced events
A.Matseiko (MIPT, Moscow region, INR RAS, Moscow)
Baikal-GVD is a large (~1 km3) underwater neutrino telescope located in Lake Baikal, Russia. In the talk, we present two machine learning techniques developed for its data analysis. First, we introduce a neural network for an efficient rejection of noise hits, emerging due to natural water luminescence. On Monte-Carlo simulated data, it reaches up to 99% signal purity (precision) and 96% survival efficiency (recall). Second, we develop a neural network for distinguishing muon- and neutrino-induced events. By choosing appropriate classification threshold, we preserve 60% of neutrino-induced events, while muon-induced events are suppressed by a factor of 10-6. Both of the developed neural networks employ the causal structure of events and surpass the precision of standard algorithmic approaches.
36. Novelty Detection Neural Networks for Model-Independent New Physics Search
A.Zaborenko (MSU, Moscow)
L.Dudko
Recent advancements in model-independent approaches in High Energy Physics have encountered challenges due to the limited effectiveness of unsupervised algorithms when compared to their supervised counterparts. In this paper, we present a novel approach utilizing a one-class Deep Neural Network (DNN) to achieve accuracy levels comparable to supervised learning methods. Our proposed novelty detection algorithm uses a multilayer perceptron to learn and distinguish a specific class from simulated noise signals. By training on a single class, our algorithm constructs a hyperplane similar to one-class Support Vector Machines (SVMs) but with enhanced accuracy and significantly reduced training and inference times. This research contributes to the advancement of model-independent techniques for uncovering New Physics phenomena, showcasing the potential of one-class DNNs as a viable alternative to traditional supervised learning approaches. The obtained results demonstrate the effectiveness of our proposed algorithm, paving the way for improved anomaly detection and exploration of uncharted territories in High Energy Physics.
Track 2. Modern Machine Learning Methods
14. Reconstruction Methods for a Partial Differential Equation: Application to Physical and Engineering Problems
N.Bykov (ITMO Universiti, St-Petersburg),
A.A.Hvatov, T.A.Andreeva, A.Y.Lukin, M.A.Maslyaev, N.V.Obraztcov, A.V.Surov, A.V.Boukhanovsky
Two approaches for restoring a process model in the form of a partial differential equation based on experimental data are developed. The first approach is based on the evolutionary algorithm, and the second one on the algorithm for best subset selection. Different techniques for numerical differentiation of noisy data are proposed for solving physical and engineering problems. A full-scale experiment of heating a viscous liquid by a merged heat source was carried out as the benchmark for verification of developed algorithms. The efficiency of the algorithms for restoring the structure of the equation, determining the coefficients of derivatives and the possibility of detecting the change in the object operation mode are shown. Additionally, the prospects of restoring a part of a hybrid model in order to simplify the description of complex engineering objects are demonstrated. The advantages and disadvantages of the proposed approaches are discussed.
This research is financially supported by The Russian Science Foundation, Agreement N 21-11-00296, https://rscf.ru/en/project/21-11-00296/
45. Decomposition of Spectral Contour into Gaussian Bands using Improved Modification of Gender Genetic Algorithm
S.Dolenko (SINP MSU, Moscow),
G.A.Kupriyanov (1), I.V.Isaev (2,3), I.V.Plastinin (2), T.A.Dolenko (1,2)
(1) Faculty of Physics, Lomonosov Moscow State University, Moscow,
(2) SINP MSU, Moscow
One of the methods for analysis of complex spectral contours (especially for spectra of liquid objects) is their decomposition into a limited number of spectral bands with physically reasonable shapes (Gaussian, Lorentzian, Voigt etc.). Consequent analysis of the dependencies of the parameters of these bands on some external conditions in which the spectra are obtained may reveal some regularities bearing information about the physical processes taking place in the object.
The problem with the required decomposition is that such decomposition in presence of noise in spectra is an incorrect inverse problem. Therefore, this problem is often solved by advanced optimization methods less subject to be stuck in local minima, such as genetic algorithms (GA).
In the conventional version of GA, all individuals are similar regarding the probabilities and implementation of the main genetic operators (crossover and mutation) and the procedure of selection. In their preceding studies [1, 2], the authors tested gender GA (GGA), where the individuals of the two genders differ by the probability of mutation (higher for males) and by the procedures of selection for crossover (with the number of crossovers limited for females). In this study, we introduce additional differences between the genders in the procedures of selection and mutation. The improved modification of GGA is tested by comparison of the efficiency of gradient descent, conventional GA and two versions of GGA in solving the problems of decomposition of the Raman valence band of liquid water into Gaussian bands.
This study has been funded by the SINP MSU state budget topic 6.1 (01201255512).
References
1. G.A.Kupriyanov, I.V.Isaev, I.V.Plastinin, T.A.Dolenko, S.A.Dolenko. Decomposition of Spectral Contour into Gaussian Bands using Gender Genetic Algorithm. The 6th International Workshop on Deep Learning in Computational Physics (DLCP-2022). Proceedings of Science, 2022, V.429, paper 009. DOI: 10.22323/1.429.0009.
2. G.Kuptiyanov, I.Isaev, S.Dolenko. A Gender Genetic Algorithm and Its Comparison with Conventional Genetic Algorithm. In: Kryzhanovsky, B., Dunin-Barkowski, W., Redko, V., Tiumentsev, Y. (eds). Advances in Neural Computation, Machine Learning, and Cognitive Research VI. NEUROINFORMATICS 2022. Studies in Computational Intelligence, 2023, V.1064, pp. 158-166. Springer, Cham. DOI: 10.1007/978-3-031-19032-2_16.
3. Robust equation discovery as a machine learning method
A.Hvatov (ITMO University),
M.Maslyaev, R.Titov, D. Aminev, J.Schvartsberg, N.Demyanchuk, E.Ivanchik, I.Markov
In this presentation, we delve into the differential equation discovery process as an AI tool to mine unknown equations from observational data. While modern methods can provide an “answer” in the form of one equation, it raises the question of potential coincidence. We intend to explore ways to verify if the provided “answer” is precisely what we need by extracting multiple potential equations from the data, solving them, and assessing uncertainty to ensure accuracy.
20. Modification of soft connectives in machine learning models
Poster
V.Kalnitsky (SPbSU, St.-Petersburg)
The problem of limited accuracy of machine learning models using soft logical connectives is investigated. Such connectives have shown their effectiveness in models with fuzzy initial data. On the one hand, the fundamental disadvantage of soft connectives is their non-associativity. On the other hand, the disadvantages of the currently used soft connectives include the loss of monotonicity and the inability to control several factors simultaneously. We have proposed an approximation of the signum function by a smooth spline. We are controlling the difference between the soft connective and the associative connective. It was shown that the spline approximation is able to reduce the influence of all negative factors and is more flexible in setting. Moreover, the constructed spline model allows numerous modifications depending on the factor that requires the most attention for different tasks.
26. The importance of the number of overfits in time series forecasting by some optimizers and loss functions in neural networks.
Z.Kurdoshev (Tomsk State University, Tomsk),
Е.Пчелинцев
Time series forecasting is one of the most important problems today, and the issue of forecasting has not lost its relevance. Prediction accuracy is difficult even for recurrent neural networks from simple neural networks. Loss functions of a series of predictions depend on the number of retraining and optimizers, their efficiency and scale in recurrent neural networks. Various loss functions and modern optimizers are used for the study. Empirical studies are carried out using LSTM, RNN and GRU networks. Each neural network examines the dependence of time series on overfitting using Adam, AdaGrad, and stochastic gradient descent. The results of the study are presented in the form of graphs.
21. Training by relaxation of Boltzmann machine with multispin interactions
I.Lobanov (ITMO University, St.-Petersburg)
Boltzmann machine (BM) is an energy based generative model capable of solving unsupervised and supervised machine learning problems. BM have been successfully used for facial recognition, phone recognition, sentiment analysis and aspect extraction, etc. [G.M. Harshvardhan, et al. A comprehensive survey and analysis of generative models in machine learning. Computer Science Review 38 (2020), 100285]. The mass use of BM is restricted by slow convergence of the corresponding training algorithms. The training can be facilitated by limiting connections in BM, which allows training by the contrastive divergence algorithm. Such restricted BMs simulated in GPU or FPGA devices are efficient enough for practical use. More efficient hardware implementations of BM are possible if the close relation between BM and Ising model is utilized. Several devices based on new physical principles have been proposed, including spin-ice systems, electric circuits involving memristors and the atomic spin system [B. Kiraly, et al. An atomic Boltzmann machine capable of self-adaption. Nat. Nanotechnol. 16 (2021), 414]. Most of the implementations include the training procedure as part of the hardware design, while the device can not change connections and biases on its own. Making the devices capable of self-training is highly desirable, as this allows them to be reusable for new tasks and speed up learning due to intrinsic parallelism of the hardware implementation.
We proposed to implement the training of the biases as energy minimization of a carefully designed spin system, where the biases are implemented as macrospins included in the device [I. Lobanov. Spin Boltzmann machine. Nanosystems: Physics, Chemistry, Mathematics, 13, 6 (2022), 593]. Since energy is naturally minimized during relaxation, no additional learning supporting circuitry is needed. This kind of devices can become the basis for the future of machine learning due to their energy efficiency, large capacity and high speed.
A universal BM should be capable of learning correlations in data, that is connections between the units should be optimized. We propose to control the connections using multispin interactions between two units and one or two macrospins representing weights. In the approach the training data is provided to the device by applying external magnetic field, and all the parameters of BM can be trained by relaxation. To test the approach we simulated a generalized Heisenberg model including symmetric two-, three- and four- spins exchanges, uniaxial anisotropy and the Zeeman term. We call the model spin Boltzmann machine (SBM). Simulation is done using an semi-implicit second-order integrator implemented in a hand made GPU accelerated software. The efficiency of SBM is demonstrated on classification of handwritten digits from MNIST database. The significance of multispin exchanges contribution to total energy was demonstrated in [Hoffmann M., Blugel S. Systematic derivation of realistic spin models for beyond-Heisenberg solids. ¨ Phys. Rev. B, 2020, 101, P. 024418], however design of such material on purpose is an open problem. In the work we outline constraints on the multispin exchange parameters, which can guide further search of materials for experimental realization of SBM.
17. Hyper-parameter tuning of neural network for high-dimensional problems in the case of Helmholtz equation
D.Poliakov, (SPbSU, St.-Petersburg)
M.Stepanova
In this work we study the effectiveness of common hyper-parameter optimization (HPO) methods for physics-informed neural network (PINN) with application to multidimensional Helmholtz problem. The network was built upon PyTorch framework without the use of special PINN-oriented libraries. We investigate the effect of hyper-parameters on NN model performance and conduct automatic hyper-parameter optimization using different combinations of search algorithms and trial schedulers. We chose Ray Tune — an open-source HPO framework that provides unified interface for many HPO packages — as the HPO tool in our work. We consider two search algorithms: random search and Bayesian method based on tree-structured Parzen estimator (TPE, in two implementations: hyperopt and hpbandster), and the “Asynchronous Successive Halving” early-stopping algorithm (ASHA, implemented in Ray Tune). For our problem, enabling early-stopping algorithm is shown to achieve faster HPO convergence speed than switching from random search to Bayesian method.
Track 3. Machine Learning in Natural Sciences
12. Estimating cloud base height from all-sky imagery using artificial neural networks
M.Borisov (MIPT, Moscow region),
M.Krinitskiy, N.Tilinina
Cloud Base Height (CBH) is one of most important meteorological parameters. CBH strongly correlates with planetary boundary layer depth. Existing methods for assessing CBH in practice either involve the use of complex and expensive equipment, such as lidars, airplanes, meteorological pilot balloons, or have high uncertainty introduced by an expert in the process of visual assessment. In addition, most of the instrumental methods designed for a stable platform for installing equipment are difficult to apply under conditions of waving. In this study, we propose a new way to estimate CBH using two optical wide-angle cameras mounted at a distance which may vary from 15m. to 30m. In our approach, we use the phenomenon of parallax, namely the property of observing a point of a cloud at angles that differ due to the displacement of an observation device. Optical images of the visible sky hemisphere are acquired synchronously during field observations from two cameras of the SAIL Cloud v.2 cloud characteristics assessment device developed in Sea-Atmosphere Interaction Laboratory (SAIL) of Shirshov Institute of Oceanology of the Russian Academy of Sciences. In order to exploit the parallax phenomenon for closely located objects (clouds), we adjust the transformation of one of the images of each pair to compensate for inaccurate camera installation. To do this, we calculate the correction of the position and orientation of the cameras using distant objects, such as the Sun or the Moon. To find the key points of clouds that are observed from cameras at different angles, the graph artificial neural network SuperGlue is used. With the help of SuperGlue, we identify key points and extract their geometric features, according to which we further match key points on synchronous images. Based on the location of key points on synchronous aligned images, we calculate the angle at which they are visible in two cameras, which allows us to estimate the distance from the camera installation base to the clouds. As a result of this study, we developed a new algorithm for calculating CBH. We compared the results of CBH estimates with the values of the ERA-5 reanalysis for marine expeditions AI57, AI58, AI61 of the Shirshov Institute of Oceanology of the Russian Academy of Sciences. The closest consistency with the reanalysis data is observed for cirrus and cumulus clouds.
46. Classification Approach to Prediction of Geomagnetic Disturbances
I.Gadzhiev (1,2),
I.Isaev (1), O.Barinov (1), S.Dolenko (1), I.Myagkova(1)
(1) D.V.Skobeltsyn Institute of Nuclear Physics, M.V.Lomonosov Moscow State University, Moscow, Russia
(2) Faculty of Physics, M.V.Lomonosov Moscow State University, Moscow, Russia
Magnetic storms can cause disruptions in the operation of radio communications, pipelines, power lines and electrical networks, and it may possibly cause human health problems. Therefore, prediction of geomagnetic disturbances is of great practical value. Geomagnetic disturbances are usually described with the help of planetary index Kp, which is provided at 3-hours interval. The approach used in this study implies classifying geomagnetic disturbances according to the level of the Kp index. To do so, the whole range of the index values is divided into several intervals according to the degree of disturbance. The input data are time series of parameters of solar wind and interplanetary magnetic field, measured onboard spacecraft at L1 Lagrange point between the Sun and the Earth, and of the value of Kp index itself. To account for the “memory” of the time series, delay embedding of all the parameters is used – for each of the parameters its several preceding values are taken into account. Additional pre-processing of the parameters is performed by calculating moving averages and other statistical indicators of the time series. To perform classification, various machine learning methods such as gradient boosting and artificial neural networks are used. The optimal values of the parameters of each method are determined by cross-validation, and pattern misbalance among classes is partially reduced using SMOTE technique. It is demonstrated that the suggested approach outperforms the trivial inertial model for all values of prediction horizon from 3 to 24 hours (with 3 hours step). The most efficient pre-processing methods are described, as well as the best machine learning models.
The study has been conducted at the expense of the grant of the Russian Science Foundation No. 23-21-00237, https://rscf.ru/en/project/23-21-00237/.
50. CLIENT-SERVER APPLICATION FOR AUTOMATED ESTIMATION OF THE MATERIAL COMPOSITION OF BOTTOM SEDIMENTS IN THE >0.1 MM FRACTION FROM MICROPHOTOGRAPHY USING MODERN DEEP LEARNING METHODS
V.Golikov(MIPT, Moscow region),
Krinitskiy M.A., Borisov D.G.
The information on the past climates deep-water environments is preserved in natural archives, such as marine sediments covering the sea-floor. The study of sediment composition in coarse fraction (variety of mineral and biogenic grains over 0.063 mm in size) is widely used to pry climate clues out of the sediment record. At present, specialists use a binocular microscope to visually classify grains from a small portion of a sediment sample. This time-consuming technique requires the observer to possess geological expertise. In previous work, we proposed an algorithm for automatic unsupervised detection of particles and their clustering. In the current work, we present qualitative improvements to the algorithm, which now employs the state-of-the-art clustering method, SPICE. This method allowed us to eliminate over-clustering and limit the number of clusters to three, making the results more suitable for interpretation. We trained the algorithm and interpreted its results. The resulting model can be used as a classifier, enabling the calculation of particle distribution by clusters, analysis of grain-size distribution, and comparison of these results with those obtained through other lithological analyses. According to the mean average deviation from the results of the X-ray diffraction analysis, our method outperforms visual description techniques. Lastly, we developed and deployed an application that automates server-side calculations and allows users to evaluate the of clustering and grain-size measurements.
49. Transfer Learning for Neural Network Solution of an Inverse Problem in Optical Spectroscopy.
A.Guzkov (1),
I.Isaev (2), S.Burikov (1), T.Dolenko (1), K.Laptinskiy (2), S.Dolenko (2)
1. Faculty of Physics Lomonosov Moscow State University
2. SINP MSU, Moscow
This work is devoted to the inverse problems of optical spectroscopy, which consist in determining the qualitative and quantitative composition of a sample by its spectra. We consider the methods of optical absorption, IR and Raman spectroscopy for determining the concentrations of heavy metal ions in water. Since this problem does not have a numerical solution, practically the only way to solve it is by using approximation methods, including machine learning methods, based on experimental data. However, this approach has some obstacles. On the one hand, obtaining spectra is a laborious and expensive process, which makes it difficult to obtain a training sample of sufficient size. On the other hand, the experimental spectra are sensitive to impurities contained in water, including those of organic origin, which also negatively affect the quality of the solution of the problem under consideration. At the same time, the impurities contained in the water are specific for each source. Therefore, it is not possible to create a universal training dataset with good representativity and providing a stable solution on any samples. Therefore, in this paper, it was proposed to use the transfer learning approach to improve the quality of the neural network solution. The neural networks were pre-trained on a basic dataset (about 3700 patterns) containing the spectra of solutions prepared with distilled water. Then fine tuning and testing of the networks were carried out on specific data sets (200 - 400 samples) containing the spectra of solutions prepared in river water taken from different rivers.
This study has been performed at the expense of the grant of the Russian Science Foundation no. 19-11-00333, https://rscf.ru/en/project/19-11-00333/.
48. The study of the integration of physical methods in the neural network solution of the inverse problem of exploration geophysics with variable physical properties of the medium.
I.Isaev (1, 3),
I.Obornev (SINP MSU, Moscow), E.Obornev (2), E.Rodionov (2), M.Shimelevich (2), S.Dolenko (1)
1. D.V.Skobeltsyn Institute of Nuclear Physics, M.V.Lomonosov Moscow State University, Moscow
2. Sergo Ordjonikidze Russian State University for Geological Prospecting, Moscow
3. KIRE RAS
The subject area of this study is exploration geophysics, which requires solving specific inverse problems - reconstructing the spatial distribution of the medium properties in the thickness of the earth from the geophysical fields that are measured on its surface. We consider the inverse problems of gravimetry, magnetometry and magnetotelluric sounding, and their integration, which means simultaneous use of various geophysical fields to reconstruct the desired distribution. Possibility of the integration of various geophysical methods requires the determined parameters for each of the methods to be the same. This may be achieved by the spatial statement of the problem, in which the task is to determine the boundaries of geophysical objects. In our previous studies, we consider the parameterization scheme where the inverse problem was to determine the lower boundary of several geological layers. Each layer was characterized by variable values of the depth of the lower boundary along the section, and by fixed values of density, magnetization, and resistivity, both for the layer and over the entire data set [1]. It was demonstrated that integration of geophysical methods provides significantly better results that use of each of the methods separately. The present study is a continuation of work in this direction, and here we consider a parameterization scheme with variable properties of the medium along each layer, and variable properties over a data set.
This study has been performed at the expense of the grant of the Russian Science Foundation no. 19-11-00333, https://rscf.ru/en/project/19-11-00333/.
References
1. I.V. Isaev, I.E. Obornev, E.A. Obornev, E.A. Rodionov, M.I. Shimelevich, S.A. Dolenko. Comparison of data integration methods for neural network solution of the inverse problem of exploration geophysics. 2022 VIII International Conference on Information Technology and Nanotechnology (ITNT), 2022, IEEE, pp. 1-4. DOI: 10.1109/ITNT55410.2022.9848628
13. Estimating significant wave height from X-band navigation radar using convolutional neural networks
M.Krinitsky (Shirshov Institute of Oceanology, RAS, Moscow),
V. Golikov, N.Anikin, A.Suslov, A.Gavrikov, N.Tilinina
Marine radars are vital for safe navigation at sea, detecting vessels and obstacles. Sea clutter, caused by Bragg scattering, is filtered out as noise, but becomes detectable on radar images when wind speed and wave height exceed certain thresholds. Wind-induced wave parameters can be determined using these images, but traditional spectral methods for obtaining wave characteristics face limitations and difficulties in improving accuracy. Machine learning techniques offer advantages in image processing tasks, being more robust and able to handle noisier data. Convolutional neural networks (CNNs) have gained popularity for their efficiency in image recognition problems, extracting features automatically and exhibiting robustness to noise. CNNs have shown success in geosciences, such as satellite image recognition and retrieving information about waves, currents, and oil spills from radar images. Recently, CNNs have been applied for real-time significant wave height estimation using ocean images from various sources. Techniques for calculating wave period and direction of radar sea clutter images have also been developed using CNNs. In recent studies, an approach was demonstrated highlighting the potential of integrating machine learning techniques like CNNs for improved wave measurement accuracy in marine radar applications. In our study, we present the usage of convolutional neural networks (CNNs) for restoring wave characteristics from shipborne radar data and compares the results with those obtained from Spotter buoy. Using CNNs together with traditional spectral analysis methods helps to improve the accuracy of wave characteristics reconstruction and increase computational efficiency.
38. Machine learning techniques for anomaly detection in high-frequency time series of wind speed and greenhouse gas concentration measurements
A.Kasatkin (Shirshov Institute of Oceanology, RAS),
M.Krinitsky
Fluxes of greenhouse gases (GHG) may be assessed in situ using eddy covariance method through processing high-frequency measurements of gas concentration and wind speed acquired at a certain sites, e.g., carbon measurements test areas of the pilot project of the Ministry of Education and Science of Russia. The measurements commonly come with noise, anomalies and gaps of various nature. These anomalies result in biased GHG flux estimates. There are a number of empirical and heuristic approaches for filtering the noise and anomalies, as well as for gap filling. These approaches are characterized by a lot of tuning parameters that are commonly adjusted by an expert, which is a limiting factor for large-scale deployment of GHG monitoring stations. In this study, we propose an alternative approach for anomaly detection in high-frequency measurements of GHG concentration and wind speed. Our approach is based on machine learning techniques. This approach is characterized by lower number of tuning parameters. The goal of our study is to develop a fully-automated data preprocessing routine based on machine learning algorithms. We collected the dataset of high-frequency GHG concentration and wind speed measurements from one of the carbon measurements test areas. In order to compare anomaly detection algorithms, we labeled anomalies in a subset of this dataset. We present two approaches for anomaly detection, namely (a) identification of outliers based on the error magnitude in time series statistical forecast performed by a machine learning (ML) algorithm; (b) classification of anomalies by a ML model trained on labeled dataset of outliers we mentioned above. We compare the approaches and algorithms based on F1-score metric assessed with respect to expert-labeled subset of anomalies in GHG concentration and wind speed time series. Within the forecast-error based approach, we trained several ML models: ARIMA autoregression method, Catboost model for autoregression, Catboost model for forecasting employing additional features, and LSTM artificial neural network. Within the supervised classification approach, we tested Catboost classification model. We demonstrate that ML models for forecasting deliver high quality of time series prediction within autoregression approach. We also show that the anomaly identification method based on autoregression approach employing additional features delivers the best quality with F1-score = 0.75.
42. Identifying cetacean mammals in high-resolution optical imagery using anomaly detection approach employing Machine Learning models
I.Khabutdinov (Shirshov Institute of Oceanology, RAS),
M.Krinitskiy, R.Belikov
Cetacean mammal populations, particularly dolphins, have recently experienced significant declines due to various artificial and natural factors. A crucial aspect of studying these populations is determining their numbers and assessing spatial distributions. In our study, we focus on monitoring dolphin populations in the Black Sea using high-resolution photographs taken from helicopters for counting purposes. Currently, expert analysts manually count dolphins in these images, which is a time-consuming process.
To address this issue, we propose the use of machine learning (ML) approaches, specifically anomaly detection using ML models. We examine a dataset collected during accounting marine expeditions of Shirshov Institute of Oceanology of Russian Academy of Sciences (IORAS) in the Black Sea from 2018 to 2019. The dataset consists of 3730 high-resolution photographs, with dolphins present in 205 images (5.5%). Each dolphin occupies approximately 0.01% of an image area (around 70 x 70 pixels), making their presence a rare event. Consequently, we treat dolphin identification as an anomaly detection task.
Our study compares classical and naive anomaly detection methods with reconstruction-based approaches that discriminate anomalies based on the magnitude of reconstruction errors. Within this latter approach, we utilize various artificial neural networks, such as Convolutional Autoencoders (CAE) and U-Net, for image reconstruction. Overall, our research aims to streamline the process of counting and monitoring dolphin populations in high-resolution imagery using advanced ML techniques.
41. Recognition of skin lesions by images
M.Ledovskikh (SPbSU, St.-Petersburg),
V. Gorikhovskii
The growing number of skin cancer patients has revealed the need for diagnostic expert systems that will help detect lesions with high accuracy. The identification of dangerous diseases associated with skin lesions, especially malignant neoplasms, requires the identification of pigmented skin lesions. Image recognition techniques and computer classification capabilities can improve the accuracy of skin cancer detection. In this work, a program was developed for segmentation of affected and unaffected skin areas. The resulting masks were used to train machine learning models. Based on the results, expert analytical systems were proposed that will help doctors diagnose skin diseases and distinguish between skin lesions both cancerous and non-cancerous.
44. Определение особых точек кривой накопления флуоресценции полимеразной цепной реакции методами машинного обучения без учителя
A.Orekhov (SPbSU, St-.-Petersburg),
М.А.Потехина
Метод полимеразной цепной реакции (ПЦР) – это циклический процесс, основанный на многократном избирательном копировании определённого участка ДНК при помощи ферментов в искусственных условиях (in vitro). Основным молекулярным механизмом ПЦР является амплификация — накопление копий выделенной нуклеотидной последовательности. Эффективность полимеразной цепной реакции характеризуется экспоненциальным участком кривой накопления флуоресценции (кинетической кривой ПЦР). Эта кривая состоит из базовой линии, экспоненциальной фазы и фазы плато.
На начальных циклах прибор регистрирует фоновый уровень флуоресценции, при этом график медленно возрастает как линейная функция. Затем, по мере накопления продукта, сигнал увеличивается экспоненциально, и конечной стадии достигает плато. Кривая накопления имеет сигмоидальную форму. Различия в начальном количестве молекул ДНК влияют на количество циклов, необходимых для повышения уровня флуоресценции выше уровня шума. Выход на плато происходит из-за постепенного снижения количества исходных компонентов реакции и повышения числа ампликонов.
Существует большое количество математических моделей, описывающих график флуоресценции для ПЦР в реальном времени. В некоторых случаях теоретический и практический интерес представляет не эвристический вывод, а формальное определение моментов перехода кривой накопления флуоресценции от линейного роста к экспоненциальному, а затем достижения плато.
Для решения этой задачи можно использовать методы машинного обучения без учителя. Если рассматривать амплификацию как квазидетерминированный дискретный случайный процесс, для которого кривые накопления флуоресценции являются монотонными траекториями, то моменты перехода от базовой линии к экспоненциальной фазе и от экспоненциальной фазы к фазе плато можно рассматривать как аномалии траекторий. Обнаружение этих аномалий возможно при помощи квадратичных форм экспоненциальных и арктангенциальных аппроксимационно-оценочных критериев (Orekhov A.V. Quasi-Deterministic Processes with Monotonic Trajectories and Unsupervised Machine Learning. Mathematics 2021, 9, 2301. https://doi.org/10.3390/math9182301).
10. Application of machine learning methods to numerical simulation of hypersonic flow
S.Pavlov (SPbSU, St.-Petersburg),
V.Istomin
The present study is devoted to applying of machine learning methods for numerical simulation of hypersonic five-component air-mixture flow past a sphere. Exact methods of transport coefficient calculation such as the Chapman-Enskog method are very computationally inefficient for numerical simulation. At the same time, approximate formulae for transport coefficients sometimes do not provide the necessary accuracy. Due to these facts, different machine learning methods for calculation of transport coefficients are considered. Comparison with experimental data is given. The results are analyzed in terms of calculation speed and computation accuracy.
24. A technique for the total ozone columns retrieval using spectral measurements of the IKFS-2 instrument.
A.Polyakov (SPbSU, St.-Petersburg),
Virolainen Y.A., Timofeev Yu.M., Nerobelov G.M., Kozlov D.A.
Atmospheric ozone plays an important role in the Earth's biosphere by absorbing dangerous solar UV radiation and by contributing to the climate formation. Ozone variations are monitored by different local and remote sensing methods. However, satellite methods only can provide data on the global distribution of ozone and its anomalies. Unlike methods based on solar radiation measurements, methods that use measurements of thermal radiation can provide information regardless of solar radiation. For such measurements, Fourier spectrometers operating in the infrared spectra region (FTIR) are usually used. We present a technique for estimating the total ozone columns (TOCs) from outgoing IR radiation spectra based on the artificial neural networks (ANNs). The technique was applied to the spectral measurements of the IKFS-2 instrument aboard the Russian meteorological satellite Meteor M N2 in 2015-2020, using data from the OMI instrument for the ANN training. A detailed validation of the TOC retrievals was carried out based on comparison with ground-based measurements presented by the WOUDC and Eubrewnet networks and satellite measurements of the OMI (Aura), TROPOMI (Sentinel 5p), and IASI (MetOp) instruments. The latitudinal-seasonal behavior of the differences in the results of different measurements was analyzed, which allowed to estimate the corresponding dependences of the measurement errors. TOCs distribution for the period 2015 to 2020 is shown. The average differences between the IKFS-2 data and independent TOC measurements are less than 2%, the standard deviations of the differences (SDDs) vary from 2 to 4%. At the same time, both the analysis of the ANN approximation errors of the OMI data and the comparison of the IKFS-2 results with independent data demonstrate an increase in discrepancies towards the poles. In the spring-winter period, SDDs reach up to 8% in the Southern and up to 6% in the Northern Hemispheres. The technique developed made it possible to derive a global distribution of TOCs for the 2015-2020 period, regardless of solar illumination and presence of clouds. In addition, this technique can be used to process the spectral data of the IKFS-2 instruments on the Meteor M N2 series of satellites. The study was carried out in the “Ozone Layer and Upper Atmosphere Research” laboratory of St. Petersburg State University (agreement with the Ministry of Education and Science of the Russian Federation No. 075-15-2021-583).
24. Повышение точности нейросетевой оценки значимой высоты ветрового волнения по данным судового навигационного радара за счет предварительного обучения на синтетических данных
V.Rezvov (Shirshov Institute of Oceanology RAS, Moscow),
M.Krinitskiy, N.Tilinina, V. Golikov
Судовые навигационные радары (СНР) являются важным инструментом морской навигации. Обычно отраженный от поверхности воды сигнал, регистрируемый СНР, фильтруется и используется для идентификации препятствий и других судов. Однако в литературе показано, что исходный сигнал СНР характеризует состояние морской поверхности, в том числе, параметры ветрового волнения. Существуют подходы оценки значимой высоты ветрового волнения по радиолокационным изображениям моря, основанные на физических закономерностях формирования интенсивности сигнала, отраженного от взволнованной поверхности. Также в современных исследованиях демонстрируется возможность аппроксимации значимой высоты волнения в подходе наук о данных с применением моделей машинного обучения. Оба подхода требовательны к натурным данным, которые собираются в дорогостоящих морских экспедициях или с использованием систем волнового мониторинга. Альтернативным подходом может стать метод порождения синтетических радиолокационных данных, характерных для определенных параметров ветрового волнения. Такой подход позволяет смоделировать картину интенсивности обратного рассеяния радиолокационного сигнала для ветрового волнения любой заданной высоты, что недоступно для измерений в открытом море. В предположении об установившемся ветровом волнении, мы генерируем такие картины в произвольном количестве. Мы используем эти данные для предварительного обучения нейросетевой модели оценки значимой высоты ветровых волн. Мы также применяем подход неконтролируемого обучения, используя порожденные синтетические радарные снимки для обучения свёрточной части нейронной сети как кодирующей части в составе нейросетевого автокодировщика. В настоящем исследовании мы демонстрируем, насколько повышается точность оценки значимой высоты ветрового волнения по снимкам судового радара с применением искусственной нейронной сети, предварительно обученной на синтетических данных, имитирующих картину радиолокационного сигнала, отраженного от взволнованной поверхности моря.
Decoding fluorescence excitation-emission matrices of carbon dots aqueous solutions with convolutional neural networks to create multimodal nanosensor of metal ions
Poster
O.Sarmanova (SINP MSU, Moscow),
G.N.ChugreevaK.A.Laptinskiy, S.A. Burikov, S.A. Dolenko, T.A. Dolenko
In the study the problem of creating a multimodal fluorescent carbon dots-based nanosensor to simultaneously measure concentrations of Cu2+, Ni2+, Cr3+, NO3- ions in liquid media, as well as its pH index has been solved. The application of convolutional neural networks to analyze 1000 complex fluorescence excitation-emission matrices of carbon dots in the presence of the studied ions enabled their identification with mean absolute errors that satisfy the needs of monitoring the composition of waste and technological medium. As a result of further extension of the dataset representativeness with variational autoencoders, the mean absolute error of the ions identification was lowered.
This study has been conducted at the expense of the grant of the Russian Science Foundation № 22-12-00138, https://rscf.ru/en/project/22-12-00138/
34. SMAP sea surface salinity improvement in the Arctic region using machine learning approaches
A.Savin (MIPT, Moscow region, Shirshov Institute of Oceanology RAS, Moscow)
M. Krinitskiy, A. Osadchiev
Sea salinity is one of the fundamental characteristics of physicochemical processes occurring in the ocean, and its consideration plays a substantial role in the description of the climate. Modern sea surface salinity (SSS) retrieval algorithms using satellite data have been developed and verified with high accuracy for the most typical regions of the World Ocean and show significantly lower quality when used in Arctic. In this study, the quality of standard algorithms is improved by using machine learning (ML) approaches. Different models of ML are examined: from classical ones that apply vector features used by the standard Soil Moisture Active Passive (SMAP) satellite salinity algorithms, to deep artificial neural networks that combine the mentioned vector features with two-dimensional fields taken from the ERA5 reanalysis. The models are validated using in-situ data collected during expeditions of the Shirshov Institute of Oceanology RAS in the Barents, Kara, Laptev, and East Siberian seas from 2015 to 2021. As a result of the study, the SMAP SSS standard product is improved in the examined region, and the most significant features for SSS retrieval are determined. The obtained results enable research of the Arctic region using improved sea surface salinity maps.
33. Determination of the charge of molecular fragments by machine learning methods
A.Shevchenko, (Samara State Technical University, Samara)
A.Chuvakov
Machine learning is currently used to solve a variety of scientific and industrial problems, including the screening of new drugs, design of new materials, management of artificial intelligence, search for extraterrestrial civilizations and scammers among bank customers. Our report is devoted to a narrower, but no less important task for chemists and cosmochemists - to determine the charge of a molecular fragment according to its various representations. We selected data, one of the most important components of machine learning, from information on the structure of monoligand coordination compounds from the Cambridge Structural Database [1], which contains more than one million entities. The separation of molecular fragments from the structure was carried out using the ToposPro software package [2], which implements a wide range of geometric and topological algorithms for solving such problems. As a result, we collected information on 18.3 thousand molecular fragments from 38.6 thousand crystal structures. Data markup, that is, adding information about the charge of fragments to the database, turned out to be the most time-consuming operation. We determined the charge values from the literature data and calculated them based on chemical considerations, by balancing the balance of charges in a particular structure. So far, charge values have been determined for 13.3 thousand ligands, on which we have tested machine predictive models. Coordination sequences, smilies, names of chemical compounds and bit mask methods are considered as feature space, among which the space of coordination sequences has shown the best results so far.
The study is supported by a grant from the Russian Science Foundation no. 23-23-00387 https://rscf.ru/project/23-23-00387/.
[1] Allen F.H. Acta Cryst. Sect. A 58 2002, 380. https://www.ccdc.cam.ac.uk/solutions/csd-core/components/csd/
[2] Blatov V.A., et al. Cryst. Growth Des. 2014, 14, 7, 3576.
18. АВТОМАТИЧЕСКОЕ ДЕТЕКТИРОВАНИЕ АКУСТИЧЕСКИХ СИГНАЛОВ БЕЛУХ И АФАЛИН
А.Тышко (Shirshov Institute of Oceanology, RAS, Moscow),
Шатравин А.В., Криницкий М.А.
Акустический мониторинг морских млекопитающих является одним из основных методов оценки численности популяций животных в заданной акватории и позволяет выявлять характерные паттерны миграций животных, отслеживать связь их присутствия с внешними условиями среды. Беломорская Белуха и черноморская афалина являются представителями видов, для которых характерна активная вокализация, поэтому для их обнаружения часто используют пассивный акустический мониторинг (ПАМ). Основной проблемой данного метода является огромный объем материала, который невозможно проанализировать вручную даже подготовленному человеку-эксперту, поэтому востребованными являются являются инструменты автоматического обнаружения сигналов. В данном исследовании мы разработали собственный алгоритм детектирования тональных сигналов зубатых китообразных, основанный на искусственных нейронных сетях, который автоматически извлекает признаки из сигналов животных и с высокой точностью выделяет временные интервалы с наибольшей вероятностью присутствия этих сигналов в аудиозаписях, преобразованных в спектрограммы. В качестве базовых методов мы использовали энергетические алгоритмы, с которыми сравнивали работу нейросети на тестовой выборке. Алгоритмы оценивались сравнением с референсной (экспертной) разметкой с использованием мер качества Точность (Precision), Полнота (Recall), площадь под ROC-кривой (ROC-AUC) и F1-score. Результаты экспериментов на тестовой выборке показали высокое превосходство по качеству представленной в настоящем исследовании искусственной нейронной сети. Базовые алгоритмы также были запущены на записях длиной в несколько дней, содержащих сигналы беломорских белух, сделанных сотрудниками лаборатории морских млекопитающих, в результате чего были получены представления о промежутках времени, в которых наблюдалась акустическая активность животных. Применение алгоритмов позволило сэкономить время сотрудников на обработку этих записей вручную, а также определить факт присутствия животных в акватории, в которой проводился акустический мониторинг. Также, наш метод может быть встроен в устройства акустического мониторинга морских млекопитающих для предварительного детектирования животных. Дальнейшие исследования будут посвящены проверке устойчивости нейросети в зависимости от уровня и типа шумов, присутствующих в аудиозаписях.
37. The role of artificial intelligence in the preparation of modern scientific and pedagogical staff. The experience of the course "Neural networks and their application in scientific research" of Moscow State University named after M. V. Lomonosov.
A.Vasiliev (MSU, AI, Moscow),
D. Vasina, S. Zapunidi, A. Ivchenko, L. Antyufrieva, V. Nemchenko, A. Ganichev, S. Kolpinsky, D. Mitina, A. Tatarintseva, A. Zadorozhny, A. Zolotareva
Today, there is a lot of talk about artificial intelligence and its great importance both in industry and in science.
The demand for specialists who know how to use the capabilities of intelligent systems in their work is invariably high both in scientific laboratories and in the research departments of commercial companies.
Therefore, one of the tasks of modern education is to train such specialists even within the walls of the university, so that while receiving basic education, the student could also simultaneously study work with artificial intelligence, learn to use all its capabilities directly to solve their own scientific or applied problems.
The report will be devoted to the experience of conducting a practical course on machine learning, which is held on the basis of the Faculty of Physics of Moscow State University.
The course “Neural networks and their application in scientific research” is an example of the fact that it is really possible to launch an AI program and begin to train highly qualified personnel in any higher educational institution.
Our course materials are in open source to support the easiest iteration, use and modification.
The authors of the course hope that the training of modern scientific and pedagogical personnel accumulated in this course will seem interesting to a professional audience and will significantly speed up the development of educational courses on machine learning in educational institutions interested in this.
31. Machine learning for diagnostics of space weather effects in the Arctic region
A.Vorobev (Geophysical Center RAS)
Despite the variety of existing approaches to monitoring space weather and geophysical parameters in the auroral oval region, the problem of effective prediction and diagnostics of auroras as a special state of the upper ionosphere at high latitudes remains practically open. And another significant problem here is also concerned with diagnostics of geomagnetically induced currents (GICs) in extended grounded technological systems, which are driven by telluric electric fields induced by the rapid changes of the geomagnetic field. The paper is concerned with research and analyzes of possibility of local diagnostics of auroras presence based on the intellectual analysis of geomagnetic data from ground-based sources. The significance of the indicative variables and their statistical relationships are assessed. So, for example, the application of Bayesian inference to the data of the Lovozero geophysical station for 2012–2020 had showed that the dependence of aposteriori probability of observing auroras in the optical range on the state of geomagnetic parameters is logarithmic, and the degree of significance is inversely proportional to the discrepancy between the empirical data and the approximating function. The accuracy of the approach to diagnostics of auroras presence based on the random forest method is at least 86% when using several local predictors and ~80% when using several global indices of geomagnetic activity characterizing the disturbance of the geomagnetic field in the auroral zone. In conclusion, promising ways to improve the quality metrics of diagnostic models are considered and the areas of their possible application are discussed. Also the paper is concerned with research on the approach to diagnostics of GIC in the power transmission lines in northwestern Russia based on data from IMAGE magnetometers. Based on the results of the statistical and correlation analysis of the objective function (the level of the GIC recorded at the Vykhodnoy transformer station) and geomagnetic data recorded by the nearby IMAGE magnetometers, the features that best characterize the target variable in a given region are distinguished. Using machine learning (ML) methods, the defined number of feature objects is used to develop the relationship for the GIC diagnostics. Evaluation of the coefficient of determination for a stack of various ML methods revealed that the regression approach and artificial neural networks (ANN) are the best solution for the problem under consideration. Verification tests have shown that ANN-based approach and regression methods provide nearly the same diagnostic accuracy for GIC (the mean square error 0.12 A2). However, ANN-based methods are less interpretable and require more computer resources.
30. Search for Meteors in the Mini-EUSO Orbital Telescope Data with Neural Networks
M.Zotov (SINP MSU, Moscow)
Since autumn 2019, the Mini-EUSO (``UV Atmosfera'') experiment, developed by the JEM-EUSO collaboration, is being carried out on board the International Space Station. Mini-EUSO is a wide-field-of-view telescope equipped with two Fresnel lenses and a focal surface built of 2304 pixels with the spatial resolution of ~6.3 km x6.3 km. The instrument operates in the UV range observing the nocturnal atmosphere of the Earth in the nadir direction. During operation, the telescope registers a variety of different phenomena manifesting themselves in the UV, among them various transient luminous events, anthropogenic illumination, bioluminescence, man-made space-debris, meteors etc. Here we discuss how one can effectively find meteor tracks in the Mini-EUSO data using a pipeline built of two simple neural networks.