User Tools

Site Tools


dlcp21:abstracts

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
dlcp21:abstracts [21/06/2021 23:26] – [Legacy of Tunka-Rex software and data.] admindlcp21:abstracts [22/06/2021 19:54] (current) – [Equivariant Gaussian Processes as Limiting Convolutional Networks with Infinite Number of Channels] admin
Line 6: Line 6:
  
  
-**N.V. Abasov, Melentiev Energy Systems Institute SB RAS** \\  +**N.V. Abasov, Melentiev Energy Systems Institute SB RAS, Russia** \\  
-E.N. Osipchuk, Melentiev Energy Systems Institute SB RAS \\  +E.N. Osipchuk, Melentiev Energy Systems Institute SB RAS, Russia \\  
-V.M. Berdnikov, Melentiev Energy Systems Institute SB RAS+V.M. Berdnikov, Melentiev Energy Systems Institute SB RAS, Russia
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 25: Line 25:
 ===== Equivariant Gaussian Processes as Limiting Convolutional Networks with Infinite Number of Channels ===== ===== Equivariant Gaussian Processes as Limiting Convolutional Networks with Infinite Number of Channels =====
  
-**A.Demichev, SINP MSU** +**A.Demichev, SINP MSU, Russia** 
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 31: Line 31:
 The topic within which this work was carried out is related to the establishment of relationships between various methods of machine learning (ML). The ultimate goal of establishing such interrelations is to achieve a better theoretical understanding of these methods and their improvement. In particular, a correspondence has recently been established between the appropriate asymptotics of deep neural networks (DNNs), including convolutional ones (CNNs), and the ML method based on Gaussian processes. Since Gaussian processes are mathematically equivalent to free (Euclidean) quantum field theory (QFT), one of the intriguing consequences of these relationships is the potential for using a broad range of QFT methods for analyzing DNNs. There are evidences (including experimental) that non-asymptotic (that is, implementable in practice) DNNs correspond to QFT with interactions. An important feature of convolutional networks is their equivariance (consistency) with respect to the symmetry transformations of the input data. In this work, we establish a relationship between the many-channel limit of equivariant CNNs and the corresponding equivariant Gaussian process (GP), and hence the QFT with the appropriate symmetry. The approach used provides explicit equivariance at each stage of the derivation of the relationship. The topic within which this work was carried out is related to the establishment of relationships between various methods of machine learning (ML). The ultimate goal of establishing such interrelations is to achieve a better theoretical understanding of these methods and their improvement. In particular, a correspondence has recently been established between the appropriate asymptotics of deep neural networks (DNNs), including convolutional ones (CNNs), and the ML method based on Gaussian processes. Since Gaussian processes are mathematically equivalent to free (Euclidean) quantum field theory (QFT), one of the intriguing consequences of these relationships is the potential for using a broad range of QFT methods for analyzing DNNs. There are evidences (including experimental) that non-asymptotic (that is, implementable in practice) DNNs correspond to QFT with interactions. An important feature of convolutional networks is their equivariance (consistency) with respect to the symmetry transformations of the input data. In this work, we establish a relationship between the many-channel limit of equivariant CNNs and the corresponding equivariant Gaussian process (GP), and hence the QFT with the appropriate symmetry. The approach used provides explicit equivariance at each stage of the derivation of the relationship.
  
 +
 +===== Modeling images of proton events for the TAIGA project using a generative adversarial network: features of the network architecture and the learning process =====
 +
 +**Ju.Dubenskaya**, SINP MSU, Russia \\ 
 +A.P.Kryukov,  SINP MSU, Russia
 + 
 +//Short presentation (15 min)//
 +
 +High-energy particles interacting with the Earth atmosphere give rise to extensive air showers emitting Cherenkov light. This light can be detected on the ground by imaging atmospheric Cherenkov telescopes (IACTs). One of the main problems solved during primary processing of experimental data is the separation of signal events (gamma-quanta) against the hadronic background, the bulk of which is made up of proton events. To ensure correct gamma event/proton event separation under real conditions, a large amount of experimental data, including model data, is required. Thus, although proton events are considered as background, their images are also necessary for accurate registration of gamma quanta. We applied a machine learning method - generative adversarial networks to generate images of proton events for the TAIGA project. This approach allowed us to significantly increase the speed of image generation. At the same time testing the results using third-party software showed that over 90% of the generated images were correct. In this article we provide an example of a GAN architecture suitable for generating images of proton events similar to those obtained from IACTs of the TAIGA project. The features of the training process are also discussed, including the number of learning epochs and selecting appropriate network parameters.
 ===== Use of conditional generative adversarial networks to improve representativity of data in optical spectroscopy ===== ===== Use of conditional generative adversarial networks to improve representativity of data in optical spectroscopy =====
  
Line 55: Line 64:
  
  
-**Elizaveta Gres, ISU, Irkutsk** \\  +**Elizaveta Gres, ISU, Irkutsk, Russia** \\  
-Alexander Kryukov, SINP MSU, Moscow+Alexander Kryukov, SINP MSU, Moscow, Russia
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 86: Line 95:
 ===== Identifying partial differential equations of land surface schemes in INM climate models with neural networks ===== ===== Identifying partial differential equations of land surface schemes in INM climate models with neural networks =====
  
-**Mikhail Krinitskiy, Shirshov Institute of Oceanology, RAS** \\  +**Mikhail Krinitskiy, Shirshov Institute of Oceanology, RAS, Russia** \\  
-Viktor Stepanenko, Research Computing Center, MSU \\  +Viktor Stepanenko, Research Computing Center, MSU, Russia \\  
-Ruslan Chernyshev, Research Computing Center, MSU+Ruslan Chernyshev, Research Computing Center, MSU, Russia
  
-//Short presentation (30 min)//+//Long presentation (30 min)//
  
 The core of a land surface scheme in climate models is a solver for a nonlinear PDE system describing thermal conductance and water diffusion in soil. This system includes thermal conductivity and water diffusivity coefficients that are functions of the solution of the system, i.e., water vapor content W and soil temperature T. For the climate models to accurately represent the Earth system's evolution, one needs to identify the equations meaning either approximating the coefficients or estimating their values empirically. Measuring the coefficients is a complicated in-lab experiment without a chance to cover the full range of environmental conditions. The fact that there are many soil types obstructs comprehensive studies as well. There are also known approximate parametric forms of the coefficients that lack accuracy and, in turn, need their identification w.r.t. their own parameters. In this work, we propose a data-driven approach for approximating the parameters of the PDE system, describing the evolution of soil characteristics. We formulate the coefficients as parametric functions, namely artificial neural networks with expressive power high enough to represent a wide range of nonlinear functions. In a routine supervised data-driven problem, one needs to present ground truth for a target value. In the case of a soil PDE system, one cannot afford measurements of the full range of coefficients' ground truth values. In contrast with this approach, we propose training the neural networks with the loss function computed as a discrepancy between the PDE system solution and the measured characteristics W and T. We also propose a scheme inherited from the backpropagation method for calculating the gradients of the loss function w.r.t. network parameters. In contrast with recently developed physics-informed neural networks (PINN) methods, our approach is not meant to approximate a PDE solution directly. Instead, we state an inverse problem and propose its solution using artificial neural networks and the method for its optimization. As a very first step, we assessed the capabilities of our approach in four scenarios: a nonlinear thermal diffusion equation, a nonlinear water vapor W diffusion equation, Richards equation, and the system of thermal conductance equation and Richards equation. We generated realistic initial conditions and simulated synthetic evolutions of W and T that we used as measurements in the networks` training procedure in all these scenarios. We exploited batch normalization, learning rate schedule, and scheduling of additive noise rate to improve the convergence. We also added a few regularization terms to the loss function that penalize negative output values, non-zero output values in the origin, and negative gradients of the network w.r.t. its input. The results of our study show that our approach provides an opportunity for reconstructing the PDE coefficients of different forms accurately without actual knowledge of their ground truth values. The core of a land surface scheme in climate models is a solver for a nonlinear PDE system describing thermal conductance and water diffusion in soil. This system includes thermal conductivity and water diffusivity coefficients that are functions of the solution of the system, i.e., water vapor content W and soil temperature T. For the climate models to accurately represent the Earth system's evolution, one needs to identify the equations meaning either approximating the coefficients or estimating their values empirically. Measuring the coefficients is a complicated in-lab experiment without a chance to cover the full range of environmental conditions. The fact that there are many soil types obstructs comprehensive studies as well. There are also known approximate parametric forms of the coefficients that lack accuracy and, in turn, need their identification w.r.t. their own parameters. In this work, we propose a data-driven approach for approximating the parameters of the PDE system, describing the evolution of soil characteristics. We formulate the coefficients as parametric functions, namely artificial neural networks with expressive power high enough to represent a wide range of nonlinear functions. In a routine supervised data-driven problem, one needs to present ground truth for a target value. In the case of a soil PDE system, one cannot afford measurements of the full range of coefficients' ground truth values. In contrast with this approach, we propose training the neural networks with the loss function computed as a discrepancy between the PDE system solution and the measured characteristics W and T. We also propose a scheme inherited from the backpropagation method for calculating the gradients of the loss function w.r.t. network parameters. In contrast with recently developed physics-informed neural networks (PINN) methods, our approach is not meant to approximate a PDE solution directly. Instead, we state an inverse problem and propose its solution using artificial neural networks and the method for its optimization. As a very first step, we assessed the capabilities of our approach in four scenarios: a nonlinear thermal diffusion equation, a nonlinear water vapor W diffusion equation, Richards equation, and the system of thermal conductance equation and Richards equation. We generated realistic initial conditions and simulated synthetic evolutions of W and T that we used as measurements in the networks` training procedure in all these scenarios. We exploited batch normalization, learning rate schedule, and scheduling of additive noise rate to improve the convergence. We also added a few regularization terms to the loss function that penalize negative output values, non-zero output values in the origin, and negative gradients of the network w.r.t. its input. The results of our study show that our approach provides an opportunity for reconstructing the PDE coefficients of different forms accurately without actual knowledge of their ground truth values.
Line 106: Line 115:
  
  
-Alexander Sboev, National Research Centre "Kurchatov Institute" \\  +Alexander Sboev, National Research Centre "Kurchatov Institute", Russia \\  
-Ivan Moloshnikov, National Research Centre "Kurchatov Institute" \\  +Ivan Moloshnikov, National Research Centre "Kurchatov Institute", Russia \\  
-**Alexander Naumov, National Research Centre "Kurchatov Institute"** \\  +**Alexander Naumov, National Research Centre "Kurchatov Institute", Russia** \\  
-Anastasia Levochkina, National Research Centre "Kurchatov Institute"+Anastasia Levochkina, National Research Centre "Kurchatov Institute", Russia
  
 //Short presentation (15 min)// //Short presentation (15 min)//
  
 The problem of forecasting the evolution of the Covid-19 pandemic is extremely relevant because of the need for planning hospital beds demand and containment policies. Given the limitation of the available temporal datasets of the Covid-19 evolution, the complexity of machine learning algorithms to be created for pandemic time series data must not be high, and their efficiency mainly depends on the correct selection of highly-meaningful features. We view as one such feature the number of tweets where Internet users report having Covid-19. The task to extract such tweets from the internet is complicated by the lack of a Russian tweet dataset for training the machine learning models for extracting this category of tweets. In this work we present a corpus of about 10000 tweets labelled with the following classes: the Ill class when the authors declare that they are ill; the Recovered class when the authors declare that they have been ill and recovered, and the Others class of all other cases. Using this corpus, a Data-Driven model based on XLM-RoBERTa language model has been trained. It demonstrates the F1-macro accuracy of 0.85 for binary task (class 1 – Ill / Recovered, class 2 – Others), and 0.60 for the five-class task (Ill with high/low confidence, Recovered with high/low confidence, Others). These results outperform the RuDR-BERT model by 5 to 7%. The XLM-RoBERTa model thus obtained has been applied to the binary classification task for the unlabeled data of 486 000 tweets. Based on the model’s predictions, a curve has been plotted of the COVID cases in Moscow from January 1, 2020 to March 1, 2021. This plot has been compared to the official statistic on the confirmed cases, and correlation analysis of these two curves, shifted from one another by 1 to 5 days, has been performed. This analysis shows the highest correlation to be when the true curve is shifted 4 days to the future relative to the predicted one. Therefore, the data from the corpus collected are 4 days ahead of the official statistic. Thus, the numbers of tweets collected have proven to be a helpful input feature to use within pandemic forecasting models. The problem of forecasting the evolution of the Covid-19 pandemic is extremely relevant because of the need for planning hospital beds demand and containment policies. Given the limitation of the available temporal datasets of the Covid-19 evolution, the complexity of machine learning algorithms to be created for pandemic time series data must not be high, and their efficiency mainly depends on the correct selection of highly-meaningful features. We view as one such feature the number of tweets where Internet users report having Covid-19. The task to extract such tweets from the internet is complicated by the lack of a Russian tweet dataset for training the machine learning models for extracting this category of tweets. In this work we present a corpus of about 10000 tweets labelled with the following classes: the Ill class when the authors declare that they are ill; the Recovered class when the authors declare that they have been ill and recovered, and the Others class of all other cases. Using this corpus, a Data-Driven model based on XLM-RoBERTa language model has been trained. It demonstrates the F1-macro accuracy of 0.85 for binary task (class 1 – Ill / Recovered, class 2 – Others), and 0.60 for the five-class task (Ill with high/low confidence, Recovered with high/low confidence, Others). These results outperform the RuDR-BERT model by 5 to 7%. The XLM-RoBERTa model thus obtained has been applied to the binary classification task for the unlabeled data of 486 000 tweets. Based on the model’s predictions, a curve has been plotted of the COVID cases in Moscow from January 1, 2020 to March 1, 2021. This plot has been compared to the official statistic on the confirmed cases, and correlation analysis of these two curves, shifted from one another by 1 to 5 days, has been performed. This analysis shows the highest correlation to be when the true curve is shifted 4 days to the future relative to the predicted one. Therefore, the data from the corpus collected are 4 days ahead of the official statistic. Thus, the numbers of tweets collected have proven to be a helpful input feature to use within pandemic forecasting models.
 +
 +===== Performance of convolutional neural networks processing simulated IACT images in the TAIGA experiment =====
 +
 +**Stanislav Polyakov**, SINP MSU, Russia \\ 
 +Alexander Kryukov,  SINP MSU, Russia \\ 
 +Evgeny Postnikov SINP MSU, Russia
 +
 +//Short presentation (15 min)//
 +
 +Extensive air showers created by high-energy particles interacting
 +with the Earth atmosphere can be detected using imaging atmospheric
 +Cherenkov telescopes (IACTs). The IACT images can be analyzed to
 +distinguish between the events caused by gamma rays and by hadrons and
 +to infer the parameters of the event such as the energy of the primary
 +particle. We use convolutional neural networks (CNNs) to analyze Monte
 +Carlo-simulated images of the telescopes of the TAIGA experiment. The
 +analysis includes selection of the images corresponding to the showers
 +caused by gamma rays and estimates of the energy of the gamma rays. We
 +compare performance of the CNNs using images from a single telescope
 +and the CNNs using images from two telescopes as inputs. \\ 
 +//Keywords: deep learning; convolutional neural networks; gamma astronomy;
 +extensive air shower; IACT; stereoscopic mode; TAIGA//
 ===== EVALUATION OF MACHINE LEARNING METHODS FOR RELATION EXTRACTION BETWEEN DRUG ADVERSE EFFECTS AND MEDICATIONS IN RUSSIAN TEXTS OF INTERNET USER REVIEWS ===== ===== EVALUATION OF MACHINE LEARNING METHODS FOR RELATION EXTRACTION BETWEEN DRUG ADVERSE EFFECTS AND MEDICATIONS IN RUSSIAN TEXTS OF INTERNET USER REVIEWS =====
  
-A.G. Sboev, NRC «Kurchatov Institute», NRNU MEPHI \\  +A.G. Sboev, NRC «Kurchatov Institute», NRNU MEPHI, Russia \\  
-**A.A. Selivanov, NRC «Kurchatov Institute»** \\  +**A.A. Selivanov, NRC «Kurchatov Institute», Russia** \\  
-R.B. Rybka, NRC «Kurchatov Institute»I.A. Moloshnikov NRC «Kurchatov Institute»+R.B. Rybka, NRC «Kurchatov Institute», Russia \\  
 +I.A. Moloshnikov NRC «Kurchatov Institute», Russia
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 126: Line 158:
 ===== Using modern machine learning methods on KASCADE data for science and education ===== ===== Using modern machine learning methods on KASCADE data for science and education =====
  
-**Victoria Tokareva, IAP KIT**+**Victoria Tokareva, IAP KIT, Germany**
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 135: Line 167:
 ===== Gamma/hadron separation for a ground based IACT (imaging atmospheric Cherenkov telescope) in experiment TAIGA using machine learning methods Random Forest ===== ===== Gamma/hadron separation for a ground based IACT (imaging atmospheric Cherenkov telescope) in experiment TAIGA using machine learning methods Random Forest =====
  
-**Vasyutina M.R., Moscow State University. Physical Department** \\  +**Vasyutina M.R., Moscow State University. Physical Department, Russia** \\  
-Sveshnikova L.G., Moscow State University. Skobeltsyn Institute of Nuclear Research.+Sveshnikova L.G., SINP MSU, Russia.
  
 //Short presentation (15 min)// //Short presentation (15 min)//
Line 144: Line 176:
 ===== Using convolutional neural network for analysis of HiSCORE events ===== ===== Using convolutional neural network for analysis of HiSCORE events =====
  
-**Vlaskina Anna, Physics Department, MSU** \\  +**Vlaskina Anna, Physics Department, MSU, Russia** \\  
-A. Kryukov, SINP MSU+A. Kryukov, SINP MSU, Russia
  
 //Short presentation (15 min)// //Short presentation (15 min)//
dlcp21/abstracts.1624307179.txt.gz · Last modified: 21/06/2021 23:26 by admin