dlcp21:abstracts
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
dlcp21:abstracts [22/06/2021 00:03] – [Identifying partial differential equations of land surface schemes in INM climate models with neural networks] admin | dlcp21:abstracts [22/06/2021 16:14] – [Performance of convolutional neural networks processing simulated IACT images in the TAIGA experiment] admin | ||
---|---|---|---|
Line 114: | Line 114: | ||
The problem of forecasting the evolution of the Covid-19 pandemic is extremely relevant because of the need for planning hospital beds demand and containment policies. Given the limitation of the available temporal datasets of the Covid-19 evolution, the complexity of machine learning algorithms to be created for pandemic time series data must not be high, and their efficiency mainly depends on the correct selection of highly-meaningful features. We view as one such feature the number of tweets where Internet users report having Covid-19. The task to extract such tweets from the internet is complicated by the lack of a Russian tweet dataset for training the machine learning models for extracting this category of tweets. In this work we present a corpus of about 10000 tweets labelled with the following classes: the Ill class when the authors declare that they are ill; the Recovered class when the authors declare that they have been ill and recovered, and the Others class of all other cases. Using this corpus, a Data-Driven model based on XLM-RoBERTa language model has been trained. It demonstrates the F1-macro accuracy of 0.85 for binary task (class 1 – Ill / Recovered, class 2 – Others), and 0.60 for the five-class task (Ill with high/low confidence, Recovered with high/low confidence, Others). These results outperform the RuDR-BERT model by 5 to 7%. The XLM-RoBERTa model thus obtained has been applied to the binary classification task for the unlabeled data of 486 000 tweets. Based on the model’s predictions, | The problem of forecasting the evolution of the Covid-19 pandemic is extremely relevant because of the need for planning hospital beds demand and containment policies. Given the limitation of the available temporal datasets of the Covid-19 evolution, the complexity of machine learning algorithms to be created for pandemic time series data must not be high, and their efficiency mainly depends on the correct selection of highly-meaningful features. We view as one such feature the number of tweets where Internet users report having Covid-19. The task to extract such tweets from the internet is complicated by the lack of a Russian tweet dataset for training the machine learning models for extracting this category of tweets. In this work we present a corpus of about 10000 tweets labelled with the following classes: the Ill class when the authors declare that they are ill; the Recovered class when the authors declare that they have been ill and recovered, and the Others class of all other cases. Using this corpus, a Data-Driven model based on XLM-RoBERTa language model has been trained. It demonstrates the F1-macro accuracy of 0.85 for binary task (class 1 – Ill / Recovered, class 2 – Others), and 0.60 for the five-class task (Ill with high/low confidence, Recovered with high/low confidence, Others). These results outperform the RuDR-BERT model by 5 to 7%. The XLM-RoBERTa model thus obtained has been applied to the binary classification task for the unlabeled data of 486 000 tweets. Based on the model’s predictions, | ||
+ | |||
+ | ===== Performance of convolutional neural networks processing simulated IACT images in the TAIGA experiment ===== | ||
+ | |||
+ | **Stanislav Polyakov**, SINP MSU, Russia \\ | ||
+ | Alexander Kryukov, | ||
+ | Evgeny Postnikov SINP MSU, Russia | ||
+ | |||
+ | //Short presentation (15 min)// | ||
+ | |||
+ | Extensive air showers created by high-energy particles interacting | ||
+ | with the Earth atmosphere can be detected using imaging atmospheric | ||
+ | Cherenkov telescopes (IACTs). The IACT images can be analyzed to | ||
+ | distinguish between the events caused by gamma rays and by hadrons and | ||
+ | to infer the parameters of the event such as the energy of the primary | ||
+ | particle. We use convolutional neural networks (CNNs) to analyze Monte | ||
+ | Carlo-simulated images of the telescopes of the TAIGA experiment. The | ||
+ | analysis includes selection of the images corresponding to the showers | ||
+ | caused by gamma rays and estimates of the energy of the gamma rays. We | ||
+ | compare performance of the CNNs using images from a single telescope | ||
+ | and the CNNs using images from two telescopes as inputs. \\ | ||
+ | //Keywords: deep learning; convolutional neural networks; gamma astronomy; | ||
+ | extensive air shower; IACT; stereoscopic mode; TAIGA// | ||
===== EVALUATION OF MACHINE LEARNING METHODS FOR RELATION EXTRACTION BETWEEN DRUG ADVERSE EFFECTS AND MEDICATIONS IN RUSSIAN TEXTS OF INTERNET USER REVIEWS ===== | ===== EVALUATION OF MACHINE LEARNING METHODS FOR RELATION EXTRACTION BETWEEN DRUG ADVERSE EFFECTS AND MEDICATIONS IN RUSSIAN TEXTS OF INTERNET USER REVIEWS ===== | ||
dlcp21/abstracts.txt · Last modified: 22/06/2021 19:54 by admin