Kyoto Information and Society Seminar (KISS)では、情報学から数理科学や人文学まで様々な研究分野や、情報技術の社会応用や社会実装の分野で活躍されている方々にご講演いただくセミナーを開催しています。本セミナーでは英語を使用し、開催頻度は月1,2回を予定しています。本セミナーに興味のある方は自由にご参加ください。また、発表を希望される方はtakeuchi@i.kyoto-u.ac.jpまでご連絡ください。
[
English page]
- 世話人
- 松下 旦 (京都大学、東京大学マーケットデザインセンター)
- 新 恭兵 (京都大学)
- 包 含 (京都大学)
- 竹内 孝 (京都大学、理研AIP)
現在のスケジュール
- [KISS-004]
- Title: A tutorial on metaheuristics for combinatorial optimization problems
- Presenter: 梅谷 俊治 (リクルート) [web]
- Date:12/17 13:30-15:00
- Location:7号館 情報3講義室 (1階 104)
- Abstract:
We often encounter computationally hard (a.k.a. NP-hard) combinatorial optimization problems in a wide range of industrial applications. A standard approach is formulating the real-world problem as a mixed integer programming (MIP) problem and then solve it by one of the state-of-the-art MIP solvers. Continuous development of the MIP technology has much improved the performance of MIP solvers and this has been accompanied by advances in computing machinery. However, many real-world problems still remains unsolved due to a large gap between the lower and upper bounds of the optimal values. We often consider an alternative approach based on heuristic algorithms that give us (not optimal) but practically good feasible solutions for such hard problems.
Metaheuristics can be considered as the collection of ideas on designing heuristic algorithm for combinatorial optimization problems. The ideas of metaheuristics give us a systematic view by incorporating them into the basic strategies such as greedy and local search algorithms. In this tutorial, we first introduce how to design efficient local search algorithms along with their ingredients. We then introduce their expansions called “metaheuristics” along with types of strategies such as iterated local search (ILS), simulated annealing (SA), genetic algorithm (GA), guided local search (GLS) and so on.
過去のスケジュール
- [KISS-003]
- Title: How Transformers Learn Causal Structure with Gradient Descent
- Presenter: Jason D. Lee (Associate Professor, Princeton University) [web]
- Date:12/5 13:30-15:00
- Location:Lecture room 2, Research Bldg. No. 7 (101, 1st floor)
- Abstract:
The incredible success of transformers on sequence modeling tasks can be largely attributed to the self-attention mechanism, which allows information to be transferred between different parts of a sequence. Self-attention allows transformers to encode causal structure which makes them particularly suitable for sequence modeling. However, the process by which transformers learn such causal structure via gradient-based training algorithms remains poorly understood. To better understand this process, we introduce an in-context learning task that requires learning latent causal structure. We prove that gradient descent on a simplified two-layer transformer learns to solve this task by encoding the latent causal graph in the first attention layer. The key insight of our proof is that the gradient of the attention matrix encodes the mutual information between tokens. As a consequence of the data processing inequality, the largest entries of this gradient correspond to edges in the latent causal graph. As a special case, when the sequences are generated from in-context Markov chains, we prove that transformers learn an induction head (Olsson et al., 2022). We confirm our theoretical findings by showing that transformers trained on our in-context learning task are able to recover a wide variety of causal structures
- [KISS-002]
- Title: Best Arm Identification: Fixed Confidence and Fixed Budget Settings
- Presenter: 小宮山 純平 (NYU) [web]
- Date:10/29 13:30-15:00
- Location:7号館 情報3講義室 (1階 104)
- Abstract:
We consider the best arm identification problem, where the goal is to find the arm with the largest mean. In this problem, there are two popular settings: the fixed-confidence setting, where the desired confidence level is given, and the fixed-budget setting, where the sample size is predetermined. We introduce the basic ideas of this problem and discuss how differences in the problem setting affect algorithmic design. If time permits, I will also introduce other recent works by the speaker.
- [KISS-001]
- Title:Outlier-Robust Neural Network Training: Efficient Optimization of Transformed Trimmed Loss with Variation Regularization
- Presenter: 奥野 彰文 (統計数理研究所 助教(兼・総研大 助教 / 理研AIP 客員研究員)) [web]
- Date:10/8 13:30-15:00
- Location:7号館 情報3講義室 (1階 104)
- Abstract:
In this study, we consider outlier-robust predictive modeling using highly-expressive neural networks. To this end, we employ (1) a transformed trimmed loss (TTL), which is a computationally feasible variant of the classical trimmed loss, and (2) a higher-order variation regularization (HOVR) of the prediction model. Note that using only TTL to train the neural network may possess outlier vulnerability, as its high expressive power causes it to overfit even the outliers perfectly. However, simultaneously introducing HOVR constrains the effective degrees of freedom, thereby avoiding fitting outliers. We newly provide an efficient stochastic gradient supergradient descent (SGSD) algorithm for optimization and its theoretical convergence guarantee. (This work is a joint work with Shotaro Yagishita (ISM))