Kyoto Information and Society Seminar (KISS) | Machine Learning and Data Mining Research Laboratory

Kyoto Information and Society Seminar (KISS)では、情報学から数理科学や人文学まで様々な研究分野や、情報技術の社会応用や社会実装の分野で活躍されている方々にご講演いただくセミナーを開催しています。本セミナーでは英語を使用し、開催頻度は月1,2回を予定しています。本セミナーに興味のある方は自由にご参加ください。また、発表を希望される方はtakeuchi@i.kyoto-u.ac.jpまでご連絡ください。 [English page]

世話人
- 松下旦 (京都大学、東京大学マーケットデザインセンター)
- 新恭兵 (京都大学)
- 包含 (統計数理研究所)
- 竹内孝 (京都大学、理研AIP)

現在のスケジュール

[KISS-008]
- Title: Multi-Fidelity Optimization for Crowd Control
- Presenter: Ryo Niwa (University of Tsukuba, AIST)
- Date：05/20 13:00-14:30
- Location：131, Research Bldg. No. 7
- Abstract:
  In crowd-control optimization using simulation, we propose a multi-fidelity framework that extends conventional two‐level fidelity models to ten levels and incorporates a switching mechanism for more flexible fidelity management. High-fidelity models typically incur substantial computational cost but deliver high accuracy, whereas low-fidelity models run with reduced cost at the expense of precision. Multi-fidelity optimization exploits this accuracy–cost trade-off to search for optimal solutions efficiently. In our approach, fidelity is defined by varying both the number of agents and the corridor width, thereby reducing computational expense while suppressing degradation in low-fidelity accuracy. Conventional fidelity-management schemes require repeated executions of the high-fidelity model for switching; by contrast, our proposed standard-deviation-based switch triggers transitions to the next fidelity level with a single threshold setting. We demonstrate on a large-scale stadium exit-assignment problem that the proposed method efficiently locates the optimal solution.

過去のスケジュール

[KISS-007]
- Title: Confident Region (CoRe) Approaches for Variational Quantum Eigensolvers
- Presenter: Shinichi Nakajima (Technical University of Berlin / BIFOLD / RIKEN AIP) [web]
- Date：04/16 13:00-14:30
- Location：104, Research Bldg. No. 7
- Abstract:
  Variational quantum eigensolvers (VQEs) are hybrid quantum-classical algorithms for estimating the ground-state energy of given quantum Hamiltonians. The classical part of VQEs corresponds to solving a noisy black-box optimization problem, for which we recently developed three approaches based on Gaussian processes (GPs) and the notion of confident region (CoRe)—the region in the search space where the predictive uncertainty is bounded by a required accuracy. This talk will begin with an introduction to VQEs and the concept of CoRe, followed by an explanation of how physical prior knowledge about VQEs is incorporated into our GP-based methods for three applications: Bayesian optimization, quantum computation cost minimization, and stochastic gradient descent.
- Related papers:
  K. Nicoli et al., “Physics-Informed Bayesian Optimization of Variational Quantum Circuits,” NeurIPS 2023. https://arxiv.org/abs/2406.06150
  C. Anders et al., “Adaptive Observation Cost Control for Variational Quantum Eigensolvers,” ICML 2024. https://arxiv.org/abs/2502.01704
  S. Pedrielli et al., “Bayesian Parameter Shift Rule in Variational Quantum Eigensolvers,” arXiv:2502.02625 https://arxiv.org/abs/2502.02625
[KISS-006]
- Title: Online Inverse Linear Optimization
- Presenter: Shinsaku Sakaue [web]
- Date：03/07 13:30-15:00
- Location：107, Research Bldg. No. 7
- Abstract:
  We study an online learning problem in which, over $T$ rounds, a learner observes time-varying feasible sets and an agent’s optimal actions obtained via linear optimization on $\R^n$. The learner sequentially predicts the agent’s linear objective function, and the performance is measured by the regret—the cumulative gap between the optimal objective values and those achieved by following predictions. The best known regret bound is $O(n^4 \log T)$, which is achieved by a somewhat inefficient ellipsoid-based method (Besbes et al. 2021, 2023). We present an efficient method based on the online Newton step and establish a regret bound of $O(n \log T)$, improving the previous bound by a factor of $n^3$. We also discuss how to handle suboptimal actions by using universal online-learning methods and possible directions toward tight regret analysis.
[KISS-005]
- Title: A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents
- Presenter: Jill-Jênn Vie (Inria) [web]
- Date：1/29 15:00-16:30
- Location：B04, Research Bldg. No. 7 (B1 floor)
- Abstract:
  Massive Open Online Courses (MOOCs) have greatly contributed to making education more accessible. However, many MOOCs maintain a rigid, one-size-fits-all structure that fails to address the diverse needs and backgrounds of individual learners. Learning path personalization aims to address this limitation, by tailoring sequences of educational content to optimize individual student learning outcomes. Existing approaches, however, often require either massive student interaction data or extensive expert annotation, limiting their broad application. In this talk, we will first present a dynamic version of cognitive diagnosis, then introduce a novel data-efficient framework for learning path personalization that operates without expert annotation. Our method employs a flexible recommender system pre-trained with reinforcement learning on a dataset of raw course materials. Through experiments on semi-synthetic data, we show that this pre-training stage substantially improves data-efficiency in a range of adaptive learning scenarios featuring new educational materials. This opens up new perspectives for the design of foundation models for adaptive learning.
[KISS-004]
- Title: A tutorial on metaheuristics for combinatorial optimization problems
- Presenter: 梅谷俊治 (リクルート) [web]
- Date：12/17 13:30-15:00
- Location：7号館情報3講義室 (1階 104)
- Abstract:
  We often encounter computationally hard (a.k.a. NP-hard) combinatorial optimization problems in a wide range of industrial applications. A standard approach is formulating the real-world problem as a mixed integer programming (MIP) problem and then solve it by one of the state-of-the-art MIP solvers. Continuous development of the MIP technology has much improved the performance of MIP solvers and this has been accompanied by advances in computing machinery. However, many real-world problems still remains unsolved due to a large gap between the lower and upper bounds of the optimal values. We often consider an alternative approach based on heuristic algorithms that give us (not optimal) but practically good feasible solutions for such hard problems.
  Metaheuristics can be considered as the collection of ideas on designing heuristic algorithm for combinatorial optimization problems. The ideas of metaheuristics give us a systematic view by incorporating them into the basic strategies such as greedy and local search algorithms. In this tutorial, we first introduce how to design efficient local search algorithms along with their ingredients. We then introduce their expansions called “metaheuristics” along with types of strategies such as iterated local search (ILS), simulated annealing (SA), genetic algorithm (GA), guided local search (GLS) and so on.
[KISS-003]
- Title: How Transformers Learn Causal Structure with Gradient Descent
- Presenter: Jason D. Lee (Associate Professor, Princeton University) [web]
- Date：12/5 13:30-15:00
- Location：Lecture room 2, Research Bldg. No. 7 (101, 1st floor)
- Abstract:
  The incredible success of transformers on sequence modeling tasks can be largely attributed to the self-attention mechanism, which allows information to be transferred between different parts of a sequence. Self-attention allows transformers to encode causal structure which makes them particularly suitable for sequence modeling. However, the process by which transformers learn such causal structure via gradient-based training algorithms remains poorly understood. To better understand this process, we introduce an in-context learning task that requires learning latent causal structure. We prove that gradient descent on a simplified two-layer transformer learns to solve this task by encoding the latent causal graph in the first attention layer. The key insight of our proof is that the gradient of the attention matrix encodes the mutual information between tokens. As a consequence of the data processing inequality, the largest entries of this gradient correspond to edges in the latent causal graph. As a special case, when the sequences are generated from in-context Markov chains, we prove that transformers learn an induction head (Olsson et al., 2022). We confirm our theoretical findings by showing that transformers trained on our in-context learning task are able to recover a wide variety of causal structures
[KISS-002]
- Title: Best Arm Identification: Fixed Confidence and Fixed Budget Settings
- Presenter: 小宮山純平 (NYU) [web]
- Date：10/29 13:30-15:00
- Location：7号館情報3講義室 (1階 104)
- Abstract:
  We consider the best arm identification problem, where the goal is to find the arm with the largest mean. In this problem, there are two popular settings: the fixed-confidence setting, where the desired confidence level is given, and the fixed-budget setting, where the sample size is predetermined. We introduce the basic ideas of this problem and discuss how differences in the problem setting affect algorithmic design. If time permits, I will also introduce other recent works by the speaker.
[KISS-001]
- Title:Outlier-Robust Neural Network Training: Efficient Optimization of Transformed Trimmed Loss with Variation Regularization
- Presenter: 奥野彰文 (統計数理研究所助教（兼・総研大助教 / 理研AIP 客員研究員)) [web]
- Date：10/8 13:30-15:00
- Location：7号館情報3講義室 (1階 104)
- Abstract:
  In this study, we consider outlier-robust predictive modeling using highly-expressive neural networks. To this end, we employ (1) a transformed trimmed loss (TTL), which is a computationally feasible variant of the classical trimmed loss, and (2) a higher-order variation regularization (HOVR) of the prediction model. Note that using only TTL to train the neural network may possess outlier vulnerability, as its high expressive power causes it to overfit even the outliers perfectly. However, simultaneously introducing HOVR constrains the effective degrees of freedom, thereby avoiding fitting outliers. We newly provide an efficient stochastic gradient supergradient descent (SGSD) algorithm for optimization and its theoretical convergence guarantee. (This work is a joint work with Shotaro Yagishita (ISM))