Data-Driven Biomedical Science Team (https://aip.riken.jp/labs/goalorient_tech/datadrive_biomed/) at RIKEN AIP
Speaker 1 (Approx. 15 mins) : Ichiro Takeuchi (Team Leader)
Title: Introduction of Data-driven biomedical science team
Abstract: In this short talk, I will introduce research activities in our team. The goal of our team is to develop machine learning methods that leads to scientific discovery in the fields of biology, medicine and material science.
Speaker 2 (Approx. 25 mins)：Vo Nguyen Le Duy
Title: Statistically Quantifying the Reliability of Data-Driven Science by Selective Inference
Abstract: Data-driven science is a scientific research in which unprecedented knowledge and insights are extracted from the data such as machine learning, and it has attracted much attention from numerous areas. However, it also raises an essential problem related to the reliability of the results obtained by data-driven algorithms. False discovered results are harmful for high-stake decision making such as medical diagnosis or automatic driving. Therefore, we develop a framework for properly evaluating the statistical reliability of data-driven results based on the concept of Selective Inference, which has recently received a lot of attention.
Speaker 3 (Approx. 25 mins)：Yu Inatsu
Title: Bayesian experimental design for optimizing black-box functions under uncertainty
Abstract: Bayesian experimental design (BED) is known as one of the powerful design strategies to efficiently optimize black-box functions that are costly to evaluate. BED assumes a prior distribution in the black-box function and builds a design strategy based on the posterior distribution after data observation. However, in the field of applications, black-box functions often contain uncertainty such as input uncertainty, and usual BED methods cannot be applied directly in such cases. In this seminar, I will talk about BED under uncertainty.
Speaker 4 (Approx. 25 mins) Eugene Ndiaye
Title: Screening Rules and its Complexity for Active Set Identification
Abstract : Screening rules were recently introduced as a technique for explicitly identifying active structures such as sparsity, in optimization problem arising in machine learning. This has led to new methods of acceleration based on a substantial dimension reduction. We show that screening rules stem from a combination of natural properties of subdifferential sets and optimality conditions, and can hence be understood in a unified way. Under mild assumptions, we analyze the number of iterations needed to identify the optimal active set for any converging algorithm. We show that it only depends on its convergence rate.
Speaker 5 (Approx. 25 mins) Hiroyuki Hanada
Title: “Predictive” pattern mining: Supervised learning for sets, sequences or the like
Abstract: We often encounter data where each instance is represented as a set, a sequence or the like. In such a case we can use pattern mining algorithm (e.g., frequent mining) to find characteristic “patterns” (subsets for sets, subsequences for sequences). In this talk we additionally assume that each instance also has a label. For example, given sequences of amino acids (i.e., proteins) together with the label of allergic or not, we would like to predict whether a new protein is allergic or not based on patterns in it. We present some methods for such problems and our achievements.