Interview with researchers 12 | Doshisha University Organization for Research Initiatives and Development

AI
Clinical Data
Co-nonlinearity
Imbalanced Data
Knowledge Discovery
Machine Learning

Synergy between Machine Learning and Knowledge Discovery

Machine translation, voice recognition, and smart home appliances are some examples of Artificial Intelligence (AI) we encounter in our daily lives. In this rapidly developing AI field, Machine Learning (ML) and Knowledge Discovery (KD), which are the subfields of AI, have been garnering particular attention. The focus of ML, which mathematically and statistically derives regularities hidden in data, is on predictive performance improvement. For example, ML improves itself to better identify the face of a criminal in security camera images. Another example can be the accurate detection of the presence of cancer in clinical examination results. On the contrary, KD aims to make the results obtained by ML understandable as new knowledge. Regarding the examples above, KD attempts to explain why and how the criminal was identified or the cancer was detected. “It could be said that ML and KD are the two wheels of a cart. If a phenomenon could be elucidated by KD, we can use that knowledge to improve the performance of ML,” says Professor Ohsaki of the Faculty of Science and Engineering. She develops methods and techniques of ML and KD and applies them to medical and educational fields for supporting human intellectual activities. “We, in our laboratory, assist those who make hypotheses such as doctors and educators,” she continued, “by presenting analytical findings through KD and ML with theoretical explanations. The findings serve as the evidence of their hypotheses. For instance, the technologies that we have developed have been applied to clinical data to obtain insights for diagnosis and treatment. We are also starting to apply our technologies to educational support, wherein learning history data is being mined to find factors that promote learning.”

Methods that make accurate predictions even with limited data

While big data analysis is attracting attention in the field of AI, Professor Ohsaki focuses on small- and medium-scale data. “Large-scale data collection in the natural sciences is difficult as conditions differ widely. We are developing analytical methods specific to small- and medium-scale data, which are required to have functions different from those of big data algorithms,” she said.

The probability of phenomena, such as the appearance of cancer or the intrusion through cyber-attacks is low. However, the situation could get quite serious when they do occur. Also, if missed, there may be no turning back. To meet the requirements of predicting such phenomena, Professor Ohsaki studied on imbalanced data classification in ML and developed a new system for that. “The small number of serious cases makes prediction difficult, which is a fundamental problem. To solve this problem, conventional resampling methods attempted to raise prediction performance by artificially augmenting pseudo data. The augmentation may change the characteristics of original data. We did not take such approach in our research; instead, we proposed a new method that penalizes the classification patterns that cause incorrect results. By assigning weights of penalties depending on the patterns, our method achieved more accurate predictions ( Figure 1 ).” This method is innovative and has a wide range of applications because it accepts any weights of penalties. A user of this method can easily raise prediction performance by setting the weights suitable to their applications based on domain knowledge.

In the medical field, Professor Ohsaki was engaged in research on chronic hepatitis. Chronic hepatitis makes the liver harder over 10 to 30 years after infection leading to cirrhosis and hepatocarcinoma. Patients undergo invasive procedures that involve inserting a needle into their abdomen to determine the degree of liver stiffness. That places a heavy mental and physical burden on them. Professor Ohsaki developed an alternative method based on ML and KD that reduces the burden on patients ( Figure 2 ). Generally, for chronic disease, data of medical examinations using blood and urine can be obtained regularly via long-term follow-up. The method makes it possible to non-invasively estimate the degree of liver stiffness by analyzing the movement of time series of blood and urine examinations. ” An insight into dynamics is commonly important to various types of time series such as speech and earthquakes. The originality of this research is to focus on the dynamics of clinical time series data and apply the technology of speech analysis to clinical diagnosis and treatment.” Beyond the medical field, the method has a potential to be applied to other time series as well. For example, an application to determine whether the autonomous driving situation is safe or dangerous could also be considered.

Creating AI to support people’s intellectual activities with new ideas

Professor Ohsaki compares her role as a researcher to that of a knife-maker. “ I do not directly cook ingredients (data) into dishes (discoveries),”she says, “but by providing kitchen knives (ML and KD methods) tailored to the person who cooks, new dishes will become possible by transforming ingredients that otherwise could not have been used previously. My goal is to create a knife that is sharp enough to be used in various dishes; not the kind that could only be used for accomplishing one task, but a system that can be applied widely,” she explains. Her current research project, “Development and Application of Co-nonlinearity Analysis Methods Leading to Novel Knowledge Awareness,” is no exception. It explores which variables composing a complex phenomenon are highly related and how strong the relationships are. The goal is to establish a method to detect latent nonlinear relationships among variables that are hard to detect by human brains alone. One such application will be the analysis of factors influencing COVID-19 transmission.

With extensive AI research and applications expected to continue in the future, the relationship between humans and AI is being discussed and sometimes argued. “When people talk about AI, we often hear them raising concerns over their job security, worrying that their jobs will be taken away. Indeed, there is a possibility that AI will replace relatively simple intellectual labor. However, I believe that AI should be viewed as a means for humans to enhance their intelligence. Utilizing ML and KD will enable us humans to generate and validate better hypotheses of which we are unaware. This can lead to the further development of human intellect. Geniuses may not need it, but AI can be useful for ordinary people including me who make up most of humanity.”

Professor Ohsaki, who, since she was a young child, loved to engage in creative activities such as drawing, feels that her true calling is in being a researcher who creates and embodies new ideas. She finds joy in those moments when an idea takes shape and becomes useful. Her enthusiasm for research and inquisitiveness was evident as she spoke about the future of AI.

PROFILE

Research Area: Machine learning / Knowledge discovery
Research Themes: ・Development and Application of Methods for Imbalanced Data Classification
・Development and Application of Methods for Co-nonlinearity Analysis Leading to Knowledge Awareness

Research Goals: 1) Development of analytical methods suitable to the scale of the data
2) Support of intellectual activities by machine learning and knowledge discovery
Researcher Database: https://kendb.doshisha.ac.jp/profile/en.827dbfe966e77032.html

Miho Ohsaki Professor, Faculty of Science and Engineering, Department of Information Systems Design