Faculty Profiles - HARA Sunao

写真a

HARA Sunao

Organization

Faculty of Environmental, Life, Natural Science and Technology Associate Professor

Homepage

http://www.a.cs.okayama-u.ac.jp/~hara/

Profile

He received the B.S., M.S., Ph.D degrees from Nagoya University in 2003, 2005 and 2011, respectively.
He is currently an assistant professor in the Graduate School of Information Science, Nara Institute of Science and Technology.
His research interests include development and evaluation of spoken dialog in real environments.
He is a member of the Acoustic Society in Japan, Human Interface Society in Japan, and Information Processing Society of Japan.

External link

Degree

Ph.D (Information science) ( Nagoya university )

Research Interests

Human Interface
Spoken dialogue
Speech recognition
lifelog
Acoustic scene analysis
Acoustic event detection
Deep Learning
Machine Learning
Speech processing
Spoken dialog system

Research Areas

Informatics / Intelligent informatics
Informatics / Web informatics and service informatics
Informatics / Perceptual information processing

Research History

Okayama University Faculty of Environmental, Life, Natural Science and Technology Associate Professor

2024.4

　 More details

researchmap
Okayama University Graduate School of Interdisciplinary Science and Engineering in Health Systems Assistant Professor

2019.4 - 2024.4

　 More details

Country：Japan

Notes：工学部情報系学科

researchmap
Okayama University The Graduate School of Natural Science and Technology Assistant Professor

2012.9 - 2019.3

　 More details

Country：Japan

Notes：工学部情報系学科

researchmap
Nara Institute of Science and Technology Assistant Professor

2011.11 - 2012.9

　 More details

researchmap

Professional Memberships

IEEE

2016.6

　 More details

researchmap
THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS.

2012.2

　 More details

researchmap
INFORMATION PROCESSING SOCIETY OF JAPAN

2007

　 More details

researchmap
ACOUSTICAL SOCIETY OF JAPAN

2004

　 More details

researchmap
HUMAN INTERFACE SOCIETY

2005 - 2022.12

　 More details

researchmap

Committee Memberships

日本音響学会編集委員会会誌部会委員

2023.6

　 More details

Committee type：Academic society

researchmap
日本音響学会関西支部第24回若手研究者交流研究発表会実行委員長

2021.4 - 2022.3

　 More details

Committee type：Academic society

researchmap
日本音響学会広報・電子化委員会委員

2013.10

　 More details

Committee type：Academic society

電子化・広報推進委員会

researchmap
日本音響学会研究発表会準備委員会委員

2023.6

　 More details

Committee type：Academic society

researchmap
日本音響学会 2023年春季研究発表会遠隔開催実行委員会委員

2022.12 - 2023.3

　 More details

Committee type：Academic society

researchmap
日本音響学会 2022年春季研究発表会遠隔開催実行委員会委員

2021.10 - 2022.3

　 More details

Committee type：Academic society

researchmap
日本音響学会 2021年秋季研究発表会遠隔開催実行委員会委員

2021.7 - 2022.3

　 More details

Committee type：Academic society

researchmap
日本音響学会 2021年春季研究発表会遠隔開催実行委員会委員

2020.11 - 2021.3

　 More details

researchmap
日本音響学会 2020年秋季研究発表会遠隔開催実行委員会委員

2020.7 - 2020.9

　 More details

Committee type：Academic society

researchmap
電子情報通信学会ソサイエティ論文誌編集委員会査読委員

2017.8

　 More details

Committee type：Academic society

researchmap
情報処理学会中国支部支部運営委員会委員

2015.5 - 2019.5

　 More details

Committee type：Academic society

researchmap
日本音響学会編集委員会査読委員

2014.2

　 More details

Committee type：Academic society

researchmap
日本音響学会関西支部若手研究者交流研究発表会実行委員

2011.11 - 2022.3

　 More details

Committee type：Academic society

researchmap
日本音響学会学生・若手フォーラム幹事会

2007.3 - 2012.3

　 More details

Committee type：Academic society

researchmap

▼display all

Papers

Speech Synthesis Using Ambiguous Inputs From Wearable Keyboards Reviewed International journal

Matsuri Iwasaki, Sunao Hara, Masanobu Abe

2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2023) 1172 - 1178 2023.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

This paper proposes a new application in speech communication using text-to-speech (TTS), and the goal is to enable dysarthria, articulation disorder, or persons who have difficulty in speaking to communicate anywhere and anytime using speech to express their thoughts and feelings. To achieve this goal, an input method is required. Thus, we propose a new text-entry method based on three concepts. First, from an easy-to-carry perspective, we used a wearable keyboard that inputs digits from 0 to 9 in decimal notation according to 10-finger movements. Second, from a no-training perspective, users input sentences in a way of touch typing using the wearable keyboard. Following this method, we obtained a sequence of numbers corresponding to the sentence. Third, a neural machine translation (NMT) method is applied to estimate texts from the sequence of numbers. The NMT was trained using two datasets; one is a Japanese-English parallel corpus containing 2.8 million pairs of sentences, which were extracted from TV and movie subtitles, while the other is a Japanese text dataset containing 32 million sentences, which were extracted from a question-and-answer platform. Using the model, phonemes and accent symbols were estimated from a sequence of numbers. Thus, the result accuracy in symbol levels was 91.48% and 43.45% of all the sentences were completely estimated with no errors. To subjectively evaluate feasibility of the NMT model, a two-person word association game was conducted; one gave hints using synthesized speech that is generated from symbols estimated by NMT, while the other guessed answers. As a result, 67.95% of all the quizzes were correctly answered, and experiment results show that the proposed method has the potential for dysarthria to communicate with TTS using a wearable keyboard.

DOI： 10.1109/APSIPAASC58517.2023.10317228

Scopus

researchmap
Speech-Emotion Control for Text-to-Speech in Spoken Dialogue Systems Using Voice Conversion and x-vector Embedding Reviewed International journal

Shunichi Kohara, Masanobu Abe, Sunao Hara

2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2023) 2280 - 2286 2023.11

　More details

Language：English Publishing type：Research paper (international conference proceedings)

In this paper, we propose an algorithm to control both speaker individuality and emotional expressions in synthesized speech, where the most important feature is the controllability of intensity in emotional expressions. An aim of the proposed algorithm is to generate various responses including emotions in text-to-speech (TTS) for spoken dialogue systems (SDS), which results in making the system more human-like. An idea is to control emotion and its intensity in line with the user's utterances. For example, when a user happily talks to SDS, the agent of the SDS responses with happy voice. Generally, voice quality of a user and the agent are different. Therefore, the proposed algorithm consists of two steps: (1) voice conversion to change speaker individuality including emotional expressions and (2) TTS with x-vector acting as an embedding vector to mainly control speech quality related to the intensity of emotions. Evaluation experiments are carried out using a scenario of a spoken dialogue system, where a teacher system of TTS encourages or cheers up students according to students' utterances. The experiment results showed that TTS can successfully reproduce the emotion and its intensity that are extracted from students' utterances, while maintaining the teacher's speaker individuality.

DOI： 10.1109/APSIPAASC58517.2023.10317413

Scopus

researchmap
Sound map of urban areas recorded by smart devices: case study at Okayama and Kurashiki Invited

Sunao Hara, Masanobu Abe

Proceedings of the 52nd International Congress and Exposition on Noise Control Engineering (Inter-Noise 2023) 1 - 12 2023.8

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Predictions for sound events and soundscape impressions from environmental sound using deep neural networks Invited

Sunao Hara, Masanobu Abe

Proceedings of the 52nd International Congress and Exposition on Noise Control Engineering (Inter-Noise 2023) 1 - 12 2023.8

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Prediction method of Soundscape Impressions using Environmental Sounds and Aerial Photographs Reviewed International journal

Yusuke Ono, Sunao Hara, Masanobu Abe

2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 1222 - 1227 2022.11

　More details

Authorship：Corresponding author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.23919/apsipaasc55919.2022.9980290

researchmap

Other Link： https://arxiv.org/abs/2209.04077
Incremental Audio Scene Classifier Using Rehearsal-Based Strategy Reviewed International journal

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

2022 IEEE 10th Global Conference on Consumer Electronics (GCCE) 69 - 623 2022.10

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

researchmap
Speech-Like Emotional Sound Generation Using WaveNet Reviewed

Kento Matsumoto, Sunao Hara, Masanobu Abe

IEICE Transactions on Information and Systems E105.D ( 9 ) 1581 - 1589 2022.9

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Institute of Electronics, Information and Communications Engineers (IEICE)

In this paper, we propose a new algorithm to generate Speech-like Emotional Sound (SES). Emotional expressions may be the most important factor in human communication, and speech is one of the most useful means of expressing emotions. Although speech generally conveys both emotional and linguistic information, we have undertaken the challenge of generating sounds that convey emotional information alone. We call the generated sounds "speech-like," because the sounds do not contain any linguistic information. SES can provide another way to generate emotional response in human-computer interaction systems. To generate "speech-like" sound, we propose employing WaveNet as a sound generator conditioned only by emotional IDs. This concept is quite different from the WaveNet Vocoder, which synthesizes speech using spectrum information as an auxiliary feature. The biggest advantage of our approach is that it reduces the amount of emotional speech data necessary for training by focusing on non-linguistic information. The proposed algorithm consists of two steps. In the first step, to generate a variety of spectrum patterns that resemble human speech as closely as possible, WaveNet is trained with auxiliary mel-spectrum parameters and Emotion ID using a large amount of neutral speech. In the second step, to generate emotional expressions, WaveNet is retrained with auxiliary Emotion ID only using a small amount of emotional speech. Experimental results reveal the following: (1) the two-step training is necessary to generate the SES with high quality, and (2) it is important that the training use a large neutral speech database and spectrum information in the first step to improve the emotional expression and naturalness of SES.

DOI： 10.1587/transinf.2021edp7236

Web of Science

researchmap

Other Link： https://www.webofscience.com/wos/woscc/full-record/WOS:000852731400008
Concept drift adaptation for audio scene classification using high-level features Reviewed International journal

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

2022 IEEE International Conference on Consumer Electronics (ICCE) 2022.1

　More details

Authorship：Last author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/icce53296.2022.9730332

researchmap
Acoustic Scene Classifier Based on Gaussian Mixture Model in the Concept Drift Situation Reviewed International journal

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

Advances in Science, Technology and Engineering Systems Journal 6 ( 5 ) 167 - 176 2021.9

　More details

Authorship：Last author Language：English Publishing type：Research paper (scientific journal) Publisher：ASTES Journal

DOI： 10.25046/aj060519

researchmap
Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech Reviewed International journal

Naoto Kakegawa, Sunao Hara, Masanobu Abe, Yusuke Ijima

Interspeech 2021 126 - 130 2021.8

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

The biggest obstacle to develop end-to-end Japanese text-to-speech (TTS) systems is to estimate phonetic and prosodic information (PPI) from Japanese texts. The following are the reasons: (1) the Kanji characters of the Japanese writing system have multiple corresponding pronunciations, (2) there is no separation mark between words, and (3) an accent nucleus must be assigned at appropriate positions. In this paper, we propose to solve the problems by neural machine translation (NMT) on the basis of encoder-decoder models, and compare NMT models of recurrent neural networks and the Transformer architecture. The proposed model handles texts on token (character) basis, although conventional systems handle them on word basis. To ensure the potential of the proposed approach, NMT models are trained using pairs of sentences and their PPIs that are generated by a conventional Japanese TTS system from 5 million sentences. Evaluation experiments were performed using PPIs that are manually annotated for 5,142 sentences. The experimental results showed that the Transformer architecture has the best performance, with 98.0% accuracy for phonetic information estimation and 95.0% accuracy for PPI estimation. Judging from the results, NMT models are promising toward end-to-end Japanese TTS.

DOI： 10.21437/interspeech.2021-914

Web of Science

researchmap
Model architectures to extrapolate emotional expressions in DNN-based text-to-speech Reviewed International journal

Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

Speech Communication 126 35 - 43 2021.2

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Elsevier BV

This paper proposes architectures that facilitate the extrapolation of emotional expressions in deep neural network (DNN)-based text-to-speech (TTS). In this study, the meaning of “extrapolate emotional expressions” is to borrow emotional expressions from others, and the collection of emotional speech uttered by target speakers is unnecessary. Although a DNN has potential power to construct DNN-based TTS with emotional expressions and some DNN-based TTS systems have demonstrated satisfactory performances in the expression of the diversity of human speech, it is necessary and troublesome to collect emotional speech uttered by target speakers. To solve this issue, we propose architectures to separately train the speaker feature and the emotional feature and to synthesize speech with any combined quality of speakers and emotions. The architectures are parallel model (PM), serial model (SM), auxiliary input model (AIM), and hybrid models (PM&AIM and SM&AIM). These models are trained through emotional speech uttered by few speakers and neutral speech uttered by many speakers. Objective evaluations demonstrate that the performances in the open-emotion test provide insufficient information. They make a comparison with those in the closed-emotion test, but each speaker has their own manner of expressing emotion. However, subjective evaluation results indicate that the proposed models could convey emotional information to some extent. Notably, the PM can correctly convey sad and joyful emotions at a rate of 60%.

DOI： 10.1016/j.specom.2020.11.004

Web of Science

researchmap

Other Link： https://arxiv.org/abs/2102.10345
Module Comparison of Transformer-TTS for Speaker Adaptation based on Fine-tuning Reviewed International journal

Katsuki Inoue, Sunao Hara, Masanobu Abe

Proceedings of APSIPA Annual Summit and Conference (APSIPA-ASC 2020) 826 - 830 2020.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE/APSIPA

End-to-end text-to-speech (TTS) models have achieved remarkable results in recent times. However, the model requires a large amount of text and audio data for training. A speaker adaptation method based on fine-tuning has been proposed for constructing a TTS model using small scale data. Although these methods can replicate the target speaker s voice quality, synthesized speech includes the deletion and/or repetition of speech. The goal of speaker adaptation is to change the voice quality to match the target speaker ' s on the premise that adjusting the necessary modules will reduce the amount of data to be fine-tuned. In this paper, we clarify the role of each module in the Transformer-TTS process by not updating it. Specifically, we froze character embedding, encoder, layer predicting stop token, and loss function for estimating sentence ending. The experimental results showed the following: (1) fine-tuning the character embedding did not result in an improvement in the deletion and/or repetition of speech, (2) speech deletion increases if the encoder is not fine-tuned, (3) speech deletion was suppressed when the layer predicting stop token is not fine-tuned, and (4) there are frequent speech repetitions at sentence end when the loss function estimating sentence ending is omitted.

Web of Science

researchmap
Concept Drift Adaptation for Acoustic Scene Classifier Based on Gaussian Mixture Model Reviewed

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

Proceedings of IEEE REGION 10 CONFERENCE (TENCON 2020) 450 - 455 2020.11

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/tencon50793.2020.9293766

researchmap
Controlling the Strength of Emotions in Speech-Like Emotional Sound Generated by WaveNet Reviewed International journal

Kento Matsumoto, Sunao Hara, Masanobu Abe

Proceedings of Interspeech 2020 3421 - 3425 2020.10

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/interspeech.2020-2064

researchmap

Other Link： https://www.webofscience.com/wos/woscc/full-record/WOS:000833594103112
Semi-Supervised Speaker Adaptation for End-to-End Speech Synthesis with Pretrained Models Reviewed International coauthorship International journal

Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020) 7634 - 7638 2020.5

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

Recently, end-to-end text-to-speech (TTS) models have achieved a remarkable performance, however, requiring a large amount of paired text and speech data for training. On the other hand, we can easily collect unpaired dozen minutes of speech recordings for a target speaker without corresponding text data. To make use of such accessible data, the proposed method leverages the recent great success of state-of-the-art end-to-end automatic speech recognition (ASR) systems and obtains corresponding transcriptions from pretrained ASR models. Although these models could only provide text output instead of intermediate linguistic features like phonemes, end-to-end TTS can be well trained with such raw text data directly. Thus, the proposed method can greatly simplify a speaker adaptation pipeline by consistently employing end-to-end ASR/TTS ecosystems. The experimental results show that our proposed method achieved comparable performance to a paired data adaptation method in terms of subjective speaker similarity and objective cepstral distance measures.

DOI： 10.1109/icassp40776.2020.9053371

Web of Science

researchmap
DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients' Speech Reviewed International journal

Hiroki Murakami, Sunao Hara, Masanobu Abe

Proceedings of APSIPA Annual Summit and Conference 2019 138 - 142 2019.11

　More details

Authorship：Corresponding author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

In this paper, we propose using phonemic information in addition to acoustic features to improve the intelligibility of speech uttered by patients with articulation disorders caused by a wide glossectomy. Our previous studies showed that voice conversion algorithm improves the quality of glossectomy patients' speech. However, losses in acoustic features of glossectomy patients' speech are so large that the quality of the reconstructed speech is low. To solve this problem, we explored potentials of several additional information to improve speech intelligibility. One of the candidates is phonemic information, more specifically Phoneme Labels as Auxiliary input (PLA). To combine both acoustic features and PLA, we employed a DNN-based algorithm. PLA is represented by a kind of one-of-k vector, i.e., PLA has a weight value (<1.0) that gradually changes in time axis, whereas one-of-k has a binary value (0 or 1). The results showed that the proposed algorithm reduced the mel-frequency cepstral distortion for all phonemes, and almost always improved intelligibility. Notably, the intelligibility was largely improved in phonemes /s/ and /z/, mainly because the tongue is used to sustain constriction to produces these phonemes. This indicates that PLA works well to compensate the lack of a tongue.

DOI： 10.1109/APSIPAASC47483.2019.9023168

Web of Science

researchmap
Speech-like Emotional Sound Generator by WaveNet Reviewed International journal

Kento Matsumoto, Sunao Hara, Masanobu Abe

Proceedings of APSIPA Annual Summit and Conference 2019 143 - 147 2019.11

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

In this paper, we propose a new algorithm to generate Speech-like Emotional Sound (SES). Emotional information plays an important role in human communication, and speech is one of the most useful media to express emotions. Although, in general, speech conveys emotional information as well as linguistic information, we have undertaken the challenge to generate sounds that convey emotional information without linguistic information, which results in making conversations in human-machine interactions more natural in some situations by providing non-verbal emotional vocalizations. We call the generated sounds "speech-like", because the sounds do not contain any linguistic information. For the purpose, we propose to employ WaveNet as a sound generator conditioned by only emotional IDs. The idea is quite different from WaveNet Vocoder that synthesizes speech using spectrum information as auxiliary features. The biggest advantage of the idea is to reduce the amount of emotional speech data for the training. The proposed algorithm consists of two steps. In the first step, WaveNet is trained to obtain phonetic features using a large speech database, and in the second step, WaveNet is re-trained using a small amount of emotional speech. Subjective listening evaluations showed that the SES could convey emotional information and was judged to sound like a human voice.

DOI： 10.1109/APSIPAASC47483.2019.9023346

Web of Science

researchmap
A signal processing perspective on human gait: Decoupling walking oscillations and gestures Reviewed International journal

Adrien Gregorj, Zeynep Yücel, Sunao Hara, Akito Monden, Masahiro Shiomi

Proceedings of the 4th International Conference on Interactive Collaborative Robotics 2019 (ICR 2019) 11659 75 - 85 2019.8

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：SPRINGER INTERNATIONAL PUBLISHING AG

This study focuses on gesture recognition in mobile interaction settings, i.e. when the interacting partners are walking. This kind of interaction requires a particular coordination, e.g. by staying in the field of view of the partner, avoiding obstacles without disrupting group composition and sustaining joint attention during motion. In literature, various studies have proven that gestures are in close relation in achieving such goals.Thus, a mobile robot moving in a group with human pedestrians, has to identify such gestures to sustain group coordination. However, decoupling of the inherent -walking- oscillations and gestures, is a big challenge for the robot. To that end, we employ video data recorded in uncontrolled settings and detect arm gestures performed by human-human pedestrian pairs by adopting a signal processing approach. Namely, we exploit the fact that there is an inherent oscillatory motion at the upper limbs arising from the gait, independent of the view angle or distance of the user to the camera. We identify arm gestures as disturbances on these oscillations. In doing that, we use a simple pitch detection method from speech processing and assume data involving a low frequency periodicity to be free of gestures. In testing, we employ a video data set recorded in uncontrolled settings and show that we achieve a detection rate of 0.80.

DOI： 10.1007/978-3-030-26118-4_8

Web of Science

researchmap
Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient's Speech Using Spectral Differential Modification in Voice Conversion Reviewed International journal

Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

Interspeech 2018, 19th Annual Conference of the International Speech Communication Association 2464 - 2468 2018.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

In this paper, we propose an algorithm to improve the naturalness of the reconstructed glossectomy patient's speech that is generated by voice conversion to enhance the intelligibility of speech uttered by patients with a wide glossectomy. While existing VC algorithms make it possible to improve intelligibility and naturalness, the result is still not satisfying. To solve the continuing problems, we propose to directly modify the speech waveforms using a spectrum differential. The motivation is that glossectomy patients mainly have problems in their vocal tract, not in their vocal cords. The proposed algorithm requires no source parameter extractions for speech synthesis, so there are no errors in source parameter extractions and we are able to make the best use of the original source characteristics. In terms of spectrum conversion, we evaluate with both GMM and DNN. Subjective evaluations show that our algorithm can synthesize more natural speech than the vocoder-based method. Judging from observations of the spectrogram, power in high-frequency bands of fricatives and stops is reconstructed to be similar to that of natural speech.

DOI： 10.21437/Interspeech.2018-1239

Web of Science

researchmap
Sound sensing using smartphones as a crowdsourcing approach Reviewed International journal

Sunao Hara, Asako Hatakeyama, Shota Kobayashi, Masanobu Abe

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 1328 - 1333 2017.12

　More details

Authorship：Lead author,　Corresponding author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

Sounds are one of the most valuable information sources for human beings from the viewpoint of understanding the environment around them. We have been now investigating the method of detecting and visualizing crowded situations in the city in a sound-sensing manner. For this purpose, we have developed a sound collection system oriented to a crowdsourcing approach and carried out the sound-collection in two Japanese cities, Okayama and Kurashiki. In this paper, we present an overview of sound collections. Then, to show an effectiveness of analyzation by sensed sounds, we profile characteristics of the cities through the visualization results of the sound.

DOI： 10.1109/APSIPA.2017.8282238

Web of Science

researchmap
An investigation to transplant emotional expressions in DNN-based TTS synthesis Reviewed International journal

Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 1253 - 1258 2017.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

In this paper, we investigate deep neural network (DNN) architectures to transplant emotional expressions to improve the expressiveness of DNN-based text-to-speech (TTS) synthesis. DNN is expected to have potential power in mapping between linguistic information and acoustic features. From multispeaker and/or multi-language perspectives, several types of DNN architecture have been proposed and have shown good performances. We tried to expand the idea to transplant emotion, constructing shared emotion-dependent mappings. The following three types of DNN architecture are examined; (1) the parallel model (PM) with an output layer consisting of both speaker-dependent layers and emotion-dependent layers, (2) the serial model (SM) with an output layer consisting of emotion-dependent layers preceded by speaker-dependent hidden layers, (3) the auxiliary input model (AIM) with an input layer consisting of emotion and speaker IDs as well as linguistics feature vectors. The DNNs were trained using neutral speech uttered by 24 speakers, and sad speech and joyful speech uttered by 3 speakers from those 24 speakers. In terms of unseen emotional synthesis, subjective evaluation tests showed that the PM performs much better than the SM and slightly better than the AIM. In addition, this test showed that the SM is the best of the three models when training data includes emotional speech uttered by the target speaker.

DOI： 10.1109/APSIPA.2017.8282231

Web of Science

researchmap
New monitoring scheme for persons with dementia through monitoring-area adaptation according to stage of disease Reviewed International journal

Shigeki Kamada, Yuji Matsuo, Sunao Hara, Masanobu Abe

Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks, LocalRec@SIGSPATIAL 2017 1:1 - 1:7 2017.11

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ACM

DOI： 10.1145/3148150.3148151

researchmap

Other Link： http://doi.acm.org/10.1145/3148150.3148151
Prediction of subjective assessments for a noise map using deep neural networks Reviewed International journal

Shota Kobayashi, Masanobu Abe, Sunao Hara

Adjunct Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers, UbiComp/ISWC 2017 113 - 116 2017.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ACM

In this paper, we investigate a method of creating noise maps that take account of human senses. Physical measurements are not enough to design our living environment and we need to know subjective assessments. To predict subjective assessments from loudness values, we propose to use metadata related to where, who and what is recording. The proposed method is implemented using deep neural networks because these can naturally treat a variety of information types. First, we evaluated its performance in predicting five-point subjective loudness levels based on a combination of several features: location specific, participant-specific, and sound-specific features. The proposed method achieved a 16.3 point increase compared with the baseline method. Next, we evaluated its performance based on noise map visualization results. The proposed noise maps were generated from the predicted subjective loudness level. Considering the differences between the two visualizations, the proposed method made fewer errors than the baseline method.

DOI： 10.1145/3123024.3123091

Web of Science

researchmap

Other Link： http://doi.acm.org/10.1145/3123024.3123091
Speaker Dependent Approach for Enhancing a Glossectomy Patient’s Speech via GMM-Based Voice Conversion

Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

Interspeech 2017 2017.8

　More details

Publishing type：Research paper (international conference proceedings) Publisher：ISCA

DOI： 10.21437/interspeech.2017-841

Web of Science

researchmap
Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion

Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6 3384 - 3388 2017

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA-INT SPEECH COMMUNICATION ASSOC

In this paper, using GMM-based voice conversion algorithm, we propose to generate speaker-dependent mapping functions to improve the intelligibility of speech uttered by patients with a wide glossectomy. The speaker-dependent approach enables to generate the mapping functions that reconstruct missing spectrum features of speech uttered by a patient without having influences of a speaker's factor. The proposed idea is simple, i.e., to collect speech uttered by a patient before and after the glossectomy, but in practice it is hard to ask patients to utter speech just for developing algorithms. To confirm the performance of the proposed approach, in this paper, in order to simulate glossectomy patients, we fabricated an intraoral appliance which covers lower dental arch and tongue surface to restrain tongue movements. In terms of the Mel-frequency cepstrum (MFC) distance, by applying the voice conversion, the distances were reduced by 25% and 42% for speaker dependent case and speaker-independent case, respectively. In terms of phoneme intelligibility, dictation tests revealed that speech reconstructed by speaker-dependent approach almost always showed better performance than the original speech uttered by simulated patients, while speaker-independent approach did not.

DOI： 10.21437/Interspeech.2017-841

Web of Science

researchmap
Enhancing a glossectomy patient's speech via GMM-based voice conversion Reviewed International journal

Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) 1 - 4 2016.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

DOI： 10.1109/apsipa.2016.7820909

Web of Science

researchmap
LiBS: Lifelog browsing system to support sharing of memories Reviewed International journal

Atsuya Namba, Sunao Hara, Masanobu Abe

UbiComp 2016 Adjunct - Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing 165 - 168 2016.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Association for Computing Machinery, Inc

We propose a lifelog browsing system through which users can share memories of their experiences with other users. Most importantly, by using global positioning system data and time stamps, the system simultaneously displays a variety of log information in a time-synchronous manner. This function empowers users with not only an easy interpretation of other users' experiences but also nonverbal notifications. Shared information on this system includes photographs taken by users, Google street views, shops and restaurants on the map, daily weather, and other items relevant to users' interests. In evaluation experiments, users preferred the proposed system to conventional photograph albums and maps for explaining and sharing their experiences. Moreover, through displayed information, the listeners found out their interest items that had not been mentioned by the speakers.

DOI： 10.1145/2968219.2971401

Scopus

researchmap

Other Link： http://doi.acm.org/10.1145/2968219.2971401
Safety vs. Privacy: User preferences from the monitored and monitoring sides of a monitoring system Reviewed International journal

Shigeki Kamada, Sunao Hara, Masanobu Abe

UbiComp 2016 Adjunct - Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing 101 - 104 2016.9

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：Association for Computing Machinery, Inc

In this study, in order to develop a monitoring system that takes into account privacy issues, we investigated user preferences in terms of the monitoring and privacy protec-tion levels. The people on the monitoring side wanted the monitoring system to allow them to monitor in detail. Con-versely, it was observed for the people being monitored that the more detailed the monitoring, the greater the feelings of being surveilled intrusively. Evaluation experiments were performed using the location data of three people in differ-ent living areas. The results of the experiments show that it is possible to control the levels of monitoring and privacy protection without being affected by the shape of a living area by adjusting the quantization level of location informa-tion. Furthermore, it became clear that the granularity of location information satisfying the people on the monitored side and the monitoring side is different.

DOI： 10.1145/2968219.2971412

Scopus

researchmap

Other Link： http://doi.acm.org/10.1145/2968219.2971412
Sound collection systems using a crowdsourcing approach to construct sound map based on subjective evaluation Reviewed International journal

Sunao Hara, Shota Kobayashi, Masanobu Abe

IEEE ICME Workshop on Multimedia Mobile Cloud for Smart City Applications (MMCloudCity-2016) 1 - 6 2016

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

This paper presents a sound collection system that uses crowdsourcing to gather information for visualizing area characteristics. First, we developed a sound collection system to simultaneously collect physical sounds, their statistics, and subjective evaluations. We then conducted a sound collection experiment using the developed system on 14 participants. We collected 693,582 samples of equivalent Aweighted loudness levels and their locations, and 5,935 samples of sounds and their locations. The data also include subjective evaluations by the participants. In addition, we analyzed the changes in sound properties of some areas before and after the opening of a large-scale shopping mall in a city. Next, we implemented visualizations on the server system to attract users' interests. Finally, we published the system, which can receive sounds from any Android smartphone user. The sound data were continuously collected and achieved a specified result.

DOI： 10.1109/ICMEW.2016.7574694

Web of Science

researchmap
A Spoken Dialog System with Redundant Response to Prevent User Misunderstanding Reviewed International journal

Masaki Yamaoka, Sunao Hara, Masanobu Abe

Proceedings of APSIPA Annual Summit and Conference 2015 229 - 232 2015.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

We propose a spoken dialog strategy for car navigation systems to facilitate safe driving. To drive safely, drivers need to concentrate on their driving; however, their concentration may be disrupted due to disagreement with their spoken dialog system. Therefore, we need to solve the problems of user misunderstandings as well as misunderstanding of spoken dialog systems. For this purpose, we introduced a driver workload level in spoken dialog management in order to prevent user misunderstandings. A key strategy of the dialog management is to make speech redundant if the driver's workload is too high in assuming that the user probably misunderstand the system utterance under such a condition. An experiment was conducted to compare performances of the proposed method and a conventional method using a user simulator. The simulator is developed under the assumption of two types of drivers: an experienced driver model and a novice driver model. Experimental results showed that the proposed strategies achieved better performance than the conventional one for task completion time, task completion rate, and user's positive speech rate. In particular, these performance differences are greater for novice users than for experienced users.

DOI： 10.1109/APSIPA.2015.7415511

Web of Science

researchmap
Extracting Daily Patterns of Human Activity Using Non-Negative Matrix Factorization Reviewed International journal

Masanobu Abe, Akihiko Hirayama, Sunao Hara

Proceedings of IEEE International Conference on Consumer Electronics (IEEE-ICCE 2015) 36 - 39 2015.1

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

This paper presents an algorithm to mine basic patterns of human activities on a daily basis using non-negative matrix factorization (NMF). The greatest benefit of the algorithm is that it can elicit patterns from which meanings can be easily interpreted. To confirm its performance, the proposed algorithm was applied to PC logging data collected from three occupations in offices. Daily patterns of software usage were extracted for each occupation. Results show that each occupation uses specific software in its own time period, and uses several types of software in parallel in its own combinations. Experiment results also show that patterns of 144 dimension vectors were compressible to those of 11 dimension vectors without degradation in occupation classification performance. Therefore, the proposed algorithm compressed basic software usage patterns to about one-tenth of their original dimensions while preserving the original information. Moreover, the extracted basic patterns showed reasonable interpretation of daily working patterns in offices.

DOI： 10.1109/ICCE.2015.7066309

Web of Science

researchmap
Sub-Band Text-to-Speech Combining Sample-Based Spectrum with Statistically Generated Spectrum Reviewed International journal

Tadashi Inai, Sunao Hara, Masanobu Abe, Yusuke Ijima, Noboru Miyazaki, Hideyuki Mizuno

Proceedings of Interspeech 2015 264 - 268 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

As described in this paper, we propose a sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system: a sample-based spectrum is used in the high-frequency band and spectrum generated by HMM-based TTS is used in the low-frequency band. Herein, sample-based spectrum means spectrum selected from a phoneme database such that it is the most similar to spectrum generated by HMM-based speech synthesis. A key idea is to compensate over-smoothing caused by statistical procedures by introducing a sample-based spectrum, especially in the high-frequency band. Listening test results show that the proposed method has better performance than HMM-based speech synthesis in terms of clarity. It is at the same level as HMM-based speech synthesis in terms of smoothness. In addition, preference test results among the proposed method, HMM-based speech synthesis, and waveform speech synthesis using 80 min speech data reveal that the proposed method is the most liked.

Web of Science

researchmap

Other Link： http://dblp.uni-trier.de/db/conf/interspeech/interspeech2015.html#conf/interspeech/InaiHAIMM15
Sound collection and visualization system enabled participatory and opportunistic sensing approaches Reviewed International journal

Sunao Hara, Masanobu Abe, Noboru Sonehara

Proceedings of 2015 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops) 390 - 395 2015

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

This paper presents a sound collection system to visualize environmental sounds that are collected using a crowd sourcing approach. An analysis of physical features is generally used to analyze sound properties; however, human beings not only analyze but also emotionally connect to sounds. If we want to visualize the sounds according to the characteristics of the listener, we need to collect not only the raw sound, but also the subjective feelings associated with them. For this purpose, we developed a sound collection system using a crowdsourcing approach to collect physical sounds, their statistics, and subjective evaluations simultaneously. We then conducted a sound collection experiment using the developed system on ten participants. We collected 6,257 samples of equivalent loudness levels and their locations, and 516 samples of sounds and their locations. Subjective evaluations by the participants are also included in the data. Next, we tried to visualize the sound on a map. The loudness levels are visualized as a color map and the sounds are visualized as icons which indicate the sound type. Finally, we conducted a discrimination experiment on the sound to implement a function of automatic conversion from sounds to appropriate icons. The classifier is trained on the basis of the GMM-UBM (Gaussian Mixture Model and Universal Background Model) method. Experimental results show that the F-measure is 0.52 and the AUC is 0.79.

DOI： 10.1109/PERCOMW.2015.7134069

Web of Science

researchmap

Other Link： https://ousar.lib.okayama-u.ac.jp/ja/53271
Algorithm to Estimate a Living Area Based on Connectivity of Places with Home Reviewed International journal

Yuji Matsuo, Sunao Hara, Masanobu Abe

HCI International 2015 - Posters’ Extended Abstracts (Part II), CCIS 529 529 570 - 576 2015

　More details

Language：English Publishing type：Part of collection (book) Publisher：SPRINGER-VERLAG BERLIN

We propose an algorithm to estimate a person's living area using his/her collected Global Positioning System (GPS) data. The most important feature of the algorithm is the connectivity of places with a home, i.e., a living area must consist of a home, important places, and routes that connect them. This definition is logical because people usually go to a place from home, and there can be several routes to that place. Experimental results show that the proposed algorithm can estimate living area with a precision of 0.82 and recall of 0.86 compared with the grand truth established by users. It is also confirmed that the connectivity of places with a home is necessary to estimate a reasonable living area.

DOI： 10.1007/978-3-319-21383-5_95

Web of Science

researchmap
Extraction of Key Segments from Day-Long Sound Data Reviewed International journal

Akinori Kasai, Sunao Hara, Masanobu Abe

HCI International 2015 - Posters’ Extended Abstracts (Part I), CCIS 528 528 620 - 626 2015

　More details

Language：English Publishing type：Part of collection (book) Publisher：SPRINGER-VERLAG BERLIN

We propose a method to extract particular sound segments from the sound recorded during the course of a day in order to provide sound segments that can be used to facilitate memory. To extract important parts of the sound data, the proposed method utilizes human behavior based on a multisensing approach. To evaluate the performance of the proposed method, we conducted experiments using sound, acceleration, and global positioning system data collected by five participants for approximately two weeks. The experimental results are summarized as follows: (1) various sounds can be extracted by dividing a day into scenes using the acceleration data; (2) sound recorded in unusual places is preferable to sound recorded in usual places; and (3) speech is preferable to nonspeech sound.

DOI： 10.1007/978-3-319-21380-4_105

Web of Science

researchmap
Inhibitory Effects of an Orally Active Small Molecule Alpha4beta1/Alpha4beta7 Integrin Antagonist, TRK-170, on Spontaneous Colitis in HLA-B27 Transgenic Rats

Hiroe Hirokawa, Yoko Koga, Rie Sasaki, Sunao Hara, Hiroyuki Meguro, Mie Kainoh

GASTROENTEROLOGY 146 ( 5 ) S640 - S640 2014.5

　More details

Language：English Publisher：W B SAUNDERS CO-ELSEVIER INC

Web of Science

researchmap
A graph-based spoken dialog strategy utilizing multiple understanding hypotheses Reviewed

Norihide Kitaoka, Yuji Kinoshita, Sunao Hara, Chiyomi Miyajima, Kazuya Takeda

Information and Media Technologies 9 ( 1 ) 111 - 120 2014.3

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：Information and Media Technologies Editorial Board

We regarded a dialog strategy for information retrieval as a graph search problem and proposed several novel dialog strategies that can recover from misrecognition through a spoken dialog that traverses the graph. To recover from misrecognition without seeking confirmation, our system kept multiple understanding hypotheses at each turn and searched for a globally optimal hypothesis in the graph whose nodes express understanding states across user utterances in a whole dialog. In the search, we used a new criterion based on efficiency in information retrieval and consistency with understanding hypotheses, which is also used to select an appropriate system response. We showed that our system can make more efficient and natural dialogs than previous ones.

DOI： 10.11185/imt.9.111

CiNii Article

researchmap
New approach to emotional information exchange: Experience metaphor based on life logs Reviewed International journal

Masanobu Abe, Daisuke Fujioka, Kazuto Hamano, Sunao Hara, Rika Mochizuki, Tomoki Watanabe

2014 IEEE International Conference on Pervasive Computing and Communication Workshops, PerCom 2014 Workshops 191 - 194 2014.3

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

We are striving to develop a new communication technology based on individual experiences that can be extracted from life logs. We have proposed the "Emotion Communication Model" and confirmed that significant correlation exists between experience and emotion. As the second step, particularly addressing impressive places and events, this paper describes an investigation of the extent to which we can share emotional information with others through individuals' experiences. Subjective experiments were conducted using life log data collected during 7-47 months. Experiment results show that (1) impressive places are determined by the distance from home, visit frequency, and direction from home and that (2) positive emotional information is highly consistent among people (71.4%), but it is not true for negative emotional information. Therefore, experiences are useful as metaphors to express positive emotional information.

DOI： 10.1109/PerComW.2014.6815198

researchmap
Development of a Toolkit Handling Multiple Speech-Oriented Guidance Agents for Mobile Applications Reviewed International journal

Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice 79 - 85 2014

　More details

Authorship：Lead author Language：English Publishing type：Part of collection (book) Publisher：Springer

DOI： 10.1007/978-1-4614-8280-2_8

researchmap
Evaluation of Invalid Input Discrimination Using Bag-of-Words for Speech-Oriented Guidance System Reviewed International journal

Haruka Majima, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice 389 - 397 2014

　More details

Language：English Publishing type：Part of collection (book) Publisher：Springer

DOI： 10.1007/978-1-4614-8280-2_35

researchmap
A Hybrid Text-to-Speech Based on Sub-Band Approach Reviewed International journal

Takuma Inoue, Sunao Hara, Masanobu Abe

Proceedings of Asia-Pacific Signal and Information Processing Association 2014 Annual Summit and Conference 1 - 4 2014

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

This paper proposes a sub-band speech synthesis approach to develop high-quality Text-to-Speech (TTS). For the low-frequency band and high-frequency band, Hidden Markov Model (HMM)-based speech synthesis and waveform -based speech synthesis are used, respectively. Both speech synthesis methods are widely known to show good performance and to have benefits and shortcomings from different points of view. One motivation is to apply the right speech synthesis method in the right frequency band. Experiment results show that in terms of the smoothness the proposed approach shows better performance than waveform -based speech synthesis, and in terms of the clarity it shows better than HMM-based speech synthesis. Consequently, the proposed approach combines the inherent benefits from both waveform-based speech synthesis and HMM-based speech synthesis.

DOI： 10.1109/APSIPA.2014.7041575

Web of Science

researchmap
Invalid Input Rejection Using Bag-of-Words for Speech-oriented Guidance System Reviewed

Haruka Majima, Yoko Fujita, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

Journal of Information Processing 54 ( 2 ) 443 - 451 2013.2

　More details

Language：Japanese

On a real environment speech-oriented information guidance system, a valid and invalid input discrimination is important as invalid inputs such as noise, laugh, cough and utterances between users lead to unpredictable system responses. Generally, acoustic features such as MFCC (Mel-Frequency Cepstral Coefficient) are used for discrimination. Comparing acoustic likelihoods of GMMs (Gaussian Mixture Models) from speech data and noise data is one of the typical methods. In addition to that, using linguistic features, such as speech recognition result, is considered to improve discrimination accuracy as it reflects the task-domain of invalid inputs and meaningless recognition results from noise inputs. In this paper, we introduce Bag-of-Words (BOW) as a feature to discriminate between valid and invalid inputs. Support Vector Machine (SVM) and Maximum Entropy method (ME) are also employed to realize robust classification. We experimented the methods using real environment data obtained from the guidance system "Takemaru-kun." By applying BOW on SVM, the F-measure is improved to 85.09%, from 82.19% when using GMMs. In addition, experiments using features combining BOW with acoustic likelihoods from GMMs, Duration and SNR were conducted, improving the F-measure to 86.58%.

CiNii Article

CiNii Books

researchmap
On-line detection of task incompletion for spoken dialog systems based on utterance and behavior tag N-gram Reviewed

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

The IEICE Transactions on Information and Systems (Japanese edition) J96-D ( 1 ) 81 - 93 2013.1

　More details

Authorship：Lead author Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap

Other Link： http://search.ieice.org/bin/summary.php?id=j96-d_1_81&category=D&year=2013&lang=J&abst=
Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition Reviewed

Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

IEICE TRANSACTIONS on Information and Systems E95D ( 10 ) 2479 - 2485 2012.10

　More details

Language：English Publishing type：Research paper (scientific journal) Publisher：IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG

A novel speech feature generation-based acoustic model training method for robust speaker-independent speech recognition is proposed. For decades, speaker adaptation methods have been widely used. All of these adaptation methods need adaptation data. However, our proposed method aims to create speaker-independent acoustic models that cover not only known but also unknown speakers. We achieve this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then we train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the transformation matrices for the existing speakers are estimated. Next, we construct pseudo-speaker transformations by sampling the weight parameters from the distribution, and apply the transformation to the normalized features of the existing speaker to generate the features of the pseudo-speakers. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models trained using our proposed method are robust for unknown speakers.

DOI： 10.1587/transinf.E95.D.2479

Web of Science

researchmap

Other Link： http://search.ieice.org/bin/summary.php?id=e95-d_10_2479
Causal analysis of task completion errors in spoken music retrieval interactions Reviewed International journal

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of the 8th international conference on Language Resources and Evaluation (LREC 2012) 1365 - 1372 2012.5

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA

In this paper, we analyze the causes of task completion errors in spoken dialog systems, using a decision tree with N-gram features of the dialog to detect task-incomplete dialogs. The dialog for a music retrieval task is described by a sequence of tags related to user and system utterances and behaviors. The dialogs are manually classified into two classes: completed and uncompleted music retrieval tasks. Differences in tag classification performance between the two classes are discussed. We then construct decision trees which can detect if a dialog finished with the task completed or not, using information gain criterion. Decision trees using N-grams of manual tags and automatic tags achieved 74.2% and 80.4% classification accuracy, respectively, while the tree using interaction parameters achieved an accuracy rate of 65.7%. We also discuss more details of the causality of task incompletion for spoken dialog systems using such trees.

Web of Science

researchmap

Other Link： http://www.lrec-conf.org/proceedings/lrec2012/summaries/1059.html
Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation Reviewed International journal

Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of 2011 Automatic Speech Recognition and Understanding Workshop (ASRU 2011) 169 - 172 2011.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：IEEE

In this paper, we propose a novel acoustic model training method which is suitable for speaker adaptation in speech recognition. Our method is based on feature generation from a small amount of speakers' data. For decades, speaker adaptation methods have been widely used. Such adaptation methods need some amount of adaptation data and if the data is not sufficient, speech recognition performance degrade significantly. If the seed models to be adapted to a specific speaker can widely cover more speakers, speaker adaptation can perform robustly. To make such robust seed models, we adopt inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our seed models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers is estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudo-speakers' features. Finally, using these features, we train the acoustic seed models. Using this seed models, we obtained better speaker adaptation results than using simply environmentally adapted models. © 2011 IEEE.

DOI： 10.1109/ASRU.2011.6163925

Scopus

researchmap
Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations Reviewed International journal

Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of 2011 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2011) 1 - 5 2011.12

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：APSIPA

In this paper a novel speech feature generationbased acoustic model training method is proposed. For decades, speaker adaptation methods have been widely used. All existing adaptation methods need adaptation data. However, our proposed method creates speaker-independent acoustic models that cover not only known but also unknown speakers. We do this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers are estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudospeakers' features. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models which are created are robust for unknown speakers.

Scopus

researchmap
On-line detection of task incompletion for spoken dialog systems using utterance and behavior tag N-gram vectors Reviewed International journal

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of the Paralinguistic Information and Its Integration in Spoken Dialogue Systems Workshop 215 - 225 2011.9

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems Reviewed International journal

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of INTERSPEECH 2011 1312 - 1315 2011

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA-INT SPEECH COMMUNICATION ASSOC

We propose a method of detecting "task incomplete" dialogs in spoken dialog systems using N-gram-based dialog models. We used a database created during a field test in which inexperienced users used a client-server music retrieval system with a spoken dialog interface on their own PCs. In this study, the dialog for a music retrieval task consisted of a sequence of user and system tags that related their utterances and behaviors. The dialogs were manually classified into two classes: the dialog either completed the music retrieval task or it didn't. We then detected dialogs that did not complete the task, using N-gram probability models or a Support Vector Machine with N-gram feature vectors trained using manually classified dialogs. Off-line and on-line detection experiments were conducted on a large amount of real data, and the results show that our proposed method achieved good classification performance.

Web of Science

researchmap

Other Link： http://www.isca-speech.org/archive/interspeech_2011/i11_1305.html
Music Recommendation System Based on Human-to-human Conversation Recognition Reviewed International journal

Hiromasa Ohashi, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Workshop Proceedings of the 7th International Conference on Intelligent Environments: Ambient Intelligence and Smart Environments 10 352 - 361 2011

　More details

Language：English Publishing type：Part of collection (book) Publisher：IOS PRESS

We developed an ambient system that plays music suitable for the mood of a human-human conversation using words obtained from a continuous-speech recognition system. Using the correspondence between a document space based on the texts related to the music and an acoustic space that expresses various audio features, the continuous-speech recognition results are mapped to an acoustic space. We performed a subjective evaluation of the system. The subjects rated the recommended music and the result reveals that the 10 most highly recommended selections included suitable music.

DOI： 10.3233/978-1-60750-795-6-352

Web of Science

researchmap
Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram Reviewed International journal

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of INTERSPEECH2010 3034 - 3037 2010.9

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：ISCA

In this paper, we propose a method of detecting task-incompleted users for a spoken dialog system using an N-gram-based dialog history model. We collected a large amount of spoken dialog data accompanied by usability evaluation scores by users in real environments. The database was made by a field test in which naive users used a client-server music retrieval system with a spoken dialog interface on their own PCs. An N-gram model was trained from sequences that consist of user dialog acts and/or system dialog acts for two dialog classes, that is, the dialog completed the music retrieval task or the dialog incompleted the task. Then the system detects unknown dialogs that is not completed the task based on the N-gram likelihood. Experiments were conducted on large real data, and the results show that our proposed method achieved good classification performance. When the classifier correctly detected all of the task-incompleted dialogs, our proposed method achieved a false detection rate of 6%.

Web of Science

researchmap

Other Link： http://www.isca-speech.org/archive/interspeech_2010/i10_3034.html
Rapid acoustic model adaptation using inverse MLLR-based feature generation Reviewed International journal

Arata Ito, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

Proceedings of ICA2010 5 1 - 6 2010.8

　More details

Language：English Publishing type：Research paper (international conference proceedings)

We propose a technique for generating a large amount of target speaker-like speech features by converting a large amount of prepared speech features of many speakers into features similar to those of the target speaker using a transformation matrix. To generate a large amount of target speaker-like features, the system only needs a very small amount of the target speaker's utterances. This technique enables the system to adapt the acoustic model efficiently from a small amount of the target speaker's utterances. To evaluate the proposed method, we prepared 100 reference speakers and 12 target (test) speakers. We conducted the experiments in an isolated word recognition task using a speech database collected by real PC-based distributed environments and compared our proposed method with MLLR, MAP and the method theoretically equivalent to the SAT. Experimental results proved that the proposed method needed a significantly smaller amount of the target speaker's utterances than conventional MLLR, MAP and SAT.

Scopus

researchmap
Estimation method of user satisfaction using N-gram-based dialog history model for spoken dialog system Reviewed International journal

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION 78 - 83 2010.5

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings) Publisher：EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA

In this paper, we propose an estimation method of user satisfaction for a spoken dialog system using an N-gram-based dialog history model. We have collected a large amount of spoken dialog data accompanied by usability evaluation scores by users in real environments. The database is made by a field-test in which naive users used a client-server music retrieval system with a spoken dialog interface on their own PCs. An N-gram model is trained from the sequences that consist of users' dialog acts and/or the system's dialog acts for each one of six user satisfaction levels: from 1 to 5 and phi (task not completed). Then, the satisfaction level is estimated based on the N-gram likelihood. Experiments were conducted on the large real data and the results show that our proposed method achieved good classification performance; the classification accuracy was 94.7% in the experiment on a classification into dialogs with task completion and those without task completion. Even if the classifier detected all of the task incomplete dialog correctly, our proposed method achieved the false detection rate of only 6%.

Web of Science

researchmap

Other Link： http://www.lrec-conf.org/proceedings/lrec2010/summaries/579.html
Data collection and usability study of a PC-based speech application in various user environments Reviewed International journal

Sunao Hara, Chiyomi Miyajima, Katsunobu Ito, Norihide Kitaoka, Kazuya Takeda

Proceedings of Oriental-COCOSDA 2008 39 - 44 2008.11

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap
In-car Speech Data Collection along with Various Multimodal Signals Reviewed International journal

Akira Ozaki, Sunao Hara, Takashi Kusakawa, Chiyomi Miyajima, Takanori Nishino, Norihide Kitaoka, Katunobu Itou, Kazuya Takeda

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08) 1846 - 1851 2008.5

　More details

Language：English Publishing type：Research paper (international conference proceedings) Publisher：EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA

In this paper, a large-scale real-world speech database is introduced along with other multimedia driving data. We designed a data collection vehicle equipped with various sensors to synchronously record twelve-channel speech, three-channel video, driving behavior including gas and brake pedal pressures, steering angles, and vehicle velocities, physiological signals including driver heart rate, skin conductance, and emotion-based sweating on the palms and soles, etc. These multimodal data are collected while driving on city streets and expressways under four different driving task conditions including two kinds of monologues, human-human dialog, and human-machine dialog. We investigated the response timing of drivers against navigator utterances and found that most overlapped with the preceding utterance due to the task characteristics and the features of Japanese. When comparing utterance length, speaking rate, and the filler rate of driver utterances in human-human and human-machine dialogs, we found that drivers tended to use longer and faster utterances with more fillers to talk with humans than machines.

Web of Science

researchmap

Other Link： http://www.lrec-conf.org/proceedings/lrec2008/summaries/472.html
Data Collection System for the Speech Utterances to an Automatic Speech Recognition System under Real Environments Reviewed

Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

The IEICE transactions on information and systems J90-D ( 10 ) 2807 - 2816 2007.10

　More details

Authorship：Lead author Language：Japanese Publishing type：Research paper (scientific journal) Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap
An online customizable music retrieval system with a spoken dialogue interface Reviewed International journal

Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

The Journal of the Acoustical Society of America 120 ( 5-2 ) 3378 - 3379 2006.11

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

CiNii Article

researchmap
Novel orally active alpha 4 integrin antagonist, T-728, attenuates dextran sodium sulfate-induced chronic colitis in mice

Ken-Ichi Hayashi, Hiroyuki Meguro, Sunao Hara, Rie Sasaki, Yoko Koga, Meiko Takeshita, Naoyoshi Yamamoto, Hiroe Hirokawa, Mie Kainoh

GASTROENTEROLOGY 130 ( 4 ) A352 - A352 2006.4

　More details

Language：English Publisher：W B SAUNDERS CO-ELSEVIER INC

0

Web of Science

researchmap
Preliminary Study of a Learning Effect on Users to Develop a New Evaluation of the Spoken Dialogue System Reviewed International journal

Sunao Hara, Ayako Shirose, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

Proceedings of Oriental-COCOSDA 2005 164 - 168 2005.12

　More details

Authorship：Lead author Language：English Publishing type：Research paper (international conference proceedings)

researchmap

▼display all

MISC

機械学習による環境音からの主観的な騒音マップ生成 Invited

原直, 阿部匡伸

騒音制御 46 ( 3 ) 126 - 130 2022.6

　More details

Authorship：Lead author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal)

researchmap
クラウドセンシングによる環境音の収集 Invited

阿部匡伸, 原直

騒音制御 42 ( 1 ) 20 - 23 2018

　More details

Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal)

researchmap
Environmental sound sensing by smartdevices, and its applications Invited

73 ( 8 ) 483 - 490 2017.8

　More details

Authorship：Lead author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal)

DOI： 10.20697/jasj.73.8_483

CiNii Article

CiNii Books

researchmap
イベントを比喩に用いた感情伝達法の検討 Reviewed

濱野和人, 原直, 阿部匡伸

電子情報通信学会論文誌 J97-D ( .12 ) 1680 - 1683 2014.12

　More details

Language：Japanese Publishing type：Rapid communication, short report, research note, etc. (scientific journal) Publisher：電子情報通信学会

researchmap
Potential Applications of Acoustic Signal Processing from Lifelog Research Perspectives Invited

38 ( 1 ) 15 - 21 2014

　More details

Authorship：Lead author Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal)

CiNii Article

CiNii Books

researchmap
「音声対話システムの実用化に向けて」10年間の長期運用を支えた音声情報案内システム「たけまるくん」の技術 Invited

西村竜一, 原直, 川波弘道, LEE Akinobu, 鹿野清宏

人工知能学会誌 28 ( 1 ) 52 - 59 2013.1

　More details

Language：Japanese Publishing type：Article, review, commentary, editorial, etc. (scientific journal) Publisher：The Japanese Society for Artificial Intelligence

DOI： 10.11517/jjsai.28.1_52

CiNii Article

CiNii Books

J-GLOBAL

researchmap
Detection of Task Incomplete Dialogs Based on Utterance Sequences N-gram for Spoken Dialog System Reviewed

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

The IEICE transactions on information and systems (Japanese edition) J94-D ( 2 ) 497 - 500 2011.2

　More details

Language：Japanese Publishing type：Rapid communication, short report, research note, etc. (scientific journal) Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap
音声対話システムのテキスト音声合成における声質変換とx-vector埋め込みを用いた感情制御方式の検討

小原俊一, 阿部匡伸, 原直

日本音響学会講演論文集 1275 - 1278 2023.9

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

researchmap
ウェアラブルデバイスを用いた曖昧な入力からの会話支援システムの検討

岩崎茉理, 原直, 阿部匡伸

日本音響学会講演論文集 1369 - 1372 2023.9

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

researchmap
小説オーディオブックの強調部分を学習に用いる抑揚制御可能な End-to-End 音声合成方式の検討

和田拓海, 阿部匡伸, 原直

日本音響学会講演論文集 903 - 906 2023.3

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

researchmap
ライフログに基づく共感的対話システムにおけるユーザの感情極性に応じた応答生成方式の検討

前薗そよぎ, 原直, 阿部匡伸

電子情報通信学会技術研究報告 122 ( 423 ) 102 - 107 2023.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
音声対話システムのための入力音声の感情に同調する声質変換とx-vector埋め込みを用いたテキストからの音声合成方式の検討

小原俊一, 原直, 阿部匡伸

電子情報通信学会技術研究報告 122 ( 389 ) 203 - 208 2023.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
GPSデータに基づく粒度変更可能な生活圏を用いた個人認証のための類似度計算方式の検討

遠山大督, 原直, 阿部匡伸

電子情報通信学会技術研究報告 122 ( 338 ) 25 - 30 2023.1

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
差分メルケプストラムを用いた声質変換による喉締め歌唱音声改善方式の検討

植田遥人, 阿部匡伸, 原直

日本音響学会講演論文集 1405 - 1408 2022.9

　More details

Language：Japanese Publisher：一般社団法人日本音響学会

researchmap
ライフログに応じて発話を変えることでユーザに親密さを感じさせる対話システムの検討

前薗そよぎ, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2022）講演論文集 1182 - 1190 2022.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
話者特徴量の操作によりシームレスに話者性を制御できるEnd-to-End音声合成方式の検討

青谷直樹, 原直, 阿部匡伸

電子情報通信学会技術研究報告 122 ( 81 ) 55 - 60 2022.6

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
音声と映像から議論への関与姿勢を推定するための特徴量の検討

金岡翼, 原直, 阿部匡伸

電子情報通信学会技術研究報告 121 ( 401 ) 57 - 62 2022.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
SSQPによる場所の印象情報を環境音と航空写真から推定する方式の検討

小野祐介, 原直, 阿部匡伸

電子情報通信学会技術研究報告 121 ( 401 ) 51 - 56 2022.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
口唇特徴量を利用した知識蒸留による舌亜全摘出者の音韻明瞭度改善法の検討

高島和嗣, 阿部匡伸, 原直

電子情報通信学会技術研究報告 121 ( 385 ) 108 - 113 2022.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
バックコーラス歌唱合成のためのDNNを用いた自然性の高い歌声合成方式の検討

木岡智宏, 阿部匡伸, 原直

電子情報通信学会技術研究報告 121 ( 385 ) 102 - 107 2022.3

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
インソール型圧力センサを用いたパーキンソン病重症度推定

林倖生, 原直, 阿部匡伸, 武本麻美

電子情報通信学会総合大会（H-4-7） 2022.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：一般社団法人電子情報通信学会

researchmap
環境音と航空写真を用いた場所の印象を推定する方式の検討

小野祐介, 原直, 阿部匡伸

第24回日本音響学会関西支部若手研究者交流研究発表会発表概要集 34 - 34 2021.12

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：日本音響学会関西支部

researchmap
歌声合成のための双方向LSTM によるビブラート表現方式の検討

金子隼人, 阿部匡伸, 原直

日本音響学会講演論文集 1109 - 1112 2021.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
音素情報を知識蒸留する舌亜全摘出者の音韻明瞭度改善法

高島和嗣, 阿部匡伸, 原直

日本音響学会講演論文集 1057 - 1060 2021.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
パーキンソン病重症度推定に向けたインソール型圧力センサで計測した歩行データの分析

林倖生, 原直, 阿部匡伸, 武本麻美

第20回情報科学技術フォーラム (FIT 2021)，CK-001 3 71 - 74 2021.8

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
人対人の会話で自然な話題展開を支援するための対話戦略の検討

前薗そよぎ, 原直, 阿部匡伸

情報処理学会研究報告 2021-SLP-137 ( 16 ) 1 - 6 2021.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
呼気流路の容易な制御を目的とした面接触型人工舌の構音改善に関する実験的研究

長塚弘亮, 川上滋央, 古寺寛志, 佐藤匡晃, 田中祐貴, 兒玉直紀, 原直, 皆木省吾

日本顎顔面補綴学会第38回総会・学術大会 30 - 30 2021.6

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：日本顎顔面補綴学会

researchmap
ニューラル機械翻訳により推定された読み仮名・韻律記号を入力とする日本語 End-to-End 音声合成の評価

懸川直人, 原直, 阿部匡伸, 井島勇祐

日本音響学会講演論文集 847 - 850 2021.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
Evaluation of Concept Drift Adaptation for Acoustic Scene Classifier Based on Kernel Density Drift Detection and Combine Merge Gaussian Mixture Model

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

日本音響学会講演論文集 201 - 204 2021.3

　More details

Language：English Publisher：日本音響学会

researchmap
歌唱表現を付与できるBidirectional-LSTM を用いた歌声合成方式の検

金子隼人, 原直, 阿部匡伸

日本音響学会講演論文集 987 - 990 2021.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
TTSによる会話支援システムのための感圧センサを用いた手袋型入力デバイスの開発と入力速度の評価

小林誠, 原直, 阿部匡伸

情報処理学会研究報告 2020-HCI-190 ( 20 ) 1 - 6 2020.12

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
パーキンソン病重症度推定のためのインソール型圧力センサを用いた時間的特徴量の検討

林倖生, 原直, 阿部匡伸, 武本麻美

2020年度(第71回)電気・情報関連学会中国支部連合大会，R20-14-02-03 1 - 1 2020.10

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：電気・情報関連学会中国支部

researchmap
Transformerを用いた日本語テキストからの読み仮名・韻律記号列推定

懸川直人, 原直, 阿部匡伸, 井島勇祐

日本音響学会講演論文集 829 - 832 2020.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
WaveNetを用いた言語情報なし感情音声合成における感情の強さ制御の検討

松本剣斗, 原直, 阿部匡伸

日本音響学会講演論文集 867 - 870 2020.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
映像と音声を用いた議論への関与姿勢や肯定的・否定的態度の推定方式の検討

金岡翼, 上原佑太郎, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2020）講演論文集 1422 - 1429 2020.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
話題の対象に対する親密度に応じて応答する音声対話システムの検討

加藤大地, 原直, 阿部匡伸

情報処理学会研究報告 2020-SLP-132 ( 21 ) 1 - 6 2020.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
GPSデータのクラスタリングによる日常生活における場所の重要度の分析

平田瑠, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2020）講演論文集 785 - 793 2020.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
ウェアラブルデバイスによる曖昧な入力からのニューラル機械翻訳を用いた日本語文章推定方式

渡邊淳, 原直, 阿部匡伸

情報処理学会研究報告 2020-HCI-187 ( 7 ) 1 - 7 2020.3

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討

荻野聖也, 原直, 阿部匡伸

電子情報通信学会総合大会情報・システムソサイエティ特別企画学生ポスターセッション予稿集 124 - 124 2020.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：一般社団法人電子情報通信学会

researchmap
End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

井上勝喜, 原直, 阿部匡伸, 林知樹, 山本龍一, 渡部晋治

日本音響学会講演論文集 1095 - 1098 2020.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
言語情報なし感情合成音を学習に用いたCycleGANによる感情変換方式の検討

松本剣斗, 原直, 阿部匡伸

日本音響学会講演論文集 1165 - 1168 2020.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
End-to-End 音声認識を用いた End-to-End 音声合成の性能評価

井上勝喜, 原直, 阿部匡伸, 渡部晋治

第22回日本音響学会関西支部若手研究者交流研究発表会概要集 6 - 6 2019.12

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：日本音響学会関西支部

researchmap
WaveNet による言語情報を含まない感情音声合成方式における話者性の検討

松本剣斗, 原直, 阿部匡伸

日本音響学会講演論文集 993 - 996 2019.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
新たにデザインされた人工舌と解剖学的人工舌の効果ならびにその選択基準

佐藤匡晃, 長塚弘亮, 川上滋央, 兒玉直紀, 原直, 阿部匡伸, 皆木省吾

日本補綴歯科学会中国・四国支部学術大会，抄録集 28 - 28 2019.9

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：日本補綴歯科学会

researchmap
CNN Autoencoderから抽出したボトルネック特徴量を用いた環境音分類

松原拓未, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2019）講演論文集 339 - 346 2019.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
GPSデータに基づく日常生活における特別な行動の検出

小林誠, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2019）講演論文集 846 - 853 2019.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
WaveNetによる言語情報を含まない感情音声合成方式の検討

松本剣斗, 原直, 阿部匡伸

情報処理学会研究報告 2019-SLP-127 ( 61 ) 1 - 6 2019.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
i-vectorに基づく賑わい音の推定方式の検討

呉セン陽, 朝田興平, 原直, 阿部匡伸

情報処理学会研究報告 2019-SLP-127 ( 33 ) 1 - 6 2019.6

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
DNN音声合成における少量の目標感情音声を用いた感情付与方式の検討

井上勝喜, 原直, 阿部匡伸, 井島勇祐

日本音響学会講演論文集 1085 - 1088 2019.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討

荻野聖也, 村上博紀, 原直, 阿部匡伸

日本音響学会講演論文集 1155 - 1158 2019.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討

村上博紀, 原直, 阿部匡伸

日本音響学会講演論文集 1151 - 1154 2019.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
ながら聴き用楽曲の作業負荷に及ぼす影響とその選択方式の検討

高瀬郁, 阿部匡伸, 原直

情報処理学会研究報告 2018-MUS-121 ( 19 ) 1 - 6 2018.11

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
DNN音声合成における感情付与方式の評価

井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

日本音響学会講演論文集 1105 - 1108 2018.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
Speech Enhancement of Glossectomy Patient's Speech using Voice Conversion Approach

Masanobu Abe, Seiya Ogino, Hiroki Murakami, Sunao Hara

日本生物物理学会第56回年会，シンポジウム：ヘルスシステムの理解と応用 198 - 198 2018.9

　More details

Authorship：Corresponding author Language：English Publishing type：Research paper, summary (national, other academic conference) Publisher：日本生物物理学会

researchmap
声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討

村上博紀, 原直, 阿部匡伸

日本音響学会講演論文集 1175 - 1178 2018.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
クラウドソーシングによる環境音マップ構築のための主観的な騒々しさ推定方式の検討

原直, 阿部匡伸

第17回情報科学技術フォーラム (FIT 2018)，O-001 4 343 - 346 2018.9

　More details

Authorship：Lead author,　Corresponding author Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討

荻野聖也, 村上博紀, 原直, 阿部匡伸

電子情報通信学会技術研究報告 118 ( 112 ) 7 - 12 2018.6

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

村上博紀, 荻野聖也, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

日本音響学会講演論文集 355 - 358 2018.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
DNN音声合成における感情付与のための継続時間長モデルの検討

井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

日本音響学会講演論文集 279 - 282 2018.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
クラウドソーシングによる賑わい音識別方式のフィールド実験評価

朝田興平, 原直, 阿部匡伸

日本音響学会講演論文集 79 - 82 2018.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
DNN音声合成における話者と感情の情報を扱うためのモデル構造の検討

井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

日本音響学会講演論文集 263 - 266 2017.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
DNNに基づく差分スペクトル補正を用いた声質変換による舌亜全摘出者の音韻明瞭性改善の検討

村上博紀, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

日本音響学会講演論文集 297 - 300 2017.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
DNNによる人間の感覚を考慮した騒々しさ推定方式に基づく騒音マップの作成

小林将大, 原直, 阿部匡伸

日本音響学会講演論文集 623 - 626 2017.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
環境音収集に効果的なインセンティブを与える可視化方式の検討 Reviewed

畠山晏彩子, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2017）講演論文集 255 - 262 2017.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
DNN音声合成における感情付与のためのモデル構造の検討

井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

電子情報通信学会技術研究報告 117 ( 106 ) 23 - 28 2017.6

　More details

Language：Japanese Publisher：一般社団法人電子情報通信学会

researchmap
2つの粒度の生活圏に基づく見守りシステム

鎌田成紀, 原直, 阿部匡伸

電子情報通信学会総合大会 (D-9-12) 1 - 1 2017.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：一般社団法人電子情報通信学会

researchmap
DNNによる人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

小林将大, 原直, 阿部匡伸

日本音響学会講演論文集 799 - 802 2017.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
スマートフォンで収録した環境音データベースを用いたCNNによる環境音分類

鳥羽隼司, 原直, 阿部匡伸

日本音響学会講演論文集 139 - 142 2017.3

　More details

Language：Japanese Publisher：日本音響学会

researchmap
基本周波数変形を考慮したスペクトル変換手法の検討 Reviewed

床建吾, 阿部匡伸, 原直

第18回IEEE広島支部学生シンポジウム（HISS 18th） 174 - 176 2016.11

　More details

Language：Japanese Publisher：IEEE広島支部

researchmap
RNNによる実環境データからのマルチ音響イベント検出

鳥羽隼司, 原直, 阿部匡伸

日本音響学会講演論文集 43 - 44 2016.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
スマートフォンで収録した環境音に含まれるタップ音除去方式の検討

朝田興平, 原直, 阿部匡伸

日本音響学会講演論文集 45 - 48 2016.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
重複音を含む環境音データベースにおける環境音検出のための特徴量の基本検討

原直, 田中智康, 阿部匡伸

日本音響学会講演論文集 3 - 6 2016.9

　More details

Authorship：Lead author,　Corresponding author Language：Japanese Publisher：日本音響学会

researchmap
GMMに基づく声質変換を用いた舌亜全摘出者の音韻明瞭性改善の検討

田中慧, 原直, 阿部匡伸, 皆木省吾

日本音響学会講演論文集 141 - 144 2016.9

　More details

Language：Japanese Publisher：日本音響学会

researchmap
Sound collection systems using a crowdsourcing approach for constructing subjective evaluation-based sound maps

116 ( 189 ) 41 - 46 2016.8

　More details

Authorship：Lead author,　Corresponding author Language：Japanese

CiNii Article

CiNii Books

researchmap
GPSデータ匿名化レベルの主観的許容度を客観的に表現する指標の検討 Reviewed

三藤優介, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2016）講演論文集 798 - 805 2016.7

　More details

Authorship：Corresponding author Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式 Reviewed

小林将大, 原直, 阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2016）講演論文集 141 - 148 2016.7

　More details

Language：Japanese Publisher：一般社団法人情報処理学会

researchmap
A measure for transfer tendency between staying places

116 ( 23 ) 95 - 100 2016.5

　More details

Language：Japanese

CiNii Article

researchmap
A watching method to protect users' privacy using living area

115 ( 409 ) 19 - 24 2016.1

　More details

Language：Japanese

CiNii Article

researchmap
A classification method for crowded situation using environmental sounds based on Gaussian mixture model-universal background model Reviewed

Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

The Journal of the Acoustical Society of America 140 ( 4 ) 3110 - 3110 2016

　More details

Language：Japanese Publishing type：Research paper, summary (international conference)

DOI： 10.1121/1.4969721

researchmap
Method to efficiently retrieve memorable scenes from video using automatically collected life log

115 ( 27 ) 23 - 28 2015.5

　More details

Language：Japanese

CiNii Article

researchmap
Method to efficiently retrieve memorable scenes from video using automatically collected life log

IPSJ SIG Notes 2015 ( 4 ) 1 - 6 2015.5

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

One of the applications using life log is to retrieve memorable scenes. In this paper, for extracting memorable scenes from video, we propose a method to use life log that are automatically collected together with video during an event. Here, data in life log are GPS, pulse and sound. Three kinds of important points are extracted from the three data, and based on the important points, particular parts of video are extracted. According to subjective evaluation experiments, it is revealed that users can easily remember things by watching the extracted video and can remember details of the events including what were not recorded in the video.

CiNii Article

CiNii Books

researchmap
A "a big day" search method using features of staying place

HAYASHI Keigo, HARA Sunao, ABE Masanobu

IEICE technical report. Life intelligence and office information systems 114 ( 500 ) 89 - 94 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In recent years, a life log is getting popular and used to provide specific services for a particular person. One of them is to retrieve memories. Life log helps us to recall events, activities, accidents, etc., but the huge amount of data in life log make it difficult for us to find out what we really want. In this paper, we propose a method to retrieve "a big day" using a feature value that is calculated from GPS data; i.e., the feature is defined as function of visiting frequency of places. According to experiments results, the proposed method can retrieve "a big day" at a rate of 60% and unusual day at a rate of 90%. As subjective evaluations were also carried out from the perspective of effectiveness, efficiency and satisfaction. The results showed that the proposal method has better performance than a conventional method.

CiNii Article

CiNii Books

researchmap
Living area extraction using staying places and routes

MATSUO Yuji, HARA Sunao, ABE Masanobu

IEICE technical report. Life intelligence and office information systems 114 ( 500 ) 77 - 82 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In these days, demands for watching out for the safety of elderly and children are extremely increasing. To make the quality of the watching better, we think that living area plays an important role. Therefore, in this paper, we propose an algorithm to estimate living area of a person from his/her accumulated GPS data. The living area is defined by important places and routes. First, taking visiting frequency into account, important places are extracted, then routes are found so as to connect the important places using best-first search. Experiments are carried out for 3 users and evaluated by precision and recall. We confirmed that the proposed algorithm has better performance than a conventional method.

CiNii Article

CiNii Books

researchmap
Behavior analysis of persons by classification of moving routes

SETO Ryo, HARA Sunao, ABE Masanobu

IEICE technical report. Life intelligence and office information systems 114 ( 500 ) 31 - 36 2015.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this paper, we propose a method to behavior analysis of persons using GPS data to focus on how human have activity. We evaluated whether or not judging a person is active or not from moving routes data using PCA or NMF classification approach. The evaluation results show that the number of important eigenvalues by PCA and approximation error by NMF is effective whether the person is active or not. In addition, we extracted the pattern of moving routes and we evaluated the difference of moving routes extracted by PCA or NMF. As a result, PCA extracted high frequency patterns. On the other hand, NMF extracted not only high frequency patterns but also low frequency patterns.

CiNii Article

CiNii Books

researchmap
FLAG: Lifelog aggregation system that was centered on position information

2014 ( 6 ) 1 - 6 2014.7

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Recently, the application and the service which utilize location information from GPS have highly prevailed. In this paper, we developed the system called FLAG which aggregates the variety of Lifelog under location information. FLAG manages location information discriminate between moving and staying. With using FLAG, we visualize categorized location information on the map and the time table. And implement set the function which registers individual name according to users in the staying state. We also link the location information from FLAG to Twitter using the posting time for an example of aggregating various kinds of Lifelog, This function enables Lifelog to show on the map even if the Lifelog has no positional information. For an evaluation of the FLAG system, we created correct data of staying by six users. And we compared accuracies of staying by using two detection methods. As a result, we confirmed that FLAG can be detected high accuracy staying than the original data.

CiNii Article

CiNii Books

researchmap

Other Link： http://id.nii.ac.jp/1001/00102345/
FLAG : Lifelog aggregation system that was centered on position information

IEICE technical report. SC, Services Computing 114 ( 157 ) 29 - 34 2014.7

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Recently, the application and the service which utilize location information from GPS have highly prevailed. In this paper, we developed the system called FLAG which aggregates the variety of Lifelog under location information. FLAG manages location information discriminate between moving and staying. With using FLAG, we visualize categorized location information on the map and the time table. And implement set the function which registers individual name according to users in the staying state. We also link the location information from FLAG to Twitter using the posting time for an example of aggregating various kinds of Lifelog, This function enables Lifelog to show on the map even if the Lifelog has no positional information. For an evaluation of the FLAG system, we created correct data of staying by six users. And we compared accuracies of staying by using two detection methods. As a result, we confirmed that FLAG can be detected high accuracy staying than the original data.

CiNii Article

CiNii Books

researchmap
Development of environmental sound collection system using smart devices based on crowd-sourcing approach

HARA Sunao, KASAI Akinori, ABE Masanobu, SONEHARA Noboru

IEICE technical report. Speech 114 ( 52 ) 177 - 180 2014.5

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this study, we aimed to construct environmental sound database for various sounds as Wisdom of the crowds. For example, considering environmental noise-pollution problems as a kind of the environmental sound, we need to consider not only signature sound, e.g. car noise and railway noise, but also life-space noise, e.g. festivity noise on streets or in parks. However, daily/widely sound collection is difficult to substantiate by few participant. Therefore, we aimed to measure environmental sound covered a vast area by applying crowdsourcing approach. First, we develop a prototype application which is run on the smart device with Android OS, and then we develop a prototype server system for collection and browsing the collected sound data. Then, we calibrated noise level measured by smart devices and carried out a sound collection experiment for validate an accuracy of sensors on the smart devices. In this report, we introduce the environemental sound collection system and the sound-collection experiment using the system.

CiNii Article

CiNii Books

researchmap
Preliminary study for behavior analysis based on degree of nodes in a network constructed from GPS data

2014 ( 6 ) 1 - 6 2014.5

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We discuss a behavior-anaysis method based on strucural features in networks constructed from a personal location history. In this paper, a directed network constructed from GPS location history is called a stay network. Stay network treat set of stay extratcted from the location history as nodes in networks. Generally, nodes of directed networks have out-degrees and in-degrees as structural features. The stay network has biased values of out-degrees, in-degrees and their ratio. Therefore, we assumed existance of relationships between the degree of biased values and human behaviors in the stay, and analyzed the relationships. We forcused on a purpose of the stay, which is assumed to be occur as a result of human behavior, and particularly analyzed relationships between the degree of biased values and the purpose of the stay.

CiNii Article

CiNii Books

researchmap
Development of environmental sound collection system using smart devices based on crowd-sourcing approach

2014 ( 36 ) 1 - 4 2014.5

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
Influence analysis on user's workload in a spoken dialog strategy for a car navigation system

Masaki Yamaoka, Sunao Hara, Masanobu Abe

IPSJ SIG Notes 2014 ( 7 ) 1 - 6 2014.5

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We assess dialog strategies from user's workload standpoint to suggest usage of spoken dialog systems while driving. We evaluate several dialog strategies, which are combinations of methods of dialog initiative and confirmation manner, with objective evaluation method using computer simulation. Two conditions of the simulation are set up; One is that the system speaks if the user has leeway to talk with the system, and the another one is that the system speaks even if the user will be fail to recognize the system's. We also evaluate spoken dialog systems applying these methods with subjective evaluation method. As a result of the evaluations, user initiative strategy has advantages in lower turn number and lower task completion rate than both system initiative strategy and mixed initiative strategies when cognitive rate is high. The result also shows that system initiative strategy and mixed initiative strategy have advantages in lower turn number and lower task completion rate than user initiative strategy when cognitive rate is low. Additionally, the result shows that the method, which system speaks only when user has enough time driving operation, makes user's workload low, however, it need more time to complete tasks.

CiNii Article

CiNii Books

researchmap
Working Patterns Extractions by Applying Nonnegative Matrix Factorization to PC Operation Logs

HIRAYAMA Akihiko, HARA Sunao, ABE Masanobu

IEICE technical report. Life intelligence and office information systems 114 ( 32 ) 33 - 38 2014.5

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this paper, we tried to extract working patterns by nonnegative matrix factorization using PC operation logs. Experiments were carried out for three occupations, in terms of daily-basis working patterns, we successfully extracted particular patterns for each occupation and some common working patterns for all occupations. We also presented that the extracted patterns can be easily interpreted as the ways of working.

CiNii Article

CiNii Books

researchmap
Influence analysis on user's workload in a spoken dialog strategy for a car navigation system

Masaki Yamaoka, Sunao Hara, Masanobu Abe

IPSJ SIG Notes 2014 ( 7 ) 1 - 6 2014.5

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We assess dialog strategies from user's workload standpoint to suggest usage of spoken dialog systems while driving. We evaluate several dialog strategies, which are combinations of methods of dialog initiative and confirmation manner, with objective evaluation method using computer simulation. Two conditions of the simulation are set up; One is that the system speaks if the user has leeway to talk with the system, and the another one is that the system speaks even if the user will be fail to recognize the system's. We also evaluate spoken dialog systems applying these methods with subjective evaluation method. As a result of the evaluations, user initiative strategy has advantages in lower turn number and lower task completion rate than both system initiative strategy and mixed initiative strategies when cognitive rate is high. The result also shows that system initiative strategy and mixed initiative strategy have advantages in lower turn number and lower task completion rate than user initiative strategy when cognitive rate is low. Additionally, the result shows that the method, which system speaks only when user has enough time driving operation, makes user's workload low, however, it need more time to complete tasks.

CiNii Article

CiNii Books

researchmap
Working Patterns Extractions by Applying Nonnegative Matrix Factorization to PC Operation Logs

2014 ( 6 ) 1 - 6 2014.5

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
Working Patterns Extractions by Applying Nonnegative Matrix Factorization to PC Operation Logs

2014 ( 6 ) 1 - 6 2014.5

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
滞在地の特徴量を利用した「特別な日」検索方式の検討

林啓吾, 原直, 阿部匡伸

第76回全国大会講演論文集 2014 ( 1 ) 459 - 460 2014.3

　More details

Language：Japanese

ライフログとは，人間の行動をデジタルデータとして記録に残すことである．ライフログデータを用いて自分の行動を振り返ることを考える．例えば，現在広く普及しているSocial Networking Service（SNS）の過去の発言や写真を見返すことで，振り返りが可能であるが，それらに記録するのは自分の意志で記録したいと思ったことに限られてしまうという問題点がある．無意識に記録できるライフログデータの一つに，Global Positioning System (GPS) による位置情報データ（GPS データ）があげられる．本研究では，GPSデータから得られる滞在地の特徴量を利用し，振り返りたいと感じる「特別な日」を検索する方式を提案した．評価実験により，提案方式で「特別な日」が高い精度で検索可能であることが示された．

CiNii Article

CiNii Books

researchmap
滞在地と経路に着目した生活圏抽出法の検討

松尾雄二, 原直, 阿部匡伸

第76回全国大会講演論文集 2014 ( 1 ) 37 - 38 2014.3

　More details

Language：Japanese

近年、高齢者による徘徊行動が社会問題となっている。既存の見守りシステムではあらかじめ行動範囲を設定する必要があり、その範囲も詳細に決定できないため十分な見守りが不可能である。そこで、本研究では蓄積したGPSデータを用いて見守りシステムに応用するための生活圏(行動範囲)を抽出する方法を検討する。GPSデータに含まれる位置情報(緯度経度)をGEOHEXを用いてHEXコードという量子化符号に変換する。それらのHEXコードを滞在地と経路に分類し、頻度の高いHEXコードをそれぞれの生活圏として抽出し、滞在地に属するデータの生活圏と経路に属するデータの生活圏の整合性を評価した。

CiNii Article

CiNii Books

researchmap
D-9-4 Individual behavior analysis by comparing GPS logs with others

Seto Ryo, Abe Masanobu, Hara Sunao

Proceedings of the IEICE General Conference 2014 ( 1 ) 88 - 88 2014.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap
A development of a smart-device application for environmental sound collection based on crowdsourcing approach

Sunao Hara, Akinori Kasai, Masanobu Abe, Noboru Sonehara

IEICE Technical Report 113 ( 479 ) 29 - 34 2014.3

　More details

Authorship：Lead author Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this study, we aimed to construct environmental sound database for various sounds as Wisdom of the crowds. For example, considering environmental noise-pollution problems as a kind of the environmental sound, we need to consider not only signature sound, e.g. car noise and railway noise, but also life-space noise, e.g. festivity noise on streets or in parks. However, daily/widely sound collection is difficult to substantiate by few participant. Therefore, we aimed to measure environmental sound covered a vast area by applying crowdsourcing approach. First, we develop a prototype application which is run on the smart device with Android OS, and then we develop a prototype server system for collection and browsing the collected sound data. Then, we calibrated noise level measured by smart devices and carried out a sound collection experiment for validate an accuracy of sensors on the smart devices. As a result of the experiment, we collected nine hundred minutes of sound data, and analyzed the relationships between the measured noise level and some subjective evaluations.

CiNii Article

CiNii Books

researchmap
Evaluation of HMM-based speech synthesis using high-frequency component of speech waveform

349 - 352 2014

　More details

Language：Japanese

CiNii Article

researchmap
Relationship between the size of speech database and subjective scores on phone-sized unit selection speech synthesis

331 - 334 2014

　More details

Language：Japanese

CiNii Article

researchmap
Sound-map construction method based on symbolization for environmental sounds collected by crowd-sensing

1535 - 1538 2014

　More details

Language：Japanese

CiNii Article

researchmap
Estimation of fuel consumption using an acoustic signal and multi-sensing signals of smartphone

NANBA Shohei, HARA Sunao, HARA Sunao

IEICE technical report. Signal processing 113 ( 28 ) 1 - 6 2013.5

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

Fuel-consumption meters are equipped with many vehicles, however, they can only show the fuel-consumption/-efficiency value but not allow to use for other purpose, e.g., gathering, analyzing, etc. One of a method to output vehicles data is to use a diagnostic connector having compliant with OBD2 standards, that can output several vehicles' signals, such as, velocity, revolution of engine, and fuel consumption. However, because the protocols depend on manufactures or types of the vehicles, the method is not easy to use for the public. In this study, we aim to estimate the fuel consumption using acoustic signals and several sensor's signals equipped with a smartphone. An estimation of a number of revolutions of engine and an estimation of torque are needed for the estimation of fuel consumption. For the estimation of the number of revolutions, we analyze the acoustic signals from the engine by fast Fourier transform and calculate the estimation value from acoustic signal reducing road-noise approximated as a Gamma mixture distribution. For the estimation of the torque, we use physics of the car with the outputs of several sensors and the vehicle's data. We finally get the fuel consumption refer to a table of fuel-consumption rate, which is created in advance, by the estimated number of revolutions and the estimated torque. As a result of a experiment for the estimation of fuel-consumption, we can achieved a acceptable values of instantaneous fuel consumption, although values of average fuel comsumption have some errors.

CiNii Article

CiNii Books

researchmap
The Individual Feature Analysis of the Network of the Stay Extracted from GPS Data

FUJIOKA Daisuke, HARA Sunao, ABE Masanobu

IEICE technical report. Life intelligence and office information systems 112 ( 466 ) 179 - 184 2013.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

In this paper, we propose a novel technique for behavior analysis focusing on a structure of a personal "stay" network obtained by GPS data. We analyzed network features, which is called scale-free property, small-world property and cluster-state property, for six subjects. We also analyzed a distribution of a feature. that is called "motif, and then compared with other networks based on these features. We evaluated biases of a degree of a hub and a unique number of connection nodes, and clarified a difference between distributions of each subjects' networks.

CiNii Article

CiNii Books

researchmap
Examination of an event sampling process with the similar impression from the others' life log

HAMANO Kazuto, ABE Masanobu, HARA Sunao, FUJIOKA Daisuke, MOTIZUKI Rika, WATANABE Tomoki

IEICE technical report. Life intelligence and office information systems 112 ( 466 ) 173 - 178 2013.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

When we tell a story of our experience, we try to describe the story replacing our experience with other's one that can recall a similar impression to help understanding of the story. In this paper, we study about an extraction method of a common sense of impressive event between people having different backgounds and experiences. First, we showed several emotional words to the subjects and then asked them to recall the events matching the words. Using the recalled events, we compared two method for evaluate the similarity of the events; a method is based on quantification of the events by his/her emotion with five point scale, and the another one is based on decision the similarity of the event by discussion between two subjects. Experimental result suggested that the decision of the similarity by discussion is heavily affected by the strong emotion of the event.

CiNii Article

CiNii Books

researchmap
音声情報案内システムにおけるBag‐of‐Wordsを用いた無効入力棄却モデルの可搬性の評価

真嶋温佳, TORRES Rafael, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

日本音響学会研究発表会講演論文集(CD-ROM) 2013 ROMBUNNO.3-9-5 2013.3

　More details

Language：Japanese

J-GLOBAL

researchmap
The 2nd stage activity report of ASJ students and young researchers forum

Okamoto Takuma, Okuzono Takeshi, Kidani Shunsuke, Hara Sunao, Ohta Tatsuya, Imoto Keisuke

THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN 69 ( 9 ) 519 - 520 2013

　More details

Language：Japanese Publisher：Acoustical Society of Japan

DOI： 10.20697/jasj.69.9_519

CiNii Article

researchmap
音声情報システムにおける最大エントロピー法を用いた無効入力棄却の評価

真嶋温佳, TORRES Rafael, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

日本音響学会研究発表会講演論文集(CD-ROM) 2012 ROMBUNNO.3-1-8 2012.9

　More details

Language：Japanese

J-GLOBAL

researchmap
Invalid Input Rejection Using Bag-of-Words for Speech-Oriented Guidance System

Majima Haruka, Fujita Yoko, Torres Rafael, Kawanami Hiromichi, Hara Sunao, Matsui Tomoko, Saruwatari Hiroshi, Shikano Kiyohiro

IPSJ SIG Notes 2012 ( 7 ) 1 - 6 2012.7

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

On a real environment speech-oriented information guidance system, a valid and invalid input discrimination is important as invalid inputs such as noise, laugh, cough and utterances between users lead to unpredictable system responses. Generally, acoustic features are used for discrimination. Comparing acoustic likelihoods of GMMs (Gaussian Mixture Models) from speech data and noise data is one of the typical methods. In addition to that, using linguistic features is considered to improve discrimination accuracy as it reflects the task-domain of invalid inputs and meaningless recognition re...

CiNii Article

CiNii Books

researchmap
New Speech Research Paradigm in the Cloud Era

Tomoyoshi Akiba, Koji Iwano, Jun Ogata, Tetsuji Ogawa, Nobutaka Ono, Takahiro Shinozaki, Koichi Shinoda, Hiroaki Nanjo, Hiromitsu Nishizaki, Masafumi Nishida, Ryuichi Nishimura, Sunao Hara, Takaaki Hori

IPSJ SIG Notes 2012 ( 4 ) 1 - 7 2012.7

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Recently most individuals have come to use mobile information devices, and daily upload the information obtained by such devices to Internet Cloud. Accordingly the applications of speech information processing have been changing drastically. We need to create a new paradigm for the research and development of speech information processing to adapt to this change. In this paper, we summarize the state-of-the-art speech technologies, propose how to create a research platform for this new paradigm, and discuss the problems we should solve to realize it.

CiNii Article

CiNii Books

researchmap
Design of a network service for developing a speech-oriented guidance system used on mobile comuputers

Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

IPSJ SIG Notes 2012 ( 1 ) 1 - 6 2012.7

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

In this paper we propose a novel speech service software for speech-oriented guidance systems. This software has been developed based on Takemaru-kun system, that is implemented at a community center since Nov. 2002. It is consisted of several modules, such as, Automatic Speech Recognition, Dialog Management, Text-to-Speech, Internet browser, and Computer Graphic Agent. This software and toolkit is plan to be freely distributed. It will be used as the speech service software as Software-as-a-Service (SaaS) for WWW site developers, and also used for an upgrade system of our system for advanc...

CiNii Article

CiNii Books

researchmap
D-9-36 DEVELOPMENT OF A SPEECH-ORIENTED GUIDANCE PLATFORM AS A SOFTWARE-AS-A-SERVICE FOR VARIOUS USAGE AND ENVIRONEMENTS

Hara Sunao, Kawanami Hiromichi, Saruwatari Hiroshi, Shikano Kiyohiro

Proceedings of the IEICE General Conference 2012 ( 1 ) 168 - 168 2012.3

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap
Multi-band Speech Recognition using Confidence of Blind Source Separation

ANDO Atsushi, OHASHI Hiromasa, HARA Sunao, KITAOKA Norihide, TAKEDA Kazuya

IEICE technical report. Speech 111 ( 431 ) 219 - 224 2012.2

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

One of the main applications of Blind Source Separation (BSS) is to improve performance of Automatic Speech Recognition (ASR) systems. However, conventional BSS algorithm has been applied only to speech signals as a pre-processing approach. In this paper, a closely coupled framework between FDICA-based BSS algorithm and speech recognition system is proposed. In the source separation step, a confidence score of the separation accuracy for each frequency bin is first estimated. Subsequently, by employing multi-band speech recognition system, acoustic likelihood is calculated in the Mel-scale filter bank energy using the estimated BSS confidence scores. Therefore, our proposed method can reduce ASR errors which caused by separation errors in BSS and permutation errors in ICA, as in the conventional approach. Experimental results showed that our proposed method improved word correct rate of ASR by 8.2 % and word accuracy rate by 5.7 % on average.

CiNii Article

CiNii Books

researchmap
多様な利用環境における音声情報案内サービスソフトウェアの開発

原直

信学総大講演論文集, 2012 168 2012

　More details

CiNii Article

researchmap
Robust Acoustic Modeling Using MLLR Transformation-based Speech Feature Generation

2010 ( 5 ) 1 - 6 2011.2

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
MLLR変換行列により制約された音響特徴量生成による頑健な音響モデル (音声)

伊藤新, 原直, 北岡教英

電子情報通信学会技術研究報告 110 ( 357 ) 55 - 60 2010.12

　More details

Publisher：電子情報通信学会

CiNii Article

researchmap
Music recommendation system based on chat speech recognition

OHASHI Hiromasa, KITAOKA Norihide, HARA Sunao, TAKEDA Kazuya

IEICE technical report 110 ( 220 ) 59 - 64 2010.10

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

We developed an ambient system that plays a music suitable for the mood of a human-human conversation using words obtained from a continuous speech recognition system. Using the correspondent between a document space based on the texts related to the musics and an acoustic space that express various audio features, the continuous speech recognition results are mapped to an acoustic space. Proper names, which are not coverd by the continuous speech recognizer, are recognized by a wordspotter. In this paper, we show the result of the perfomance evaluation for the system. For read music review texts, the system obtained in MRR of 0.83, which is not bad, with high WER of 70.55%, not low F measure of 31.58. We also show an example result for chat conversations.

CiNii Article

CiNii Books

researchmap
Estimation method of user satisfaction based on dialog history N-gram for spoken dialog system

Hara Sunao, Kitaoka Norihide, Takeda Kazuya

IEICE technical report 109 ( 355 ) 77 - 82 2009.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

researchmap
Estimation method of user satisfaction based on dialog history N-gram for spoken dialog system

HARA SUNAO, KITAOKA NORIHIDE, TAKEDA KAZUYA

2009 ( 14 ) 1 - 6 2009.12

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
User modeling for a satisfaction evaluation of a speech recognition system

HARA Sunao, KITAOKA Norihide, TAKEDA Kazuya

IEICE technical report 108 ( 338 ) 61 - 66 2008.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

A mathematical model for predicting the user satisfaction of a speech dialogue systems is studied based on a field trial of a voice-navigated music retrieval system. The Subjective Word Accuracy (subjective-WA), of the user is introduced as a background psychometrics for the satisfaction. In the field test, subjective-WA is collected through questionnaires together with satisfactory indexes and various user profiles. First we show that the subjective-WA is more significant to the user satisfactory than (Objective) Word Accuracy (objective-WA ), which is calculated using the manually given transcriptions for the recorded dialogue. Then through top-down clustering of the joint distribution of subjective- and objective-WAs, we show that the user population can be grouped into several sub-groups in terms of sensitivity to recognition accuracy. The lower bound of the objective-WA for the given subjective-WA is also calculated from the joint distribution. Finally, a graphical model is build that predicts the user satisfactory index from user profiles and reduces the distribution uncertainty of user satisfaction by 13% of its variance.

CiNii Article

CiNii Books

researchmap
Evaluation of training effects by long-term use of a spoken dialogue interface

HARA Sunao, MIYAJIMA Chiyomi, ITOU Katsunobu, KITAOKA Norihide, TAKEDA Kazuya

70 ( 5 (3L-4) ) 5-341 - 5-342 2008.3

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference)

CiNii Article

CiNii Books

researchmap
User modeling for a satisfaction evaluation of a speech recognition system

HARA Sunao, KITAOKA Norihide, TAKEDA Kazuya

IPSJ SIG Notes 2008 ( 123 ) 61 - 66 2008

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

A mathematical model for predicting the user satisfaction of a speech dialogue systems is studied based on a field trial of a voice-navigated music retrieval system. The Subjective Word Accuracy (subjective-WA), of the user is introduced as a background psychometrics for the satisfaction. In the field test, subjective-WA is collected through questionnaires together with satisfactory indexes and various user profiles. First we show that the subjective-WA is more significant to the user satisfactory than (Objective) Word Accuracy (objective-WA), which is calculated using the manually given transcriptions for the recorded dialogue. Then through top-down clustering of the joint distribution of subjective- and objective-WAs, we show that the user population can be grouped into several sub-groups in terms of sensitivity to recognition accuracy. The lower bound of the objective-WA for the given subjective-WA is also calculated from the joint distribution. Finally, a graphical model is build that predicts the user satisfactory index from user profiles and reduces the distribution uncertainty of user satisfaction by 13% of its variance.

CiNii Article

CiNii Books

researchmap
Data Collection System for the Speech Utterances to an Automatic Speech Recognition System under Real Environments

HARA Sunao, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

The IEICE transactions on information and systems 90 ( 10 ) 2807 - 2816 2007.10

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

CiNii Article

CiNii Books

researchmap
Constructing Acoustic Model for User-specific Song List in a Music Retrieval System

HARA Sunao, MIYAJIMA Chiyomi, KITAOKA Norihide, ITOU Katsunobu, TAKEDA Kazuya

IPSJ SIG Notes 2007 ( 75 ) 87 - 90 2007.7

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

This paper discusses a training method for the HMM acoustic model that efficiently cover the given vocabulary in order to apply it to the speech interface of a music retrieval system. Customizing the acoustic model to each user is important in this application because 1) song titles and artist names contain many phonetic contexts that are rare in general, e.g. text reading corpora, and 2) the songs stored in a device are different among users. In particular, finding an optimal state-tying structure for the given vocabulary is a new problem in acoustic model training. We propose a method for building a task-dependent acoustic model that uses task-related synthetic utterances of more than one hundred speakers by means of HMM-based speech synthesis. From the experimental evaluation using field test data, we confirmed that the task-dependent acoustic model trained by the proposed method can reduce word error rate by 10% compared to a task-independent model.

CiNii Article

CiNii Books

researchmap
Evaluation of a music retrieval system using spoken dialogue

2007 47 - 50 2007

　More details

Language：Japanese

CiNii Article

researchmap
Speech data collection and evaluation by using a spoken dialogue system on general purpose PCs

HARA Sunao, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

IEICE technical report 106 ( 443 ) 167 - 172 2006.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

We developed a user customizable speech dialogue system and a framework for automatic speech data collection in field experiments over the Internet. Users can download and install the speech dialogue system onto their own PCs and customize the system on a remote server for their own use. The speech data recorded on their PCs are transferred to the remote server through the Internet. The system enables us to collect speech data spoken by many users with wide variety of acoustic environments. During a two-month field test, we obtained 59 hours of recorded data including 5 hours and 41 minutes detected as speech, which corresponds to 11351 speech segments. The word correct rate for the 4716 speech utterances spoken to the dialogue system was 66.0%, which was improved to 70.5% after applying unsupervised MLLR for each user.

CiNii Article

CiNii Books

researchmap
Speech data collection and evaluation by using a spoken dialogue system on general purpose PCs

HARA Sunao, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

IPSJ SIG Notes 2006 ( 136 ) 167 - 172 2006

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We developed a user customizable speech dialogue system and a framework for automatic speech data collection in field experiments over the Internet. Users can download and install the speech dialogue system onto their own PCs and customize the system on a remote server for their own use. The speech data recorded on their PCs are transferred to the remote server through the Internet. The system enables us to collect speech data spoken by many users with wide variety of acoustic environments. During a two-month field test, we obtained 59 hours of recorded data including 5 hours and 41 minutes detected as speech, which corresponds to 11351 speech segments. The word correct rate for the 4716 speech utterances spoken to the dialogue system was 66.0%, which was improved to 70.5% after applying unsupervised MLLR for each user.

CiNii Article

CiNii Books

researchmap
Evaluation of training effects of a spoken dialogue interface by long-term use

HARA Sunao, SHIROSE Ayako, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

2005 ( 1 ) 153 - 154 2005.3

　More details

Language：Japanese

CiNii Article

CiNii Books

researchmap
Evaluation of training effects by long-term use of a spoken dialogue interface

HARA Sunao, SHIROSE Ayako, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

IPSJ SIG Notes 2005 ( 12 ) 17 - 22 2005

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

We are developing a music retrieval system for in-car use based on a spoken dialogue interface. The system can retrieve and play musics that a user wants to listen to. We have previously conducted experiments where each subject uses the system for one hour. In the experiments, we have found that the speech recognition performance is improved as the subjects get used to the system, although the degree of training depends on the subject. In this paper, we conduct extended experiments where each subject uses the system over five one-hour sessions. Experimental results for twelve subjects show that the system achieves about 60% relative improvement in recognition performance at the fifth session compared to the first session.

CiNii Article

CiNii Books

researchmap
A music searching system by spoken dialogue

HARA Sunao, SHIROSE Ayako, MIYAJIMA Chiyomi, ITOU Katsunobu, TAKEDA Kazuya

IPSJ SIG Notes 2004 ( 103 ) 31 - 36 2004.2

　More details

Language：Japanese Publisher：Information Processing Society of Japan (IPSJ)

Recently, various applications equipped with speech recognition are developed. For example, it is used for the car-navigation systems with handsfree operation. There are some systems of music download via the Internet so a music search interface which is easy to use is expected. Then we create a music search system supposing use in the car was used by spoken dialogue. This system can search and play the music what the user want to listen. In this paper, we discuss a detail of the system and spoken dialogue recording with the s)'stem. Experimental results of 150 subjects with a prototype system show that the system could achieve about 80% word correct indoor environment and about 76% word correct in car environment.

CiNii Article

CiNii Books

researchmap
Preliminary study on the evaluation of a quality of spoken dialogue system in terms of user factors

SHIROSE Ayako, HARA Sunao, FUJIMURA Hiroshi, ITO Katsunobu, TAKEDA Kazuya, ITAKURA Fumitada

IEICE technical report. Natural language understanding and models of communication 103 ( 518 ) 7 - 12 2003.12

　More details

Language：Japanese Publisher：The Institute of Electronics, Information and Communication Engineers

This study aims to describe user problems and process of learning skill in using spoken dialogue systems and to reveal how these impact on the evaluation of the system usefulness. For this aim, we designed a new dialogue system and carried out a field test for a large number of subjects and asked them to evaluate the usefulness of the system. The results showed that the evaluation of the system did not correlate a recognition rate but user satisfaction and comprehension. This suggested that the spoken dialogue systems should be evaluated in terms of user factors. Controlled experiments are needed to discuss in detail.

CiNii Article

CiNii Books

researchmap
Preliminary study on the evaluation of a quality of spoken dialogue system in terms of user factors

Ayako Shirose, Sunao Hara, Hiroshi Fujimura, Katsunobu Ito, Kazuya Takeda, Fumitada Itakura

IPSJ SIG Technical Report 2003 ( 124 (2003-SLP-049) ) 253 - 258 2003.12

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：Information Processing Society of Japan (IPSJ)

This study aims to describe user problems and process of learning skill in using spoken dialogue systems and to reveal how these impact on the evaluation of the system usefulness. For this aim, we designed a new dialogue system and carried out a field test for a large number of subjects and asked them to evaluate the usefulness of the system. The results showed that the evaluation of the system did not correlate a recognition rate but user satisfaction and comprehension. This suggested that the spoken dialogue systems should be evaluated in terms of user factors. Controlled experiments are needed to discuss in detail.

CiNii Article

CiNii Books

researchmap
Implementation and evaluation of Julius/Julian on the PDA environment

Sunao Hara, Nobuo Kawaguchi, Kazuya Takeda, Fumitada Itakura

IPSJ SIG Technical Report 2003 ( 14 (2002-SLP-045) ) 131 - 136 2003.2

　More details

Language：Japanese Publishing type：Research paper, summary (national, other academic conference) Publisher：Information Processing Society of Japan (IPSJ)

In order to develop an open source platform of the speech recognition on Personal Digital Assistant (PDA), a general-purpose speech recognition engine Julius/Julian is ported to the PDA environment. From the experimental evaluations the following results are obtained. In die isolated word recognition, 90% accuracy is obtained by about 1.9 times of real time, that in about 73 times to a standard PC environment. In the connected digit recognition, 99% word accuracy is obtained using HMMs trained by sentences recorded by PDA.

CiNii Article

CiNii Books

researchmap

▼display all

Presentations

人対人の会話で自然な話題展開を支援するための対話戦略の検討

前薗そよぎ, 原直, 阿部匡伸

音学シンポジウム2021（情報処理学会音声言語処理研究会） 2021.6.18 情報処理学会

　More details

Event date： 2021.6.18 - 2021.6.19

Language：Japanese Presentation type：Poster presentation

researchmap
呼気流路の容易な制御を目的とした面接触型人工舌の構音改善に関する実験的研究

長塚弘亮, 川上滋央, 古寺寛志, 佐藤匡晃, 田中祐貴, 兒玉直紀, 原直, 皆木省吾

日本顎顔面補綴学会第38回総会・学術大会 2021.6.4

　More details

Event date： 2021.6.3 - 2021.6.5

Language：Japanese Presentation type：Poster presentation

researchmap
ニューラル機械翻訳により推定された読み仮名・韻律記号を入力とする日本語 End-to-End 音声合成の評価

懸川直人, 原直, 阿部匡伸, 井島勇祐

日本音響学会2021年春季研究発表会 2021.3.11 日本音響学会

　More details

Event date： 2021.3.10 - 2021.3.12

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Evaluation of Concept Drift Adaptation for Acoustic Scene Classifier Based on Kernel Density Drift Detection and Combine Merge Gaussian Mixture Model

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

2021.3.10

　More details

Event date： 2021.3.10 - 2021.3.12

Language：Japanese Presentation type：Oral presentation (general)

researchmap
歌唱表現を付与できるBidirectional-LSTM を用いた歌声合成方式の検討

金子隼人, 原直, 阿部匡伸

日本音響学会2021年春季研究発表会 2021.3.10 日本音響学会

　More details

Event date： 2021.3.10 - 2021.3.12

Language：Japanese Presentation type：Poster presentation

researchmap
TTSによる会話支援システムのための感圧センサを用いた手袋型入力デバイスの開発と入力速度の評価

IPSJ SIG-HCI 2020.12.9

　More details

Event date： 2020.12.8 - 2020.12.9

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Module Comparison of Transformer-TTS for Speaker Adaptation based on Fine-tuning International conference

Katsuki Inoue, Sunao Hara, Masanobu Abe

APSIPA Annual Summit and Conference 2020 2020.12 APSIPA

　More details

Event date： 2020.12.7 - 2020.12.10

Language：English Presentation type：Oral presentation (general)

Venue：Online/Virtual Conference (Auckland, New Zealand)

researchmap

Other Link： https://ieeexplore.ieee.org/document/9306250
Concept Drift Adaptation for Acoustic Scene Classifier Based on Gaussian Mixture Model International conference

Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

The 2020 IEEE Region 10 Conference (IEEE-TENCON 2020) 2020.11 IEEE

　More details

Event date： 2020.11.16 - 2020.11.19

Language：English Presentation type：Oral presentation (general)

Venue：Online/Virtual Conference (Osaka, Japan)

researchmap
Controlling the Strength of Emotions in Speech-like Emotional Sound Generated by WaveNet International conference

Kento Matsumoto, Sunao Hara, Masanobu Abe

Interspeech 2020 2020.10 ISCA

　More details

Event date： 2020.10.25 - 2020.10.29

Language：English Presentation type：Oral presentation (general)

Venue：Online/Virtual Conference (Shanghai, China)

researchmap
パーキンソン病重症度推定のためのインソール型圧力センサを用いた時間的特徴量の検討

林倖生, 原直, 阿部匡伸

2020年度(第71回)電気・情報関連学会中国支部連合大会 2020.10.24

　More details

Event date： 2020.10.24

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Transformerを用いた日本語テキストからの読み仮名・韻律記号列推定

懸川直人, 原直, 阿部匡伸, 井島勇祐

日本音響学会2020年秋季研究発表会 2020.9.11 日本音響学会

　More details

Event date： 2020.9.9 - 2020.9.11

Language：Japanese Presentation type：Oral presentation (general)

Venue：オンライン

researchmap
WaveNetを用いた言語情報なし感情音声合成における感情の強さ制御の検討

松本剣斗, 原直, 阿部匡伸

日本音響学会2020年秋季研究発表会 2020.9.10 日本音響学会

　More details

Event date： 2020.9.9 - 2020.9.11

Language：Japanese Presentation type：Poster presentation

Venue：オンライン

researchmap
映像と音声を用いた議論への関与姿勢や肯定的・否定的態度の推定方式の検討

Tsubasa Kanaoka, Yutaro Uehara, Sunao Hara, Masanobu Abe

DICOMO 2020 2020.6.26

　More details

Event date： 2020.6.24 - 2020.6.26

Language：Japanese Presentation type：Oral presentation (general)

researchmap
GPSデータのクラスタリングによる日常生活における場所の重要度の分析

Rui Hirata, Sunao Hara, Masanobu Abe

DICOMO 2020 2020.6.25

　More details

Event date： 2020.6.24 - 2020.6.26

Language：Japanese Presentation type：Oral presentation (general)

researchmap
End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

2020.6.7

　More details

Event date： 2020.6.6 - 2020.6.7

Language：Japanese Presentation type：Poster presentation

researchmap
話題の対象に対する親密度に応じて応答する音声対話システムの検討

Daichi Kato, Sunao Hara, Masanobu Abe

2020.6.6

　More details

Event date： 2020.6.6 - 2020.6.7

Language：Japanese Presentation type：Poster presentation

researchmap
Semi-supervised speaker adaptation for end-to-end speech synthesis with the pretrained models International coauthorship International conference

Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

ICASSP 2020 2020.5 IEEE

　More details

Event date： 2020.5.4 - 2020.5.8

Language：English Presentation type：Oral presentation (general)

Venue：Online/Virtual Conference (Barcelona, Spain)

researchmap
舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討

Seiya Ogino, Sunao Hara, Masanobu Abe

2020.3.17

　More details

Event date： 2020.3.17 - 2020.3.18

Language：Japanese Presentation type：Poster presentation

researchmap
言語情報なし感情合成音を学習に用いたCycleGANによる感情変換方式の検討

Kento Matsumoto, Sunao Hara, Masanobu Abe

2020.3.18

　More details

Event date： 2020.3.16 - 2020.3.18

Language：Japanese Presentation type：Oral presentation (general)

researchmap
End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

2020.3.17

　More details

Event date： 2020.3.16 - 2020.3.18

Language：Japanese Presentation type：Oral presentation (general)

researchmap
ウェアラブルデバイスによる曖昧な入力からのニューラル機械翻訳を用いた日本語文章推定方式

Jun Watanabe, Sunao Hara, Masanobu Abe

IPSJ SIG-HCI 2020.3.16

　More details

Event date： 2020.3.16 - 2020.3.17

Language：Japanese Presentation type：Oral presentation (general)

researchmap
End-to-End⾳声認識を⽤いたEnd-to-End⾳声合成の性能評価 International coauthorship

Katsuki Inoue, Sunao Hara, Masanobu Abe, Shinji Watanabe

2019.11.30

　More details

Event date： 2019.11.30

Language：Japanese Presentation type：Poster presentation

researchmap
DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients’ Speech International conference

Hiroki Murakami, Sunao Hara, Masanobu Abe

APSIPA Annual Summit and Conference 2019 2019.11.19 APSIPA

　More details

Event date： 2019.11.18 - 2019.11.21

Language：English Presentation type：Oral presentation (general)

Venue：Lanzhou, China

researchmap
Speech-like Emotional Sound Generator by WaveNet International conference

Kento Matsumoto, Sunao Hara, Masanobu Abe

APSIPA Annual Summit and Conference 2019 2019.11 APSIPA

　More details

Event date： 2019.11.18 - 2019.11.21

Language：English Presentation type：Oral presentation (general)

Venue：Lanzhou, China

researchmap
WaveNet による言語情報を含まない感情音声合成方式における話者性の検討

Kento Matsumoto, Sunao Hara, Masanobu Abe

2019.9.6

　More details

Event date： 2019.9.4 - 2019.9.6

Language：Japanese Presentation type：Oral presentation (general)

researchmap
新たにデザインされた人工舌と解剖学的人工舌の効果ならびにその選択基準

佐藤匡晃, 長塚弘亮, 川上滋央, 兒玉直紀, 原直, 阿部匡伸, 皆木省吾

日本補綴歯科学会中国・四国支部学術大会 2019.9.1 日本補綴歯科学会中国・四国支部

　More details

Event date： 2019.8.31 - 2019.9.1

Language：Japanese Presentation type：Oral presentation (general)

Venue：広島県福山市

researchmap
A signal processing perspective on human gait: Decoupling walking oscillations and gestures International coauthorship International conference

Adrien Gregorj, Zeynep Yücel, Sunao Hara, Akito Monden, Masahiro Shiomi

The 4th International Conference on Interactive Collaborative Robotics 2019 (ICR 2019) 2019.8

　More details

Event date： 2019.8.20 - 2019.8.25

Language：English Presentation type：Oral presentation (general)

researchmap
GPSデータに基づく日常生活における特別な行動の検出

Makoto Kobayashi, Sunao Hara, Masanobu Abe

DICOMO 2019 2019.7.4

　More details

Event date： 2019.7.3 - 2019.7.5

Language：Japanese Presentation type：Oral presentation (general)

researchmap
CNN Autoencoder から抽出したボトルネック特徴量を用いた環境音分類

Takumi Matsubara, Sunao Hara, Masanobu Abe

DICOMO 2019 2019.7.3

　More details

Event date： 2019.7.3 - 2019.7.5

Language：Japanese Presentation type：Oral presentation (general)

researchmap
WaveNetによる言語情報を含まない感情音声合成方式の検討

Kento Matsumoto, Sunao Hara, Masanobu Abe

2019.6.23

　More details

Event date： 2019.6.22 - 2019.6.23

Language：Japanese Presentation type：Poster presentation

researchmap
i-vectorに基づく賑わい音の推定方式の検討

Zhenyang Wu, Kohei Tomoda, Sunao Hara, Masanobu Abe

2019.6.22

　More details

Event date： 2019.6.22 - 2019.6.23

Language：Japanese Presentation type：Poster presentation

researchmap
舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討

Hiroki Murakami, Sunao Hara, Masanobu Abe

2019.3.4

　More details

Event date： 2019.3.3 - 2019.3.5

Language：Japanese Presentation type：Poster presentation

researchmap
声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討

Seiya Ogino, Hiroki Murakami, Sunao Hara, Masanobu Abe

2019.3.4

　More details

Event date： 2019.3.3 - 2019.3.5

Language：Japanese Presentation type：Poster presentation

researchmap
DNN音声合成における少量の目標感情音声を用いた感情付与方式の検討

Katsuki Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima

2019.3.3

　More details

Event date： 2019.3.3 - 2019.3.5

Language：Japanese Presentation type：Poster presentation

researchmap
ながら聴き用楽曲の作業負荷に及ぼす影響とその選択方式の検討

Kaoru Takase, Masanobu Abe, Sunao Hara

2018.11.21

　More details

Event date： 2018.11.21 - 2018.11.22

Language：Japanese Presentation type：Poster presentation

researchmap
クラウドソーシングによる環境音マップ構築のための主観的な騒々しさ推定方式の検討

Sunao Hara, Masanobu Abe

2018.9.20

　More details

Event date： 2018.9.19 - 2018.9.21

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Speech Enhancement of Glossectomy Patient’s Speech using Voice Conversion Approach

Masanobu Abe, Seiya Ogino, Hiroki Murakami, Sunao Hara

日本生物物理学会第56回年会，シンポジウム：ヘルスシステムの理解と応用 2018.9.15 日本生物物理学会

　More details

Event date： 2018.9.15 - 2018.9.17

Language：English Presentation type：Oral presentation (general)

Venue：岡山大学津島キャンパス

researchmap
声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討

Hiroki Murakami, Sunao Hara, Masanobu Abe

2018.9.12

　More details

Event date： 2018.9.12 - 2018.9.14

Language：Japanese Presentation type：Poster presentation

researchmap
DNN音声合成における感情付与方式の評価

Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

2018.9.12

　More details

Event date： 2018.9.12 - 2018.9.14

Language：Japanese Presentation type：Poster presentation

researchmap
Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient’s Speech Using Spectral Differential Modification in Voice Conversion International conference

Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

Interspeech 2018 2018.9.5 ISCA

　More details

Event date： 2018.9.2 - 2018.9.6

Language：English Presentation type：Poster presentation

Venue：Hyderabad, India

researchmap
音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討

Seiya Ogino, Hiroki Murakami, Sunao Hara, Masanobu Abe

SP 2018.6.28

　More details

Event date： 2018.6.28 - 2018.6.29

Language：Japanese Presentation type：Oral presentation (general)

researchmap
舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

Hiroki Murakami, Seiya Ogino, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

　More details

Event date： 2018.3.13 - 2018.3.15

Language：Japanese Presentation type：Poster presentation

researchmap
クラウドソーシングによる賑わい音識別方式のフィールド実験評価

Kohei Tomoda, Sunao Hara, Masanobu Abe

　More details

Event date： 2018.3.13 - 2018.3.15

Language：Japanese Presentation type：Poster presentation

researchmap
DNN音声合成における感情付与のための継続時間長モデルの検討

Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

　More details

Event date： 2018.3.13 - 2018.3.15

Language：Japanese Presentation type：Poster presentation

researchmap
Sound sensing using smartphones as a crowdsourcing approach International conference

Sunao Hara, Asako Hatakeyama, Shota Kobayashi, Masanobu Abe

APSIPA Annual Summit and Conference 2017 2017.12.15 APSIPA

　More details

Event date： 2017.12.12 - 2017.12.15

Language：English Presentation type：Oral presentation (general)

Venue：Kuala Lumpur, Malaysia

researchmap
An investigation to transplant emotional expressions in DNN-based TTS synthesis, International conference

Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

APSIPA Annual Summit and Conference 2017 2017.12.14 APSIPA

　More details

Event date： 2017.12.12 - 2017.12.15

Language：English Presentation type：Poster presentation

Venue：Kuala Lumpur, Malaysia

researchmap
New monitoring scheme for persons with dementia through monitoring-area adaptation according to stage of disease, International conference

Shigeki Kamada, Yuji Matsuo, Sunao Hara, Masanobu Abe

ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks (LocalRec 2017) ACM

　More details

Event date： 2017.11.7 - 2017.11.10

Language：English Presentation type：Poster presentation

Venue：Redondo Beach, CA, USA

researchmap
DNN に基づく差分スペクトル補正を用いた声質変換による舌亜全摘出者の音韻明瞭性改善の検討

Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

2017.9.26

　More details

Event date： 2017.9.25 - 2017.9.27

Language：Japanese Presentation type：Poster presentation

researchmap
DNN による人間の感覚を考慮した騒々しさ推定方式に基づく騒音マップの作成

Shota Kobayashi, Sunao Hara, Masanobu Abe

2017.9.26

　More details

Event date： 2017.9.25 - 2017.9.27

Language：Japanese Presentation type：Poster presentation

researchmap
DNN音声合成における話者と感情の情報を扱うためのモデル構造の検討

Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

2017.9.25

　More details

Event date： 2017.9.25 - 2017.9.27

Language：Japanese Presentation type：Poster presentation

researchmap
Prediction of subjective assessments for a noise map using deep neural networks International conference

Shota Kobayashi, Sunao Hara, Masanobu Abe

2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2017) 2017.9.13 ACM

　More details

Event date： 2017.9.11 - 2017.9.15

Language：English Presentation type：Poster presentation

Venue：Maui, Hawaii, USA

researchmap
Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion International conference

Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

Interspeech 2017 2017.8.23 ISCA

　More details

Event date： 2017.8.20 - 2017.8.24

Language：English Presentation type：Poster presentation

Venue：Stockholm, Sweden

researchmap
環境音収集に効果的なインセンティブを与える可視化方式の検討

Asako Hatakeyama, Sunao Hara, Masanobu Abe

DICOMO 2017 2017.6.28

　More details

Event date： 2017.6.28 - 2017.6.30

Language：Japanese Presentation type：Oral presentation (general)

researchmap
DNN音声合成における感情付与のためのモデル構造の検討

Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

SP 2017.6.22

　More details

Event date： 2017.6.22 - 2017.6.23

Language：Japanese Presentation type：Oral presentation (general)

researchmap
2つの粒度の生活圏に基づく見守りシステム

Shigeki Kamada, Sunao Hara, Masanobu Abe

2017.3.22

　More details

Event date： 2017.3.22 - 2017.3.25

Language：Japanese Presentation type：Oral presentation (general)

researchmap
スマートフォンで収録した環境音データベースを用いたCNNによる環境音分類

Shunji Toba, Sunao Hara, Masanobu Abe

2017.3.16

　More details

Event date： 2017.3.15 - 2017.3.17

Language：Japanese Presentation type：Poster presentation

researchmap
DNNによる人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

Shota Kobayashi, Sunao Hara, Masanobu Abe

2017.3.16

　More details

Event date： 2017.3.15 - 2017.3.17

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Enhancing a Glossectomy Patient’s Speech via GMM-based Voice Conversion International conference

Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

APSIPA Annual Summit and Conference 2016 2016.12.13 APSIPA

　More details

Event date： 2016.12.13 - 2016.12.16

Language：English Presentation type：Oral presentation (general)

Venue：Jeju, Korea

researchmap
A classification method for crowded situation using environmental sounds based on Gaussian mixture model-universal background model International conference

Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

ASA/ASJ 5th Joint Meeting 2016.11.29 米国音響学会／日本音響学会

　More details

Event date： 2016.11.28 - 2016.12.2

Language：English Presentation type：Poster presentation

Venue：Honolulu, Hawaii

researchmap
基本周波数変形を考慮したスペクトル変換手法の検討

Kengo Toko, Sunao Hara, Masanobu Abe

2016.11.19

　More details

Event date： 2016.11.19 - 2016.11.20

Language：Japanese Presentation type：Poster presentation

researchmap
スマートフォンで収録した環境音に含まれるタップ音除去方式の検討

Kohei Tomoda, Sunao Hara, Masanobu Abe

2016.9.15

　More details

Event date： 2016.9.14 - 2016.9.16

Language：Japanese Presentation type：Poster presentation

researchmap
重複音を含む環境音データベースにおける環境音検出のための特徴量の基本検討

Sunao Hara, Tomoyasu Tanaka, Masanobu Abe

2016.9.15

　More details

Event date： 2016.9.14 - 2016.9.16

Language：Japanese Presentation type：Oral presentation (general)

researchmap
GMMに基づく声質変換を用いた舌亜全摘出者の音韻明瞭性改善の検討

Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

2016.9.15

　More details

Event date： 2016.9.14 - 2016.9.16

Language：Japanese Presentation type：Oral presentation (general)

researchmap
RNNによる実環境データからのマルチ音響イベント検出

Shunji Toba, Sunao Hara, Masanobu Abe

2016.9.15

　More details

Event date： 2016.9.14 - 2016.9.16

Language：Japanese Presentation type：Poster presentation

researchmap
LiBS: Lifelog browsing system to support sharing of memories International conference

Atsuya Namba, Sunao Hara, Masanobu Abe

2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2016) 2016.9.13 ACM

　More details

Event date： 2016.9.12 - 2016.9.16

Language：English Presentation type：Poster presentation

Venue：Heidelberg, Germany

researchmap
Safety vs. Privacy: User Preferences from the Monitored and Monitoring Sides of a Monitoring System International conference

Shigeki Kamada, Sunao Hara, Masanobu Abe

2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2016) 2016.9.13 ACM

　More details

Event date： 2016.9.12 - 2016.9.16

Language：English Presentation type：Poster presentation

Venue：Heidelberg, Germany

researchmap
主観的評価に基づいた騒音マップ構築のためのクラウドソーシングによる環境音収集システム

Sunao Hara, Shota Kobayashi, Masanobu Abe

SP 2016.8.25

　More details

Event date： 2016.8.24 - 2016.8.25

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Sound collection systems using a crowdsourcing approach to construct sound map based on subjective evaluation International conference

Sunao Hara, Shota Kobayashi, Masanobu Abe

IEEE ICME Workshop on Multimedia Mobile Cloud for Smart City Applications (MMCloudCity-2016) 2016.7.15 IEEE

　More details

Event date： 2016.7.11 - 2016.7.15

Language：English Presentation type：Oral presentation (general)

Venue：Seattle, WA, USA

researchmap
GPSデータ匿名化レベルの主観的許容度を客観的に表現する指標の検討

Yusuke Mitou, Sunao Hara, Masanobu Abe

DICOMO 2016 2016.7.7

　More details

Event date： 2016.7.6 - 2016.7.8

Language：Japanese Presentation type：Oral presentation (general)

researchmap
人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

Shota Kobayashi, Sunao Hara, Masanobu Abe

DICOMO 2016 2016.7.6

　More details

Event date： 2016.7.6 - 2016.7.8

Language：Japanese Presentation type：Oral presentation (general)

researchmap
A measure for transfer tendency between staying places

Takashi Ofuji, Sunao Hara, Masanobu Abe

LOIS 2016.5.13

　More details

Event date： 2016.5.12 - 2016.5.13

Language：Japanese Presentation type：Oral presentation (general)

researchmap
賑わい度推定のための環境音データベースの構築

Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

2016 Spring Meeting of ASJ 2016.3.11

　More details

Event date： 2016.3.9 - 2016.3.11

Language：Japanese Presentation type：Poster presentation

researchmap
A watching method to protect users' privacy using living area

Shigeki Kamada, Sunao Hara, Masanobu Abe

LOIS 2016.1.21

　More details

Event date： 2016.1.21 - 2016.1.22

Language：Japanese Presentation type：Oral presentation (general)

researchmap
A Spoken Dialog System with Redundant Response to Prevent User Misunderstanding International conference

Masaki Yamaoka, Sunao Hara, Masanobu Abe

APSIPA Annual Summit and Conference 2015 2015.12.19 APSIPA

　More details

Event date： 2015.12.16 - 2015.12.19

Language：English Presentation type：Oral presentation (general)

Venue：Hong Kong

researchmap
環境音検出器を用いた環境音の可視化に関する検討

田中智康,原直,阿部匡伸

第17回IEEE広島支部学生シンポジウム（HISS 17th） IEEE広島支部

　More details

Event date： 2015.11.21 - 2015.11.22

Language：Japanese Presentation type：Poster presentation

Venue：岡山大学

researchmap
心地よく話すことができる聞き役音声対話システムのための対話戦略

齊藤椋太,原直,阿部匡伸

第17回IEEE広島支部学生シンポジウム（HISS 17th） IEEE広島支部

　More details

Event date： 2015.11.21 - 2015.11.22

Language：Japanese Presentation type：Poster presentation

Venue：岡山大学

researchmap
クラウドソーシングによる環境音収集システムを用いた予備収録実験

原直,阿部匡伸,曽根原登

2015年日本音響学会秋季研究発表会 2015.9.18 日本音響学会

　More details

Event date： 2015.9.16 - 2015.9.18

Language：Japanese Presentation type：Poster presentation

Venue：会津大学

researchmap
冗長なシステム応答を用いたユーザの誤認識に頑健な音声対話システムに関する検討

山岡将綺,原直,阿部匡伸

2015年日本音響学会秋季研究発表会 2015.9.18 日本音響学会

　More details

Event date： 2015.9.16 - 2015.9.18

Language：Japanese Presentation type：Poster presentation

Venue：会津大学

researchmap
音楽を用いた生活収録音の振り返り手法の検討

鳥羽隼司,原直,阿部匡伸

2015年日本音響学会秋季研究発表会 2015.9.17 日本音響学会

　More details

Event date： 2015.9.16 - 2015.9.18

Language：Japanese Presentation type：Oral presentation (general)

Venue：会津大学

researchmap
A Sub-Band Text-to-Speech by Combining Sample-Based Spectrum with Statistically Generated Spectrum International conference

Tadashi Inai, Sunao Hara, Masanobu Abe, Yusuke Ijima, Noboru Miyazaki and Hideyuki Mizuno

Interspeech 2015 ISCA

　More details

Event date： 2015.9.6 - 2015.9.10

Language：English Presentation type：Poster presentation

Venue：Dresden, Germany

researchmap
Algorithm to estimate a living area based on connectivity of places with home International conference

Yuji Matsuo, Sunao Hara, Masanobu Abe

HCI International 2015

　More details

Event date： 2015.8.2 - 2015.8.7

Language：English Presentation type：Poster presentation

Venue：Los Angels, CA, USA

researchmap
Extraction of key segments from day-long sound data International conference

Akinori Kasai, Sunao Hara, Masanobu Abe

HCI International 2015

　More details

Event date： 2015.8.2 - 2015.8.7

Language：English Presentation type：Poster presentation

Venue：Los Angels, CA, USA

researchmap
LiBS:発見と気付きを可能とするライフログブラウジング方式

難波敦也, 原直,阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2015） 2015.7.10

　More details

Event date： 2015.7.8 - 2015.7.10

Language：Japanese Presentation type：Oral presentation (general)

Venue：岩手県安比高原

researchmap
長期取得音からライフログとして残したい音の抽出方法

笠井昭範,原直,阿部匡伸

マルチメディア，分散，協調とモバイルシンポジウム（DICOMO2015） 2015.7.10

　More details

Event date： 2015.7.8 - 2015.7.10

Language：Japanese Presentation type：Oral presentation (general)

Venue：岩手県安比高原

researchmap
振り返り支援における効率的な映像要約のための自動収集ライフログ活用法

大西杏菜,原直,阿部匡伸

ライフインテリジェンスとオフィス情報システム研究会 (LOIS) 電子情報通信学会

　More details

Event date： 2015.5.14 - 2015.5.15

Language：Japanese Presentation type：Oral presentation (general)

Venue：津田塾大学小平キャンパス

researchmap
Sound collection and visualization system enabled participatory and opportunistic sensing approaches International conference

Sunao Hara, Masanobu Abe, Noboru Sonehara

2nd International Workshop on Crowd Assisted Sensing, Pervasive Systems and Communications (CASPer 2015) IEEE

　More details

Event date： 2015.3.27

Language：English Presentation type：Oral presentation (general)

Venue：St. Louis, Missouri, USA

researchmap
ミックスボイスの地声・裏声との類似度比較

家村朋典,原直,阿部匡伸

2015年日本音響学会春季研究発表会 2015.3.17 日本音響学会

　More details

Event date： 2015.3.16 - 2015.3.18

Language：Japanese Presentation type：Poster presentation

Venue：日本大学理工学部

researchmap
聴取者の主観評価に基づく音地図作成のための環境音収録

原直,阿部匡伸,曽根原登

2015年日本音響学会春季研究発表会 2015.3.17 日本音響学会

　More details

Event date： 2015.3.16 - 2015.3.18

Language：Japanese Presentation type：Oral presentation (general)

Venue：日本大学理工学部

researchmap
高域部への素片スペクトルとHMM生成スペクトルの導入によるHMM合成音声の品質改善の検討

稻井禎,原直,阿部匡伸,井島勇祐,宮崎昇,水野秀之

2015年日本音響学会春季研究発表会 2015.3.17 日本音響学会

　More details

Event date： 2015.3.16 - 2015.3.18

Language：Japanese Presentation type：Poster presentation

Venue：日本大学理工学部

researchmap
移動経路のパターン分類による人の行動分析

瀬藤諒,原直,阿部匡伸

ライフインテリジェンスとオフィス情報システム研究会 (LOIS) 電子情報通信学会

　More details

Event date： 2015.3.5 - 2015.3.6

Language：Japanese Presentation type：Oral presentation (general)

Venue：沖縄科学技術大学院大学

researchmap
滞在地の特徴量を利用した「特別な日」検索方式

林啓吾,原直,阿部匡伸

ライフインテリジェンスとオフィス情報システム研究会 (LOIS) 電子情報通信学会

　More details

Event date： 2015.3.5 - 2015.3.6

Language：Japanese Presentation type：Oral presentation (general)

Venue：沖縄科学技術大学院大学

researchmap
滞在地と経路に着目した生活圏抽出法

松尾雄二,原直,阿部匡伸

ライフインテリジェンスとオフィス情報システム研究会 (LOIS) 電子情報通信学会

　More details

Event date： 2015.3.5 - 2015.3.6

Language：Japanese Presentation type：Oral presentation (general)

Venue：沖縄科学技術大学院大学

researchmap
クラウドセンシングデータによる地域の賑わい分析 -地域経済活性化- International conference

Sunao Hara

ISSI2014

　More details

Event date： 2015.2.16 - 2015.2.17

Language：Japanese Presentation type：Oral presentation (invited, special)

researchmap
Extracting Daily Patterns of Human Activity Using Non-Negative Matrix Factorization International conference

Masanobu Abe, Akihiko Hirayama, Sunao Hara

IEEE International Conference on Consumer Electronics IEEE

　More details

Event date： 2015.1.9 - 2015.1.12

Language：English Presentation type：Oral presentation (general)

Venue：Las Vegas, USA

researchmap
発話への関心の有無判別における聞き手の判別基準の有効性

Ryota Saito, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.12.14

Language：Japanese Presentation type：Poster presentation

researchmap
ロック歌唱における「歪み声」と「ミックスボイス」の音響的特徴分析

Tomonori Iemura, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.12.14

Language：Japanese Presentation type：Poster presentation

researchmap
A Hybrid Text-to-Speech Based on Sub-Band Approach International conference

Takuma Inoue, Sunao Hara, Masanobu Abe

Asia-Pacific Signal and Information Processing Association 2014 Annual Summit and Conference (APSIPA ASC 2014) Asia-Pacific Signal and Information Processing Association (APSIPA)

　More details

Event date： 2014.12.9 - 2014.12.12

Language：English Presentation type：Poster presentation

Venue：Cambodia

researchmap
音素波形選択型音声合成方式に用いるデータベースサイズと主観評価との関係分析

Tadashi Inai, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.9.3 - 2014.9.5

Language：Japanese Presentation type：Poster presentation

researchmap
クラウドセンシングにより収集された環境音のシンボル表現を用いた音地図構築手法

Sunao Hara, Masanobu Abe, Noboru Sonehara

　More details

Event date： 2014.9.3 - 2014.9.5

Language：Japanese Presentation type：Oral presentation (general)

researchmap
FLAG: 位置情報を基軸としたライフログ集約システム

Akinori Kasai, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.6.28 - 2014.6.29

Language：Japanese Presentation type：Oral presentation (general)

researchmap
GPSデータから構築したネットワーク構造におけるノード次数に基づく行動分析法の検討

Daisuke Fujioka, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.5.29 - 2014.5.30

Language：Japanese Presentation type：Oral presentation (general)

researchmap
スマートデバイスを用いたクラウドソーシングによる環境音収集システムの開発

Sunao Hara, Akinori Kasai, Masanobu Abe, Noboru Sonehara

　More details

Event date： 2014.5.24 - 2014.5.25

Language：Japanese Presentation type：Poster presentation

researchmap
車載用音声対話システムにおけるユーザ負荷を考慮した対話戦略の検討

Masaki Yamaoka, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.5.22 - 2014.5.23

Language：Japanese Presentation type：Oral presentation (general)

researchmap
非負値行列因子分解によるPC操作ログからの勤務パタン抽出

Akihiko Hirayama, Sunao Hara, Masanobu Abe

　More details

Event date： 2014.5.15 - 2014.5.16

Language：Japanese Presentation type：Oral presentation (general)

researchmap
クラウドソーシングによる環境音収集

原直

第15回岡山情報通信技術研究会岡山情報通信技術研究会

　More details

Event date： 2014.4.30

Language：Japanese Presentation type：Oral presentation (general)

Venue：岡山大学

researchmap
New Approach to Emotional Information Exchange: Experience Metaphor Based on Life Logs International conference

Masanobu Abe, Daisuke Fujioka, Kazuto Hamano, Sunao Hara, Rika Mochizuki, Tomoki Watanabe

The 12th IEEE International Conference on Pervasive Computing and Communications (PerCom 2014) IEEE

　More details

Event date： 2014.3.24 - 2014.3.28

Language：English Presentation type：Poster presentation

Venue：Budapest, Hungary

researchmap
他人との行動ログ比較による個人の行動特徴分析

Ryo Seto, Masanobu Abe, Sunao Hara

　More details

Event date： 2014.3.18 - 2014.3.21

Language：Japanese Presentation type：Oral presentation (general)

researchmap
滞在地と経路に着目した生活圏抽出法の検討

松尾雄二,原直,阿部匡伸

情報処理学会第76回全国大会 2014

　More details

Event date： 2014.3.11 - 2014.3.13

Language：Japanese Presentation type：Oral presentation (general)

Venue：東京電機大学

researchmap
滞在地の特徴量を利用した「特別な日」検索方式の検討

林啓吾,原直,阿部匡伸

情報処理学会第76回全国大会 2014

　More details

Event date： 2014.3.11 - 2014.3.13

Language：Japanese Presentation type：Oral presentation (general)

Venue：東京電機大学

researchmap
音声波形の高域利用による HMM 音声合成方式の評価

井上拓真,原直,阿部匡伸,井島勇祐,水野秀之

2014年日本音響学会春季研究発表会 2014

　More details

Event date： 2014.3.10 - 2014.3.12

Language：Japanese Presentation type：Oral presentation (general)

Venue：日本大学

researchmap
クラウドソーシングによる環境音収集のためのスマートデバイス用アプリケーションの開発

原直,笠井昭範,阿部匡伸,曽根原登

電子情報通信学会 LOIS研究会 2014

　More details

Event date： 2014.3.7 - 2014.3.8

Language：Japanese Presentation type：Oral presentation (general)

researchmap
地理情報を活用したモバイル音声対話システムに関する研究

原直

情報処理学会音声言語処理研究会（SIG–SLP第100回シンポジウム）

　More details

Event date： 2014.1.31 - 2014.2.1

Language：Japanese Presentation type：Oral presentation (general)

researchmap
PC操作ログから抽出したソフトウェアの使用様態による働き方の分析

平山明彦,原直,阿部匡伸

第15回IEEE広島支部学生シンポジウム（HISS 15th） 2013

　More details

Event date： 2013.11

Language：Japanese Presentation type：Poster presentation

Venue：鳥取大学

researchmap
GPSデータの滞在地に着目した行動振り返り支援方式

林啓吾,原直,阿部匡伸

平成25年度（第64回）電気・情報関連学会中国支部連合大会

　More details

Event date： 2013.10.19

Language：Japanese Presentation type：Oral presentation (general)

Venue：岡山大学

researchmap
位置情報による行動分析を行うための経由地検出の検討

瀬藤諒,原直,阿部匡伸

平成25年度（第64回）電気・情報関連学会中国支部連合大会

　More details

Event date： 2013.10.19

Language：Japanese Presentation type：Oral presentation (general)

Venue：岡山大学

researchmap
GPSデータを用いた生活圏の動的生成のためのデータ量に関する検討

松尾雄二,原直,阿部匡伸

平成25年度（第64回）電気・情報関連学会中国支部連合大会

　More details

Event date： 2013.10.19

Language：Japanese Presentation type：Oral presentation (general)

Venue：岡山大学

researchmap
混合正規分布を用いた声質変換法における分布数とスペクトル変換精度との関係性の検討

遠藤一輝,原直,阿部匡伸

平成25年度（第64回）電気・情報関連学会中国支部連合大会

　More details

Event date： 2013.10.19

Language：Japanese Presentation type：Oral presentation (general)

Venue：岡山大学

researchmap
スペクトル包絡と基本周波数の変換が音声の個人性に与える影響の検討

Keisuke Kawai, Sunao Hara, Masanobu Abe

　More details

Event date： 2013.9.25 - 2013.9.27

Language：Japanese Presentation type：Poster presentation

researchmap
音声波形の高域利用によるHMM音声合成の高品質化

Takuma Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima, Hideyuki Mizuno

　More details

Event date： 2013.9.25 - 2013.9.27

Language：Japanese Presentation type：Poster presentation

researchmap
音響信号とマルチセンサー信号を利用したスマートフォンによる自動車の燃費推定

Shohei Nanba, Sunao Hara, Masanobu Abe

　More details

Event date： 2013.5.16 - 2013.5.17

Language：Japanese Presentation type：Oral presentation (general)

researchmap
音声情報案内システムにおけるBag-of-Wordsを用いた無効入力棄却モデルの可搬性の評価

真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

日本音響学会 2013年春季研究発表会日本音響学会

　More details

Event date： 2013.3.13 - 2013.3.15

Language：Japanese Presentation type：Oral presentation (general)

Venue：東京工科大学

researchmap
HMM音声合成と波形音声合成の混在による方式の評価

Takuma Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima, Hideyuki Mizuno

　More details

Event date： 2013.3.13 - 2013.3.15

Language：Japanese Presentation type：Oral presentation (general)

researchmap
The Individual Feature Analysis of the Network of the Stay Extracted from GPS Data

Daisuke Fujioka, Sunao Hara, Masanobu Abe

　More details

Event date： 2013.3.7 - 2013.3.8

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Examination of an event sampling process with the similar impression from the others' life log

Kazuto Hamano, Sunao Hara, Masanobu Abe, Daisuke Fujioka, Rika Mochizuki, Tomoki Watanabe

　More details

Event date： 2013.3.7 - 2013.3.8

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Bag-of-Wordsを用いた音声情報案内システム無効入力棄却モデルの可搬性の評価

真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

第15回日本音響学会関西支部若手研究者交流研究発表会日本音響学会関西支部

　More details

Event date： 2012.12.9

Language：Japanese Presentation type：Poster presentation

researchmap
Development of a toolkit handling multiple speech-oriented guidance agents for mobile applications International conference

Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

The 4th International Workshop on Spoken Dialog Systems (IWSDS2012)

　More details

Event date： 2012.11.28 - 2012.11.30

Language：English Presentation type：Poster presentation

Venue：Paris, France

researchmap
Evaluation of invalid input discrimination using BOW for speech-oriented guidance system International conference

Haruka Majima, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

The 4th International Workshop on Spoken Dialog Systems (IWSDS2012)

　More details

Event date： 2012.11.28 - 2012.11.30

Language：English Presentation type：Poster presentation

Venue：Paris, France

researchmap
音声情報システムにおける最大エントロピー法を用いた無効入力棄却の評価

真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

日本音響学会 2012年秋季研究発表会日本音響学会

　More details

Event date： 2012.9

Language：Japanese Presentation type：Oral presentation (general)

researchmap
携帯端末用の音声情報案内システム開発に向けたネットワークサービスの検討

原直,川波弘道,猿渡洋,鹿野清宏

音声言語処理研究会 2012 情報処理学会

　More details

Event date： 2012.7

Language：Japanese Presentation type：Oral presentation (general)

researchmap
クラウド時代の新しい音声研究パラダイム

秋葉友良,岩野公司,緒方淳,小川哲司,小野順貴,篠崎隆宏,篠田浩一,南條浩輝,西崎博光,西田昌史,西村竜一,原直,堀貴明

音声言語処理研究会 2012 情報処理学会

　More details

Event date： 2012.7

Language：Japanese Presentation type：Symposium, workshop panel (nominated)

researchmap
音声情報案内システムにおけるBag-of-Wordsを特徴量とした無効入力の棄却

真嶋温佳,トーレス・ラファエル,川波弘道,原直,松井知子,猿渡洋,鹿野清宏

音声言語処理研究会 2012 情報処理学会

　More details

Event date： 2012.7

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Causal analysis of task completion erros in spoken music retrieval interactions International conference

Sunao Hara, Norihide Kitaoka, Kazuya Takeda

LREC 2012 2012 ELDA

　More details

Event date： 2012.5

Language：English Presentation type：Poster presentation

Venue：Istanbul, Turkey

researchmap
Multi-band speech recognition using band-dependent confidence measures of blind source separation International conference

Atsushi Ando, Hiromasa Ohashi, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

The Acoustics 2012 2012

　More details

Event date： 2012.5

Language：English Presentation type：Poster presentation

Venue：Hong Kong

researchmap
携帯端末用音声情報案内システムのためのマイク入力に関する調査

中清行,原直,川波弘道,猿渡洋,鹿野清宏

電子情報通信学会総合大会情報・システムソサイエティ特別企画学生ポスターセッション 2012 電子情報通信学会

　More details

Event date： 2012.3

Language：Japanese Presentation type：Poster presentation

researchmap
周波数帯域ごとの音源分離信頼度を利用したマルチバンド音声認識

安藤厚志,大橋宏正,原直,北岡教英,武田一哉

2012年春季研究発表会 2012 日本音響学会

　More details

Event date： 2012.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
多様な利用環境における音声情報案内サービスソフトウェアの開発

原直,川波弘道,猿渡洋,鹿野清宏

電子情報通信学会総合大会 2012 電子情報通信学会

　More details

Event date： 2012.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
ブラインド音源分離の信頼度を用いたマルチバンド音声認識

安藤厚志,大橋宏正,原直,北岡教英,武田一哉

音声研究会 2012 電子情報通信学会

　More details

Event date： 2012.2

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation International conference

Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

ASRU 2011 2011 IEEE

　More details

Event date： 2011.12

Language：English Presentation type：Poster presentation

Venue：Hawaii

researchmap
Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations International conference

Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

APSIPA-ASC 2011 2011 APSIPA

　More details

Event date： 2011.10

Language：English Presentation type：Poster presentation

Venue：Xi'an, China

researchmap
On-line detection of task incompletion for spoken dialog systems using utterance and behavior tag N-gram vectors International conference

Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

IWSDS 2011 2011

　More details

Event date： 2011.9

Language：English Presentation type：Poster presentation

Venue：Granada, Spain

researchmap
Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems International conference

Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

Interspeech 2011 2011 ISCA

　More details

Event date： 2011.8

Language：English Presentation type：Poster presentation

Venue：Florence, Italy

researchmap
Music recommendation system based on human-to-human conversation recognition International conference

Hiromasa OHASHI, Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

HCIAmI'11 2011

　More details

Event date： 2011.7

Language：English Presentation type：Oral presentation (general)

Venue：Nottingham, U.K.

researchmap
雑談音声の認識に基づく楽曲連想再生システム

大橋宏正,原直,北岡教英,武田一哉

2011年春季研究発表会 2011 日本音響学会

　More details

Event date： 2011.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
MLLR変換行列に基づいた音響特徴量生成による音響モデル学習

伊藤新,原直,北岡教英,武田一哉

2011年春季研究発表会 2011 日本音響学会

　More details

Event date： 2011.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
音声対話システムにおける発話・行動タグN-gram を用いた課題未達成対話の検出手法と分析

原直,北岡教英,武田一哉

2011年春季研究発表会 2011 日本音響学会

　More details

Event date： 2011.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
MLLR変換行列により制約された音響特徴量生成による頑健な音響モデル

伊藤新,北岡教英,原直,武田一哉

音声言語シンポジウム 2010 情報処理学会

　More details

Event date： 2010.12

Language：Japanese Presentation type：Oral presentation (general)

researchmap
雑談音声の常時認識による楽曲提案システム

大橋宏正,北岡教英,原直,武田一哉

音声研究会 2010 電子情報通信学会

　More details

Event date： 2010.10

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram International conference

Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

INTERSPEECH2010 2010.9

　More details

Event date： 2010.9

Language：English Presentation type：Poster presentation

researchmap
音声対話システムの発話系列N-gram を用いた課題未達成対話のオンライン検出

原直,北岡教英,武田一哉

2011年秋季研究発表会 2010 日本音響学会

　More details

Event date： 2010.9

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Rapid acoustic model adaptation using inverse MLLR-based feature generation International conference

Arata ITO, Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

The 20th International Congress on Acoustics (ICA2010) 2010.8

　More details

Event date： 2010.8

Language：English Presentation type：Poster presentation

researchmap
Estimation method of user satisfaction using N-gram-based dialog history model for spoken dialog system International conference

Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

7th conference on International Language Resources and Evaluation (LREC'10) 2010.5

　More details

Event date： 2010.5

Language：English Presentation type：Oral presentation (general)

researchmap
MLLR 変換行列により生成した音声特徴量に基づく高速モデル適応

伊藤新,原直,北岡教英,武田一哉

2011年秋季研究発表会 2010 日本音響学会

　More details

Event date： 2010.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
楽曲連想再生のための文書特徴量と音響特徴量の対応付け

高橋量衛, 大石康智, 原直, 北岡教英, 武田一哉

第4回音声ドキュメント処理ワークショップ 2010

　More details

Event date： 2010.2

Language：Japanese Presentation type：Oral presentation (general)

researchmap
音声対話システムの対話履歴N-gramを利用したユーザ満足度推定手法

原直,北岡教英,武田一哉

音声言語シンポジウム 2009 情報処理学会

　More details

Event date： 2009.12

Language：Japanese Presentation type：Oral presentation (general)

researchmap
複数音響モデルからの最適選択による音声認識

伊藤新,原直,宮島千代美,北岡教英,武田一哉

平成21年度電気関係学会東海支部連合大会 2009

　More details

Event date： 2009.9

Language：Japanese Presentation type：Oral presentation (general)

researchmap
楽曲間の主観的類似度と音響的類似度との関連付けに関する検討

平賀悠介,大石康智,原直,武田一哉

2009年秋季研究発表会 2009 日本音響学会

　More details

Event date： 2009.9

Language：Japanese Presentation type：Oral presentation (general)

researchmap
音声対話システムのユーザ満足度推論におけるネットワークモデルの構築と評価

原直,北岡教英,武田一哉

2009年春季研究発表会 2009 日本音響学会

　More details

Event date： 2009.3

Language：Japanese Presentation type：Oral presentation (general)

researchmap
音声認識システムの満足度評価におけるユーザモデル

原直,北岡教英,武田一哉

音声言語シンポジウム 2008 情報処理学会

　More details

Event date： 2008.12

Language：Japanese Presentation type：Oral presentation (general)

researchmap
Data collection and usability study of a PC-based speech application in various user environments International conference

Sunao Hara, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

Oriental-COCOSDA 2008 2008.11

　More details

Event date： 2008.11

Language：English Presentation type：Oral presentation (general)

researchmap
In-car speech data collection along with various multimodal signals International conference

Akira Ozaki, Sunao Hara, Takashi Kusakawa, Chiyomi Miyajima, Takanori Nishino, Norihide Kitaoka, Katunobu Itou, Kazuya Takeda

The 6th International Language Resources and Evaluation (LREC08) 2008.5

　More details

Event date： 2008.5

Language：English Presentation type：Oral presentation (general)

researchmap
DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients' Speech International conference

Hiroki Murakami, Sunao Hara, Masanobu Abe

APSIPA Annual Summit and Conference 2019 2019.11 APSIPA

　More details

Language：English Presentation type：Oral presentation (general)

Venue：Lanzhou, China

researchmap
舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

村上博紀, 荻野聖也, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

日本音響学会2018年春季研究発表会 2018.3.26 日本音響学会

　More details

Language：Japanese Presentation type：Poster presentation

Venue：日本工業大学宮代キャンパス

researchmap
クラウドソーシングによる賑わい音識別方式のフィールド実験評価

朝田興平, 原直, 阿部匡伸

日本音響学会2018年春季研究発表会 2018.3.25 日本音響学会

　More details

Language：Japanese Presentation type：Poster presentation

Venue：日本工業大学宮代キャンパス

researchmap
DNN音声合成における感情付与のための継続時間長モデルの検討

井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

日本音響学会2018年春季研究発表会 2018.3.25 日本音響学会

　More details

Language：Japanese Presentation type：Poster presentation

Venue：日本工業大学宮代キャンパス

researchmap
An online customizable music retrieval system with a spoken dialogue interface International conference

Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

4th Joint Meeting of ASA/ASJ 2006.11

　More details

Language：English Presentation type：Poster presentation

researchmap
Preliminary Study of a Learning Effect on Users to Develop a New Evaluation of the Spoken Dialogue System International conference

Sunao Hara, Ayako Shirose, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

Oriental-COCOSDA 2005 2005.12

　More details

Language：English Presentation type：Oral presentation (general)

researchmap

▼display all

Works

ChartEx

Sunao Hara

2017.5

　More details

Work type：Software Location：GitHub

Excel Addin for export chart as image file such as png, jpeg, and pdf.

researchmap
オトログマッパー

原直

2014

-

2016

　More details

Work type：Software Location：Google Play

研究用に作成した Android アプリケーション

researchmap
TTX KanjiMenu Plugin

Sunao Hara

2007.3

　More details

Work type：Software

researchmap
Pocket Julius

原直

2003.1

　More details

Work type：Software

このパッケージは大語彙音声認識デコーダ Julius を Microsoft Pocket PC 2002 環境で動くようにした Pocket Julius のデモパッケージです．

researchmap

Awards

学会活動貢献賞

2023.3 日本音響学会

　More details

researchmap
教育貢献賞

2022.3 岡山大学工学部実験・演習科目における音声配信環境の構築

原直, 右田剛史

　More details

researchmap
教育貢献賞

2022.3 岡山大学工学部教育用計算機システムの充実に関する貢献

乃村能成, 上野史, 原直, 渡邊誠也

　More details

researchmap
社会貢献賞

2021.3 岡山大学工学部

　More details

researchmap
ベストティーチャー賞

2020.3 岡山大学工学部

　More details

researchmap
教育貢献賞

2019.3 岡山大学工学部

原直

　More details

researchmap
学会活動貢献賞

2019.3 日本音響学会

原直

　More details

researchmap
FIT奨励賞

2018.9 第17回情報科学技術フォーラム

原直

　More details

researchmap
優秀論文賞

2016.8 情報処理学会 DICOMO2016

小林将大, 原直, 阿部匡伸

　More details

researchmap
教育貢献賞

2016.3 岡山大学工学部

原直

　More details

researchmap
平成25年度岡山工学振興会科学技術賞

2013.7 公益財団法人岡山工学振興会

原直

　More details

researchmap
平成16年秋季研究発表会ポスター賞

2004.9 日本音響学会

原直

　More details

researchmap

▼display all

Research Projects

Research on a machine learning method for estimating atmospheres of tourist attractions from environmental sounds considering concept drift

Grant number：23K11335 2023.04 - 2027.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)

原直

　 More details

Grant amount：\4680000 （ Direct expense: \3600000 、 Indirect expense：\1080000 ）

researchmap
協調的ライブ記録が支えるアクティブラーニング＠オンラインの技術研究

Grant number：21K12155 2021.04 - 2024.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

西村竜一, 原直

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4030000 （ Direct expense: \3100000 、 Indirect expense：\930000 ）

本研究では、アクティブラーニングをオンライン展開するために必要となる要素技術開発を行う。特に、グループワークをオンラインで実施することを想定し、学生と学生、学生と指導者、指導者と指導者の間の意思疎通を支援する技術を開発する。
利用者に適応可能なフレキシブルなインタフェースを実現するため、話者判別法の検討を行った。特に、若年話者判別タスクに深層学習を適用し、異なる分類モデルの検証を行った。データセットには、クラウドソーシングで収集したオンライン実環境発話を使用した。
オンラインでの意思疎通の際に、度々問題となる話者の早口の可視化手法を検討した。自動音声認識を応用して、単位時間あたりの発話文字数（発話速度）の計測を試みたが、早口の検出部分と聴講者が早口と感じるタイミングが異なることがあることを確認した。複数の自動音声認識エンジンを併用した実験では、人手で書き起こした正確な場合よりも自動音声認識の出力文字数が少なくなる傾向があった。この減少を早口の可視化のファクタとして利用することを検討した。
音声と映像の併用特徴量を用いて、議論の様子の評価手法を検討した。音と画像を併用することで識別率の改善傾向が得られた。音響信号が取得できない場合でも、画像中の人の動きから判定できることがあることを確認した。多様な情報源から、適切な特徴量を見出す方式について、さらに検討する。
敵対的生成ネットワークを用いた話者匿名化手法の検証を行った。匿名化処理後の音声に対し、自然性と話者認識可能性、話者弁別可能性を調査した。自然性について、従来法と比較してスコアの改善を得た。処理後音声からの話者特定は困難であることを確認した。話者弁別正解率から、処理後音声間の話者弁別は可能であることが示された。

researchmap
感情や個人性を高品質に表現可能なDNNに基づく音声合成方式の研究

Grant number：21K11963 2021.04 - 2024.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

阿部匡伸, 原直

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4160000 （ Direct expense: \3200000 、 Indirect expense：\960000 ）

研究計画調書に記載した課題に関して，令和３年度（２０２１年度）に実施した内容は下記の通り。
（課題１）非言語情報の表現モデル「①-１感情表現モデルの検討」については，話者性を制御できるように補助情報として話者IDを加えるとともに，感情の強さを感情IDのone-hotベクトルの重みによって合成時に制御できるようにモデル構造を改良した。「①-２感情強度表現方式の検討」についてはMOSテストによって感情の強さ制御性能を評価した。評価実験から感情IDの操作によって，“Happy”は感情の強さを制御可能であることが示された．一方，“Angry”は感情の強さが“Happy”ほど適切に制御できなかった。分析の結果， “Angry”は“Normal”に類似した音響パラメータ特徴となっており，今回の実験に使用した“Angry”データは細かな操作が難しい音声であることが明らかとなった。「①-３話者性の多様化への適用」については, ABX テストにより合成音声の話者性を評価した。Xとして自然音声か合成音声のどちらかを提示し，XがA話者とB話者のどちらに近いかを判定させた。自然音声では，“Happy” と“Normal” では正解率が約95%，“Angry” は正解率が約85%であり，他の感情に比べて話者性の差が小さいと考えられる．これに対して合成音声はどの感情においても70％程度となり，正解率は低下するものの話者性の識別はできていると考えられる。また，“Happy”は，話者性の識別率が高く，“Angry”は，話者によっては識別率の高い話者がいた。また，話者性の識別は声質の違いと感情の表出の違いとがあり，どちらが重要な要因であるかはさらなる実験が必要である。

researchmap
観光地の雰囲気可視化を可能とする簡易なアノテーションに基づく深層学習方式の研究

Grant number：20K12079 2020.04 - 2023.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

原直

　 More details

Authorship：Principal investigator

Grant amount：\4290000 （ Direct expense: \3300000 、 Indirect expense：\990000 ）

課題1に関連して，これまでに収録を行っていたデータ約800個に対して，1名による詳細なアノテーション付与を行った．課題3で検討した項目に準じて付与を行った．アノテーションのための環境音聴取時には，ストリートビューの映像も同時に提示することで，音だけに依存しない場の印象や雰囲気をアノテーションすることとした．
課題2に関連して，課題1で得られたデータを利用し，単純なDNN方式による地域特性の分類を行った．分類器には，音源情報を入れることで，地域特性の推定精度が上がる．このとき，人手でつけた音源情報ではなく，音響信号と航空写真から推定した音源情報によっても，人手の情報と同程度の推定精度が得られることを示した．これにより，詳細アノテーションに比肩する情報を，簡易アノテーションに付加情報を与えることで得られる可能性が示唆された．さらに，Concept Driftに基づく適応方式の研究を進めた．
課題3に関連して，昨年度に引き続き，ISO12913のサウンドスケープとしての考え方に基づいた研究を進めた．地域特性を表現するアノテーションとして，8種類の評価軸を用いることとした．ただし，人手の評価によるばらつきも考慮し，8つの評価軸から，より簡潔に表現することができる2種の評価軸で表す方式を採用し，課題2における推定方式の検討を進めた．
課題2に挙げたConcept Driftの考え方を取り入れた研究として，国際会議1件，論文誌1件の発表をおこなった．また，各課題にて挙げた内容に基づき，国内会議2件の発表をおこなった．

researchmap
Development of PBL instruction support system to measure learners' activities using acoustic signals

Grant number：18K02862 2018.04 - 2022.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

NISIMURA Ryuichi

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4420000 （ Direct expense: \3400000 、 Indirect expense：\1020000 ）

In this study, we developed a technology to realize an instructor support system for group work by applying sound information processing technology. (1) Wearable devices worn by learners were improved by evaluating sound source separation features. (2) Deep learning identification algorithms were developed to visualize the participation attitudes of learners. (3) We developed a group work logging system and a support system for annotating group work participation information. (4) We developed a method for speaker anonymization of recorded group work speech by applying deep learning voice transformation. Due to the impact of the new coronavirus, we had to change our original plan and decided not to continue the face-to-face experiments, but we were able to obtain new knowledge that is useful for online education.

researchmap
A Study on Algorithms to Improve Intelligibility of Glossectomy Patients' Speech Using Deep Neural Networks

Grant number：18K11376 2018.04 - 2022.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

Abe Masanobu

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4290000 （ Direct expense: \3300000 、 Indirect expense：\990000 ）

In this study, we investigate voice conversion algorithms to improve intelligibility of speech uttered by a patient who has articulation disorders because of wide glossectomy and/or segmental mandibulectomy. To achieve real time processing, voice conversion directly modifies waveform using spectrum differential between a healthy speaker and a glossectomy speaker. The spectrum differential is estimated by Deep Neural Networks(DNN). To improve the performance, we proposed to use lip shapes as auxiliary inputs and to introduce knowledge distillation approach to make best use of phoneme labels as auxiliary inputs. Experimental results showed that both approaches work well, and phoneme labels with knowledge distillation has better performance than the usage of lip shapes.

researchmap
地域活性化政策立案のための音響信号による“賑い度”調査プラットフォームの研究開発

2015.07 - 2018.03

Ministry of International Affairs and Communiations Strategic Information and Communications R&D Promotion Programme

Masanobu Abe, Sunao Hara

　 More details

Authorship：Coinvestigator(s) Grant type：Competitive

researchmap
Development of activity sound visualization method for personal evaluation of PBL

Grant number：15K01069 2015.04 - 2018.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

Ryuichi Nishimura, Sunao Hara

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4680000 （ Direct expense: \3600000 、 Indirect expense：\1080000 ）

In this study, we have developed methods for realizing to support evaluations of students participating in PBL (Project-Based Learning) on the basis of visualization technologies of sound information. A method of detection of activated communication in group work from dialogue voice was examined. We developed the prototype system for presenting a whole condition of a group work using wearable voice recording terminals. In addition, sound source information visualization methods based on deep learning neural networks have been investigated.

researchmap
A study on monitoring systems with privacy protection control

Grant number：15K00128 2015.04 - 2018.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C) Grant-in-Aid for Scientific Research (C)

Masanobu Abe, Sunao Hara

　 More details

Authorship：Coinvestigator(s)

Grant amount：\4420000 （ Direct expense: \3400000 、 Indirect expense：\1020000 ）

In this study, we try to develop a monitoring system that takes into account privacy issues. The monitoring system uses a living area and controls the degree of watching over and privacy protection by changing the granularity of the living area. The living area is defined a set of home, frequently visited place of stay, and travel route connecting them. We proposed an algorithm to generate the living area using GPS data collected for a long period. Experimental results show that the proposed algorithm can estimate living area with a precision of 0.85.
We also carried out questionnaire on user preferences in terms of the monitoring and privacy protection levels used these living area. Experiment results showed that the people on the monitoring side wanted the monitoring system to allow them to monitor in detail. Conversely, it was observed for the people being monitored that the more detailed the monitoring, the greater the feelings of being surveilled intrusively.

researchmap
Study on spoken dialogue system with safety consideration based on automatic estimation of driving situation

Grant number：26730092 2014.04 - 2017.03

Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B) Grant-in-Aid for Young Scientists (B)

Hara Sunao

　 More details

Authorship：Principal investigator

Grant amount：\3640000 （ Direct expense: \2800000 、 Indirect expense：\840000 ）

We conducted below:
(1) We used bio signals and driving information signals that are measured by the driver’s body and the vehicle’s body for estimating the driving road from the signals. Furthermore, we estimated driving load assuming use of sensors obtained by smartphones. (2) We evaluated the spoken dialog strategy from the viewpoint of user’s driving load. An objective evaluation by computer simulation was conducted by considering both dialog initiatives and the exitance of confirmation utterances. (3) We conducted a subjective evaluation to evaluate the performance of the proposed dialog strategy as a spoken dialog system. The subjects were asked to drive a simulator during talking with a spoken dialog system. (4) A dialog strategy based on graph search was introduced to realize a dialog strategy which considering the estimation result of the user's mental load. We evaluated the proposed system by objective evaluation and subjective evaluation.

researchmap

▼display all

Class subject in charge

Exercises on Programming 1 （2023academic year） 1st semester - 水1～3
Exercises on Programming 2 （2023academic year） Second semester - 水1～3
Exercises on Programming 1 （2023academic year） 1st semester - 水1～3
Exercises on Programming 2 （2023academic year） Second semester - 水1～3
Advanced Internship for Interdisciplinary Medical Sciences and Engineering （2023academic year） Year-round - その他
Technical English for Interdisciplinary Medical Sciences and Engineering （2023academic year） Late - その他
Research Works for Interdisciplinary Medical Sciences and Engineering （2023academic year） Year-round - その他
Research Works for Interdisciplinary Medical Sciences and Engineering （2023academic year） Year-round - その他
Introduction to Information Processing 2 （2023academic year） Second semester - 月1～2
Introduction to Information Processing 2 （2023academic year） Second semester - 木1～2
Information Technology Experiments B (Media Processing) （2023academic year） Third semester - 火3～7,金3～7
Information Technology Experiments B (Media Processing) （2023academic year） Third semester - 火3～7,金3～7
Advanced Research on Speech Processing I （2023academic year） Prophase - 月1～2
Advanced Research on Speech Processing II （2023academic year） Prophase - 木5～6
Digital Signal Processing （2022academic year） Third semester - 火1～2,木1～2
Exercises on Programming 1 （2022academic year） 1st semester - 水1～3
Exercises on Programming 2 （2022academic year） Second semester - 水1～3
Exercises on Programming 1 （2022academic year） 1st semester - 水1～3
Exercises on Programming 2 （2022academic year） Second semester - 水1～3
Technical English for Interdisciplinary Medical Sciences and Engineering （2022academic year） Late - その他
Research Works for Interdisciplinary Medical Sciences and Engineering （2022academic year） Year-round - その他
Introduction to Information Processing 2 （2022academic year） Second semester - 木1～2
Introduction to Information Processing 2 （2022academic year） Second semester - 月1～2
Information Technology Experiments B (Media Processing) （2022academic year） Third semester - 火3～7,木3～7
Advanced Research on Speech Processing I （2022academic year） Prophase - 月1～2
Advanced Research on Speech Processing II （2022academic year） Prophase - 木5～6
Digital Signal Processing （2021academic year） Fourth semester - 月1,月2,木1,木2
Exercises on Programming 1 （2021academic year） 1st semester - 水1,水2,水3
Exercises on Programming 2 （2021academic year） Second semester - 水1,水2,水3
Technical English for Interdisciplinary Medical Sciences and Engineering （2021academic year） Late - その他
Research Works for Interdisciplinary Medical Sciences and Engineering （2021academic year） Year-round - その他
Introduction to Information Processing 2 （2021academic year） Second semester - 月1～2
Introduction to Information Processing 2 （2021academic year） Second semester - 木1～2
Information Technology Experiments B (Media Processing) （2021academic year） Third semester - 火3,火4,火5,火6,火7,木3,木4,木5,木6,木7
Advanced Research on Speech Processing I （2021academic year） Prophase - 木5～6
Advanced Research on Speech Processing II （2021academic year） Prophase - 木5～6
Exercises on Programming （2020academic year） 1st and 2nd semester - 水1,水2,水3
Exercises on Programming 1 （2020academic year） 1st semester - 水1,水2,水3
Exercises on Programming 2 （2020academic year） Second semester - 水1,水2,水3
Technical English for Interdisciplinary Medical Sciences and Engineering （2020academic year） Late - その他
Research Works for Interdisciplinary Medical Sciences and Engineering （2020academic year） Year-round - その他
Introduction to Information Processing 2 （2020academic year） Second semester - 月1,月2
Introduction to Information Processing 2 （2020academic year） Second semester - 木1,木2
Information Technology Experiments B (Media Processing) （2020academic year） Third semester - 火3,火4,火5,火6,木3,木4,木5,木6
Laboratory Work on Information Technology III （2020academic year） Third semester - 火3,火4,火5,火6
Laboratory Work on Information Technology IV （2020academic year） Third semester - 木3,木4,木5,木6
Advanced Research on Speech Processing I （2020academic year） Prophase - 木5,木6

▼display all

Academic Activities

日本音響学会第24回関西支部若手研究者交流研究発表会

Role(s)：Planning, management, etc.

日本音響学会関西支部（オンライン（Gather.Town）） 2021.12.4

　More details

Type：Academic society, research group, etc.

日本音響学会関西支部では，若手研究者間での研究交流及び相互啓発を目的として，1998年より「若手研究者交流研究発表会」を開催しています。これまでに数多くの若手研究者の方々に参加・発表していただきました。本年度は，新型コロナウイルス感染症に対する各種イベントへの社会的要請等を鑑み，オンラインで開催します。研究者間の交流だけでなく産学の交流も深めるために，賛助会員の企業展示も開催する予定です。

researchmap