Updated on 2022/06/16

写真a

 
HARA Sunao
 
Organization
Faculty of Interdisciplinary Science and Engineering in Health Systems Assistant Professor
Position
Assistant Professor
Profile
He received the B.S., M.S., Ph.D degrees from Nagoya University in 2003, 2005 and 2011, respectively.
He is currently an assistant professor in the Graduate School of Information Science, Nara Institute of Science and Technology.
His research interests include development and evaluation of spoken dialog in real environments.
He is a member of the Acoustic Society in Japan, Human Interface Society in Japan, and Information Processing Society of Japan.
External link

Degree

  • Ph.D (Information science) ( Nagoya university )

Research Interests

  • Human Interface

  • Spoken dialogue

  • Speech recognition

  • lifelog

  • Acoustic scene analysis

  • Acoustic event detection

  • Deep Learning

  • Machine Learning

  • Speech processing

  • Spoken dialog system

Research Areas

  • Informatics / Intelligent informatics

  • Informatics / Web informatics and service informatics

  • Informatics / Perceptual information processing

Research History

  • Okayama University   Graduate School of Interdisciplinary Science and Engineering in Health Systems   Assistant Professor

    2019.4

      More details

    Country:Japan

    Notes:工学部 情報系学科

    researchmap

  • Okayama University   The Graduate School of Natural Science and Technology   Assistant Professor

    2012.9 - 2019.3

      More details

    Country:Japan

    Notes:工学部 情報系学科

    researchmap

  • Nara Institute of Science and Technology   Assistant Professor

    2011.11 - 2012.9

      More details

Professional Memberships

  • IEEE

    2016.6

      More details

  • THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS.

    2012.2

      More details

  • INFORMATION PROCESSING SOCIETY OF JAPAN

    2007

      More details

  • ACOUSTICAL SOCIETY OF JAPAN

    2004

      More details

  • HUMAN INTERFACE SOCIETY

    2005

      More details

Committee Memberships

  • 日本音響学会   2022年春季研究発表会 遠隔開催実行委員会 委員  

    2021.10 - 2022.3   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会   2021年秋季研究発表会 遠隔開催実行委員会 委員  

    2021.7 - 2022.3   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会 関西支部   第24回若手研究者交流研究発表会 実行委員長  

    2021.4 - 2022.3   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会   2021年春季研究発表会 遠隔開催実行委員会 委員  

    2020.11 - 2021.3   

      More details

  • 日本音響学会   2020年秋季研究発表会 遠隔開催実行委員会 委員  

    2020.7 - 2020.9   

      More details

    Committee type:Academic society

    researchmap

  • 電子情報通信学会   ソサイエティ論文誌編集委員会 査読委員  

    2017.8   

      More details

    Committee type:Academic society

    researchmap

  • 情報処理学会 中国支部   支部運営委員会 委員  

    2015.5 - 2019.5   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会   編集委員会 査読委員  

    2014.2   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会   電子化・広報推進委員会  

    2013.10   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会 関西支部   若手研究者交流研究発表会 実行委員  

    2011.11   

      More details

    Committee type:Academic society

    researchmap

  • 日本音響学会   学生・若手フォーラム幹事会  

    2007.3 - 2012.3   

      More details

    Committee type:Academic society

    researchmap

▼display all

 

Papers

  • Acoustic Scene Classifier Based on Gaussian Mixture Model in the Concept Drift Situation Reviewed

    Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

    Advances in Science, Technology and Engineering Systems Journal   6 ( 5 )   167 - 176   2021.9

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:ASTES Journal  

    DOI: 10.25046/aj060519

    researchmap

  • Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech Reviewed International journal

    Naoto Kakegawa, Sunao Hara, Masanobu Abe, Yusuke Ijima

    Interspeech 2021   126 - 130   2021.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    The biggest obstacle to develop end-to-end Japanese text-to-speech (TTS) systems is to estimate phonetic and prosodic information (PPI) from Japanese texts. The following are the reasons: (1) the Kanji characters of the Japanese writing system have multiple corresponding pronunciations, (2) there is no separation mark between words, and (3) an accent nucleus must be assigned at appropriate positions. In this paper, we propose to solve the problems by neural machine translation (NMT) on the basis of encoder-decoder models, and compare NMT models of recurrent neural networks and the Transformer architecture. The proposed model handles texts on token (character) basis, although conventional systems handle them on word basis. To ensure the potential of the proposed approach, NMT models are trained using pairs of sentences and their PPIs that are generated by a conventional Japanese TTS system from 5 million sentences. Evaluation experiments were performed using PPIs that are manually annotated for 5,142 sentences. The experimental results showed that the Transformer architecture has the best performance, with 98.0% accuracy for phonetic information estimation and 95.0% accuracy for PPI estimation. Judging from the results, NMT models are promising toward end-to-end Japanese TTS.

    DOI: 10.21437/interspeech.2021-914

    researchmap

  • Model architectures to extrapolate emotional expressions in DNN-based text-to-speech Reviewed International journal

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

    Speech Communication   126   35 - 43   2021.2

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:Elsevier BV  

    This paper proposes architectures that facilitate the extrapolation of emotional expressions in deep neural network (DNN)-based text-to-speech (TTS). In this study, the meaning of “extrapolate emotional expressions” is to borrow emotional expressions from others, and the collection of emotional speech uttered by target speakers is unnecessary. Although a DNN has potential power to construct DNN-based TTS with emotional expressions and some DNN-based TTS systems have demonstrated satisfactory performances in the expression of the diversity of human speech, it is necessary and troublesome to collect emotional speech uttered by target speakers. To solve this issue, we propose architectures to separately train the speaker feature and the emotional feature and to synthesize speech with any combined quality of speakers and emotions. The architectures are parallel model (PM), serial model (SM), auxiliary input model (AIM), and hybrid models (PM&AIM and SM&AIM). These models are trained through emotional speech uttered by few speakers and neutral speech uttered by many speakers. Objective evaluations demonstrate that the performances in the open-emotion test provide insufficient information. They make a comparison with those in the closed-emotion test, but each speaker has their own manner of expressing emotion. However, subjective evaluation results indicate that the proposed models could convey emotional information to some extent. Notably, the PM can correctly convey sad and joyful emotions at a rate of 60%.

    DOI: 10.1016/j.specom.2020.11.004

    Web of Science

    researchmap

    Other Link: https://arxiv.org/abs/2102.10345

  • Module Comparison of Transformer-TTS for Speaker Adaptation based on Fine-tuning Reviewed International journal

    Katsuki Inoue, Sunao Hara, Masanobu Abe

    Proceedings of APSIPA Annual Summit and Conference (APSIPA-ASC 2020)   826 - 830   2020.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE/APSIPA  

    End-to-end text-to-speech (TTS) models have achieved remarkable results in recent times. However, the model requires a large amount of text and audio data for training. A speaker adaptation method based on fine-tuning has been proposed for constructing a TTS model using small scale data. Although these methods can replicate the target speaker s voice quality, synthesized speech includes the deletion and/or repetition of speech. The goal of speaker adaptation is to change the voice quality to match the target speaker ' s on the premise that adjusting the necessary modules will reduce the amount of data to be fine-tuned. In this paper, we clarify the role of each module in the Transformer-TTS process by not updating it. Specifically, we froze character embedding, encoder, layer predicting stop token, and loss function for estimating sentence ending. The experimental results showed the following: (1) fine-tuning the character embedding did not result in an improvement in the deletion and/or repetition of speech, (2) speech deletion increases if the encoder is not fine-tuned, (3) speech deletion was suppressed when the layer predicting stop token is not fine-tuned, and (4) there are frequent speech repetitions at sentence end when the loss function estimating sentence ending is omitted.

    Web of Science

    researchmap

  • Concept Drift Adaptation for Acoustic Scene Classifier Based on Gaussian Mixture Model Reviewed

    Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

    Proceedings of IEEE REGION 10 CONFERENCE (TENCON 2020)   450 - 455   2020.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/tencon50793.2020.9293766

    researchmap

  • Controlling the Strength of Emotions in Speech-Like Emotional Sound Generated by WaveNet Reviewed International journal

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    Proceedings of Interspeech 2020   3421 - 3425   2020.10

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    DOI: 10.21437/interspeech.2020-2064

    researchmap

  • Semi-Supervised Speaker Adaptation for End-to-End Speech Synthesis with Pretrained Models Reviewed International coauthorship International journal

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

    Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020)   7634 - 7638   2020.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    Recently, end-to-end text-to-speech (TTS) models have achieved a remarkable performance, however, requiring a large amount of paired text and speech data for training. On the other hand, we can easily collect unpaired dozen minutes of speech recordings for a target speaker without corresponding text data. To make use of such accessible data, the proposed method leverages the recent great success of state-of-the-art end-to-end automatic speech recognition (ASR) systems and obtains corresponding transcriptions from pretrained ASR models. Although these models could only provide text output instead of intermediate linguistic features like phonemes, end-to-end TTS can be well trained with such raw text data directly. Thus, the proposed method can greatly simplify a speaker adaptation pipeline by consistently employing end-to-end ASR/TTS ecosystems. The experimental results show that our proposed method achieved comparable performance to a paired data adaptation method in terms of subjective speaker similarity and objective cepstral distance measures.

    DOI: 10.1109/icassp40776.2020.9053371

    Web of Science

    researchmap

  • DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients' Speech Reviewed International journal

    Hiroki Murakami, Sunao Hara, Masanobu Abe

    Proceedings of APSIPA Annual Summit and Conference 2019   138 - 142   2019.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    In this paper, we propose using phonemic information in addition to acoustic features to improve the intelligibility of speech uttered by patients with articulation disorders caused by a wide glossectomy. Our previous studies showed that voice conversion algorithm improves the quality of glossectomy patients' speech. However, losses in acoustic features of glossectomy patients' speech are so large that the quality of the reconstructed speech is low. To solve this problem, we explored potentials of several additional information to improve speech intelligibility. One of the candidates is phonemic information, more specifically Phoneme Labels as Auxiliary input (PLA). To combine both acoustic features and PLA, we employed a DNN-based algorithm. PLA is represented by a kind of one-of-k vector, i.e., PLA has a weight value (<1.0) that gradually changes in time axis, whereas one-of-k has a binary value (0 or 1). The results showed that the proposed algorithm reduced the mel-frequency cepstral distortion for all phonemes, and almost always improved intelligibility. Notably, the intelligibility was largely improved in phonemes /s/ and /z/, mainly because the tongue is used to sustain constriction to produces these phonemes. This indicates that PLA works well to compensate the lack of a tongue.

    DOI: 10.1109/APSIPAASC47483.2019.9023168

    Web of Science

    researchmap

  • Speech-like Emotional Sound Generator by WaveNet Reviewed International journal

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    Proceedings of APSIPA Annual Summit and Conference 2019   143 - 147   2019.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    In this paper, we propose a new algorithm to generate Speech-like Emotional Sound (SES). Emotional information plays an important role in human communication, and speech is one of the most useful media to express emotions. Although, in general, speech conveys emotional information as well as linguistic information, we have undertaken the challenge to generate sounds that convey emotional information without linguistic information, which results in making conversations in human-machine interactions more natural in some situations by providing non-verbal emotional vocalizations. We call the generated sounds "speech-like", because the sounds do not contain any linguistic information. For the purpose, we propose to employ WaveNet as a sound generator conditioned by only emotional IDs. The idea is quite different from WaveNet Vocoder that synthesizes speech using spectrum information as auxiliary features. The biggest advantage of the idea is to reduce the amount of emotional speech data for the training. The proposed algorithm consists of two steps. In the first step, WaveNet is trained to obtain phonetic features using a large speech database, and in the second step, WaveNet is re-trained using a small amount of emotional speech. Subjective listening evaluations showed that the SES could convey emotional information and was judged to sound like a human voice.

    DOI: 10.1109/APSIPAASC47483.2019.9023346

    Web of Science

    researchmap

  • A signal processing perspective on human gait: Decoupling walking oscillations and gestures Reviewed International journal

    Adrien Gregorj, Zeynep Yücel, Sunao Hara, Akito Monden, Masahiro Shiomi

    Proceedings of the 4th International Conference on Interactive Collaborative Robotics 2019 (ICR 2019)   11659   75 - 85   2019.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:SPRINGER INTERNATIONAL PUBLISHING AG  

    This study focuses on gesture recognition in mobile interaction settings, i.e. when the interacting partners are walking. This kind of interaction requires a particular coordination, e.g. by staying in the field of view of the partner, avoiding obstacles without disrupting group composition and sustaining joint attention during motion. In literature, various studies have proven that gestures are in close relation in achieving such goals.Thus, a mobile robot moving in a group with human pedestrians, has to identify such gestures to sustain group coordination. However, decoupling of the inherent -walking- oscillations and gestures, is a big challenge for the robot. To that end, we employ video data recorded in uncontrolled settings and detect arm gestures performed by human-human pedestrian pairs by adopting a signal processing approach. Namely, we exploit the fact that there is an inherent oscillatory motion at the upper limbs arising from the gait, independent of the view angle or distance of the user to the camera. We identify arm gestures as disturbances on these oscillations. In doing that, we use a simple pitch detection method from speech processing and assume data involving a low frequency periodicity to be free of gestures. In testing, we employ a video data set recorded in uncontrolled settings and show that we achieve a detection rate of 0.80.

    DOI: 10.1007/978-3-030-26118-4_8

    Web of Science

    researchmap

  • Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient's Speech Using Spectral Differential Modification in Voice Conversion Reviewed International journal

    Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    Interspeech 2018, 19th Annual Conference of the International Speech Communication Association   2464 - 2468   2018.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    In this paper, we propose an algorithm to improve the naturalness of the reconstructed glossectomy patient's speech that is generated by voice conversion to enhance the intelligibility of speech uttered by patients with a wide glossectomy. While existing VC algorithms make it possible to improve intelligibility and naturalness, the result is still not satisfying. To solve the continuing problems, we propose to directly modify the speech waveforms using a spectrum differential. The motivation is that glossectomy patients mainly have problems in their vocal tract, not in their vocal cords. The proposed algorithm requires no source parameter extractions for speech synthesis, so there are no errors in source parameter extractions and we are able to make the best use of the original source characteristics. In terms of spectrum conversion, we evaluate with both GMM and DNN. Subjective evaluations show that our algorithm can synthesize more natural speech than the vocoder-based method. Judging from observations of the spectrogram, power in high-frequency bands of fricatives and stops is reconstructed to be similar to that of natural speech.

    DOI: 10.21437/Interspeech.2018-1239

    Web of Science

    researchmap

  • Sound sensing using smartphones as a crowdsourcing approach Reviewed International journal

    Sunao Hara, Asako Hatakeyama, Shota Kobayashi, Masanobu Abe

    2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017   1328 - 1333   2017.12

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/APSIPA.2017.8282238

    researchmap

  • An investigation to transplant emotional expressions in DNN-based TTS synthesis Reviewed International journal

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

    2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017   1253 - 1258   2017.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/APSIPA.2017.8282231

    researchmap

  • New monitoring scheme for persons with dementia through monitoring-area adaptation according to stage of disease Reviewed International journal

    Shigeki Kamada, Yuji Matsuo, Sunao Hara, Masanobu Abe

    Proceedings of the 1st ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks, LocalRec@SIGSPATIAL 2017   1:1 - 1:7   2017.11

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ACM  

    DOI: 10.1145/3148150.3148151

    researchmap

    Other Link: http://doi.acm.org/10.1145/3148150.3148151

  • Prediction of subjective assessments for a noise map using deep neural networks Reviewed International journal

    Shota Kobayashi, Masanobu Abe, Sunao Hara

    Adjunct Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers, UbiComp/ISWC 2017   113 - 116   2017.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ACM  

    DOI: 10.1145/3123024.3123091

    researchmap

    Other Link: http://doi.acm.org/10.1145/3123024.3123091

  • Speaker Dependent Approach for Enhancing a Glossectomy Patient’s Speech via GMM-Based Voice Conversion

    Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    Interspeech 2017   2017.8

     More details

    Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    DOI: 10.21437/interspeech.2017-841

    Web of Science

    researchmap

  • Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion

    Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6   3384 - 3388   2017

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA-INT SPEECH COMMUNICATION ASSOC  

    In this paper, using GMM-based voice conversion algorithm, we propose to generate speaker-dependent mapping functions to improve the intelligibility of speech uttered by patients with a wide glossectomy. The speaker-dependent approach enables to generate the mapping functions that reconstruct missing spectrum features of speech uttered by a patient without having influences of a speaker's factor. The proposed idea is simple, i.e., to collect speech uttered by a patient before and after the glossectomy, but in practice it is hard to ask patients to utter speech just for developing algorithms. To confirm the performance of the proposed approach, in this paper, in order to simulate glossectomy patients, we fabricated an intraoral appliance which covers lower dental arch and tongue surface to restrain tongue movements. In terms of the Mel-frequency cepstrum (MFC) distance, by applying the voice conversion, the distances were reduced by 25% and 42% for speaker dependent case and speaker-independent case, respectively. In terms of phoneme intelligibility, dictation tests revealed that speech reconstructed by speaker-dependent approach almost always showed better performance than the original speech uttered by simulated patients, while speaker-independent approach did not.

    DOI: 10.21437/Interspeech.2017-841

    Web of Science

    researchmap

  • Enhancing a glossectomy patient's speech via GMM-based voice conversion Reviewed International journal

    Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

    2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)   1 - 4   2016.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    DOI: 10.1109/apsipa.2016.7820909

    Web of Science

    researchmap

  • LiBS: Lifelog browsing system to support sharing of memories Reviewed International journal

    Atsuya Namba, Sunao Hara, Masanobu Abe

    UbiComp 2016 Adjunct - Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing   165 - 168   2016.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:Association for Computing Machinery, Inc  

    We propose a lifelog browsing system through which users can share memories of their experiences with other users. Most importantly, by using global positioning system data and time stamps, the system simultaneously displays a variety of log information in a time-synchronous manner. This function empowers users with not only an easy interpretation of other users' experiences but also nonverbal notifications. Shared information on this system includes photographs taken by users, Google street views, shops and restaurants on the map, daily weather, and other items relevant to users' interests. In evaluation experiments, users preferred the proposed system to conventional photograph albums and maps for explaining and sharing their experiences. Moreover, through displayed information, the listeners found out their interest items that had not been mentioned by the speakers.

    DOI: 10.1145/2968219.2971401

    Scopus

    researchmap

    Other Link: http://doi.acm.org/10.1145/2968219.2971401

  • Safety vs. Privacy: User preferences from the monitored and monitoring sides of a monitoring system Reviewed International journal

    Shigeki Kamada, Sunao Hara, Masanobu Abe

    UbiComp 2016 Adjunct - Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing   101 - 104   2016.9

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:Association for Computing Machinery, Inc  

    In this study, in order to develop a monitoring system that takes into account privacy issues, we investigated user preferences in terms of the monitoring and privacy protec-tion levels. The people on the monitoring side wanted the monitoring system to allow them to monitor in detail. Con-versely, it was observed for the people being monitored that the more detailed the monitoring, the greater the feelings of being surveilled intrusively. Evaluation experiments were performed using the location data of three people in differ-ent living areas. The results of the experiments show that it is possible to control the levels of monitoring and privacy protection without being affected by the shape of a living area by adjusting the quantization level of location informa-tion. Furthermore, it became clear that the granularity of location information satisfying the people on the monitored side and the monitoring side is different.

    DOI: 10.1145/2968219.2971412

    Scopus

    researchmap

    Other Link: http://doi.acm.org/10.1145/2968219.2971412

  • Sound collection systems using a crowdsourcing approach to construct sound map based on subjective evaluation Reviewed International journal

    Sunao Hara, Shota Kobayashi, Masanobu Abe

    IEEE ICME Workshop on Multimedia Mobile Cloud for Smart City Applications (MMCloudCity-2016)   1 - 6   2016

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    This paper presents a sound collection system that uses crowdsourcing to gather information for visualizing area characteristics. First, we developed a sound collection system to simultaneously collect physical sounds, their statistics, and subjective evaluations. We then conducted a sound collection experiment using the developed system on 14 participants. We collected 693,582 samples of equivalent Aweighted loudness levels and their locations, and 5,935 samples of sounds and their locations. The data also include subjective evaluations by the participants. In addition, we analyzed the changes in sound properties of some areas before and after the opening of a large-scale shopping mall in a city. Next, we implemented visualizations on the server system to attract users' interests. Finally, we published the system, which can receive sounds from any Android smartphone user. The sound data were continuously collected and achieved a specified result.

    DOI: 10.1109/ICMEW.2016.7574694

    Web of Science

    researchmap

  • A Spoken Dialog System with Redundant Response to Prevent User Misunderstanding Reviewed International journal

    Masaki Yamaoka, Sunao Hara, Masanobu Abe

    Proceedings of APSIPA Annual Summit and Conference 2015   229 - 232   2015.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    We propose a spoken dialog strategy for car navigation systems to facilitate safe driving. To drive safely, drivers need to concentrate on their driving; however, their concentration may be disrupted due to disagreement with their spoken dialog system. Therefore, we need to solve the problems of user misunderstandings as well as misunderstanding of spoken dialog systems. For this purpose, we introduced a driver workload level in spoken dialog management in order to prevent user misunderstandings. A key strategy of the dialog management is to make speech redundant if the driver's workload is too high in assuming that the user probably misunderstand the system utterance under such a condition. An experiment was conducted to compare performances of the proposed method and a conventional method using a user simulator. The simulator is developed under the assumption of two types of drivers: an experienced driver model and a novice driver model. Experimental results showed that the proposed strategies achieved better performance than the conventional one for task completion time, task completion rate, and user's positive speech rate. In particular, these performance differences are greater for novice users than for experienced users.

    DOI: 10.1109/APSIPA.2015.7415511

    Web of Science

    researchmap

  • Extracting Daily Patterns of Human Activity Using Non-Negative Matrix Factorization Reviewed International journal

    Masanobu Abe, Akihiko Hirayama, Sunao Hara

    Proceedings of IEEE International Conference on Consumer Electronics (IEEE-ICCE 2015)   36 - 39   2015.1

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    This paper presents an algorithm to mine basic patterns of human activities on a daily basis using non-negative matrix factorization (NMF). The greatest benefit of the algorithm is that it can elicit patterns from which meanings can be easily interpreted. To confirm its performance, the proposed algorithm was applied to PC logging data collected from three occupations in offices. Daily patterns of software usage were extracted for each occupation. Results show that each occupation uses specific software in its own time period, and uses several types of software in parallel in its own combinations. Experiment results also show that patterns of 144 dimension vectors were compressible to those of 11 dimension vectors without degradation in occupation classification performance. Therefore, the proposed algorithm compressed basic software usage patterns to about one-tenth of their original dimensions while preserving the original information. Moreover, the extracted basic patterns showed reasonable interpretation of daily working patterns in offices.

    DOI: 10.1109/ICCE.2015.7066309

    Web of Science

    researchmap

  • Sub-Band Text-to-Speech Combining Sample-Based Spectrum with Statistically Generated Spectrum Reviewed International journal

    Tadashi Inai, Sunao Hara, Masanobu Abe, Yusuke Ijima, Noboru Miyazaki, Hideyuki Mizuno

    Proceedings of Interspeech 2015   264 - 268   2015

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    As described in this paper, we propose a sub-band speech synthesis approach to develop a high quality Text-to-Speech (TTS) system: a sample-based spectrum is used in the high-frequency band and spectrum generated by HMM-based TTS is used in the low-frequency band. Herein, sample-based spectrum means spectrum selected from a phoneme database such that it is the most similar to spectrum generated by HMM-based speech synthesis. A key idea is to compensate over-smoothing caused by statistical procedures by introducing a sample-based spectrum, especially in the high-frequency band. Listening test results show that the proposed method has better performance than HMM-based speech synthesis in terms of clarity. It is at the same level as HMM-based speech synthesis in terms of smoothness. In addition, preference test results among the proposed method, HMM-based speech synthesis, and waveform speech synthesis using 80 min speech data reveal that the proposed method is the most liked.

    Web of Science

    researchmap

    Other Link: http://dblp.uni-trier.de/db/conf/interspeech/interspeech2015.html#conf/interspeech/InaiHAIMM15

  • Sound collection and visualization system enabled participatory and opportunistic sensing approaches Reviewed International journal

    Sunao Hara, Masanobu Abe, Noboru Sonehara

    Proceedings of 2015 IEEE International Conference on Pervasive Computing and Communication Workshops (PerCom Workshops)   390 - 395   2015

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    This paper presents a sound collection system to visualize environmental sounds that are collected using a crowd sourcing approach. An analysis of physical features is generally used to analyze sound properties; however, human beings not only analyze but also emotionally connect to sounds. If we want to visualize the sounds according to the characteristics of the listener, we need to collect not only the raw sound, but also the subjective feelings associated with them. For this purpose, we developed a sound collection system using a crowdsourcing approach to collect physical sounds, their statistics, and subjective evaluations simultaneously. We then conducted a sound collection experiment using the developed system on ten participants. We collected 6,257 samples of equivalent loudness levels and their locations, and 516 samples of sounds and their locations. Subjective evaluations by the participants are also included in the data. Next, we tried to visualize the sound on a map. The loudness levels are visualized as a color map and the sounds are visualized as icons which indicate the sound type. Finally, we conducted a discrimination experiment on the sound to implement a function of automatic conversion from sounds to appropriate icons. The classifier is trained on the basis of the GMM-UBM (Gaussian Mixture Model and Universal Background Model) method. Experimental results show that the F-measure is 0.52 and the AUC is 0.79.

    DOI: 10.1109/PERCOMW.2015.7134069

    Web of Science

    researchmap

    Other Link: https://ousar.lib.okayama-u.ac.jp/ja/53271

  • Algorithm to Estimate a Living Area Based on Connectivity of Places with Home Reviewed International journal

    Yuji Matsuo, Sunao Hara, Masanobu Abe

    HCI International 2015 - Posters’ Extended Abstracts (Part II), CCIS 529   529   570 - 576   2015

     More details

    Language:English   Publishing type:Part of collection (book)   Publisher:SPRINGER-VERLAG BERLIN  

    We propose an algorithm to estimate a person's living area using his/her collected Global Positioning System (GPS) data. The most important feature of the algorithm is the connectivity of places with a home, i.e., a living area must consist of a home, important places, and routes that connect them. This definition is logical because people usually go to a place from home, and there can be several routes to that place. Experimental results show that the proposed algorithm can estimate living area with a precision of 0.82 and recall of 0.86 compared with the grand truth established by users. It is also confirmed that the connectivity of places with a home is necessary to estimate a reasonable living area.

    DOI: 10.1007/978-3-319-21383-5_95

    Web of Science

    researchmap

  • Extraction of Key Segments from Day-Long Sound Data Reviewed International journal

    Akinori Kasai, Sunao Hara, Masanobu Abe

    HCI International 2015 - Posters’ Extended Abstracts (Part I), CCIS 528   528   620 - 626   2015

     More details

    Language:English   Publishing type:Part of collection (book)   Publisher:SPRINGER-VERLAG BERLIN  

    We propose a method to extract particular sound segments from the sound recorded during the course of a day in order to provide sound segments that can be used to facilitate memory. To extract important parts of the sound data, the proposed method utilizes human behavior based on a multisensing approach. To evaluate the performance of the proposed method, we conducted experiments using sound, acceleration, and global positioning system data collected by five participants for approximately two weeks. The experimental results are summarized as follows: (1) various sounds can be extracted by dividing a day into scenes using the acceleration data; (2) sound recorded in unusual places is preferable to sound recorded in usual places; and (3) speech is preferable to nonspeech sound.

    DOI: 10.1007/978-3-319-21380-4_105

    Web of Science

    researchmap

  • Inhibitory Effects of an Orally Active Small Molecule Alpha4beta1/Alpha4beta7 Integrin Antagonist, TRK-170, on Spontaneous Colitis in HLA-B27 Transgenic Rats

    Hiroe Hirokawa, Yoko Koga, Rie Sasaki, Sunao Hara, Hiroyuki Meguro, Mie Kainoh

    GASTROENTEROLOGY   146 ( 5 )   S640 - S640   2014.5

     More details

    Language:English   Publisher:W B SAUNDERS CO-ELSEVIER INC  

    Web of Science

    researchmap

  • A graph-based spoken dialog strategy utilizing multiple understanding hypotheses Reviewed

    Norihide Kitaoka, Yuji Kinoshita, Sunao Hara, Chiyomi Miyajima, Kazuya Takeda

    Information and Media Technologies   9 ( 1 )   111 - 120   2014.3

     More details

    Language:English   Publishing type:Research paper (scientific journal)  

    We regarded a dialog strategy for information retrieval as a graph search problem and proposed several novel dialog strategies that can recover from misrecognition through a spoken dialog that traverses the graph. To recover from misrecognition without seeking confirmation, our system kept multiple understanding hypotheses at each turn and searched for a globally optimal hypothesis in the graph whose nodes express understanding states across user utterances in a whole dialog. In the search, we used a new criterion based on efficiency in information retrieval and consistency with understanding hypotheses, which is also used to select an appropriate system response. We showed that our system can make more efficient and natural dialogs than previous ones.

    DOI: 10.11185/imt.9.111

    researchmap

  • New approach to emotional information exchange: Experience metaphor based on life logs Reviewed International journal

    Masanobu Abe, Daisuke Fujioka, Kazuto Hamano, Sunao Hara, Rika Mochizuki, Tomoki Watanabe

    2014 IEEE International Conference on Pervasive Computing and Communication Workshops, PerCom 2014 Workshops   191 - 194   2014.3

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    We are striving to develop a new communication technology based on individual experiences that can be extracted from life logs. We have proposed the "Emotion Communication Model" and confirmed that significant correlation exists between experience and emotion. As the second step, particularly addressing impressive places and events, this paper describes an investigation of the extent to which we can share emotional information with others through individuals' experiences. Subjective experiments were conducted using life log data collected during 7-47 months. Experiment results show that (1) impressive places are determined by the distance from home, visit frequency, and direction from home and that (2) positive emotional information is highly consistent among people (71.4%), but it is not true for negative emotional information. Therefore, experiences are useful as metaphors to express positive emotional information.

    DOI: 10.1109/PerComW.2014.6815198

    researchmap

  • Development of a Toolkit Handling Multiple Speech-Oriented Guidance Agents for Mobile Applications Reviewed International journal

    Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

    Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice   79 - 85   2014

     More details

    Authorship:Lead author   Language:English   Publishing type:Part of collection (book)   Publisher:Springer  

    DOI: 10.1007/978-1-4614-8280-2_8

    researchmap

  • Evaluation of Invalid Input Discrimination Using Bag-of-Words for Speech-Oriented Guidance System Reviewed International journal

    Haruka Majima, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

    Natural Interaction with Robots, Knowbots and Smartphones, Putting Spoken Dialog Systems into Practice   389 - 397   2014

     More details

    Language:English   Publishing type:Part of collection (book)   Publisher:Springer  

    DOI: 10.1007/978-1-4614-8280-2_35

    researchmap

  • A Hybrid Text-to-Speech Based on Sub-Band Approach Reviewed International journal

    Takuma Inoue, Sunao Hara, Masanobu Abe

    Proceedings of Asia-Pacific Signal and Information Processing Association 2014 Annual Summit and Conference   1 - 4   2014

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    This paper proposes a sub-band speech synthesis approach to develop high-quality Text-to-Speech (TTS). For the low-frequency band and high-frequency band, Hidden Markov Model (HMM)-based speech synthesis and waveform -based speech synthesis are used, respectively. Both speech synthesis methods are widely known to show good performance and to have benefits and shortcomings from different points of view. One motivation is to apply the right speech synthesis method in the right frequency band. Experiment results show that in terms of the smoothness the proposed approach shows better performance than waveform -based speech synthesis, and in terms of the clarity it shows better than HMM-based speech synthesis. Consequently, the proposed approach combines the inherent benefits from both waveform-based speech synthesis and HMM-based speech synthesis.

    DOI: 10.1109/APSIPA.2014.7041575

    Web of Science

    researchmap

  • Invalid Input Rejection Using Bag-of-Words for Speech-oriented Guidance System Reviewed

    Haruka Majima, Yoko Fujita, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

    Journal of Information Processing   54 ( 2 )   443 - 451   2013.2

     More details

    Language:Japanese  

    On a real environment speech-oriented information guidance system, a valid and invalid input discrimination is important as invalid inputs such as noise, laugh, cough and utterances between users lead to unpredictable system responses. Generally, acoustic features such as MFCC (Mel-Frequency Cepstral Coefficient) are used for discrimination. Comparing acoustic likelihoods of GMMs (Gaussian Mixture Models) from speech data and noise data is one of the typical methods. In addition to that, using linguistic features, such as speech recognition result, is considered to improve discrimination accuracy as it reflects the task-domain of invalid inputs and meaningless recognition results from noise inputs. In this paper, we introduce Bag-of-Words (BOW) as a feature to discriminate between valid and invalid inputs. Support Vector Machine (SVM) and Maximum Entropy method (ME) are also employed to realize robust classification. We experimented the methods using real environment data obtained from the guidance system "Takemaru-kun." By applying BOW on SVM, the F-measure is improved to 85.09%, from 82.19% when using GMMs. In addition, experiments using features combining BOW with acoustic likelihoods from GMMs, Duration and SNR were conducted, improving the F-measure to 86.58%.

    CiNii Article

    CiNii Books

    researchmap

  • On-line detection of task incompletion for spoken dialog systems based on utterance and behavior tag N-gram Reviewed

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    The IEICE Transactions on Information and Systems (Japanese edition)   J96-D ( 1 )   81 - 93   2013.1

     More details

    Authorship:Lead author   Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Article

    CiNii Books

    researchmap

    Other Link: http://search.ieice.org/bin/summary.php?id=j96-d_1_81&category=D&year=2013&lang=J&abst=

  • Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition Reviewed

    Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    IEICE TRANSACTIONS on Information and Systems   E95D ( 10 )   2479 - 2485   2012.10

     More details

    Language:English   Publishing type:Research paper (scientific journal)   Publisher:IEICE-INST ELECTRONICS INFORMATION COMMUNICATIONS ENG  

    A novel speech feature generation-based acoustic model training method for robust speaker-independent speech recognition is proposed. For decades, speaker adaptation methods have been widely used. All of these adaptation methods need adaptation data. However, our proposed method aims to create speaker-independent acoustic models that cover not only known but also unknown speakers. We achieve this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then we train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the transformation matrices for the existing speakers are estimated. Next, we construct pseudo-speaker transformations by sampling the weight parameters from the distribution, and apply the transformation to the normalized features of the existing speaker to generate the features of the pseudo-speakers. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models trained using our proposed method are robust for unknown speakers.

    DOI: 10.1587/transinf.E95.D.2479

    Web of Science

    researchmap

    Other Link: http://search.ieice.org/bin/summary.php?id=e95-d_10_2479

  • Causal analysis of task completion errors in spoken music retrieval interactions Reviewed International journal

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of the 8th international conference on Language Resources and Evaluation (LREC 2012)   1365 - 1372   2012.5

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA  

    In this paper, we analyze the causes of task completion errors in spoken dialog systems, using a decision tree with N-gram features of the dialog to detect task-incomplete dialogs. The dialog for a music retrieval task is described by a sequence of tags related to user and system utterances and behaviors. The dialogs are manually classified into two classes: completed and uncompleted music retrieval tasks. Differences in tag classification performance between the two classes are discussed. We then construct decision trees which can detect if a dialog finished with the task completed or not, using information gain criterion. Decision trees using N-grams of manual tags and automatic tags achieved 74.2% and 80.4% classification accuracy, respectively, while the tree using interaction parameters achieved an accuracy rate of 65.7%. We also discuss more details of the causality of task incompletion for spoken dialog systems using such trees.

    Web of Science

    researchmap

    Other Link: http://www.lrec-conf.org/proceedings/lrec2012/summaries/1059.html

  • Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation Reviewed International journal

    Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of 2011 Automatic Speech Recognition and Understanding Workshop (ASRU 2011)   169 - 172   2011.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:IEEE  

    In this paper, we propose a novel acoustic model training method which is suitable for speaker adaptation in speech recognition. Our method is based on feature generation from a small amount of speakers' data. For decades, speaker adaptation methods have been widely used. Such adaptation methods need some amount of adaptation data and if the data is not sufficient, speech recognition performance degrade significantly. If the seed models to be adapted to a specific speaker can widely cover more speakers, speaker adaptation can perform robustly. To make such robust seed models, we adopt inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our seed models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers is estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudo-speakers' features. Finally, using these features, we train the acoustic seed models. Using this seed models, we obtained better speaker adaptation results than using simply environmentally adapted models. © 2011 IEEE.

    DOI: 10.1109/ASRU.2011.6163925

    Scopus

    researchmap

  • Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations Reviewed International journal

    Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of 2011 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2011)   1 - 5   2011.12

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:APSIPA  

    In this paper a novel speech feature generationbased acoustic model training method is proposed. For decades, speaker adaptation methods have been widely used. All existing adaptation methods need adaptation data. However, our proposed method creates speaker-independent acoustic models that cover not only known but also unknown speakers. We do this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the MLLR transformation matrices for the existing speakers are estimated. Next we generate pseudo-speaker MLLR transformations by sampling the weight parameters from the distribution, and apply the inverse of the transformation to the normalized existing speaker features to generate the pseudospeakers' features. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models which are created are robust for unknown speakers.

    Scopus

    researchmap

  • On-line detection of task incompletion for spoken dialog systems using utterance and behavior tag N-gram vectors Reviewed International journal

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of the Paralinguistic Information and Its Integration in Spoken Dialogue Systems Workshop   215 - 225   2011.9

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems Reviewed International journal

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of INTERSPEECH 2011   1312 - 1315   2011

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA-INT SPEECH COMMUNICATION ASSOC  

    We propose a method of detecting "task incomplete" dialogs in spoken dialog systems using N-gram-based dialog models. We used a database created during a field test in which inexperienced users used a client-server music retrieval system with a spoken dialog interface on their own PCs. In this study, the dialog for a music retrieval task consisted of a sequence of user and system tags that related their utterances and behaviors. The dialogs were manually classified into two classes: the dialog either completed the music retrieval task or it didn't. We then detected dialogs that did not complete the task, using N-gram probability models or a Support Vector Machine with N-gram feature vectors trained using manually classified dialogs. Off-line and on-line detection experiments were conducted on a large amount of real data, and the results show that our proposed method achieved good classification performance.

    Web of Science

    researchmap

    Other Link: http://www.isca-speech.org/archive/interspeech_2011/i11_1305.html

  • Music Recommendation System Based on Human-to-human Conversation Recognition Reviewed International journal

    Hiromasa Ohashi, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Workshop Proceedings of the 7th International Conference on Intelligent Environments: Ambient Intelligence and Smart Environments   10   352 - 361   2011

     More details

    Language:English   Publishing type:Part of collection (book)   Publisher:IOS PRESS  

    We developed an ambient system that plays music suitable for the mood of a human-human conversation using words obtained from a continuous-speech recognition system. Using the correspondence between a document space based on the texts related to the music and an acoustic space that expresses various audio features, the continuous-speech recognition results are mapped to an acoustic space. We performed a subjective evaluation of the system. The subjects rated the recommended music and the result reveals that the 10 most highly recommended selections included suitable music.

    DOI: 10.3233/978-1-60750-795-6-352

    Web of Science

    researchmap

  • Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram Reviewed International journal

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of INTERSPEECH2010   3034 - 3037   2010.9

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:ISCA  

    In this paper, we propose a method of detecting task-incompleted users for a spoken dialog system using an N-gram-based dialog history model. We collected a large amount of spoken dialog data accompanied by usability evaluation scores by users in real environments. The database was made by a field test in which naive users used a client-server music retrieval system with a spoken dialog interface on their own PCs. An N-gram model was trained from sequences that consist of user dialog acts and/or system dialog acts for two dialog classes, that is, the dialog completed the music retrieval task or the dialog incompleted the task. Then the system detects unknown dialogs that is not completed the task based on the N-gram likelihood. Experiments were conducted on large real data, and the results show that our proposed method achieved good classification performance. When the classifier correctly detected all of the task-incompleted dialogs, our proposed method achieved a false detection rate of 6%.

    Web of Science

    researchmap

    Other Link: http://www.isca-speech.org/archive/interspeech_2010/i10_3034.html

  • Rapid acoustic model adaptation using inverse MLLR-based feature generation Reviewed International journal

    Arata Ito, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    Proceedings of ICA2010   5   1 - 6   2010.8

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)  

    We propose a technique for generating a large amount of target speaker-like speech features by converting a large amount of prepared speech features of many speakers into features similar to those of the target speaker using a transformation matrix. To generate a large amount of target speaker-like features, the system only needs a very small amount of the target speaker's utterances. This technique enables the system to adapt the acoustic model efficiently from a small amount of the target speaker's utterances. To evaluate the proposed method, we prepared 100 reference speakers and 12 target (test) speakers. We conducted the experiments in an isolated word recognition task using a speech database collected by real PC-based distributed environments and compared our proposed method with MLLR, MAP and the method theoretically equivalent to the SAT. Experimental results proved that the proposed method needed a significantly smaller amount of the target speaker's utterances than conventional MLLR, MAP and SAT.

    Scopus

    researchmap

  • Estimation method of user satisfaction using N-gram-based dialog history model for spoken dialog system Reviewed International journal

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION   78 - 83   2010.5

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA  

    In this paper, we propose an estimation method of user satisfaction for a spoken dialog system using an N-gram-based dialog history model. We have collected a large amount of spoken dialog data accompanied by usability evaluation scores by users in real environments. The database is made by a field-test in which naive users used a client-server music retrieval system with a spoken dialog interface on their own PCs. An N-gram model is trained from the sequences that consist of users' dialog acts and/or the system's dialog acts for each one of six user satisfaction levels: from 1 to 5 and phi (task not completed). Then, the satisfaction level is estimated based on the N-gram likelihood. Experiments were conducted on the large real data and the results show that our proposed method achieved good classification performance; the classification accuracy was 94.7% in the experiment on a classification into dialogs with task completion and those without task completion. Even if the classifier detected all of the task incomplete dialog correctly, our proposed method achieved the false detection rate of only 6%.

    Web of Science

    researchmap

    Other Link: http://www.lrec-conf.org/proceedings/lrec2010/summaries/579.html

  • Data collection and usability study of a PC-based speech application in various user environments Reviewed International journal

    Sunao Hara, Chiyomi Miyajima, Katsunobu Ito, Norihide Kitaoka, Kazuya Takeda

    Proceedings of Oriental-COCOSDA 2008   39 - 44   2008.11

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • In-car Speech Data Collection along with Various Multimodal Signals Reviewed International journal

    Akira Ozaki, Sunao Hara, Takashi Kusakawa, Chiyomi Miyajima, Takanori Nishino, Norihide Kitaoka, Katunobu Itou, Kazuya Takeda

    Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)   1846 - 1851   2008.5

     More details

    Language:English   Publishing type:Research paper (international conference proceedings)   Publisher:EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA  

    In this paper, a large-scale real-world speech database is introduced along with other multimedia driving data. We designed a data collection vehicle equipped with various sensors to synchronously record twelve-channel speech, three-channel video, driving behavior including gas and brake pedal pressures, steering angles, and vehicle velocities, physiological signals including driver heart rate, skin conductance, and emotion-based sweating on the palms and soles, etc. These multimodal data are collected while driving on city streets and expressways under four different driving task conditions including two kinds of monologues, human-human dialog, and human-machine dialog. We investigated the response timing of drivers against navigator utterances and found that most overlapped with the preceding utterance due to the task characteristics and the features of Japanese. When comparing utterance length, speaking rate, and the filler rate of driver utterances in human-human and human-machine dialogs, we found that drivers tended to use longer and faster utterances with more fillers to talk with humans than machines.

    Web of Science

    researchmap

    Other Link: http://www.lrec-conf.org/proceedings/lrec2008/summaries/472.html

  • Data Collection System for the Speech Utterances to an Automatic Speech Recognition System under Real Environments Reviewed

    Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

    The IEICE transactions on information and systems   J90-D ( 10 )   2807 - 2816   2007.10

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Research paper (scientific journal)  

    researchmap

  • An online customizable music retrieval system with a spoken dialogue interface Reviewed International journal

    Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

    The Journal of the Acoustical Society of America   120 ( 5-2 )   3378 - 3379   2006.11

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

  • Novel orally active alpha 4 integrin antagonist, T-728, attenuates dextran sodium sulfate-induced chronic colitis in mice

    Ken-Ichi Hayashi, Hiroyuki Meguro, Sunao Hara, Rie Sasaki, Yoko Koga, Meiko Takeshita, Naoyoshi Yamamoto, Hiroe Hirokawa, Mie Kainoh

    GASTROENTEROLOGY   130 ( 4 )   A352 - A352   2006.4

     More details

    Language:English   Publisher:W B SAUNDERS CO-ELSEVIER INC  

    0

    Web of Science

    researchmap

  • Preliminary Study of a Learning Effect on Users to Develop a New Evaluation of the Spoken Dialogue System Reviewed International journal

    Sunao Hara, Ayako Shirose, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

    Proceedings of Oriental-COCOSDA 2005   164 - 168   2005.12

     More details

    Authorship:Lead author   Language:English   Publishing type:Research paper (international conference proceedings)  

    researchmap

▼display all

MISC

  • クラウドセンシングによる環境音の収集 Invited

    阿部匡伸, 原直

    騒音制御   42 ( 1 )   20 - 23   2018

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

    researchmap

  • スマートデバイスによる音収録とその応用 Invited

    原直

    日本音響学会誌   73 ( 8 )   483 - 490   2017.8

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

    DOI: 10.20697/jasj.73.8_483

    researchmap

  • イベントを比喩に用いた感情伝達法の検討 Reviewed

    濱野和人, 原直, 阿部匡伸

    電子情報通信学会論文誌   J97-D ( .12 )   1680 - 1683   2014.12

     More details

    Language:Japanese   Publishing type:Rapid communication, short report, research note, etc. (scientific journal)   Publisher:電子情報通信学会  

    researchmap

  • Potential Applications of Acoustic Signal Processing from Lifelog Research Perspectives Invited

    38 ( 1 )   15 - 21   2014

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

    CiNii Article

    CiNii Books

    researchmap

  • 「音声対話システムの実用化に向けて」10年間の長期運用を支えた音声情報案内システム「たけまるくん」の技術 Invited

    西村竜一, 原直, 川波弘道, LEE Akinobu, 鹿野清宏

    人工知能学会誌   28 ( 1 )   52 - 59   2013.1

     More details

    Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)   Publisher:The Japanese Society for Artificial Intelligence  

    DOI: 10.11517/jjsai.28.1_52

    CiNii Article

    CiNii Books

    J-GLOBAL

    researchmap

  • Detection of Task Incomplete Dialogs Based on Utterance Sequences N-gram for Spoken Dialog System Reviewed

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    The IEICE transactions on information and systems (Japanese edition)   J94-D ( 2 )   497 - 500   2011.2

     More details

    Language:Japanese   Publishing type:Rapid communication, short report, research note, etc. (scientific journal)   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Article

    CiNii Books

    researchmap

  • 機械学習による環境音からの主観的な騒音マップ生成 Invited

    原直, 阿部匡伸

    騒音制御   46 ( 3 )   126 - 130   2022.6

     More details

    Authorship:Lead author   Language:Japanese   Publishing type:Article, review, commentary, editorial, etc. (scientific journal)  

    researchmap

  • 環境音と航空写真を用いた場所の印象を推定する方式の検討

    小野祐介, 原直, 阿部匡伸

    第24回 日本音響学会関西支部 若手研究者交流研究発表会 発表概要集   34   2021.12

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    researchmap

  • 歌声合成のための双方向LSTM によるビブラート表現方式の検討

    金子隼人, 阿部匡伸, 原直

    日本音響学会講演論文集   1109 - 1112   2021.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 音素情報を知識蒸留する舌亜全摘出者の音韻明瞭度改善法

    高島和嗣, 阿部匡伸, 原直

    日本音響学会講演論文集   1057 - 1060   2021.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • パーキンソン病重症度推定に向けたインソール型圧力センサで計測した歩行データの分析

    林倖生, 原直, 阿部匡伸, 武本麻美

    第20回情報科学技術フォーラム (FIT 2021),CK-001   3   71 - 74   2021.8

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • 呼気流路の容易な制御を目的とした面接触型人工舌の構音改善に関する実験的研究

    長塚弘亮, 川上滋央, 古寺寛志, 佐藤匡晃, 田中祐貴, 兒玉直紀, 原直, 皆木省吾

    日本顎顔面補綴学会 第38回総会・学術大会   30 - 30   2021.6

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:日本顎顔面補綴学会  

    researchmap

  • 人対人の会話で自然な話題展開を支援するための対話戦略の検討

    前薗そよぎ, 原直, 阿部匡伸

    情報処理学会研究報告   2021-SLP-137 ( 16 )   1 - 6   2021.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • ニューラル機械翻訳により推定された読み仮名・韻律記号を入力とする日本語 End-to-End 音声合成の評価

    懸川直人, 原直, 阿部匡伸, 井島勇祐

    日本音響学会講演論文集   847 - 850   2021.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 歌唱表現を付与できるBidirectional-LSTM を用いた歌声合成方式の検

    金子隼人, 原直, 阿部匡伸

    日本音響学会講演論文集   987 - 990   2021.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • Evaluation of Concept Drift Adaptation for Acoustic Scene Classifier Based on Kernel Density Drift Detection and Combine Merge Gaussian Mixture Model

    Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

    日本音響学会講演論文集   201 - 204   2021.3

     More details

    Language:English   Publisher:日本音響学会  

    researchmap

  • TTSによる会話支援システムのための感圧センサを用いた手袋型入力デバイスの開発と入力速度の評価

    小林誠, 原直, 阿部匡伸

    情報処理学会研究報告   2020-HCI-190 ( 20 )   1 - 6   2020.12

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • パーキンソン病重症度推定のためのインソール型圧力センサを用いた時間的特徴量の検討

    林倖生, 原直, 阿部匡伸, 武本麻美

    2020年度(第71回)電気・情報関連学会中国支部連合大会,R20-14-02-03   1 - 1   2020.10

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:電気・情報関連学会中国支部  

    researchmap

  • Transformerを用いた日本語テキストからの読み仮名・韻律記号列推定

    懸川直人, 原直, 阿部匡伸, 井島勇祐

    日本音響学会講演論文集   829 - 832   2020.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • WaveNetを用いた言語情報なし感情音声合成における感情の強さ制御の検討

    松本剣斗, 原直, 阿部匡伸

    日本音響学会講演論文集   867 - 870   2020.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 映像と音声を用いた議論への関与姿勢や肯定的・否定的態度の推定方式の検討

    金岡翼, 上原佑太郎, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2020)講演論文集   1422 - 1429   2020.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • 話題の対象に対する親密度に応じて応答する音声対話システムの検討

    加藤大地, 原直, 阿部匡伸

    情報処理学会研究報告   2020-SLP-132 ( 21 )   1 - 6   2020.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • GPSデータのクラスタリングによる日常生活における場所の重要度の分析

    平田瑠, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2020)講演論文集   785 - 793   2020.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • ウェアラブルデバイスによる曖昧な入力からのニューラル機械翻訳を用いた日本語文章推定方式

    渡邊淳, 原直, 阿部匡伸

    情報処理学会研究報告   2020-HCI-187 ( 7 )   1 - 7   2020.3

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • 舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討

    荻野聖也, 原直, 阿部匡伸

    電子情報通信学会総合大会 情報・システムソサイエティ特別企画 学生ポスターセッション予稿集   124 - 124   2020.3

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:一般社団法人電子情報通信学会  

    researchmap

  • End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

    井上勝喜, 原直, 阿部匡伸, 林知樹, 山本龍一, 渡部晋治

    日本音響学会講演論文集   1095 - 1098   2020.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 言語情報なし感情合成音を学習に用いたCycleGANによる感情変換方式の検討

    松本剣斗, 原直, 阿部匡伸

    日本音響学会講演論文集   1165 - 1168   2020.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • End-to-End 音声認識を用いた End-to-End 音声合成の性能評価

    井上勝喜, 原直, 阿部匡伸, 渡部晋治

    第22回 日本音響学会 関西支部若手研究者交流研究発表会 概要集   6 - 6   2019.12

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:日本音響学会関西支部  

    researchmap

  • WaveNet による言語情報を含まない感情音声合成方式における話者性の検討

    松本剣斗, 原直, 阿部匡伸

    日本音響学会講演論文集   993 - 996   2019.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 新たにデザインされた人工舌と解剖学的人工舌の効果ならびにその選択基準

    佐藤匡晃, 長塚弘亮, 川上滋央, 兒玉直紀, 原直, 阿部匡伸, 皆木省吾

    日本補綴歯科学会 中国・四国支部学術大会,抄録集   28 - 28   2019.9

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:日本補綴歯科学会  

    researchmap

  • CNN Autoencoderから抽出したボトルネック特徴量を用いた環境音分類

    松原拓未, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2019)講演論文集   339 - 346   2019.7

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • GPSデータに基づく日常生活における特別な行動の検出

    小林誠, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2019)講演論文集   846 - 853   2019.7

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • WaveNetによる言語情報を含まない感情音声合成方式の検討

    松本剣斗, 原直, 阿部匡伸

    情報処理学会研究報告   2019-SLP-127 ( 61 )   1 - 6   2019.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • i-vectorに基づく賑わい音の推定方式の検討

    呉セン陽, 朝田興平, 原直, 阿部匡伸

    情報処理学会研究報告   2019-SLP-127 ( 33 )   1 - 6   2019.6

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • DNN音声合成における少量の目標感情音声を用いた感情付与方式の検討

    井上勝喜, 原直, 阿部匡伸, 井島勇祐

    日本音響学会講演論文集   1085 - 1088   2019.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討

    荻野聖也, 村上博紀, 原直, 阿部匡伸

    日本音響学会講演論文集   1155 - 1158   2019.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討

    村上博紀, 原直, 阿部匡伸

    日本音響学会講演論文集   1151 - 1154   2019.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • ながら聴き用楽曲の作業負荷に及ぼす影響とその選択方式の検討

    高瀬郁, 阿部匡伸, 原直

    情報処理学会研究報告   2018-MUS-121 ( 19 )   1 - 6   2018.11

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • DNN音声合成における感情付与方式の評価

    井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

    日本音響学会講演論文集   1105 - 1108   2018.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • Speech Enhancement of Glossectomy Patient's Speech using Voice Conversion Approach

    Masanobu Abe, Seiya Ogino, Hiroki Murakami, Sunao Hara

    日本生物物理学会第56回年会,シンポジウム:ヘルスシステムの理解と応用   198 - 198   2018.9

     More details

    Authorship:Corresponding author   Language:English   Publishing type:Research paper, summary (national, other academic conference)   Publisher:日本生物物理学会  

    researchmap

  • 声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討

    村上博紀, 原直, 阿部匡伸

    日本音響学会講演論文集   1175 - 1178   2018.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • クラウドソーシングによる環境音マップ構築のための主観的な騒々しさ推定方式の検討

    原直, 阿部匡伸

    第17回情報科学技術フォーラム (FIT 2018),O-001   4   343 - 346   2018.9

     More details

    Authorship:Lead author, Corresponding author   Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • 音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討

    荻野聖也, 村上博紀, 原直, 阿部匡伸

    電子情報通信学会技術研究報告   118 ( 112 )   7 - 12   2018.6

     More details

    Language:Japanese   Publisher:一般社団法人電子情報通信学会  

    researchmap

  • 舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

    村上博紀, 荻野聖也, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

    日本音響学会講演論文集   355 - 358   2018.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • DNN音声合成における感情付与のための継続時間長モデルの検討

    井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

    日本音響学会講演論文集   279 - 282   2018.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • クラウドソーシングによる賑わい音識別方式のフィールド実験評価

    朝田興平, 原直, 阿部匡伸

    日本音響学会講演論文集   79 - 82   2018.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • DNN音声合成における話者と感情の情報を扱うためのモデル構造の検討

    井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

    日本音響学会講演論文集   263 - 266   2017.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • DNNに基づく差分スペクトル補正を用いた声質変換による舌亜全摘出者の音韻明瞭性改善の検討

    村上博紀, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

    日本音響学会講演論文集   297 - 300   2017.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • DNNによる人間の感覚を考慮した騒々しさ推定方式に基づく騒音マップの作成

    小林将大, 原直, 阿部匡伸

    日本音響学会講演論文集   623 - 626   2017.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 環境音収集に効果的なインセンティブを与える可視化方式の検討 Reviewed

    畠山晏彩子, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2017)講演論文集   255 - 262   2017.7

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • DNN音声合成における感情付与のためのモデル構造の検討

    井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

    電子情報通信学会技術研究報告   117 ( 106 )   23 - 28   2017.6

     More details

    Language:Japanese   Publisher:一般社団法人電子情報通信学会  

    researchmap

  • 2つの粒度の生活圏に基づく見守りシステム

    鎌田成紀, 原直, 阿部匡伸

    電子情報通信学会総合大会 (D-9-12)   1 - 1   2017.3

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:一般社団法人電子情報通信学会  

    researchmap

  • DNNによる人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

    小林将大, 原直, 阿部匡伸

    日本音響学会講演論文集   799 - 802   2017.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • スマートフォンで収録した環境音データベースを用いたCNNによる環境音分類

    鳥羽隼司, 原直, 阿部匡伸

    日本音響学会講演論文集   139 - 142   2017.3

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 基本周波数変形を考慮したスペクトル変換手法の検討 Reviewed

    床建吾, 阿部匡伸, 原直

    第18回IEEE広島支部学生シンポジウム(HISS 18th)   174 - 176   2016.11

     More details

    Language:Japanese   Publisher:IEEE広島支部  

    researchmap

  • RNNによる実環境データからのマルチ音響イベント検出

    鳥羽隼司, 原直, 阿部匡伸

    日本音響学会講演論文集   43 - 44   2016.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • スマートフォンで収録した環境音に含まれるタップ音除去方式の検討

    朝田興平, 原直, 阿部匡伸

    日本音響学会講演論文集   45 - 48   2016.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • 重複音を含む環境音データベースにおける環境音検出のための特徴量の基本検討

    原直, 田中智康, 阿部匡伸

    日本音響学会講演論文集   3 - 6   2016.9

     More details

    Authorship:Lead author, Corresponding author   Language:Japanese   Publisher:日本音響学会  

    researchmap

  • GMMに基づく声質変換を用いた舌亜全摘出者の音韻明瞭性改善の検討

    田中慧, 原直, 阿部匡伸, 皆木省吾

    日本音響学会講演論文集   141 - 144   2016.9

     More details

    Language:Japanese   Publisher:日本音響学会  

    researchmap

  • Sound collection systems using a crowdsourcing approach for constructing subjective evaluation-based sound maps

    116 ( 189 )   41 - 46   2016.8

     More details

    Authorship:Lead author, Corresponding author   Language:Japanese  

    CiNii Article

    CiNii Books

    researchmap

  • 人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式 Reviewed

    小林将大, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2016)講演論文集   141 - 148   2016.7

     More details

    Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • GPSデータ匿名化レベルの主観的許容度を客観的に表現する指標の検討 Reviewed

    三藤優介, 原直, 阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2016)講演論文集   798 - 805   2016.7

     More details

    Authorship:Corresponding author   Language:Japanese   Publisher:一般社団法人情報処理学会  

    researchmap

  • A measure for transfer tendency between staying places

    116 ( 23 )   95 - 100   2016.5

     More details

    Language:Japanese  

    CiNii Article

    researchmap

  • A classification method for crowded situation using environmental sounds based on Gaussian mixture model-universal background model Reviewed

    Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

    The Journal of the Acoustical Society of America   140 ( 4 )   3110 - 3110   2016

     More details

    Language:Japanese   Publishing type:Research paper, summary (international conference)  

    DOI: 10.1121/1.4969721

    researchmap

  • Method to efficiently retrieve memorable scenes from video using automatically collected life log

    IPSJ SIG Notes   2015 ( 4 )   1 - 6   2015.5

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    One of the applications using life log is to retrieve memorable scenes. In this paper, for extracting memorable scenes from video, we propose a method to use life log that are automatically collected together with video during an event. Here, data in life log are GPS, pulse and sound. Three kinds of important points are extracted from the three data, and based on the important points, particular parts of video are extracted. According to subjective evaluation experiments, it is revealed that users can easily remember things by watching the extracted video and can remember details of the events including what were not recorded in the video.

    CiNii Article

    CiNii Books

    researchmap

  • A "a big day" search method using features of staying place

    HAYASHI Keigo, HARA Sunao, ABE Masanobu

    IEICE technical report. Life intelligence and office information systems   114 ( 500 )   89 - 94   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In recent years, a life log is getting popular and used to provide specific services for a particular person. One of them is to retrieve memories. Life log helps us to recall events, activities, accidents, etc., but the huge amount of data in life log make it difficult for us to find out what we really want. In this paper, we propose a method to retrieve "a big day" using a feature value that is calculated from GPS data; i.e., the feature is defined as function of visiting frequency of places. According to experiments results, the proposed method can retrieve "a big day" at a rate of 60% and unusual day at a rate of 90%. As subjective evaluations were also carried out from the perspective of effectiveness, efficiency and satisfaction. The results showed that the proposal method has better performance than a conventional method.

    CiNii Article

    CiNii Books

    researchmap

  • Living area extraction using staying places and routes

    MATSUO Yuji, HARA Sunao, ABE Masanobu

    IEICE technical report. Life intelligence and office information systems   114 ( 500 )   77 - 82   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In these days, demands for watching out for the safety of elderly and children are extremely increasing. To make the quality of the watching better, we think that living area plays an important role. Therefore, in this paper, we propose an algorithm to estimate living area of a person from his/her accumulated GPS data. The living area is defined by important places and routes. First, taking visiting frequency into account, important places are extracted, then routes are found so as to connect the important places using best-first search. Experiments are carried out for 3 users and evaluated by precision and recall. We confirmed that the proposed algorithm has better performance than a conventional method.

    CiNii Article

    CiNii Books

    researchmap

  • Behavior analysis of persons by classification of moving routes

    SETO Ryo, HARA Sunao, ABE Masanobu

    IEICE technical report. Life intelligence and office information systems   114 ( 500 )   31 - 36   2015.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this paper, we propose a method to behavior analysis of persons using GPS data to focus on how human have activity. We evaluated whether or not judging a person is active or not from moving routes data using PCA or NMF classification approach. The evaluation results show that the number of important eigenvalues by PCA and approximation error by NMF is effective whether the person is active or not. In addition, we extracted the pattern of moving routes and we evaluated the difference of moving routes extracted by PCA or NMF. As a result, PCA extracted high frequency patterns. On the other hand, NMF extracted not only high frequency patterns but also low frequency patterns.

    CiNii Article

    CiNii Books

    researchmap

  • FLAG: Lifelog aggregation system that was centered on position information

    2014 ( 6 )   1 - 6   2014.7

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Recently, the application and the service which utilize location information from GPS have highly prevailed. In this paper, we developed the system called FLAG which aggregates the variety of Lifelog under location information. FLAG manages location information discriminate between moving and staying. With using FLAG, we visualize categorized location information on the map and the time table. And implement set the function which registers individual name according to users in the staying state. We also link the location information from FLAG to Twitter using the posting time for an example of aggregating various kinds of Lifelog, This function enables Lifelog to show on the map even if the Lifelog has no positional information. For an evaluation of the FLAG system, we created correct data of staying by six users. And we compared accuracies of staying by using two detection methods. As a result, we confirmed that FLAG can be detected high accuracy staying than the original data.

    CiNii Article

    CiNii Books

    researchmap

    Other Link: http://id.nii.ac.jp/1001/00102345/

  • FLAG : Lifelog aggregation system that was centered on position information

    IEICE technical report. SC, Services Computing   114 ( 157 )   29 - 34   2014.7

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Recently, the application and the service which utilize location information from GPS have highly prevailed. In this paper, we developed the system called FLAG which aggregates the variety of Lifelog under location information. FLAG manages location information discriminate between moving and staying. With using FLAG, we visualize categorized location information on the map and the time table. And implement set the function which registers individual name according to users in the staying state. We also link the location information from FLAG to Twitter using the posting time for an example of aggregating various kinds of Lifelog, This function enables Lifelog to show on the map even if the Lifelog has no positional information. For an evaluation of the FLAG system, we created correct data of staying by six users. And we compared accuracies of staying by using two detection methods. As a result, we confirmed that FLAG can be detected high accuracy staying than the original data.

    CiNii Article

    CiNii Books

    researchmap

  • Development of environmental sound collection system using smart devices based on crowd-sourcing approach

    HARA Sunao, KASAI Akinori, ABE Masanobu, SONEHARA Noboru

    IEICE technical report. Speech   114 ( 52 )   177 - 180   2014.5

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this study, we aimed to construct environmental sound database for various sounds as Wisdom of the crowds. For example, considering environmental noise-pollution problems as a kind of the environmental sound, we need to consider not only signature sound, e.g. car noise and railway noise, but also life-space noise, e.g. festivity noise on streets or in parks. However, daily/widely sound collection is difficult to substantiate by few participant. Therefore, we aimed to measure environmental sound covered a vast area by applying crowdsourcing approach. First, we develop a prototype application which is run on the smart device with Android OS, and then we develop a prototype server system for collection and browsing the collected sound data. Then, we calibrated noise level measured by smart devices and carried out a sound collection experiment for validate an accuracy of sensors on the smart devices. In this report, we introduce the environemental sound collection system and the sound-collection experiment using the system.

    CiNii Article

    CiNii Books

    researchmap

  • Preliminary study for behavior analysis based on degree of nodes in a network constructed from GPS data

    2014 ( 6 )   1 - 6   2014.5

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    We discuss a behavior-anaysis method based on strucural features in networks constructed from a personal location history. In this paper, a directed network constructed from GPS location history is called a stay network. Stay network treat set of stay extratcted from the location history as nodes in networks. Generally, nodes of directed networks have out-degrees and in-degrees as structural features. The stay network has biased values of out-degrees, in-degrees and their ratio. Therefore, we assumed existance of relationships between the degree of biased values and human behaviors in the stay, and analyzed the relationships. We forcused on a purpose of the stay, which is assumed to be occur as a result of human behavior, and particularly analyzed relationships between the degree of biased values and the purpose of the stay.

    CiNii Article

    CiNii Books

    researchmap

  • Influence analysis on user's workload in a spoken dialog strategy for a car navigation system

    Masaki Yamaoka, Sunao Hara, Masanobu Abe

    IPSJ SIG Notes   2014 ( 7 )   1 - 6   2014.5

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    We assess dialog strategies from user's workload standpoint to suggest usage of spoken dialog systems while driving. We evaluate several dialog strategies, which are combinations of methods of dialog initiative and confirmation manner, with objective evaluation method using computer simulation. Two conditions of the simulation are set up; One is that the system speaks if the user has leeway to talk with the system, and the another one is that the system speaks even if the user will be fail to recognize the system's. We also evaluate spoken dialog systems applying these methods with subjective evaluation method. As a result of the evaluations, user initiative strategy has advantages in lower turn number and lower task completion rate than both system initiative strategy and mixed initiative strategies when cognitive rate is high. The result also shows that system initiative strategy and mixed initiative strategy have advantages in lower turn number and lower task completion rate than user initiative strategy when cognitive rate is low. Additionally, the result shows that the method, which system speaks only when user has enough time driving operation, makes user's workload low, however, it need more time to complete tasks.

    CiNii Article

    CiNii Books

    researchmap

  • Working Patterns Extractions by Applying Nonnegative Matrix Factorization to PC Operation Logs

    HIRAYAMA Akihiko, HARA Sunao, ABE Masanobu

    IEICE technical report. Life intelligence and office information systems   114 ( 32 )   33 - 38   2014.5

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this paper, we tried to extract working patterns by nonnegative matrix factorization using PC operation logs. Experiments were carried out for three occupations, in terms of daily-basis working patterns, we successfully extracted particular patterns for each occupation and some common working patterns for all occupations. We also presented that the extracted patterns can be easily interpreted as the ways of working.

    CiNii Article

    CiNii Books

    researchmap

  • Influence analysis on user's workload in a spoken dialog strategy for a car navigation system

    Masaki Yamaoka, Sunao Hara, Masanobu Abe

    IPSJ SIG Notes   2014 ( 7 )   1 - 6   2014.5

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    We assess dialog strategies from user's workload standpoint to suggest usage of spoken dialog systems while driving. We evaluate several dialog strategies, which are combinations of methods of dialog initiative and confirmation manner, with objective evaluation method using computer simulation. Two conditions of the simulation are set up; One is that the system speaks if the user has leeway to talk with the system, and the another one is that the system speaks even if the user will be fail to recognize the system's. We also evaluate spoken dialog systems applying these methods with subjective evaluation method. As a result of the evaluations, user initiative strategy has advantages in lower turn number and lower task completion rate than both system initiative strategy and mixed initiative strategies when cognitive rate is high. The result also shows that system initiative strategy and mixed initiative strategy have advantages in lower turn number and lower task completion rate than user initiative strategy when cognitive rate is low. Additionally, the result shows that the method, which system speaks only when user has enough time driving operation, makes user's workload low, however, it need more time to complete tasks.

    CiNii Article

    CiNii Books

    researchmap

  • D-9-4 Individual behavior analysis by comparing GPS logs with others

    Seto Ryo, Abe Masanobu, Hara Sunao

    Proceedings of the IEICE General Conference   2014 ( 1 )   88 - 88   2014.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Article

    CiNii Books

    researchmap

  • A development of a smart-device application for environmental sound collection based on crowdsourcing approach

    Sunao Hara, Akinori Kasai, Masanobu Abe, Noboru Sonehara

    IEICE Technical Report   113 ( 479 )   29 - 34   2014.3

     More details

    Authorship:Lead author   Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this study, we aimed to construct environmental sound database for various sounds as Wisdom of the crowds. For example, considering environmental noise-pollution problems as a kind of the environmental sound, we need to consider not only signature sound, e.g. car noise and railway noise, but also life-space noise, e.g. festivity noise on streets or in parks. However, daily/widely sound collection is difficult to substantiate by few participant. Therefore, we aimed to measure environmental sound covered a vast area by applying crowdsourcing approach. First, we develop a prototype application which is run on the smart device with Android OS, and then we develop a prototype server system for collection and browsing the collected sound data. Then, we calibrated noise level measured by smart devices and carried out a sound collection experiment for validate an accuracy of sensors on the smart devices. As a result of the experiment, we collected nine hundred minutes of sound data, and analyzed the relationships between the measured noise level and some subjective evaluations.

    CiNii Article

    CiNii Books

    researchmap

  • Evaluation of HMM-based speech synthesis using high-frequency component of speech waveform

    349 - 352   2014

     More details

    Language:Japanese  

    CiNii Article

    researchmap

  • Sound-map construction method based on symbolization for environmental sounds collected by crowd-sensing

    1535 - 1538   2014

     More details

    Language:Japanese  

    CiNii Article

    researchmap

  • Relationship between the size of speech database and subjective scores on phone-sized unit selection speech synthesis

    331 - 334   2014

     More details

    Language:Japanese  

    CiNii Article

    researchmap

  • Estimation of fuel consumption using an acoustic signal and multi-sensing signals of smartphone

    NANBA Shohei, HARA Sunao, HARA Sunao

    IEICE technical report. Signal processing   113 ( 28 )   1 - 6   2013.5

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    Fuel-consumption meters are equipped with many vehicles, however, they can only show the fuel-consumption/-efficiency value but not allow to use for other purpose, e.g., gathering, analyzing, etc. One of a method to output vehicles data is to use a diagnostic connector having compliant with OBD2 standards, that can output several vehicles' signals, such as, velocity, revolution of engine, and fuel consumption. However, because the protocols depend on manufactures or types of the vehicles, the method is not easy to use for the public. In this study, we aim to estimate the fuel consumption using acoustic signals and several sensor's signals equipped with a smartphone. An estimation of a number of revolutions of engine and an estimation of torque are needed for the estimation of fuel consumption. For the estimation of the number of revolutions, we analyze the acoustic signals from the engine by fast Fourier transform and calculate the estimation value from acoustic signal reducing road-noise approximated as a Gamma mixture distribution. For the estimation of the torque, we use physics of the car with the outputs of several sensors and the vehicle's data. We finally get the fuel consumption refer to a table of fuel-consumption rate, which is created in advance, by the estimated number of revolutions and the estimated torque. As a result of a experiment for the estimation of fuel-consumption, we can achieved a acceptable values of instantaneous fuel consumption, although values of average fuel comsumption have some errors.

    CiNii Article

    CiNii Books

    researchmap

  • The Individual Feature Analysis of the Network of the Stay Extracted from GPS Data

    FUJIOKA Daisuke, HARA Sunao, ABE Masanobu

    IEICE technical report. Life intelligence and office information systems   112 ( 466 )   179 - 184   2013.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    In this paper, we propose a novel technique for behavior analysis focusing on a structure of a personal "stay" network obtained by GPS data. We analyzed network features, which is called scale-free property, small-world property and cluster-state property, for six subjects. We also analyzed a distribution of a feature. that is called "motif, and then compared with other networks based on these features. We evaluated biases of a degree of a hub and a unique number of connection nodes, and clarified a difference between distributions of each subjects' networks.

    CiNii Article

    CiNii Books

    researchmap

  • Examination of an event sampling process with the similar impression from the others' life log

    HAMANO Kazuto, ABE Masanobu, HARA Sunao, FUJIOKA Daisuke, MOTIZUKI Rika, WATANABE Tomoki

    IEICE technical report. Life intelligence and office information systems   112 ( 466 )   173 - 178   2013.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    When we tell a story of our experience, we try to describe the story replacing our experience with other's one that can recall a similar impression to help understanding of the story. In this paper, we study about an extraction method of a common sense of impressive event between people having different backgounds and experiences. First, we showed several emotional words to the subjects and then asked them to recall the events matching the words. Using the recalled events, we compared two method for evaluate the similarity of the events; a method is based on quantification of the events by his/her emotion with five point scale, and the another one is based on decision the similarity of the event by discussion between two subjects. Experimental result suggested that the decision of the similarity by discussion is heavily affected by the strong emotion of the event.

    CiNii Article

    CiNii Books

    researchmap

  • 音声情報案内システムにおけるBag‐of‐Wordsを用いた無効入力棄却モデルの可搬性の評価

    真嶋温佳, TORRES Rafael, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

    日本音響学会研究発表会講演論文集(CD-ROM)   2013   ROMBUNNO.3-9-5   2013.3

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • The 2nd stage activity report of ASJ students and young researchers forum

    Okamoto Takuma, Okuzono Takeshi, Kidani Shunsuke, Hara Sunao, Ohta Tatsuya, Imoto Keisuke

    THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN   69 ( 9 )   519 - 520   2013

     More details

    Language:Japanese   Publisher:Acoustical Society of Japan  

    DOI: 10.20697/jasj.69.9_519

    CiNii Article

    researchmap

  • 音声情報システムにおける最大エントロピー法を用いた無効入力棄却の評価

    真嶋温佳, TORRES Rafael, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

    日本音響学会研究発表会講演論文集(CD-ROM)   2012   ROMBUNNO.3-1-8   2012.9

     More details

    Language:Japanese  

    J-GLOBAL

    researchmap

  • Invalid Input Rejection Using Bag-of-Words for Speech-Oriented Guidance System

    Majima Haruka, Fujita Yoko, Torres Rafael, Kawanami Hiromichi, Hara Sunao, Matsui Tomoko, Saruwatari Hiroshi, Shikano Kiyohiro

    IPSJ SIG Notes   2012 ( 7 )   1 - 6   2012.7

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    On a real environment speech-oriented information guidance system, a valid and invalid input discrimination is important as invalid inputs such as noise, laugh, cough and utterances between users lead to unpredictable system responses. Generally, acoustic features are used for discrimination. Comparing acoustic likelihoods of GMMs (Gaussian Mixture Models) from speech data and noise data is one of the typical methods. In addition to that, using linguistic features is considered to improve discrimination accuracy as it reflects the task-domain of invalid inputs and meaningless recognition re...

    CiNii Article

    CiNii Books

    researchmap

  • New Speech Research Paradigm in the Cloud Era

    Tomoyoshi Akiba, Koji Iwano, Jun Ogata, Tetsuji Ogawa, Nobutaka Ono, Takahiro Shinozaki, Koichi Shinoda, Hiroaki Nanjo, Hiromitsu Nishizaki, Masafumi Nishida, Ryuichi Nishimura, Sunao Hara, Takaaki Hori

    IPSJ SIG Notes   2012 ( 4 )   1 - 7   2012.7

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    Recently most individuals have come to use mobile information devices, and daily upload the information obtained by such devices to Internet Cloud. Accordingly the applications of speech information processing have been changing drastically. We need to create a new paradigm for the research and development of speech information processing to adapt to this change. In this paper, we summarize the state-of-the-art speech technologies, propose how to create a research platform for this new paradigm, and discuss the problems we should solve to realize it.

    CiNii Article

    CiNii Books

    researchmap

  • Design of a network service for developing a speech-oriented guidance system used on mobile comuputers

    Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

    IPSJ SIG Notes   2012 ( 1 )   1 - 6   2012.7

     More details

    Language:Japanese   Publisher:Information Processing Society of Japan (IPSJ)  

    In this paper we propose a novel speech service software for speech-oriented guidance systems. This software has been developed based on Takemaru-kun system, that is implemented at a community center since Nov. 2002. It is consisted of several modules, such as, Automatic Speech Recognition, Dialog Management, Text-to-Speech, Internet browser, and Computer Graphic Agent. This software and toolkit is plan to be freely distributed. It will be used as the speech service software as Software-as-a-Service (SaaS) for WWW site developers, and also used for an upgrade system of our system for advanc...

    CiNii Article

    CiNii Books

    researchmap

  • D-9-36 DEVELOPMENT OF A SPEECH-ORIENTED GUIDANCE PLATFORM AS A SOFTWARE-AS-A-SERVICE FOR VARIOUS USAGE AND ENVIRONEMENTS

    Hara Sunao, Kawanami Hiromichi, Saruwatari Hiroshi, Shikano Kiyohiro

    Proceedings of the IEICE General Conference   2012 ( 1 )   168 - 168   2012.3

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    CiNii Article

    CiNii Books

    researchmap

  • Multi-band Speech Recognition using Confidence of Blind Source Separation

    ANDO Atsushi, OHASHI Hiromasa, HARA Sunao, KITAOKA Norihide, TAKEDA Kazuya

    IEICE technical report. Speech   111 ( 431 )   219 - 224   2012.2

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    One of the main applications of Blind Source Separation (BSS) is to improve performance of Automatic Speech Recognition (ASR) systems. However, conventional BSS algorithm has been applied only to speech signals as a pre-processing approach. In this paper, a closely coupled framework between FDICA-based BSS algorithm and speech recognition system is proposed. In the source separation step, a confidence score of the separation accuracy for each frequency bin is first estimated. Subsequently, by employing multi-band speech recognition system, acoustic likelihood is calculated in the Mel-scale filter bank energy using the estimated BSS confidence scores. Therefore, our proposed method can reduce ASR errors which caused by separation errors in BSS and permutation errors in ICA, as in the conventional approach. Experimental results showed that our proposed method improved word correct rate of ASR by 8.2 % and word accuracy rate by 5.7 % on average.

    CiNii Article

    CiNii Books

    researchmap

  • 多様な利用環境における音声情報案内サービスソフトウェアの開発

    原直

    信学総大講演論文集, 2012   168   2012

  • Robust Acoustic Modeling Using MLLR Transformation-based Speech Feature Generation

    2010 ( 5 )   1 - 6   2011.2

     More details

  • MLLR変換行列により制約された音響特徴量生成による頑健な音響モデル (音声)

    伊藤 新, 原 直, 北岡 教英

    電子情報通信学会技術研究報告   110 ( 357 )   55 - 60   2010.12

     More details

    Publisher:電子情報通信学会  

    CiNii Article

    researchmap

  • 雑談音声の常時認識による楽曲提案システム(一般セッション,福祉と見守りのための画像・音声処理)

    大橋 宏正, 北岡 教英, 原 直, 武田 一哉

    電子情報通信学会技術研究報告. SP, 音声   110 ( 220 )   59 - 64   2010.10

     More details

    Publisher:社団法人電子情報通信学会  

    音声を連続音声認識システムにより常時認識することによって得られる認識単語列からその場の雰囲気に適切な音楽・楽曲を提案し,再生するシステムを構築した.楽曲を説明するテキストより構築された文書ベクトル空間と,楽曲の音響特徴量を表現する音響ベクトル空間の対応付けを利用することで,大語彙音声認識によって得られた音声認識単語列を音響ベクトル空間へとマッピングする.また,大語彙音声認識ではカバーできない固有名詞などのキーワードをワードスポッティングで認識する.本稿ではシステムの概要と基本的な性能評価の結果と実際の雑談音声への応用に向けた予備実験結果を示す.楽曲のレビューを読み上げた音声を認識した結果による楽曲検索結果と,レビューのテキストを用いた結果との比較により,テキストではMRR値1で検索できたものが,音声認識性能はWER70.55%,ワードスポッティング性能はF値31.58%でもMRR値0.83と比較的良い結果を得た.また,今後の雑談認識の応用の予備的実験を行い,雑談書き起こしからの例を示した.

    CiNii Article

    researchmap

  • 音声対話システムの対話履歴N-gramを利用したユーザ満足度推定手法

    原 直, 北岡 教英, 武田 一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2009 ( 14 )   1 - 6   2009.12

     More details

    Publisher:一般社団法人情報処理学会  

    本論文では,音声対話システムの対話履歴を N-gram によってモデル化を行い、その尤度に基づいてユーザ満足度の推定を行う手法を提案する.実験では音声対話による楽曲検索システムを利用した実環境音声データに収録されているユーザとシステムの対話履歴を利用して各ユーザの満足度を N-gram の尤度を用いて推定する.それぞれの満足度レベル毎の N-gram モデルは,ユーザの対話行動とシステムの対話行動を符号化した系列を利用することで作成される.本論文では,作成したモデルを用いて満足度の推定を行いその分類性能を評価した.実験結果より,提案手法は高い分類性能を示しており,特にタスク達成者とタスク未達成者の分類では全てのタスク未達成者を検出する際の誤検出率を 10% に抑えることができた.

    CiNii Article

    researchmap

  • User modeling for a satisfaction evaluation of a speech recognition system

    HARA Sunao, KITAOKA Norihide, TAKEDA Kazuya

    IEICE technical report   108 ( 338 )   61 - 66   2008.12

     More details

    Language:Japanese   Publisher:The Institute of Electronics, Information and Communication Engineers  

    A mathematical model for predicting the user satisfaction of a speech dialogue systems is studied based on a field trial of a voice-navigated music retrieval system. The Subjective Word Accuracy (subjective-WA), of the user is introduced as a background psychometrics for the satisfaction. In the field test, subjective-WA is collected through questionnaires together with satisfactory indexes and various user profiles. First we show that the subjective-WA is more significant to the user satisfactory than (Objective) Word Accuracy (objective-WA ), which is calculated using the manually given transcriptions for the recorded dialogue. Then through top-down clustering of the joint distribution of subjective- and objective-WAs, we show that the user population can be grouped into several sub-groups in terms of sensitivity to recognition accuracy. The lower bound of the objective-WA for the given subjective-WA is also calculated from the joint distribution. Finally, a graphical model is build that predicts the user satisfactory index from user profiles and reduces the distribution uncertainty of user satisfaction by 13% of its variance.

    CiNii Article

    researchmap

  • 多様な利用環境における楽曲検索音声対話システムのフィールドテストと評価

    原直, 宮島千代美, 伊藤克亘, 北岡教英, 武田一哉

    情報処理学会第70回全国大会講演論文集   70 ( 5 (3L-4) )   5-341 - 5-342   2008.3

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)   Publisher:社団法人情報処理学会  

    CiNii Article

    researchmap

  • 音声認識システムの満足度評価におけるユーザモデル(言語モデル・システム,第10回音声言語シンポジウム)

    原直, 北岡教英, 武田一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2008 ( 123 )   61 - 66   2008

     More details

    Publisher:社団法人情報処理学会  

    楽曲検索音声対話システムを用いたフィールドテストによるデータを用いて音声対話システムのユーザ満足度を推測する数学的モデルについて検討を行った.本研究ではユーザ満足度の背景となる心理尺度として,ユーザの主観に基づく体感認識精度を導入する.体感認識精度は満足度の指標や様々なユーザプロファイルとともにフイールドテストの事後アンケートを通じて収集した.まず,体感認識精度が対話データの書き起こしに基づいた客観認識精度指標よりも満足度指標と関係が高いことを示す.続いて,体感・客観認識精度の同時分布に対するトップダウンクラスタリングによってユーザのグループを認識精度に対する鋭敏さという観点により分類を行う.体感認識精度を与えうる客観認識精度の下限についてもこの同時分布を用いて計算を行う.最後に,ユーザプロファイルや環境条件を用いてユーザ満足度指標を推測するためのグラフモデルを構築し,分散の約13%に相当するユーザ満足度分布の不確かさを削減することを示す.

    CiNii Article

    researchmap

  • 多様な音響環境下における音声認識システム利用時のデータ収集システム(音声,聴覚)

    原 直, 宮島 千代美, 伊藤 克亘, 武田 一哉

    電子情報通信学会論文誌. D, 情報・システム   90 ( 10 )   2807 - 2816   2007.10

     More details

    Publisher:社団法人電子情報通信学会  

    多様なPC利用環境において音声認識の被験者実験を行うために,WWWを利用した音声対話システム及びそのデータ収集システムを構築した.本システムのユーザはPC上にインストールされたクライアントプログラムを用いることで音声対話によりPC内に蓄積された音楽を再生したり,音楽ダウンロードサイトの楽曲を試聴することができる.クライアントプログラムの利用により収集された音声データや対話ログはインターネットを介してサーバ上に蓄積される.サーバには,ユーザごとの辞書のカスタマイズ等の機能も実装している.本システムをインターネット上で試験的に公開し,2か月間のフィールドテストとしてデータ収集を行ったところ合計59時間のシステム稼動時データを収集した.そのうち約5時間41分のデータ(11351個)が音声区間として検出されており,システムに対して発話されていた音声データとして6335発話を得た.

    CiNii Article

    researchmap

  • 楽曲検索システムにおけるプレイリストに適応した音響モデル構築手法に関する検討(音響・音韻モデル)

    原直, 宮島千代美, 北岡教英, 伊藤克亘, 武田一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2007 ( 75 )   87 - 90   2007.7

     More details

    Publisher:社団法人情報処理学会  

    本論文では楽曲検索システムの音声インタフェースに適用するための与えられた認識語彙集合に最適なHMM 音響モデルを学習するための手法について述べる。本論文が対象とする楽曲検索アプリケーションにおいては各ユーザ毎にHMM 音響モデルをカスタマイズすることが重要である、なぜなら、1)楽曲名やアーティスト名には一般的なテキスト読み上げコーパスには出現しないような音韻コンテキストが存在すること、2)ユーザによって蓄積している音楽が異なっていること、が理由としてあげられる。特に、認識語彙集合に対して最適な状態共有構造を探すということは音響モデルの学習における新しい問題である。そこで本研究では100名以上の話者による合成音声を用いてタスクに関連した語彙発話を生成しタスク依存音響モデルを構築する手法を提案する。フィールドテストによる評価実験の結果、提案手法により作成したタスク依存音響モデルはタスク非依存音響モデルに比べて約10%の単語誤り削減率を達成することを確認した。

    CiNii Article

    researchmap

  • 楽曲検索音声対話システムの評価 (感性)

    原 直, 伊藤 克亘, 北岡 教英

    シンポジウムモバイル研究論文集   2007   47 - 50   2007

     More details

    Publisher:モバイル学会  

    CiNii Article

    researchmap

  • 汎用PC上で利用された音声対話システムによる音声収集と評価(第8回音声言語シンポジウム)

    原直, 宮島千代美, 伊藤克亘, 武田一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2006 ( 136 )   167 - 172   2006

     More details

    Publisher:社団法人情報処理学会  

    実際の利用環境に近づけた被験者実験を行うために, WWW経由でカスタマイズ可能な音声対話システム及びそのデータ収集システムを構築した.本システムのユーザは自分のPCに音声認識システムをインストールし利用する.不特定多数のユーザに対応するために,インターネット上に用意されたリモートサーバ上で各ユーザが本システムのカスタマイズすることが可能である.さらに,本システムを利用する各ユーザのPCで収録された音声データはインターネットを通してリモートサーバに送信される.本システムを利用することにより,複数のユーザによる実環境下における音声認識システム利用時データを収集することが可能になる.本論文では実際に本システムを使用したフィールドテストを行いデータ収集及び分析を行った.インターネット上に公開して2ヶ月間で59時間のデータが収集され,そのうち約5時間41分のデータ(11351個)が音声区間として検出されていた.認識実験の結果,検出された音声データからシステムに対して発話されていた音声データ4716発話を評価用データとした場合の単語正解率は66.0%であり,各ユーザ毎に教師無しMLLR適応を行うことで単語正解率は70.5%に向上した.

    CiNii Article

    researchmap

  • 長期間の音声対話インタフェース利用時における学習効果の評価

    原 直, 白勢 彩子, 宮島 千代美, 伊藤 克亘, 武田 一哉

    日本音響学会研究発表会講演論文集   2005 ( 1 )   153 - 154   2005.3

  • 音声対話インタフェースの長期利用時における学習効果の評価

    原直, 白勢彩子, 宮島千代美, 伊藤克亘, 武田一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2005 ( 12 )   17 - 22   2005

     More details

    Publisher:社団法人情報処理学会  

    本研究では車内での利用を想定した音声対話による楽曲検索システムを構築している.このシステムは, ユーザが対話によって聞きたい楽曲を検索し再生するというシステムである.以前の報告ではユーザが約1時間の間システムを利用する実験を行った.この実験において被験者によって度合は異なるものの, 習熟することにより認識性能が向上するという知見が得られた.そこで, 本報告では, 被験者がシステムに十分慣れるように, 1時間のセッションを5回繰り返す実験を行った.実験により収録した12名の音声を分析した結果, 最終日において初日の約60%の誤り改善率を得た.

    CiNii Article

    researchmap

  • 音声対話による楽曲検索システム(音声対話システムA)(テーマ:音声対話システム、音声言語情報処理、一般)

    原直, 白勢彩子, 宮島千代美, 伊藤克亘, 武田一哉

    情報処理学会研究報告. SLP, 音声言語情報処理   2004 ( 103 )   31 - 36   2004.2

     More details

    Publisher:社団法人情報処理学会  

    近年,音声認識技術を用いた様々なアプリケーションが考えられている.例えば,ハンズフリーで利用できるという特徴を生かして,カーナビゲーションシステムなどに利用されている.また,インターネット経由の楽曲ダウンロードシステムが増えており,使いやすい楽曲検索インタフェースへの期待が高まっている.そこで本研究では車内での利用を想定した音声対話による楽曲検索システムを作成した.このシステムは,ユーザが対話によって聞きたい楽曲を検索し再生するというシステムである.本報告ではシステムの詳細及びシステムを用いた音声対話の収録について述べる.プロトタイプを用いて約150名の被験者実験を行った結果,室内で約80%,実車運転時で約76%の単語正解率を得た.

    CiNii Article

    researchmap

  • Preliminary study on the evaluation of a quality of spoken dialogue system in terms of user factors

    Ayako Shirose, Sunao Hara, Hiroshi Fujimura, Katsunobu Ito, Kazuya Takeda, Fumitada Itakura

    IPSJ SIG Technical Report   2003 ( 124 (2003-SLP-049) )   253 - 258   2003.12

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    CiNii Article

    researchmap

  • Implementation and evaluation of Julius/Julian on the PDA environment

    Sunao Hara, Nobuo Kawaguchi, Kazuya Takeda, Fumitada Itakura

    IPSJ SIG Technical Report   2003 ( 14 (2002-SLP-045) )   131 - 136   2003.2

     More details

    Language:Japanese   Publishing type:Research paper, summary (national, other academic conference)  

    CiNii Article

    researchmap

▼display all

Presentations

  • 人対人の会話で自然な話題展開を支援するための対話戦略の検討

    前薗そよぎ, 原直, 阿部匡伸

    音学シンポジウム2021(情報処理学会 音声言語処理研究会)  2021.6.18  情報処理学会

     More details

    Event date: 2021.6.18 - 2021.6.19

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 呼気流路の容易な制御を目的とした面接触型人工舌の構音改善に関する実験的研究

    長塚弘亮, 川上滋央, 古寺寛志, 佐藤匡晃, 田中祐貴, 兒玉直紀, 原直, 皆木省吾

    日本顎顔面補綴学会 第38回総会・学術大会  2021.6.4 

     More details

    Event date: 2021.6.3 - 2021.6.5

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • ニューラル機械翻訳により推定された読み仮名・韻律記号を入力とする日本語 End-to-End 音声合成の評価

    懸川直人, 原直, 阿部匡伸, 井島勇祐

    日本音響学会2021年春季研究発表会  2021.3.11  日本音響学会

     More details

    Event date: 2021.3.10 - 2021.3.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Evaluation of Concept Drift Adaptation for Acoustic Scene Classifier Based on Kernel Density Drift Detection and Combine Merge Gaussian Mixture Model

    Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

    2021.3.10 

     More details

    Event date: 2021.3.10 - 2021.3.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 歌唱表現を付与できるBidirectional-LSTM を用いた歌声合成方式の検討

    金子隼人, 原直, 阿部匡伸

    日本音響学会2021年春季研究発表会  2021.3.10  日本音響学会

     More details

    Event date: 2021.3.10 - 2021.3.12

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • TTSによる会話支援システムのための感圧センサを用いた手袋型入力デバイスの開発と入力速度の評価

    IPSJ SIG-HCI  2020.12.9 

     More details

    Event date: 2020.12.8 - 2020.12.9

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Module Comparison of Transformer-TTS for Speaker Adaptation based on Fine-tuning International conference

    Katsuki Inoue, Sunao Hara, Masanobu Abe

    APSIPA Annual Summit and Conference 2020  2020.12  APSIPA

     More details

    Event date: 2020.12.7 - 2020.12.10

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online/Virtual Conference (Auckland, New Zealand)  

    researchmap

    Other Link: https://ieeexplore.ieee.org/document/9306250

  • Concept Drift Adaptation for Acoustic Scene Classifier Based on Gaussian Mixture Model International conference

    Ibnu Daqiqil Id, Masanobu Abe, Sunao Hara

    The 2020 IEEE Region 10 Conference (IEEE-TENCON 2020)  2020.11  IEEE

     More details

    Event date: 2020.11.16 - 2020.11.19

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online/Virtual Conference (Osaka, Japan)  

    researchmap

  • Controlling the Strength of Emotions in Speech-like Emotional Sound Generated by WaveNet International conference

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    Interspeech 2020  2020.10  ISCA

     More details

    Event date: 2020.10.25 - 2020.10.29

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online/Virtual Conference (Shanghai, China)  

    researchmap

  • パーキンソン病重症度推定のためのインソール型圧力センサを用いた時間的特徴量の検討

    林倖生, 原直, 阿部匡伸

    2020年度(第71回)電気・情報関連学会中国支部連合大会  2020.10.24 

     More details

    Event date: 2020.10.24

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Transformerを用いた日本語テキストからの読み仮名・韻律記号列推定

    懸川直人, 原直, 阿部匡伸, 井島勇祐

    日本音響学会2020年秋季研究発表会  2020.9.11  日本音響学会

     More details

    Event date: 2020.9.9 - 2020.9.11

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:オンライン  

    researchmap

  • WaveNetを用いた言語情報なし感情音声合成における感情の強さ制御の検討

    松本剣斗, 原直, 阿部匡伸

    日本音響学会2020年秋季研究発表会  2020.9.10  日本音響学会

     More details

    Event date: 2020.9.9 - 2020.9.11

    Language:Japanese   Presentation type:Poster presentation  

    Venue:オンライン  

    researchmap

  • 映像と音声を用いた議論への関与姿勢や肯定的・否定的態度の推定方式の検討

    Tsubasa Kanaoka, Yutaro Uehara, Sunao Hara, Masanobu Abe

    DICOMO 2020  2020.6.26 

     More details

    Event date: 2020.6.24 - 2020.6.26

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • GPSデータのクラスタリングによる日常生活における場所の重要度の分析

    Rui Hirata, Sunao Hara, Masanobu Abe

    DICOMO 2020  2020.6.25 

     More details

    Event date: 2020.6.24 - 2020.6.26

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

    2020.6.7 

     More details

    Event date: 2020.6.6 - 2020.6.7

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 話題の対象に対する親密度に応じて応答する音声対話システムの検討

    Daichi Kato, Sunao Hara, Masanobu Abe

    2020.6.6 

     More details

    Event date: 2020.6.6 - 2020.6.7

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Semi-supervised speaker adaptation for end-to-end speech synthesis with the pretrained models International coauthorship International conference

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

    ICASSP 2020  2020.5  IEEE

     More details

    Event date: 2020.5.4 - 2020.5.8

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Online/Virtual Conference (Barcelona, Spain)  

    researchmap

  • 舌亜全摘出者の音韻明瞭度改善のための推定音素事後確率を用いた声質変換の検討

    Seiya Ogino, Sunao Hara, Masanobu Abe

    2020.3.17 

     More details

    Event date: 2020.3.17 - 2020.3.18

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 言語情報なし感情合成音を学習に用いたCycleGANによる感情変換方式の検討

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    2020.3.18 

     More details

    Event date: 2020.3.16 - 2020.3.18

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • End-to-End 音声認識を用いた音声合成の半教師あり話者適応 International coauthorship

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe

    2020.3.17 

     More details

    Event date: 2020.3.16 - 2020.3.18

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • ウェアラブルデバイスによる曖昧な入力からのニューラル機械翻訳を用いた日本語文章推定方式

    Jun Watanabe, Sunao Hara, Masanobu Abe

    IPSJ SIG-HCI  2020.3.16 

     More details

    Event date: 2020.3.16 - 2020.3.17

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • End-to-End⾳声認識を⽤いたEnd-to-End⾳声合成の性能評価 International coauthorship

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Shinji Watanabe

    2019.11.30 

     More details

    Event date: 2019.11.30

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients’ Speech International conference

    Hiroki Murakami, Sunao Hara, Masanobu Abe

    APSIPA Annual Summit and Conference 2019  2019.11.19  APSIPA

     More details

    Event date: 2019.11.18 - 2019.11.21

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Lanzhou, China  

    researchmap

  • Speech-like Emotional Sound Generator by WaveNet International conference

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    APSIPA Annual Summit and Conference 2019  2019.11  APSIPA

     More details

    Event date: 2019.11.18 - 2019.11.21

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Lanzhou, China  

    researchmap

  • WaveNet による言語情報を含まない感情音声合成方式における話者性の検討

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    2019.9.6 

     More details

    Event date: 2019.9.4 - 2019.9.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 新たにデザインされた人工舌と解剖学的人工舌の効果ならびにその選択基準

    佐藤匡晃, 長塚弘亮, 川上滋央, 兒玉直紀, 原直, 阿部匡伸, 皆木省吾

    日本補綴歯科学会 中国・四国支部学術大会  2019.9.1  日本補綴歯科学会 中国・四国支部

     More details

    Event date: 2019.8.31 - 2019.9.1

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:広島県福山市  

    researchmap

  • A signal processing perspective on human gait: Decoupling walking oscillations and gestures International coauthorship International conference

    Adrien Gregorj, Zeynep Yücel, Sunao Hara, Akito Monden, Masahiro Shiomi

    The 4th International Conference on Interactive Collaborative Robotics 2019 (ICR 2019)  2019.8 

     More details

    Event date: 2019.8.20 - 2019.8.25

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • GPSデータに基づく日常生活における特別な行動の検出

    Makoto Kobayashi, Sunao Hara, Masanobu Abe

    DICOMO 2019  2019.7.4 

     More details

    Event date: 2019.7.3 - 2019.7.5

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • CNN Autoencoder から抽出したボトルネック特徴量を用いた環境音分類

    Takumi Matsubara, Sunao Hara, Masanobu Abe

    DICOMO 2019  2019.7.3 

     More details

    Event date: 2019.7.3 - 2019.7.5

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • WaveNetによる言語情報を含まない感情音声合成方式の検討

    Kento Matsumoto, Sunao Hara, Masanobu Abe

    2019.6.23 

     More details

    Event date: 2019.6.22 - 2019.6.23

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • i-vectorに基づく賑わい音の推定方式の検討

    Zhenyang Wu, Kohei Tomoda, Sunao Hara, Masanobu Abe

    2019.6.22 

     More details

    Event date: 2019.6.22 - 2019.6.23

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 舌亜全摘出者の音韻明瞭度改善のための Bidirectional LSTM-RNN に基づく音素補助情報を用いた声質変換方式の検討

    Hiroki Murakami, Sunao Hara, Masanobu Abe

    2019.3.4 

     More details

    Event date: 2019.3.3 - 2019.3.5

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 声質変換による舌亜全摘出者の音韻明瞭度改善のための音素補助情報の推定方式の検討

    Seiya Ogino, Hiroki Murakami, Sunao Hara, Masanobu Abe

    2019.3.4 

     More details

    Event date: 2019.3.3 - 2019.3.5

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN音声合成における少量の目標感情音声を用いた感情付与方式の検討

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima

    2019.3.3 

     More details

    Event date: 2019.3.3 - 2019.3.5

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • ながら聴き用楽曲の作業負荷に及ぼす影響とその選択方式の検討

    Kaoru Takase, Masanobu Abe, Sunao Hara

    2018.11.21 

     More details

    Event date: 2018.11.21 - 2018.11.22

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • クラウドソーシングによる環境音マップ構築のための主観的な騒々しさ推定方式の検討

    Sunao Hara, Masanobu Abe

    2018.9.20 

     More details

    Event date: 2018.9.19 - 2018.9.21

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Speech Enhancement of Glossectomy Patient’s Speech using Voice Conversion Approach

    Masanobu Abe, Seiya Ogino, Hiroki Murakami, Sunao Hara

    日本生物物理学会第56回年会,シンポジウム:ヘルスシステムの理解と応用  2018.9.15  日本生物物理学会

     More details

    Event date: 2018.9.15 - 2018.9.17

    Language:English   Presentation type:Oral presentation (general)  

    Venue:岡山大学 津島キャンパス  

    researchmap

  • 声質変換による舌亜全摘出者の音韻明瞭度改善のための補助情報の検討

    Hiroki Murakami, Sunao Hara, Masanobu Abe

    2018.9.12 

     More details

    Event date: 2018.9.12 - 2018.9.14

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN音声合成における感情付与方式の評価

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

    2018.9.12 

     More details

    Event date: 2018.9.12 - 2018.9.14

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Naturalness Improvement Algorithm for Reconstructed Glossectomy Patient’s Speech Using Spectral Differential Modification in Voice Conversion International conference

    Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    Interspeech 2018  2018.9.5  ISCA

     More details

    Event date: 2018.9.2 - 2018.9.6

    Language:English   Presentation type:Poster presentation  

    Venue:Hyderabad, India  

    researchmap

  • 音声と口唇形状を用いた声質変換による舌亜全摘出者の音韻明瞭度改善の検討

    Seiya Ogino, Hiroki Murakami, Sunao Hara, Masanobu Abe

    SP  2018.6.28 

     More details

    Event date: 2018.6.28 - 2018.6.29

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

    Hiroki Murakami, Seiya Ogino, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

     More details

    Event date: 2018.3.13 - 2018.3.15

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • クラウドソーシングによる賑わい音識別方式のフィールド実験評価

    Kohei Tomoda, Sunao Hara, Masanobu Abe

     More details

    Event date: 2018.3.13 - 2018.3.15

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN音声合成における感情付与のための継続時間長モデルの検討

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

     More details

    Event date: 2018.3.13 - 2018.3.15

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Sound sensing using smartphones as a crowdsourcing approach International conference

    Sunao Hara, Asako Hatakeyama, Shota Kobayashi, Masanobu Abe

    APSIPA Annual Summit and Conference 2017  2017.12.15  APSIPA

     More details

    Event date: 2017.12.12 - 2017.12.15

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Kuala Lumpur, Malaysia  

    researchmap

  • An investigation to transplant emotional expressions in DNN-based TTS synthesis, International conference

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

    APSIPA Annual Summit and Conference 2017  2017.12.14  APSIPA

     More details

    Event date: 2017.12.12 - 2017.12.15

    Language:English   Presentation type:Poster presentation  

    Venue:Kuala Lumpur, Malaysia  

    researchmap

  • New monitoring scheme for persons with dementia through monitoring-area adaptation according to stage of disease, International conference

    Shigeki Kamada, Yuji Matsuo, Sunao Hara, Masanobu Abe

    ACM SIGSPATIAL Workshop on Recommendations for Location-based Services and Social Networks (LocalRec 2017)  ACM

     More details

    Event date: 2017.11.7 - 2017.11.10

    Language:English   Presentation type:Poster presentation  

    Venue:Redondo Beach, CA, USA  

    researchmap

  • DNN に基づく差分スペクトル補正を用いた声質変換による舌亜全摘出者の音韻明瞭性改善の検討

    Hiroki Murakami, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    2017.9.26 

     More details

    Event date: 2017.9.25 - 2017.9.27

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN による人間の感覚を考慮した騒々しさ推定方式に基づく騒音マップの作成

    Shota Kobayashi, Sunao Hara, Masanobu Abe

    2017.9.26 

     More details

    Event date: 2017.9.25 - 2017.9.27

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNN音声合成における話者と感情の情報を扱うためのモデル構造の検討

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

    2017.9.25 

     More details

    Event date: 2017.9.25 - 2017.9.27

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Prediction of subjective assessments for a noise map using deep neural networks International conference

    Shota Kobayashi, Sunao Hara, Masanobu Abe

    2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2017)  2017.9.13  ACM

     More details

    Event date: 2017.9.11 - 2017.9.15

    Language:English   Presentation type:Poster presentation  

    Venue:Maui, Hawaii, USA  

    researchmap

  • Speaker Dependent Approach for Enhancing a Glossectomy Patient's Speech via GMM-based Voice Conversion International conference

    Kei Tanaka, Sunao Hara, Masanobu Abe, Masaaki Sato, Shogo Minagi

    Interspeech 2017  2017.8.23  ISCA

     More details

    Event date: 2017.8.20 - 2017.8.24

    Language:English   Presentation type:Poster presentation  

    Venue:Stockholm, Sweden  

    researchmap

  • 環境音収集に効果的なインセンティブを与える可視化方式の検討

    Asako Hatakeyama, Sunao Hara, Masanobu Abe

    DICOMO 2017  2017.6.28 

     More details

    Event date: 2017.6.28 - 2017.6.30

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • DNN音声合成における感情付与のためのモデル構造の検討

    Katsuki Inoue, Sunao Hara, Masanobu Abe, Katsunobu Houjou, Yusuke Ijima

    SP  2017.6.22 

     More details

    Event date: 2017.6.22 - 2017.6.23

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 2つの粒度の生活圏に基づく見守りシステム

    Shigeki Kamada, Sunao Hara, Masanobu Abe

    2017.3.22 

     More details

    Event date: 2017.3.22 - 2017.3.25

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • スマートフォンで収録した環境音データベースを用いたCNNによる環境音分類

    Shunji Toba, Sunao Hara, Masanobu Abe

    2017.3.16 

     More details

    Event date: 2017.3.15 - 2017.3.17

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • DNNによる人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

    Shota Kobayashi, Sunao Hara, Masanobu Abe

    2017.3.16 

     More details

    Event date: 2017.3.15 - 2017.3.17

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Enhancing a Glossectomy Patient’s Speech via GMM-based Voice Conversion International conference

    Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

    APSIPA Annual Summit and Conference 2016  2016.12.13  APSIPA

     More details

    Event date: 2016.12.13 - 2016.12.16

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Jeju, Korea  

    researchmap

  • A classification method for crowded situation using environmental sounds based on Gaussian mixture model-universal background model International conference

    Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

    ASA/ASJ 5th Joint Meeting  2016.11.29  米国音響学会/日本音響学会

     More details

    Event date: 2016.11.28 - 2016.12.2

    Language:English   Presentation type:Poster presentation  

    Venue:Honolulu, Hawaii  

    researchmap

  • 基本周波数変形を考慮したスペクトル変換手法の検討

    Kengo Toko, Sunao Hara, Masanobu Abe

    2016.11.19 

     More details

    Event date: 2016.11.19 - 2016.11.20

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • スマートフォンで収録した環境音に含まれるタップ音除去方式の検討

    Kohei Tomoda, Sunao Hara, Masanobu Abe

    2016.9.15 

     More details

    Event date: 2016.9.14 - 2016.9.16

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 重複音を含む環境音データベースにおける環境音検出のための特徴量の基本検討

    Sunao Hara, Tomoyasu Tanaka, Masanobu Abe

    2016.9.15 

     More details

    Event date: 2016.9.14 - 2016.9.16

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • GMMに基づく声質変換を用いた舌亜全摘出者の音韻明瞭性改善の検討

    Kei Tanaka, Sunao Hara, Masanobu Abe, Shogo Minagi

    2016.9.15 

     More details

    Event date: 2016.9.14 - 2016.9.16

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • RNNによる実環境データからのマルチ音響イベント検出

    Shunji Toba, Sunao Hara, Masanobu Abe

    2016.9.15 

     More details

    Event date: 2016.9.14 - 2016.9.16

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • LiBS: Lifelog browsing system to support sharing of memories International conference

    Atsuya Namba, Sunao Hara, Masanobu Abe

    2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2016)  2016.9.13  ACM

     More details

    Event date: 2016.9.12 - 2016.9.16

    Language:English   Presentation type:Poster presentation  

    Venue:Heidelberg, Germany  

    researchmap

  • Safety vs. Privacy: User Preferences from the Monitored and Monitoring Sides of a Monitoring System International conference

    Shigeki Kamada, Sunao Hara, Masanobu Abe

    2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UniComp 2016)  2016.9.13  ACM

     More details

    Event date: 2016.9.12 - 2016.9.16

    Language:English   Presentation type:Poster presentation  

    Venue:Heidelberg, Germany  

    researchmap

  • 主観的評価に基づいた騒音マップ構築のためのクラウドソーシングによる環境音収集システム

    Sunao Hara, Shota Kobayashi, Masanobu Abe

    SP  2016.8.25 

     More details

    Event date: 2016.8.24 - 2016.8.25

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Sound collection systems using a crowdsourcing approach to construct sound map based on subjective evaluation International conference

    Sunao Hara, Shota Kobayashi, Masanobu Abe

    IEEE ICME Workshop on Multimedia Mobile Cloud for Smart City Applications (MMCloudCity-2016)  2016.7.15  IEEE

     More details

    Event date: 2016.7.11 - 2016.7.15

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Seattle, WA, USA  

    researchmap

  • GPSデータ匿名化レベルの主観的許容度を客観的に表現する指標の検討

    Yusuke Mitou, Sunao Hara, Masanobu Abe

    DICOMO 2016  2016.7.7 

     More details

    Event date: 2016.7.6 - 2016.7.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 人間の感覚を考慮した騒音マップ作成のための騒々しさ推定方式

    Shota Kobayashi, Sunao Hara, Masanobu Abe

    DICOMO 2016  2016.7.6 

     More details

    Event date: 2016.7.6 - 2016.7.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • A measure for transfer tendency between staying places

    Takashi Ofuji, Sunao Hara, Masanobu Abe

    LOIS  2016.5.13 

     More details

    Event date: 2016.5.12 - 2016.5.13

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 賑わい度推定のための環境音データベースの構築

    Tomoyasu Tanaka, Sunao Hara, Masanobu Abe

    2016 Spring Meeting of ASJ  2016.3.11 

     More details

    Event date: 2016.3.9 - 2016.3.11

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • A watching method to protect users' privacy using living area

    Shigeki Kamada, Sunao Hara, Masanobu Abe

    LOIS  2016.1.21 

     More details

    Event date: 2016.1.21 - 2016.1.22

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • A Spoken Dialog System with Redundant Response to Prevent User Misunderstanding International conference

    Masaki Yamaoka, Sunao Hara, Masanobu Abe

    APSIPA Annual Summit and Conference 2015  2015.12.19  APSIPA

     More details

    Event date: 2015.12.16 - 2015.12.19

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Hong Kong  

    researchmap

  • 環境音検出器を用いた環境音の可視化に関する検討

    田中智康,原直,阿部匡伸

    第17回IEEE広島支部学生シンポジウム(HISS 17th)  IEEE広島支部

     More details

    Event date: 2015.11.21 - 2015.11.22

    Language:Japanese   Presentation type:Poster presentation  

    Venue:岡山大学  

    researchmap

  • 心地よく話すことができる聞き役音声対話システムのための対話戦略

    齊藤椋太,原直,阿部匡伸

    第17回IEEE広島支部学生シンポジウム(HISS 17th)  IEEE広島支部

     More details

    Event date: 2015.11.21 - 2015.11.22

    Language:Japanese   Presentation type:Poster presentation  

    Venue:岡山大学  

    researchmap

  • クラウドソーシングによる環境音収集システムを用いた予備収録実験

    原直,阿部匡伸,曽根原登

    2015年日本音響学会秋季研究発表会  2015.9.18  日本音響学会

     More details

    Event date: 2015.9.16 - 2015.9.18

    Language:Japanese   Presentation type:Poster presentation  

    Venue:会津大学  

    researchmap

  • 冗長なシステム応答を用いたユーザの誤認識に頑健な音声対話システムに関する検討

    山岡将綺,原直,阿部匡伸

    2015年日本音響学会秋季研究発表会  2015.9.18  日本音響学会

     More details

    Event date: 2015.9.16 - 2015.9.18

    Language:Japanese   Presentation type:Poster presentation  

    Venue:会津大学  

    researchmap

  • 音楽を用いた生活収録音の振り返り手法の検討

    鳥羽隼司,原直,阿部匡伸

    2015年日本音響学会秋季研究発表会  2015.9.17  日本音響学会

     More details

    Event date: 2015.9.16 - 2015.9.18

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:会津大学  

    researchmap

  • A Sub-Band Text-to-Speech by Combining Sample-Based Spectrum with Statistically Generated Spectrum International conference

    Tadashi Inai, Sunao Hara, Masanobu Abe, Yusuke Ijima, Noboru Miyazaki and Hideyuki Mizuno

    Interspeech 2015  ISCA

     More details

    Event date: 2015.9.6 - 2015.9.10

    Language:English   Presentation type:Poster presentation  

    Venue:Dresden, Germany  

    researchmap

  • Algorithm to estimate a living area based on connectivity of places with home International conference

    Yuji Matsuo, Sunao Hara, Masanobu Abe

    HCI International 2015 

     More details

    Event date: 2015.8.2 - 2015.8.7

    Language:English   Presentation type:Poster presentation  

    Venue:Los Angels, CA, USA  

    researchmap

  • Extraction of key segments from day-long sound data International conference

    Akinori Kasai, Sunao Hara, Masanobu Abe

    HCI International 2015 

     More details

    Event date: 2015.8.2 - 2015.8.7

    Language:English   Presentation type:Poster presentation  

    Venue:Los Angels, CA, USA  

    researchmap

  • LiBS:発見と気付きを可能とするライフログブラウジング方式

    難波敦也, 原直,阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2015)  2015.7.10 

     More details

    Event date: 2015.7.8 - 2015.7.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岩手県 安比高原  

    researchmap

  • 長期取得音からライフログとして残したい音の抽出方法

    笠井昭範,原直,阿部匡伸

    マルチメディア,分散,協調とモバイルシンポジウム(DICOMO2015)  2015.7.10 

     More details

    Event date: 2015.7.8 - 2015.7.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岩手県 安比高原  

    researchmap

  • 振り返り支援における効率的な映像要約のための自動収集ライフログ活用法

    大西杏菜,原直,阿部匡伸

    ライフインテリジェンスとオフィス情報システム研究会 (LOIS)  電子情報通信学会

     More details

    Event date: 2015.5.14 - 2015.5.15

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:津田塾大学小平キャンパス  

    researchmap

  • Sound collection and visualization system enabled participatory and opportunistic sensing approaches International conference

    Sunao Hara, Masanobu Abe, Noboru Sonehara

    2nd International Workshop on Crowd Assisted Sensing, Pervasive Systems and Communications (CASPer 2015)  IEEE

     More details

    Event date: 2015.3.27

    Language:English   Presentation type:Oral presentation (general)  

    Venue:St. Louis, Missouri, USA  

    researchmap

  • ミックスボイスの地声・裏声との類似度比較

    家村朋典,原直,阿部匡伸

    2015年日本音響学会春季研究発表会  2015.3.17  日本音響学会

     More details

    Event date: 2015.3.16 - 2015.3.18

    Language:Japanese   Presentation type:Poster presentation  

    Venue:日本大学理工学部  

    researchmap

  • 聴取者の主観評価に基づく音地図作成のための環境音収録

    原直,阿部匡伸,曽根原登

    2015年日本音響学会春季研究発表会  2015.3.17  日本音響学会

     More details

    Event date: 2015.3.16 - 2015.3.18

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:日本大学理工学部  

    researchmap

  • 高域部への素片スペクトルとHMM生成スペクトルの導入によるHMM合成音声の品質改善の検討

    稻井禎,原直,阿部匡伸,井島勇祐,宮崎昇,水野秀之

    2015年日本音響学会春季研究発表会  2015.3.17  日本音響学会

     More details

    Event date: 2015.3.16 - 2015.3.18

    Language:Japanese   Presentation type:Poster presentation  

    Venue:日本大学理工学部  

    researchmap

  • 移動経路のパターン分類による人の行動分析

    瀬藤諒,原直,阿部匡伸

    ライフインテリジェンスとオフィス情報システム研究会 (LOIS)  電子情報通信学会

     More details

    Event date: 2015.3.5 - 2015.3.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:沖縄科学技術大学院大学  

    researchmap

  • 滞在地の特徴量を利用した「特別な日」検索方式

    林啓吾,原直,阿部匡伸

    ライフインテリジェンスとオフィス情報システム研究会 (LOIS)  電子情報通信学会

     More details

    Event date: 2015.3.5 - 2015.3.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:沖縄科学技術大学院大学  

    researchmap

  • 滞在地と経路に着目した生活圏抽出法

    松尾雄二,原直,阿部匡伸

    ライフインテリジェンスとオフィス情報システム研究会 (LOIS)  電子情報通信学会

     More details

    Event date: 2015.3.5 - 2015.3.6

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:沖縄科学技術大学院大学  

    researchmap

  • クラウドセンシングデータによる地域の賑わい分析 -地域経済活性化- International conference

    Sunao Hara

    ISSI2014 

     More details

    Event date: 2015.2.16 - 2015.2.17

    Language:Japanese   Presentation type:Oral presentation (invited, special)  

    researchmap

  • Extracting Daily Patterns of Human Activity Using Non-Negative Matrix Factorization International conference

    Masanobu Abe, Akihiko Hirayama, Sunao Hara

    IEEE International Conference on Consumer Electronics  IEEE

     More details

    Event date: 2015.1.9 - 2015.1.12

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Las Vegas, USA  

    researchmap

  • 発話への関心の有無判別における聞き手の判別基準の有効性

    Ryota Saito, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.12.14

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • ロック歌唱における「歪み声」と「ミックスボイス」の音響的特徴分析

    Tomonori Iemura, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.12.14

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • A Hybrid Text-to-Speech Based on Sub-Band Approach International conference

    Takuma Inoue, Sunao Hara, Masanobu Abe

    Asia-Pacific Signal and Information Processing Association 2014 Annual Summit and Conference (APSIPA ASC 2014)  Asia-Pacific Signal and Information Processing Association (APSIPA)

     More details

    Event date: 2014.12.9 - 2014.12.12

    Language:English   Presentation type:Poster presentation  

    Venue:Cambodia  

    researchmap

  • 音素波形選択型音声合成方式に用いるデータベースサイズと主観評価との関係分析

    Tadashi Inai, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.9.3 - 2014.9.5

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • クラウドセンシングにより収集された環境音のシンボル表現を用いた音地図構築手法

    Sunao Hara, Masanobu Abe, Noboru Sonehara

     More details

    Event date: 2014.9.3 - 2014.9.5

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • FLAG: 位置情報を基軸としたライフログ集約システム

    Akinori Kasai, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.6.28 - 2014.6.29

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • GPSデータから構築したネットワーク構造におけるノード次数に基づく行動分析法の検討

    Daisuke Fujioka, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.5.29 - 2014.5.30

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • スマートデバイスを用いたクラウドソーシングによる環境音収集システムの開発

    Sunao Hara, Akinori Kasai, Masanobu Abe, Noboru Sonehara

     More details

    Event date: 2014.5.24 - 2014.5.25

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 車載用音声対話システムにおけるユーザ負荷を考慮した対話戦略の検討

    Masaki Yamaoka, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.5.22 - 2014.5.23

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 非負値行列因子分解によるPC操作ログからの勤務パタン抽出

    Akihiko Hirayama, Sunao Hara, Masanobu Abe

     More details

    Event date: 2014.5.15 - 2014.5.16

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • クラウドソーシングによる環境音収集

    原直

    第15回 岡山情報通信技術研究会  岡山情報通信技術研究会

     More details

    Event date: 2014.4.30

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岡山大学  

    researchmap

  • New Approach to Emotional Information Exchange: Experience Metaphor Based on Life Logs International conference

    Masanobu Abe, Daisuke Fujioka, Kazuto Hamano, Sunao Hara, Rika Mochizuki, Tomoki Watanabe

    The 12th IEEE International Conference on Pervasive Computing and Communications (PerCom 2014)  IEEE

     More details

    Event date: 2014.3.24 - 2014.3.28

    Language:English   Presentation type:Poster presentation  

    Venue:Budapest, Hungary  

    researchmap

  • 他人との行動ログ比較による個人の行動特徴分析

    Ryo Seto, Masanobu Abe, Sunao Hara

     More details

    Event date: 2014.3.18 - 2014.3.21

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 滞在地と経路に着目した生活圏抽出法の検討

    松尾雄二,原直,阿部匡伸

    情報処理学会第76回全国大会  2014 

     More details

    Event date: 2014.3.11 - 2014.3.13

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京電機大学  

    researchmap

  • 滞在地の特徴量を利用した「特別な日」検索方式の検討

    林啓吾,原直,阿部匡伸

    情報処理学会第76回全国大会  2014 

     More details

    Event date: 2014.3.11 - 2014.3.13

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京電機大学  

    researchmap

  • 音声波形の高域利用による HMM 音声合成方式の評価

    井上拓真,原直,阿部匡伸,井島勇祐,水野秀之

    2014年日本音響学会春季研究発表会  2014 

     More details

    Event date: 2014.3.10 - 2014.3.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:日本大学  

    researchmap

  • クラウドソーシングによる環境音収集のためのスマートデバイス用アプリケーションの開発

    原直,笠井昭範,阿部匡伸,曽根原登

    電子情報通信学会 LOIS研究会  2014 

     More details

    Event date: 2014.3.7 - 2014.3.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 地理情報を活用したモバイル音声対話システムに関する研究

    原直

    情報処理学会 音声言語処理研究会(SIG–SLP第100回シンポジウム) 

     More details

    Event date: 2014.1.31 - 2014.2.1

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • PC操作ログから抽出したソフトウェアの使用様態による働き方の分析

    平山明彦,原直,阿部匡伸

    第15回IEEE広島支部学生シンポジウム(HISS 15th)  2013 

     More details

    Event date: 2013.11

    Language:Japanese   Presentation type:Poster presentation  

    Venue:鳥取大学  

    researchmap

  • GPSデータの滞在地に着目した行動振り返り支援方式

    林啓吾,原直,阿部匡伸

    平成25年度(第64回)電気・情報関連学会中国支部連合大会 

     More details

    Event date: 2013.10.19

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岡山大学  

    researchmap

  • 位置情報による行動分析を行うための経由地検出の検討

    瀬藤諒,原直,阿部匡伸

    平成25年度(第64回)電気・情報関連学会中国支部連合大会 

     More details

    Event date: 2013.10.19

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岡山大学  

    researchmap

  • GPSデータを用いた生活圏の動的生成のためのデータ量に関する検討

    松尾雄二,原直,阿部匡伸

    平成25年度(第64回)電気・情報関連学会中国支部連合大会 

     More details

    Event date: 2013.10.19

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岡山大学  

    researchmap

  • 混合正規分布を用いた声質変換法における分布数とスペクトル変換精度との関係性の検討

    遠藤一輝,原直,阿部匡伸

    平成25年度(第64回)電気・情報関連学会中国支部連合大会 

     More details

    Event date: 2013.10.19

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:岡山大学  

    researchmap

  • スペクトル包絡と基本周波数の変換が音声の個人性に与える影響の検討

    Keisuke Kawai, Sunao Hara, Masanobu Abe

     More details

    Event date: 2013.9.25 - 2013.9.27

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 音声波形の高域利用によるHMM音声合成の高品質化

    Takuma Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima, Hideyuki Mizuno

     More details

    Event date: 2013.9.25 - 2013.9.27

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 音響信号とマルチセンサー信号を利用したスマートフォンによる自動車の燃費推定

    Shohei Nanba, Sunao Hara, Masanobu Abe

     More details

    Event date: 2013.5.16 - 2013.5.17

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 音声情報案内システムにおけるBag-of-Wordsを用いた無効入力棄却モデルの可搬性の評価

    真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

    日本音響学会 2013年春季研究発表会  日本音響学会

     More details

    Event date: 2013.3.13 - 2013.3.15

    Language:Japanese   Presentation type:Oral presentation (general)  

    Venue:東京工科大学  

    researchmap

  • HMM音声合成と波形音声合成の混在による方式の評価

    Takuma Inoue, Sunao Hara, Masanobu Abe, Yusuke Ijima, Hideyuki Mizuno

     More details

    Event date: 2013.3.13 - 2013.3.15

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • The Individual Feature Analysis of the Network of the Stay Extracted from GPS Data

    Daisuke Fujioka, Sunao Hara, Masanobu Abe

     More details

    Event date: 2013.3.7 - 2013.3.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Examination of an event sampling process with the similar impression from the others' life log

    Kazuto Hamano, Sunao Hara, Masanobu Abe, Daisuke Fujioka, Rika Mochizuki, Tomoki Watanabe

     More details

    Event date: 2013.3.7 - 2013.3.8

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Bag-of-Wordsを用いた音声情報案内システム無効入力棄却モデルの可搬性の評価

    真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

    第15回日本音響学会関西支部若手研究者交流研究発表会  日本音響学会関西支部

     More details

    Event date: 2012.12.9

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • Development of a toolkit handling multiple speech-oriented guidance agents for mobile applications International conference

    Sunao Hara, Hiromichi Kawanami, Hiroshi Saruwatari, Kiyohiro Shikano

    The 4th International Workshop on Spoken Dialog Systems (IWSDS2012) 

     More details

    Event date: 2012.11.28 - 2012.11.30

    Language:English   Presentation type:Poster presentation  

    Venue:Paris, France  

    researchmap

  • Evaluation of invalid input discrimination using BOW for speech-oriented guidance system International conference

    Haruka Majima, Rafael Torres, Hiromichi Kawanami, Sunao Hara, Tomoko Matsui, Hiroshi Saruwatari, Kiyohiro Shikano

    The 4th International Workshop on Spoken Dialog Systems (IWSDS2012) 

     More details

    Event date: 2012.11.28 - 2012.11.30

    Language:English   Presentation type:Poster presentation  

    Venue:Paris, France  

    researchmap

  • 音声情報システムにおける最大エントロピー法を用いた無効入力棄却の評価

    真嶋温佳, トーレスラファエル, 川波弘道, 原直, 松井知子, 猿渡洋, 鹿野清宏

    日本音響学会 2012年秋季研究発表会  日本音響学会

     More details

    Event date: 2012.9

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 携帯端末用の音声情報案内システム開発に向けたネットワークサービスの検討

    原直,川波弘道,猿渡洋,鹿野清宏

    音声言語処理研究会  2012  情報処理学会

     More details

    Event date: 2012.7

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • クラウド時代の新しい音声研究パラダイム

    秋葉友良,岩野公司,緒方淳,小川哲司,小野順貴,篠崎隆宏,篠田浩一,南條浩輝,西崎博光,西田昌史,西村竜一,原直,堀貴明

    音声言語処理研究会  2012  情報処理学会

     More details

    Event date: 2012.7

    Language:Japanese   Presentation type:Symposium, workshop panel (nominated)  

    researchmap

  • 音声情報案内システムにおけるBag-of-Wordsを特徴量とした無効入力の棄却

    真嶋温佳,トーレス・ラファエル,川波弘道,原直,松井知子,猿渡洋,鹿野清宏

    音声言語処理研究会  2012  情報処理学会

     More details

    Event date: 2012.7

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Causal analysis of task completion erros in spoken music retrieval interactions International conference

    Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    LREC 2012  2012  ELDA

     More details

    Event date: 2012.5

    Language:English   Presentation type:Poster presentation  

    Venue:Istanbul, Turkey  

    researchmap

  • Multi-band speech recognition using band-dependent confidence measures of blind source separation International conference

    Atsushi Ando, Hiromasa Ohashi, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    The Acoustics 2012  2012 

     More details

    Event date: 2012.5

    Language:English   Presentation type:Poster presentation  

    Venue:Hong Kong  

    researchmap

  • 携帯端末用音声情報案内システムのためのマイク入力に関する調査

    中清行,原直,川波弘道,猿渡洋,鹿野清宏

    電子情報通信学会 総合大会 情報・システムソサイエティ特別企画 学生ポスターセッション  2012  電子情報通信学会

     More details

    Event date: 2012.3

    Language:Japanese   Presentation type:Poster presentation  

    researchmap

  • 周波数帯域ごとの音源分離信頼度を利用したマルチバンド音声認識

    安藤厚志,大橋宏正,原直,北岡教英,武田一哉

    2012年春季研究発表会  2012  日本音響学会

     More details

    Event date: 2012.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 多様な利用環境における音声情報案内サービスソフトウェアの開発

    原直,川波弘道,猿渡洋,鹿野清宏

    電子情報通信学会総合大会  2012  電子情報通信学会

     More details

    Event date: 2012.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • ブラインド音源分離の信頼度を用いたマルチバンド音声認識

    安藤厚志,大橋宏正,原直,北岡教英,武田一哉

    音声研究会  2012  電子情報通信学会

     More details

    Event date: 2012.2

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation International conference

    Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    ASRU 2011  2011  IEEE

     More details

    Event date: 2011.12

    Language:English   Presentation type:Poster presentation  

    Venue:Hawaii  

    researchmap

  • Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations International conference

    Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda

    APSIPA-ASC 2011  2011  APSIPA

     More details

    Event date: 2011.10

    Language:English   Presentation type:Poster presentation  

    Venue:Xi'an, China  

    researchmap

  • On-line detection of task incompletion for spoken dialog systems using utterance and behavior tag N-gram vectors International conference

    Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    IWSDS 2011  2011 

     More details

    Event date: 2011.9

    Language:English   Presentation type:Poster presentation  

    Venue:Granada, Spain  

    researchmap

  • Detection of task-incomplete dialogs based on utterance-and-behavior tag N-gram for spoken dialog systems International conference

    Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    Interspeech 2011  2011  ISCA

     More details

    Event date: 2011.8

    Language:English   Presentation type:Poster presentation  

    Venue:Florence, Italy  

    researchmap

  • Music recommendation system based on human-to-human conversation recognition International conference

    Hiromasa OHASHI, Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    HCIAmI'11  2011 

     More details

    Event date: 2011.7

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Nottingham, U.K.  

    researchmap

  • 雑談音声の認識に基づく楽曲連想再生システム

    大橋宏正,原直,北岡教英,武田一哉

    2011年春季研究発表会  2011  日本音響学会

     More details

    Event date: 2011.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • MLLR変換行列に基づいた音響特徴量生成による音響モデル学習

    伊藤新,原直,北岡教英,武田一哉

    2011年春季研究発表会  2011  日本音響学会

     More details

    Event date: 2011.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 音声対話システムにおける発話・行動タグN-gram を用いた課題未達成対話の検出手法と分析

    原直,北岡教英,武田一哉

    2011年春季研究発表会  2011  日本音響学会

     More details

    Event date: 2011.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • MLLR変換行列により制約された音響特徴量生成による頑健な音響モデル

    伊藤新,北岡教英,原直,武田一哉

    音声言語シンポジウム  2010  情報処理学会

     More details

    Event date: 2010.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 雑談音声の常時認識による楽曲提案システム

    大橋宏正,北岡教英,原直,武田一哉

    音声研究会  2010  電子情報通信学会

     More details

    Event date: 2010.10

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Automatic detection of task-incompleted dialog for spoken dialog system based on dialog act N-gram International conference

    Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    INTERSPEECH2010  2010.9 

     More details

    Event date: 2010.9

    Language:English   Presentation type:Poster presentation  

    researchmap

  • 音声対話システムの発話系列N-gram を用いた課題未達成対話のオンライン検出

    原直,北岡教英,武田一哉

    2011年秋季研究発表会  2010  日本音響学会

     More details

    Event date: 2010.9

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Rapid acoustic model adaptation using inverse MLLR-based feature generation International conference

    Arata ITO, Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    The 20th International Congress on Acoustics (ICA2010)  2010.8 

     More details

    Event date: 2010.8

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Estimation method of user satisfaction using N-gram-based dialog history model for spoken dialog system International conference

    Sunao HARA, Norihide KITAOKA, Kazuya TAKEDA

    7th conference on International Language Resources and Evaluation (LREC'10)  2010.5 

     More details

    Event date: 2010.5

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • MLLR 変換行列により生成した音声特徴量に基づく高速モデル適応

    伊藤新,原直,北岡教英,武田一哉

    2011年秋季研究発表会  2010  日本音響学会

     More details

    Event date: 2010.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 楽曲連想再生のための文書特徴量と音響特徴量の対応付け

    高橋量衛, 大石康智, 原直, 北岡教英, 武田一哉

    第4回音声ドキュメント処理ワークショップ  2010 

     More details

    Event date: 2010.2

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 音声対話システムの対話履歴N-gramを利用したユーザ満足度推定手法

    原直,北岡教英,武田一哉

    音声言語シンポジウム  2009  情報処理学会

     More details

    Event date: 2009.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 複数音響モデルからの最適選択による音声認識

    伊藤新,原直,宮島千代美,北岡教英,武田一哉

    平成21年度電気関係学会 東海支部連合大会  2009 

     More details

    Event date: 2009.9

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 楽曲間の主観的類似度と音響的類似度との関連付けに関する検討

    平賀悠介,大石康智,原直,武田一哉

    2009年秋季研究発表会  2009  日本音響学会

     More details

    Event date: 2009.9

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 音声対話システムのユーザ満足度推論におけるネットワークモデルの構築と評価

    原直,北岡教英,武田一哉

    2009年春季研究発表会  2009  日本音響学会

     More details

    Event date: 2009.3

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • 音声認識システムの満足度評価におけるユーザモデル

    原直,北岡教英,武田一哉

    音声言語シンポジウム  2008  情報処理学会

     More details

    Event date: 2008.12

    Language:Japanese   Presentation type:Oral presentation (general)  

    researchmap

  • Data collection and usability study of a PC-based speech application in various user environments International conference

    Sunao Hara, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

    Oriental-COCOSDA 2008  2008.11 

     More details

    Event date: 2008.11

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • In-car speech data collection along with various multimodal signals International conference

    Akira Ozaki, Sunao Hara, Takashi Kusakawa, Chiyomi Miyajima, Takanori Nishino, Norihide Kitaoka, Katunobu Itou, Kazuya Takeda

    The 6th International Language Resources and Evaluation (LREC08)  2008.5 

     More details

    Event date: 2008.5

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

  • DNN-based Voice Conversion with Auxiliary Phonemic Information to Improve Intelligibility of Glossectomy Patients' Speech International conference

    Hiroki Murakami, Sunao Hara, Masanobu Abe

    APSIPA Annual Summit and Conference 2019  2019.11  APSIPA

     More details

    Language:English   Presentation type:Oral presentation (general)  

    Venue:Lanzhou, China  

    researchmap

  • 舌亜全摘出者の音韻明瞭性改善のためのマルチモーダルデータベースの構築

    村上博紀, 荻野聖也, 原直, 阿部匡伸, 佐藤匡晃, 皆木省吾

    日本音響学会2018年春季研究発表会  2018.3.26  日本音響学会

     More details

    Language:Japanese   Presentation type:Poster presentation  

    Venue:日本工業大学 宮代キャンパス  

    researchmap

  • クラウドソーシングによる賑わい音識別方式のフィールド実験評価

    朝田興平, 原直, 阿部匡伸

    日本音響学会2018年春季研究発表会  2018.3.25  日本音響学会

     More details

    Language:Japanese   Presentation type:Poster presentation  

    Venue:日本工業大学 宮代キャンパス  

    researchmap

  • DNN音声合成における感情付与のための継続時間長モデルの検討

    井上勝喜, 原直, 阿部匡伸, 北条伸克, 井島勇祐

    日本音響学会2018年春季研究発表会  2018.3.25  日本音響学会

     More details

    Language:Japanese   Presentation type:Poster presentation  

    Venue:日本工業大学 宮代キャンパス  

    researchmap

  • An online customizable music retrieval system with a spoken dialogue interface International conference

    Sunao Hara, Chiyomi Miyajima, Katsunobu Itou, Kazuya Takeda

    4th Joint Meeting of ASA/ASJ  2006.11 

     More details

    Language:English   Presentation type:Poster presentation  

    researchmap

  • Preliminary Study of a Learning Effect on Users to Develop a New Evaluation of the Spoken Dialogue System International conference

    Sunao Hara, Ayako Shirose, Chiyomi Miyajima, Katsunobu Ito, Kazuya Takeda

    Oriental-COCOSDA 2005  2005.12 

     More details

    Language:English   Presentation type:Oral presentation (general)  

    researchmap

▼display all

Works

  • ChartEx

    Sunao Hara

    2017.5

     More details

    Work type:Software   Location:GitHub  

    Excel Addin for export chart as image file such as png, jpeg, and pdf.

    researchmap

  • TTX KanjiMenu Plugin

    Sunao Hara

    2007.3

     More details

    Work type:Software  

    researchmap

  • Pocket Julius

    原直

    2003.1

     More details

    Work type:Software  

    このパッケージは大語彙音声認識デコーダ Julius を Microsoft Pocket PC 2002 環境で動くようにした Pocket Julius のデモパッケージです.

    researchmap

Awards

  • 社会貢献賞

    2021.3   岡山大学工学部  

     More details

  • ベストティーチャー賞

    2020.3   岡山大学工学部  

     More details

  • 教育貢献賞

    2019.3   岡山大学工学部  

    原直

     More details

  • 学会活動貢献賞

    2019.3   日本音響学会  

    原直

     More details

  • FIT奨励賞

    2018.9   第17回情報科学技術フォーラム  

    原直

     More details

  • 優秀論文賞

    2016.8   情報処理学会 DICOMO2016  

    小林将大, 原直, 阿部匡伸

     More details

  • 教育貢献賞

    2016.3   岡山大学工学部  

    原直

     More details

  • 平成25年度岡山工学振興会科学技術賞

    2013.7   公益財団法人岡山工学振興会  

    原直

     More details

  • 平成16年秋季研究発表会 ポスター賞

    2004.9   日本音響学会  

    原直

     More details

▼display all

Research Projects

  • 協調的ライブ記録が支えるアクティブラーニング@オンラインの技術研究

    Grant number:21K12155  2021.04 - 2024.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    西村 竜一, 原 直

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4030000 ( Direct expense: \3100000 、 Indirect expense:\930000 )

    researchmap

  • 感情や個人性を高品質に表現可能なDNNに基づく音声合成方式の研究

    Grant number:21K11963  2021.04 - 2024.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    阿部 匡伸, 原 直

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4160000 ( Direct expense: \3200000 、 Indirect expense:\960000 )

    researchmap

  • 観光地の雰囲気可視化を可能とする簡易なアノテーションに基づく深層学習方式の研究

    Grant number:20K12079  2020.04 - 2023.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    原 直

      More details

    Authorship:Principal investigator 

    Grant amount:\4290000 ( Direct expense: \3300000 、 Indirect expense:\990000 )

    researchmap

  • 音響信号から学修者の活性度を測るPBL指導支援システムの開発

    Grant number:18K02862  2018.04 - 2022.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    西村 竜一, 原 直

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4420000 ( Direct expense: \3400000 、 Indirect expense:\1020000 )

    本研究では、音情報処理の技術を応用し、大学等の高等教育機関で導入が進むPBLの指導者支援システムを開発する。
    学修者の活性度抽出アルゴリズムの一部として、CNN Autoencoderによる音分類方式の検討を進めた。特に、収録環境の異なる2つのコーパスを用いた実験により、単に両者を混ぜたコーパスからモデル学習をするのではなく、Autoencoderによる特徴量抽出を介することで、より高い分類性能を持つモデルを学習できることを示した。画像と音声を含む会議データを収録し、その画像としての特徴と音声としての特徴を併用することで、会議参加者の発話意図の推定や参加姿勢の良し悪しを推定する検討を行った。
    PBLのデータ取集と協調的アノテーションの実現の取り組みとして、グループワークに取り組む学生個々の参加情報記録システムを開発した。システムでは,指導者がタブレット端末に提示された学生の顔画像をタッチ操作することで、その学生のグループワークへの参加状況を記録することが可能である。本研究では、発言や相槌の回数、グループワークに対する貢献度を記録できるインタフェースを設計した。システムで提示する顔画像は、360度全方位の撮影が可能なビデオカメラで撮影したパノラマ画像から抽出する。実験では、協力者に依頼し、実際のグループワークを撮影した。また、データの利活用を円滑にするため、音声を匿名化する声質変換システムの開発を行った。提案システムは、(a) マイクロホンアレイを用いた対話音声の収録 (b) 各話者の音声に含まれる特徴量の抽出 (c) 深層学習による音声の声質変換を実現する。(c)では、ニューラルネットワークに基づいた敵対的生成ネットワークのアルゴリズムであるCycle-Consistent Adversarial Networksを用いた。

    researchmap

  • ディープニューラルネットワークによる舌亜全摘出者の音韻明瞭性改善の研究

    Grant number:18K11376  2018.04 - 2022.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    阿部 匡伸, 原 直, 皆木 省吾

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4290000 ( Direct expense: \3300000 、 Indirect expense:\990000 )

    研究計画調書に記載した課題に関して,令和元年度(2019年度)に実施した内容は下記の通り.
    (課題1)音韻明瞭度の改善 「①-1 調音結合の制約をDNNによりモデル化」については,両方向LSTMと前向きLSTMの性能を比較した結果,両方向は明瞭性よりは自然性,高品質化の効果が大きいことが明らかになった。「①-2唇情報表現の検討」については,音声のスペクトル情報と唇の画像情報の異なる特徴量を扱うことからDNNによる方式で検討を進めた。まず, Kinectにより取得した唇の輪郭とスペクトル情報を利用した結果,母音の明瞭性が向上することが明らかとなった。子音に比べて母音の継続時間長が長いためと考えられる。次に,DNNの特徴を利用して,音韻の情報を補助的に利用する方法を検討した。実験では,正解となる音韻情報を与えて改善の可能性を確認した結果,破裂音,摩擦音等の子音の明瞭性が向上することが明らかとなった。
    (課題2)患者の発話負担軽減.「②-2疑似舌亜全摘出者音声データベースの構築」については,昨年度作成した下歯を覆うプレートを利用して,3名分の疑似舌亜全摘出者の音声と唇映像の同時収録データベースを構築した.
    (課題3)リアルタイム動作.「③-2差分スペクトル利用による誤分析対処」については,WORLD方式によって分析し,3層のDNNによるスペクトル推定し,残差信号を利用した声質変換方式により,100ミリ秒の遅れてリアルタイム処理できることが明らかとなった。

    researchmap

  • 地域活性化政策立案のための音響信号による“賑い度”調査プラットフォームの研究開発

    2015.07 - 2018.03

    Ministry of International Affairs and Communiations  Strategic Information and Communications R&D Promotion Programme 

    Masanobu Abe, Sunao Hara

      More details

    Authorship:Coinvestigator(s)  Grant type:Competitive

    researchmap

  • Development of activity sound visualization method for personal evaluation of PBL

    Grant number:15K01069  2015.04 - 2018.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    Ryuichi Nishimura, Sunao Hara

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4680000 ( Direct expense: \3600000 、 Indirect expense:\1080000 )

    In this study, we have developed methods for realizing to support evaluations of students participating in PBL (Project-Based Learning) on the basis of visualization technologies of sound information. A method of detection of activated communication in group work from dialogue voice was examined. We developed the prototype system for presenting a whole condition of a group work using wearable voice recording terminals. In addition, sound source information visualization methods based on deep learning neural networks have been investigated.

    researchmap

  • A study on monitoring systems with privacy protection control

    Grant number:15K00128  2015.04 - 2018.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Scientific Research (C)  Grant-in-Aid for Scientific Research (C)

    Masanobu Abe, Sunao Hara

      More details

    Authorship:Coinvestigator(s) 

    Grant amount:\4420000 ( Direct expense: \3400000 、 Indirect expense:\1020000 )

    In this study, we try to develop a monitoring system that takes into account privacy issues. The monitoring system uses a living area and controls the degree of watching over and privacy protection by changing the granularity of the living area. The living area is defined a set of home, frequently visited place of stay, and travel route connecting them. We proposed an algorithm to generate the living area using GPS data collected for a long period. Experimental results show that the proposed algorithm can estimate living area with a precision of 0.85.
    We also carried out questionnaire on user preferences in terms of the monitoring and privacy protection levels used these living area. Experiment results showed that the people on the monitoring side wanted the monitoring system to allow them to monitor in detail. Conversely, it was observed for the people being monitored that the more detailed the monitoring, the greater the feelings of being surveilled intrusively.

    researchmap

  • Study on spoken dialogue system with safety consideration based on automatic estimation of driving situation

    Grant number:26730092  2014.04 - 2017.03

    Japan Society for the Promotion of Science  Grants-in-Aid for Scientific Research Grant-in-Aid for Young Scientists (B)  Grant-in-Aid for Young Scientists (B)

    Hara Sunao

      More details

    Authorship:Principal investigator 

    Grant amount:\3640000 ( Direct expense: \2800000 、 Indirect expense:\840000 )

    We conducted below:
    (1) We used bio signals and driving information signals that are measured by the driver’s body and the vehicle’s body for estimating the driving road from the signals. Furthermore, we estimated driving load assuming use of sensors obtained by smartphones. (2) We evaluated the spoken dialog strategy from the viewpoint of user’s driving load. An objective evaluation by computer simulation was conducted by considering both dialog initiatives and the exitance of confirmation utterances. (3) We conducted a subjective evaluation to evaluate the performance of the proposed dialog strategy as a spoken dialog system. The subjects were asked to drive a simulator during talking with a spoken dialog system. (4) A dialog strategy based on graph search was introduced to realize a dialog strategy which considering the estimation result of the user's mental load. We evaluated the proposed system by objective evaluation and subjective evaluation.

    researchmap

▼display all

 

Class subject in charge

  • Digital Signal Processing (2021academic year) Fourth semester  - 月1,月2,木1,木2

  • Exercises on Programming 1 (2021academic year) 1st semester  - 水1,水2,水3

  • Exercises on Programming 2 (2021academic year) Second semester  - 水1,水2,水3

  • Technical English for Interdisciplinary Medical Sciences and Engineering (2021academic year) Late  - その他

  • Research Works for Interdisciplinary Medical Sciences and Engineering (2021academic year) Year-round  - その他

  • Introduction to Information Processing 2 (2021academic year) Second semester  - 月1~2

  • Introduction to Information Processing 2 (2021academic year) Second semester  - 木1~2

  • Information Technology Experiments B (Media Processing) (2021academic year) Third semester  - 火3,火4,火5,火6,火7,木3,木4,木5,木6,木7

  • Advanced Research on Speech Processing I (2021academic year) Prophase  - 木5~6

  • Advanced Research on Speech Processing II (2021academic year) Prophase  - 木5~6

  • Exercises on Programming (2020academic year) 1st and 2nd semester  - 水1,水2,水3

  • Exercises on Programming 1 (2020academic year) 1st semester  - 水1,水2,水3

  • Exercises on Programming 2 (2020academic year) Second semester  - 水1,水2,水3

  • Technical English for Interdisciplinary Medical Sciences and Engineering (2020academic year) Late  - その他

  • Research Works for Interdisciplinary Medical Sciences and Engineering (2020academic year) Year-round  - その他

  • Introduction to Information Processing 2 (2020academic year) Second semester  - 月1,月2

  • Introduction to Information Processing 2 (2020academic year) Second semester  - 木1,木2

  • Information Technology Experiments B (Media Processing) (2020academic year) Third semester  - 火3,火4,火5,火6,木3,木4,木5,木6

  • Laboratory Work on Information Technology III (2020academic year) Third semester  - 火3,火4,火5,火6

  • Laboratory Work on Information Technology IV (2020academic year) Third semester  - 木3,木4,木5,木6

  • Advanced Research on Speech Processing I (2020academic year) Prophase  - 木5,木6

▼display all