Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play,...
Hisashi K, Tetsuya Ogata, Toru Takahashi, Kazunori Komatani, Hiroshi G. Okuno
This paper shows a continuous vocal imitation system using a computational model that explains the process of phoneme acquisition by infants. Human infants perceive speech sounds as continuous...
Shun Shiramatsu, Kazunori Komatani, Kôiti Hasida, Tetsuya Ogata, Hiroshi G. Okuno
This paper presents a quantitative modeling of referential coherence by which conversation systems measure the smoothness of discourse. Investigations of the corpora show that referential coherence...
Two-Channel-Based Voice Activity Detection for Humanoid Robots in Noisy Home Environments (2009)
Hyun-don Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract—The purpose of this research is to accurately classify the speech signals originating from the front even in noisy home environments. This ability can help robots to improve speech...
Kouhei Sumi, Katsutoshi Itoyama, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper presents amethod that identifies musical chords in polyphonic musical signals. As musical chords mainly represent the harmony of music and are relatedtoothermu sical elementssuch as melody...
Hyun-don Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — We propose a way to evaluate various sound localization systems for moving sounds under the same conditions. To construct a database for moving sounds, we developed a moving sound...
A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice (2009)
Takeshi Mizumoto, Ryu Takeda, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — This paper presents a beat-counting robot that can
Hisashi K, Tetsuya Ogata, Kazunori Komatani, Hiroshi G. Okuno
Abstract — This paper proposes a computational model for phoneme acquisition by infants. Human infants perceive speech sounds not as discrete phoneme sequences but as continuous acoustic signals....
Active Sensing based Dynamical Object Feature Extraction (2009)
Shun Nishide, Tetsuya Ogata, Ryunosuke Yokoya, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract — This paper presents a method to autonomously extract object features that describe their dynamics from active sensing experiences. The model is composed of a dynamics learning module and...
Katsutoshi Itoyama, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a music remixing interface, called Instrument Equalizer, that allows users to control the volume of each instrument part within existing audio recordings in real time. Although...
Takehiro Abe, Katsutoshi Itoyama, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper presents an analysis-manipulation method that can generate musical instrument sounds with arbitrary pitches and durations from the sound of a given musical instrument (called seed) without...
Spatially Mapping of Friendliness for Human-Robot Interaction (2009)
Tsuyoshi Tasaki, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — It is important that robots interact with multiple people. However, most research has dealt with only interaction between one robot and one person and assumed that the distance between...
Kazunori Komatani, Yuichiro Fukubayashi, Tetsuya Ogata, Hiroshi G. Okuno
A method is presented that helps novice users understand the language expressions that a system can accept, even from unacceptable utterances made that may contain automatic speech recognition...
Hyun-don Kim, Jinsung Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — In normal human communication, people face the speaker when listening and usually pay attention to the speaker ’ face. Therefore, in robot audition, the recognition of the front talker...
Yuichiro Fukubayashi, Kazunori Komatani, Mikio Nakano, Kotaro Funakoshi, Tetsuya Ogata, Hiroshi G. Okuno
Language understanding (LU) modules for spoken dialogue systems in the early phases of their development need to be (i) easy to construct and (ii) robust against various expressions. Conventional...
Yuji Kubota, Masatoshi Yoshida, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
If machine audition can recognize an auditory scene containing simultaneous and moving talkers, what kinds of awareness will people gain from an auditory scene visualizer? This paper presents the...
Kazuyoshi Yoshii, Student Member, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, ...
Abstract—This paper presents a hybrid music recommender system that ranks musical pieces while efficiently maintaining collaborative and content-based data, i.e., rating scores given by users and...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play,...
Katsutoshi Itoyama, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a sound source separation method for polyphonic sound mixtures of music to build an instrument equalizer for remixing multiple tracks separated from compact-disc recordings by...
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We present methods for automatic speaker identification in noisy environments. To improve noise robustness of speaker identification, we developed two methods, the harmonic structure extraction...
Hiromasa Fujihara, Masataka Goto, Jun Ogata, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
synchronization between lyrics and music CD recordings
Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno
We exploit the barge-in rate of individual users to predict automatic speech recognition (ASR) errors. A barge-in is a situation in which a user starts speaking during a system prompt, and it can be...
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We aimed at improving the efficiency and scalability of a hybrid music recommender system based on a probabilistic generative model that integrates both collaborative data (rating scores provided by...
Robot Motion Control using Listener’s Back-Channels and Head Gesture Information (2008)
Tsuyoshi Tasaki, Takeshi Yamaguchi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
A novel method is described for robot gestures and utterances during a dialogue based on the listener’s understanding and interest, which are recognized from back-channels and head gestures....
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a method for estimating F0s of vocal from polyphonic audio signals. Because melody is sung by a singer in many musical pieces, the estimation of F0s of the vocal part is useful...
Vocal Imitation Using Physical Vocal Tract Model (2008)
Hisashi K, Tetsuya Ogata, Kazunori Komatani, Hiroshi G. Okuno
Abstract — A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech...
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We present methods for automatic speaker identification in noisy environments. To improve noise robustness of speaker identification, we developed two methods, the harmonic structure extraction...
Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World (2008)
Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-marc Valin, Kazunori Komatani, Tetsuya Ogata, ...
Abstract — This paper presents a robot audition system that recognizes simultaneous speech in the real world by using robotembedded microphones. We have previously reported Missing Feature Theory...
Satoshi Ikeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
In a multi-domain spoken dialogue system, a user’s utterances are more prone to be out-of-grammar, because this kind of system deals with more tasks than a single-domain system. We defined a topic...
Discovery of Other Individuals by Projecting a Self-Model Through Imitation (2008)
Ryunosuke Yokoya, Tetsuya Ogata, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract — This paper proposes a novel model which enables a humanoid robot infant to discover other individual (e.g. human parent). In this work, the authors define “other individual” as an...
Multiple Moving Speaker Tracking by Microphone Array on Mobile Robot (2008)
Masamitsu Murase, Shunichi Yamamoto, Jean-marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, ...
Real-world applications often require tracking multiple moving speakers for improving human-robot interactions and/or sound source separation. This paper presents multiple moving speaker tracking...
Analyzing Temporal Transition of Real User’s Behaviors in a Spoken Dialogue System (2008)
Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno
Managing various behaviors of real users is indispensable for spoken dialogue systems to operate adequately in real environments. We have analyzed various users ’ behaviors using data collected...
Ryu Takeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Robot audition systems require capabilities for sound source separation and the recognition of separated sounds, since we hear a mixture of sounds in our daily lives, especially mixed of speech. We...
Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — This paper describes a new semi-blind source separation (semi-BSS) technique with independent component analysis (ICA) for enhancing a target source of interest and for suppressing other...
Dynamic help generation by estimating user’s mental model in spoken dial ogue systems (2008)
Yuichiro Fukubayashi, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
In a speech interface, a gap between a user’s mental model and actual structures of systems tends to be large because the amount of information conveyed by speech is limited. We address dynamic...
Ryu Takeda, Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-marc Valin, Kazunori Komatani, ...
This paper addresses automatic speech recognition (ASR) for robots integrated with sound source separation (SSS) by using leak noise based missing feature mask generation. The missing feature theory...
Two-way Translation of Compound Sentences and Arm Motions by Recurrent Neural Networks (2008)
Tetsuya Ogata, Masamitsu Murase, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract- We present a connectionist model that combines motions and language based on the behavioral experiences of a real robot. Two models of recurrent neural network with parametric bias (RNNPB)...
Hyun-don Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract—The purpose of this research is to develop techniques that enable robots to choose and track a desired person for interaction in daily-life environments. Therefore, localizing multiple...
Kazuyoshi Yoshii, Kazuhiro Nakadai, Toyotaka Torii, Yuji Hasegawa, Hiroshi Tsujino, Kazunori Komatani, ...
Abstract — We aim at enabling a biped robot to interact with humans through real-world music in daily-life environments, e.g., to autonomously keep its steps (stamps) in time with musical beats. To...
Object Dynamics Prediction and Motion Generation based on Reliable (2008)
Shun Nishide, Tetsuya Ogata, Ryunosuke Yokoya, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract — Consistency of object dynamics, which is related to reliable predictability, is an important factor for generating object manipulation motions. This paper proposes a technique to...
Okuno: “Instrument Identification in Polyphonic Music: Feature Weighting to Minimize (2007)
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper addresses the problem of identifying musical instruments in polyphonic music. Musical instrument identification (MII) is an improtant task in music information retrieval because MII...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play,...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We provide a new solution to the problem of feature variations caused by the overlapping of sounds in instrument identification in polyphonic music. When multiple instruments simultaneously play,...
Kazuhiro Nakadai, Mikio Nakano, Hiroshi Tsujino, Jean-marc Valin, Kazunori Komatani, Tetsuya Ogata, ...
This paper addresses robot audition that can cope with speech that has a low signal-to-noise ratio (SNR) in real time by using robot-embedded microphones. To cope with such a noise, we exploited two...
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper presents a hybrid music recommendation method that solves problems of two prominent conventional methods: collaborative filtering and content-based recommendation. The former cannot...
Experience based imitation using RNNPB (2006)
Ryunosuke Yokoya, Tetsuya Ogata, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract — Robot imitation is a useful and promising alternative to robot programming. Robot imitation involves two crucial issues. The first is how a robot can imitate a human whose physical...
Ryu Takeda, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Abstract — Robot audition is a critical technology in making robots symbiosis with people. Since we hear a mixture of sounds in our daily lives, sound source localization and separation, and...
Okuno: “Automatic Feature Weighting in Automatic Transcription of Specified Part (2006)
Katsutoshi Itoyama, Tetsuro Kitahara, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
We studied the problem of automatic music transcription (AMT) for polyphonic music. AMT is an important task for music information retrieval because AMT results enable retrieving musical pieces,...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a new technique for recognizing musical instruments in polyphonic music. Because the conventional framework for musical instrument recognition in polyphonic music had to estimate...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
Instrumentation is an important cue in retrieving musical content. Conventional methods for instrument recognition performing notewise require accurate estimation of the onset time and fundamental...
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper presents a framework for correcting errors of automatic drum sound detection focusing on the periodicity of drum patterns. We define drum patterns as periodic structures found in onset...
Experience based imitation using RNNPB (2006)
Ryunosuke Yokoya, Tetsuya Ogata, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract — Robot imitation is a useful and promising alternative to robot programming. Robot imitation involves two crucial issues. The first is how a robot can imitate a human whose physical...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a new technique for recognizing musical instruments in polyphonic music. Because the conventional framework for musical instrument recognition in polyphonic music had to estimate...
Robot Gesture Generation from Environmental Sounds Using Inter-modality Mapping (2005)
Hattori, Yuya, Kozima, Hideki, Komatani, Kazunori, Ogata, Tetsuya, Okuno, Hiroshi G.
We propose a motion generation model in which robots presume the sound source of an environmental sound and imitate its motion. Sharing environmental sounds between humans and robots enables them to...
Robot Gesture Generation from Environmental Sounds Using Inter-modality Mapping (2005)
Hattori, Yuya, Kozima, Hideki, Komatani, Kazunori, Ogata, Tetsuya, Okuno, Hiroshi G.
We propose a motion generation model in which robots presume the sound source of an environmental sound and imitate its motion. Sharing environmental sounds between humans and robots enables them to...
Robot Gesture Generation from Environmental Sounds Using Inter-modality Mapping (2005)
Hattori, Yuya, Kozima, Hideki, Komatani, Kazunori, Ogata, Tetsuya, Okuno, Hiroshi G.
We propose a motion generation model in which robots presume the sound source of an environmental sound and imitate its motion. Sharing environmental sounds between humans and robots enables them to...
Making a robot recognize three simultaneous sentences in real-time (2005)
Kazuhiro Nakadai, Jean-marc Valin, Jean Rouat, François Michaud, Kazunori Komatani, Tetsuya Ogata, ...
Abstract — A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and...
Making a robot recognize three simultaneous sentences in real-time (2005)
Kazuhiro Nakadai, Jean-marc Valin, Jean Rouat, François Michaud, Kazunori Komatani, Tetsuya Ogata, ...
Abstract — A humanoid robot under real-world environments usually hears mixtures of sounds, and thus three capabilities are essential for robot audition; sound source localization, separation, and...
Singer identification based on accompaniment sound reduction and reliable frame selection (2005)
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in...
Extracting multi-modal dynamics of objects using RNNPB (2005)
Tetsuya Ogata, Hayato Ohba, Jun Tani, Kazunori Komatani, Hiroshi G. Okuno
Abstract- Dynamic features play an important role in recognizing objects that have similar static features in colors and or shapes. This paper focuses on active sensing that exploits dynamic feature...
Kazunori Komatani, Naoyuki K, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes the incorporation of contextual information into spoken dialogue systems in the database search task. Appropriate dialogue modeling is required to manage automatic speech...
Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper addresses the problem of identifying musical instruments in polyphonic music. Musical instrument identification (MII) is an improtant task in music information retrieval because MII...
Singer identification based on accompaniment sound reduction and reliable frame selection (2005)
Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in...
SHIRAMATSU, Shun, KOMATANI, Kazunori, MIYATA, Takashi, HASIDA, Koiti, OKUNO, Hiroshi G.
PACLIC 19 / Taipei, taiwan / December 1-3, 2005
Automatic chord transcription with concurrent recognition of chord symbols and boundaries (2004)
Takuya Yoshioka, Tetsuro Kitahara, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno
This paper describes a method that recognizes musical chords from real-world audio signals in compact-disc recordings. The automatic recognition of musical chords is necessary for music information...
Dynamic Communication of Humanoid Robot with multiple (2004)
Tsuyoshi Tasaki, Shohei Matsumoto, Hayato Ohba, Mitsuhiko Toda, Kazunori Komatani, Tetsuya Ogata, ...
Research on human-robot interaction is getting an increasing amount of attention. Since almost all the research has dealt with only communication between one robot and one person, there have been...
Flexible guidance generation using user model in spoken dialogue systems (2003)
Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
We address appropriate user modeling in order to generate cooperative responses to each user in spoken dialogue systems. Unlike previous studies that focus on user’s knowledge or typical kinds of...
Flexible guidance generation using user model in spoken dialogue systems (2003)
Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
We address appropriate user modeling in order to generate cooperative responses to each user in spoken dialogue systems. Unlike previous studies that focus on user’s knowledge or typical kinds of...
Flexible Guidance Generation using (2003)
User Model In, Kazunori Komatani, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
We address appropriate user modeling in order to generate cooperative responses to each user in spoken dialogue systems. Unlike previous studies that focus on user's knowledge or typical kinds...
Kazunori Komatani, Fumihiro Adachi, Shinichi Ueno, Tatsuya Kawahara, Hiroshi G. Okuno
We realize a telephone-based collaborative natural language dialogue system.
Spoken Dialogue Systems for Information Retrieval with Domain-Independent Dialogue Strategies (2002)
Kyoto University (京都大学)
Efficient dialogue strategy to find users’ intended items from information query results (2002)
Kazunori Komatani, Tatsuya Kawahara, Ryosuke Ito, Hiroshi G. Okuno
We address a dialogue framework that narrows down the user's query results obtained by aninformation retrieval system. The follow-up dialogue to constrain query results is signi cant especially...
Kazunori Komatani, Tatsuya Kawahara
We present a method to realize exible mixedinitiative dialogue, in which the system can make e ective con rmation and guidance using concept-level con dence measures (CMs) derived from speech...
Kazunori Komatani, Tatsuya Kawahara
We present a method to realize flexible mixedinitiative dialogue, in which the system can make effective confirmation and guidance using concept-level confidence measures (CMs) derived from speech...
Kazunori Komatani, Tatsuya Kawahara
We present a method to realize exible mixedinitiative dialogue, in which the system can make e ective con rmation and guidance using concept-level con dence measures (CMs) derived from speech...