Speech+Music data
The signal and annotation for music part detection are available!
いつかトニー賞のデータも置きたい
Speech+Music data for scene detection
Speech data from ATR speech database (ATR/SDB) [1] and popular music from RWC music database [2].
Format: wave
Sampling Rate: 22050 [Hz]
Quantization Bit Rate: 16 [bit]
Signal is here.
Annotation is here.
References
[1] A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara and K. Shikano:
"ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis",
Speech Communication, Vol. 9, 4, pp. 357-363 (1990).
[2] M. Goto, H. Hashiguchi, T. Nishimura and R. Oka:
" RWC Music Database: Popular, Classical, and Jazz Music Databases",
Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR 2002), pp. 287-288 (2002).