The signal and annotation for music part detection are available!
Speech+Music data for scene detection
Speech data from ATR speech database (ATR/SDB)  and popular music from RWC music database .
Sampling Rate: 22050 [Hz]
Quantization Bit Rate: 16 [bit]
Signal is here.
Annotation is here.
 A. Kurematsu, K. Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara and K. Shikano:
"ATR Japanese Speech Database as a Tool of Speech Recognition and Synthesis",
Speech Communication, Vol. 9, 4, pp. 357-363 (1990).
 M. Goto, H. Hashiguchi, T. Nishimura and R. Oka: " RWC Music Database: Popular, Classical, and Jazz Music Databases", Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR 2002), pp. 287-288 (2002).