FMP AudioLabs
C4

Chapter 4: Music Structure Analysis


In Chapter 4 of [Müller, FMP, Springer 2015], we address a central and well-researched area within MIR known as music structure analysis. Given a music recording, the objective is to identify important structural elements and to temporally segment the recording according to these elements. Within this scenario, we discuss fundamental segmentation principles based on repetitions, homogeneity, and novelty—principles that also apply to other types of multimedia beyond music. As an important technical tool, we study in detail the concept of self-similarity matrices and discuss their structural properties. Finally, we briefly touch the topic of evaluation, introducing the notions of precision, recall, and F-measure. These measures are used to compare the computed results that are obtained by an automated procedure with so-called ground truth annotations that are typically generated manually by some domain expert.

4.1 General Principles
4.2 Self-Similarity Matrices
4.3 Audio Thumbnailing
4.4 Novelty-Based Segmentation
4.5 Evaluation
4.6 Further Notes

Notebooks

Topic Relation to [Müller, FMP, Springer 2015] & Description HTML IPYNB
Music Structure Analysis: General Principles [Section 4.1]
Repetition; homogeneity; novelty; segment; part; structure annotation (format, read, convert); feature representation; chroma; harmony; MFCC; timbre; tempogram; rhythm; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
Self-Similarity Matrix (SSM) [Section 4.2.1]
SSM; recurrence plot; path structure; repetition; block structure; homogeneity; cell; score; segment; path; block; induced segment; step size condition; overview of SSM computation; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
SSM: Synthetic Generation [Section 4.2.1, Exercise 4.9]
SSM; annotation; discretization; SSM from annotations; path structure; block structure; distortions; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
SSM: Feature Smoothing [Section 4.2.2.1]
Average filtering; median filtering; block structure; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
SSM: Path Enhancement [Section 4.2.2.2]
Diagonal smoothing; tempo invariance; multiple filtering; forward–backward smoothing; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
SSM: Transposition Invariance [Section 4.2.2.3]
Cyclic transposition; transposed SSM; transposition index; Zager and Evans example (In the Year 2525)
[html] [ipynb]
SSM: Thresholding [Section 4.2.2.4, Exercise 4.5]
Global threshold; scaling; penalty; relative threshold; local threshold; Brahms example (Hungarian Dance No. 5); Zager and Evans example (In the Year 2525)
[html] [ipynb]
Audio Thumbnailing [Section 4.3]
Audio thumbnail; repetition; normalized SSM; segment family; path family; induced segmentation; score; coverage; optimal path family; accumulated score matrix; dynamic programming; fitness measure; thumbnail selection; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
Scape Plot Representation [Section 4.3.2, Exercise 4.12]
Segment; center; triangular representation; fitness scape plot; (normalized) score; (normalized) coverage; Brahms example (Hungarian Dance No. 5); Beatles examples
[html] [ipynb]
Novelty-Based Segmentation [Section 4.4.1]
SSM; block structure; checkerboard kernel (box-like, Gaussian); novelty function; kernel size; Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
Structure Feature [Section 4.4.2, Exercise 4.13]
Lag; time–lag representation; circular time–lag representation; structure feature; novelty function; synthetic examples; median filter; smoothing filter; Gaussian kernel; Chopin example (Op.28, No. 11); Brahms example (Hungarian Dance No. 5)
[html] [ipynb]
Evaluation [Section 4.5]
Ground truth; reference; estimation; item; relevant; true positive; false negative; false positive; precision; recall; F-measure; pairwise precision, recall, and F-measure; label sequence; boundary annotation; tolerance
[html] [ipynb]
C0 C1 C2 C3 C4 C5 C6 C7 C8