
MIREX全称Music Information Retrieval Evaluation eXchange,即音乐信息检索评测,至于eXchange放在这不太清楚什么意思,或许与“交流”类似的含义吧,比赛由IMIRSEL承办,每个子项目由任务组织者设计并管理,这些任务组织者基本就是各个领域的领头专家。


  • Audio Classification (Train/Test) Tasks

包含了以下几个子任务:1. 美国流行音乐、拉丁音乐、韩国流行音乐的流派分类,2. 音乐情感分类、韩国流行音乐情感分类,3. 古典音乐的作曲家鉴别。这个任务做了很多年,感觉准确率到达一个瓶颈,不同任务的准确率基本上就稳定在0.65~0.8之间。


  • Audio Music Similarity and Retrieval

音频相似度和检索,7000首30s的歌曲,返回一个稀疏矩阵,对每首歌返回相似度前100名的歌曲及相似度。看看应用场景吧 A music similarity system can help a music consumer find new music by finding the music that is most musically similar to specific query songs (or is nearest to songs that the consumer already likes). 其实不太清楚这种相似性度量是通过哪个衡量标准:节拍、速度、调式、节奏、旋律、和声、和弦,中的一个还是几个。



  • Symbolic Melodic Similarity

计算旋律相似性,应该指的是通过MIDI的旋律符号,比较旋律的相似性。Retrieve the most similar items from a collection of symbolic pieces, given a symbolic query, and rank them by melodic similarity. There will be only 1 task this year which comprises a set of six "base" monophonic MIDI queries to be matched against a monophonic MIDI collection. 类似于以下结构信息

CUT[Das Hildebrandslied]
REG[Europa, Mitteleuropa, Deutschland]
KEY[A0001  04  G 4/2]
MEL[1_  3b_3b_4_4_  5__5__
    0_5__5_  5_6_7b_5_  5__0_
    5_  5_6_7b_5_  6b__5__
    0_5_4_3b_  5_3b_3b__
    0_3b_3b_3b_  4_4_5__  5__0_
    5_  4_3b_3b_3b_  2__1__
    0_5_5_.4  3b__0_
    5_  6b_5_5_3b_  4__5__
    0_4_3b3b1_  1_-6_-7__  1__. //] >>
FCT[Romanze, Ballade, Lied]



  • Structural Segmentation

The segment structure (or form) is one of the most important musical parameters. It is furthermore special because musical structure -- especially in popular music genres (e.g. verse, chorus, etc.) -- is accessible to everybody: it needs no particular musical knowledge. 输入一段音乐,输出的是对这段音乐的分段信息,如以下格式

0.000    5.223    A
5.223    15.101   B
15.101   20.334   A



  • Multiple Fundamental Frequency Estimation & Tracking


Example :
time	F01	F02	F03	
time	F01	F02	F03	F04
time	...	...	...	...
which might look like:
0.78	146.83	220.00	349.23
0.79	349.23	146.83	369.99	220.00	
0.80	...	...	...	...

For the second task, for each row, the file should contain the onset, offset and the F0 of each note event separated by a tab, ordered in terms of onset times:
onset	offset F01
onset	offset F02
...	... ...
which might look like:
0.68	1.20	349.23
0.72	1.02	220.00
...	...	...



  • Audio Tempo Estimation

Submitted programs should output two tempi (a slower tempo, T1, and a faster tempo, T2) as well as the strength of T1 relative to T2 (0-1). The relative strength ST2 (not output) is simply 1 - ST1. The tempo estimates from each algorithm should be written to a text file in the following format

60	180	0.7


P = ST1 * TT1 + (1 - ST1) * TT2

where ST1 is the relative perceptual strength of T1 (given by groundtruth data, varies from 0 to 1.0), TT1 is the ability of the algorithm to identify T1 to within 8%, and TT2 is the ability of the algorithm to identify T2 to within 8%. No credit will be given for tempi other than T1 and T2. 然后奇怪的事情就在这,这里说ST1是given by groudtruth data,那么自己预测的ST1不参与评测吗?

The algorithm with the best average P-score will achieve the highest rank in the task.


  • Audio Tag Classification


 <example path and filename>\t<tag classification>\t<affinity>\n
 /data/file1.wav    rock      0.9
 /data/file1.wav    guitar    0.7
 /data/file1.wav    vocal     0.3
 /data/file2.wav    rock      0.5



  • Set List Identification 


1. To identify the order of songs which be performed in a live concert.

In this sub task, the participants known the the artist and artist's studio song collection. Assigning a live concert audio and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify the order of songs in this live concert.

2. To identify the start/end time of each song in song sequence

In this sub task, the participants known the artist, artist's studio song collection and the song sequence. Assigning a live concert audio, song sequence and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify start time and end time of each song in the live concert.




  • Audio Onset Detection


  • Audio Offset Detection


  • Audio Beat Tracking


  • Audio Key Detection


  • Audio Downbeat Detection


  • Real-time Audio to Score Alignment(a.k.a Score Following)


  • Audio Cover Song Identification


Example distance matrix 0.1
1    /path/to/audio/file/track1.wav
2    /path/to/audio/file/track2.wav
3    /path/to/audio/file/track3.wav
4    /path/to/audio/file/track4.wav
5    /path/to/audio/file/track5.wav
Q/R   1        2        3        4        5
1     0.00000  1.24100  0.2e-4   0.42559  0.21313
3     50.2e-4  0.62640  0.00000  0.38000  0.15152


The following evaluation metrics will be computed for each submission: 1. Total number of covers identified in top 10;2. Mean number of covers identified in top 10 (average performance);3. Mean (arithmetic) of Avg. Precisions;4. Mean rank of first correctly identified cover。话说1和2是一个意思吧,MAP在10时的值;3是平均准确率,应该还跟内部位置有关;4是第一个识别正确的cover song的排名



  • Discovery of Repeated Themes & Sections

Algorithms that take a single piece of music as input, and output a list of patterns repeated within that piece. Also known as intra-opus discovery. 输入:一段音乐;输出:在这段音乐里重复出现的模式。那么所谓的模式是什么呢?For the purposes of this task, a pattern is defined as a set of ontime-pitch pairs that occurs at least twice (i.e., is repeated at least once) in a piece of music. The second, third, etc. occurrences of the pattern will likely be shifted in time and perhaps also transposed, relative to the first occurrence. Ideally an algorithm would be able to discover all exact and inexact occurrences of a pattern within a piece, so in evaluating this task we are interested in both (1) whether an algorithm can discover one occurrence, up to time shift and transposition, and (2) to what extent it can find all occurrences. It has been pointed out by Lartillot and Toiviainen (2007) among others that as well as ontime-pitch patterns, there are various types of repeating pattern (e.g., ontimes alone, duration, contour, harmony, etc.). For the sake of simplicity, the current task is restricted to ontime-pitch pairs.


  • Audio Melody Extraction


  • Query by Singing/Humming


  • Audio Chord Estimation


  • Singing Voice Separation


  • Audio Fingerprinting


