Humdrum Extras

themax manpage


COMMAND

    themax -- Search melodic index data created by tindex from **kern data

SYNOPSIS

    [cat indexfile themax [options[indexfile(s)]

OPTIONS

-a Anchor search to start of search field.
-f filename Search only in specified filename tags.
-m Search only entries in a minor key.
-M Search only entries in a major key.
-t Search only entries in a particular key tonic.
-T Search only entries in particular metrical classes.
-c Search by pitch refined contour.
-C Search by pitch gross contour.
-d Search by scale degree.
-i Search by 12-tone interval string.
-I Search by diatonic musical interval.
-p Search by diatonic pitch class.
-P Search by 12-tone pitch class.
-b Search by beat level.
-u Search by duration.
-e Search by metric refined contour.
-E Search by metric gross contour.
-r Search by duration refined contour.
-R Search by duration gross contour.
-l Search by metric position.
-L Search by metric level.
-q Perform an interleaved feature query.
-B Ignore boundary markers in the features. This feature should not be used with intervallic features, since boundaries occlude interval features at segmentation boundaries.
--count Report the number of matches to the search query found in any particular entry in the index file rather than echoing the index line if a match is found.
-D convert -p search (pitch class) into diatonic pitch class search, where any accidental on a diatonic pitch letter will match.
--features Display the component feature regular expression search string and then exit without searching.
-k Search using **kern notes consisting of pitches and/or durations.
--limit # Stop searching the index file(s) after the specified number of matching entries have been found.
--loc Report the starting and ending note number of the match. The longest parallel match indicates the ending note.
--locstart Report the starting note number of matches.
--no-messages Do not echo control messages in the index data.
--raw Do not attempt to clean up search queries from the command-line.
--regex Display the final regular expression search string and then exit without searching.
--total Return the total number of entry matches rather than echoing lines which match.
--trim Return only the first column of index lines which match query.
--unlink When multiple feature queries are searched, do not require that they start on the same note in the underlying data.
-v Negate the search query and return entries which do not match.

DESCRIPTION

    The themax command is the search-engine core of Themefinder. The program reads an index file created by the tindex program and serves as a user interface to regular-expression searches on the index data. This search index is extracted from **kern data (monophonic or polyphonic, composed of monophonic spines) and consists of a series of lines, one for each musical file (or part for a polyphonic file) being searched.

    Each line in the index file consists of an ordered sequence of musical features extracted from an underlying Humdrum file, with each musical feature description starting with a unique identifying character and with fields separated by tab characters. The index file is searched via regular expressions by the themax program. The original thema command utilizes the command-line regular expression tool, grep, to do the final searching in the index file. The themax program uses the Perl-Compatible Regular Expression library to internally search the index file (and thus does not require any external tools to run which is useful for using the program on Windows computers).

    As an example application, suppose that you want to search in the melodies of the vocal parts in Ludwig Erk's Deutscher Liederschatz, volume 1 [zip file] which consists of 201 songs. First, you will have to extract the vocal parts from the full score. The vocal part is always in spine three of the score for these particular songs (the first two spines being the piano accompaniment), so the following command can extract the vocal parts using a bash-shell for-loop:

          for i in *.krn
          do
             extractx -f3 $i > `basename $i .krn`.thm
          done
          tindex -a *.thm > liederschatz1.thema

    The above command will create the following content in the file liederschatz1.thema:

    The first column contains a filename given as input to the tindex command (the .thm extensions removed manually for this example). The filename can also be replaced with some other unique identifier for the melodic data described on the line. Subsequent tab-separated fields on the line each start with a unique character which identifies what musical feature follows. For example, "Z" marks the start of the key-designation field, and "{" marks the start of the twelve-tone interval feature.

    Using the -a option (for extract all musical features), the tindex command will extract all seven pitch features and eight rhythm features into the index data. You can select which features to store in the index file using the -f option in the tindex program. Here is an example of the features extracted from song 53 in the example set:

    erk053
    The filename (or any arbitrary identification tag). This field will be ignored by themax when searching the index file.
    ZE-=
    The key of the melodic data. The field starts with a capital "Z" for major keys, and a lower-case "z" for minor keys. The field ends with an equals sign (=). In this case the music is in E-flat major. A major would be encoded as "ZA=", and C-sharp minor would be encoded as "zC#=".
    {p1p2p0m1p1p2m2p0m2p0p0p0p0m3p0m3m4p2p2p1p7p0p5m1m2m2m2
     m1m2m2m1m2p0p8p4m2p0p0p2p2p1p2m2m1m2p0m2p0p0m8p1
    The "12-tone interval" search feature which is prefixed by the "{" character. The data field consists of the number of half-steps in a melodic interval, as well as "p" (plus) for rising intervals and "m" (minus) for falling intervals. A repeated note is indicated by "p0".
    #uusduudsdssssDsDDuuuUsUdddddddddsUUdssuuuudddsdssDu
    The "pitch refined contour" search feature which is prefixed by the "#" character. There are five possible characters in this feature: u = unison (repeated note). d for a step-wise (major/minor/augmented second) interval down. u for a step-wise interval up. D for a leap (larger than a second) interval down. And U for a leap up.
    :UUSDUUDSDSSSSDSDDUUUUSUDDDDDDDDDSUUDSSUUUUDDDSDSSDU
    The "pitch gross contour" search feature which is prefixed by the ":" character. There are three possible characters in this feature: U for a unison, D for a falling interval, and U for a rising interval.
    %3455456554444422756715517654321766465556712176655571
    The "scale degree" search feature which is prefixed by the "%" character. The digits 1 through 7 are allowed in this feature field, with 1 indicating the note is on the tonic of the musical key for the data, 2 is the second scale-degree in the key, and so on.
    }Xm2XM2P1xm2Xm2XM2xM2P1xM2P1P1P1P1xm3P1xm3xM3XM2XM2Xm2XP5P1XP4xm2xM2x
          M2xM2xm2xM2xM2xm2xM2P1Xm6XM3xM2P1P1XM2XM2Xm2XM2xM2xm2xM2P1xM2P1P1xm6Xm2
    The "musical interval" search feature which is prefixed by the "}" character. This is the diatonic interval description of melodic intervals in the data. Typically there are three components to each interval: (1) direction, (2) quality, (3) diatonic value. Lower-case "x" indicates a falling interval, Upper-case "X" indicates a rising interval. For the quality there are five possible characters: "A" for augmented intervals, "M" for major, "m" for minor, "d" for diminished, and "P" for perfect. doubly augmented/diminished intervals are represented by repeating "A" or "d" respectively. Perfect unisons "P1" are the only interval which are not given a direction marker (unlike the 12-tone interval features which needs a separator for the interval numbers).
    j78AA9A0AA88888552A023AA320A875320080AAA02353200AAA23
    The "12-tone pitch class" search feature which is prefixed by the "j" character. This is the pitch class in terms of half steps above C. For example, C = 1, D = 2, and so on. The pitch class numbers 10 and 11 are mapped to the characters A and B respectively.
    JG Ab Bb Bb A Bb C Bb Bb Ab Ab Ab Ab Ab F F D Bb C D Eb Bb Bb Eb D C
          Bb Ab G F Eb D C C Ab C Bb Bb Bb C D Eb F Eb D C C Bb Bb Bb D Eb
    The "diatonic pitch class" search feature which is prefixed by the "J" character. This is the standard diatonic-based note name for the pitches within an octave. Each pitch name is separated by a space from the previous pitch (unlike most other feature). Natural diatonic pitches are just the diatonic letter in upper case (A-G). Sharps are represented after the diatonic pitch name with "#". Flats are represented after the diatonic pitch name with a lower-case "b". Double sharp/flats are indicated by repeating "#" or "b".
    M4/4quadruplesimple
    The "metric description" field which consists of three components, and prefixed by the character "M": (1) the time signature as two numbers separated by a slash, (2) "triple", "duple" or "irregular" depending on if the number of beats in a measure are divisible by 3, 2 or other, and (3) "simple" or "compound" depending on if the meter is a compound meter (such as 6/8 where the beat is a dotted quarter note), or simple otherwise. No spaces are given between the three descriptive components.
    ~<><======><=====><=><===========><=><><=><===><=><>
    The "duration gross contour" search feature which is prefixed by the "~" character. The three possible characters are: < if the current note is shorter than than the previous note; > if the current note is longer than the previous note; = if the current note has the same duration as the previous note.
    ^[][======><=====><=][===========><=][><=][===><=][]
    The "duration refined contour" search feature which is prefixed by the "^" character. The five possible characters include these two additional states: "[" if the current note is much shorter (more than half the duration) of the previous note. "]" if the current note is much longer than the previous note.
    ;8d 16 4d 8 8 8 8 8 8 8 4 8 8 8 8 8 8 4 8 8 2 8 8 8 8 8 8 8 8 8 8 8 8 4
          8 8 4d 8 4 8 8 2 8 8 8 8 4 8 8 4d 8 2
    The "duration" search feature which is prefixed by the ";" (semi-colon) character. The durations are represented as rhythm reciprocals (4 for quarter note, 8 for 8th-note, etc. and dotted rhythms indicated by appending one more more "d" characters after the basic rhythmic unit.
    &1010101010110101011011010101010101101011011010110101
    The "beat level" search feature which is prefixed by the "&" character. The two possible characters are: 1 if the note is played on the beat, or 0 if the note is played off the beat. (In this case the 3/8 time signature is being treated as "simple" rather than compound, so the beats are being assigned to the eighth-note level.
    '0 m2 p2 m1 p1 m1 0 m1 p2 m1 0 0 m1 p2 m1 0 m1 p1 0 m1 p2 0 m1 p2 m1 0
          m1 p1 m1 0 m1 p2 m1 0 0 m1 p2 m1 p1 0 m1 p2 p1 m1 0 m1 p2 0 m1 p1 m1 p2
    The "metric level" search feature which is prefixed by the "'" (apostrophe) character. Basically a log-2 rhythmic interval. 0 indicates that the previous duration is the same as that of the current note.
    `WHWHWhwHWhSwHWhwHwwHWwHWhwHWhwHWhSwHWHwwHwWhwHWwHWH
    The "metric refined contour" search feature which is prefixed by the "`" (back-quote) character.
    @WHWHWHWHWHSWHWHWHWWHWWHWHWHW
         HWHWHSWHWHWWHWWHWHWWHWH
    The "metric gross contour" search feature which is prefixed by the "@" (at symbol) character.
    =4 4_3/4 1 2_1/2 3 3_1/2 4 4_1/2 1 1_1/2 2 4 4_1/2 1 1_1/2 2 2_1/2 3 4 4_1/2 1 4 4_1/2 1 1_1/2 2 2_1/2 3 3_1/2 4 4_1/2 1 1_1/2 2 4 4_1/2 1 2_1/2 3 4 4_1/2 1 3 3_1/2 4 4_1/2 1 2 2_1/2 3 4_1/2 1
    The "metric position" search feature which is the beat number within the measure on which a note starts. Off-beats contain an extra fractional value which follows an underscore.

    The tindex program will always place these search features in the given order, although not all features need be extracted into the index. This ordering is important if you want to search multiple features in parallel since a single regular expressions is generated to do the search in a single pass.

    In the above example generation of an index file, the score sequence of the notes as printed on paper were used. This may cause problems by identifying sequences which span a first/second repeat measure. If you are concerned about these sorts of errors, you should pass the songs through the thrux program first so that the performance sequence of the notes is used in the index file.

    Another caveat is that the tindex program requires an explicit key designation when creating the key field in the index data. If there is only a key signature (such as *k[f#] for G major or E minor), the input music file still needs a key designation, such as *G: for G major, or *e: for E minor. This key designation, in turn, is required for calculating the scale-degree feature.

    Search filters

    Search in major keys only

    When using the -M option with themax, only music in major keys will be searched. The -M option can also be used without any other search queries. In that case, the themax program will return a list of the lines in the index file which represent music in a major key.

    For example, to count the number of songs in any major key in the example Deutscher Liederschatz volume, type the following command which should give an answer of 196 songs:

        themax -M indexfile | wc -l
        196 

    Search in minor keys only

    Likewise, the -m option is used to selectively search minor key entries in the index file. In the example set of songs, only 5 are in a minor key:
        themax -m indexfile | wc -l
        5 

    The themax program echos the entire text line in the index on which a match was found for a search query. The command "wc -l" is used above to count the number of lines being returned by the themax command, which is also a count of the number of songs which match the query. If you want to only see which files contained the match, then try this command:

        themax -m liederschatz1.thema | awk '{print $1}'
        erk018
        erk042
        erk081
        erk090
        erk133

    Filtering music for a particular tonic

    The -t option can be used to only search music with a particular tonic pitch. For example "-t C" will only search music in either C major or C minor. Like -M and -m, the -t option does not have to be used with a particular music query, and will return all lines which match the requested tonic key if no additional search queries are added.

    For key tonics which contain a flat sign, use a minus sign after the diatonic pitch name (such at "-t A-" for A-flat. For sharp signs use a "#" character; however, you should probably enclose the option in single quotes on the command-line so that the sharp sign is not interpreted as a comment marker: "-t 'C#'" for a C-sharp tonic. Alternatively, the themax program will accept forms such as "-t A-flat" for A-flat, and "-t F-sharp" for F-sharp.

    As an example, here are searches for various keys which can be used to get statistics on the frequency of tonics used in the songs:

        themax -t C indexfile | wc -l
        30
        themax -t D-flat indexfile | wc -l
        1
        themax -t D indexfile | wc -l
        15
        themax -t E-flat indexfile | wc -l
        20
        themax -t E indexfile | wc -l
        5
        themax -t F indexfile | wc -l
        33
        themax -t G indexfile | wc -l
        47
        themax -t A-flat indexfile | wc -l
        3
        themax -t A indexfile | wc -l
        22
        themax -t B-flat indexfile | wc -l
        24
        themax -t B indexfile | wc -l
        1 

    In this case, the most commonly used key has a tonic on G. The -t option does not select the major/minor modality. Use the -M or -m options to choose the mode:

        themax -M -t G indexfile | wc -l
        46
        themax -m -t G indexfile | wc -l
        1 

    In this case, there is one piece in G minor, and 46 in G major.

    Of course, a more efficient method of doing this particular analysis would be to process the second data field of the index directly without using themax:

        awk '{print $2}' indexfile | sort | uniq -c
           3 ZA-=
          19 ZA=
          24 ZB-=
          30 ZC=
           1 ZD-=
          15 ZD=
          20 ZE-=
           5 ZE=
          33 ZF=
          46 ZG=
           3 zA=
           1 zB=
           1 zG= 

    Filtering music for a particular meter

    Music in a particular meter or metrical category can be selected by using the -T option. The metrical description of the music contains the time signature (numerator/denominator) followed by the string "duple", "triple", "quadruple" or "irregular", and finally "simple" or "compound". As with the previously described options, the -T option can be used alone without any musical feature search.

    In the example data set, the -T option can be used to identify how many songs are in 4/4:

        themax -T 4/4 indexfile | wc -l
        63

    To search music in any duple meter (2/4 and 6/8, for example), use "duple":

        themax -T duple indexfile | wc -l
        62

    Any meter with more than 4 beats is classified as "irregular" (such as 5/4, but not 9/8 which is considered to be a compound time signature with three beats).

        themax -T irregular indexfile | wc -l
        1

    You can compare the number of songs in a simple meter (such as 2/4, 3/4) to songs in compound meter (such as 6/8, 9/8) with the following commands.

        themax -T simple indexfile | wc -l
        176
        themax -T compound indexfile | wc -l
        25

    So about 1/8 of the songs in the example set are in a compound meter.

    Be careful, because the labeling of meters as compound or simple is done automatically by the tindex program, and it cannot distinguish between 6/8 which is compound (two beats at the dotted quarter duration) or simple (six beats at the eighth note duration), which is an inherent ambiguity built-in to modern time signatures. Also, 3/8 is currently identified as being simple (three beats on the eighth note level) rather than compound (one beat at the dotted quarter note level).

    Multiple components of the metrical description can be searched at the same time. The order must be (1) time signature (2) beat count (3) simple/compound. Here is an example search for music in a triple, simple meters:

        themax -T triplesimple indexfile | wc -l
        74

    Note that -T simpletriple would not return any matches since the ordering of the metric components is incorrect.

    Search anchoring

    By including the -a option when searching with themax, matches are required to start at the beginning of a data entry. Potential matches which start after the initial position in a data entry will be ignored. This option requires an actual feature query (unlike the previously described search filtering options). If -a is used without any other arguments, all lines in the search index will be echoed to standard output.

    Here is an example search for the pitch sequence "C D E" in the example index, searching both unanchored (pattern can start anywhere in the data), or anchored (pattern must start at the beginning of the music). Notice that this melodic pattern occurs in 29 songs, but only occurs at the start of one song.

        themax -p "C D E" indexfile | wc -l
        29
        themax -a -p "C D E" indexfile | awk '{print $1}'
        erk131 

    Pitch query options

    diatonic pitch class

    The -p option is used to search the music by pitch class. The pitch class name is composed of a diatonic letter A-G, and can be followed by one or more flat (-) or sharp (#) signs. For example, A-- is A double-flat, and C## is C double-sharp. An "X" can be used to represent a double sharp. Accidentals only apply to the note to which they follow, so repeated notes using the same accidental must also repeat the accidental signs. In the classic thema command, a single space between note names is required; however, this is optional in themax, and the diatonic pitch names are case-insensitive.

    Here is an example search for the melodic sequence "D F# A" in the example data index:

        themax -p "D F# A" indexfile | wc -l
        8

    Extended pitch-class query syntax

    Northern European pitch-class names are understood by themax, where "is" is equivalent to a sharp sign, and "es" is equivalent to a flat sign. You will have to be careful about the B/H diatonic pitch name. If an "H" is present in the query, it will be assumed to be equivalent to an English "B" pitch, while "B" will be assumed to be equivalent to an English "B-". If no "H" is present, then "B" is assumed to be equivalent to an English "B" pitch.

        themax -p "d fis a" indexfile | wc -l
        8

    Southern European and fixed-do solfège are also understood in the pitch-class query string: C = ut or do, D = re, E = mi, F = fa, G = sol or so, A = la, B = si or ti. English or German accidental syntax can be applied to these basic diatonic codes. Spaces between pitch names are optional.

        themax -p "re fa# la" indexfile | wc -l
        8
        themax -p "refaisla" indexfile | wc -l
        8

    Wildcards in pitch-class query strings

    Five characters are used to represent special meanings in the pitch class search query:

    .
    To search for an unknown or arbitrary pitch, use the dot to represent any one pitch. Example:
      (G# . B) will match to (G# E  B)
      (G# A# B)
      (G# E- B)
      (G# C-- B)
      etc.
    ?
    To indicate an optional note, append a question mark. Examples:
      (F#? A) will match to (F#  A), or (  A)

      (.? G) will match to (E  G)
      (C    G)
      (F# G)
      etc.
    *
    To indicate any number of optional notes, insert an asterisk. Example:
      (* A) will match to (   A)
      (G  A)
      (B G  A)
      (B F# G  A)
      etc.
    @
    To indicate any number of repeated notes, append an at sign. Example:
      (G@ E) will match to (G E)
      (G G E)
      (G G G E)
      etc.
    ^
    A caret appended to a note indicates an arbitrary accidental. Examples:
      (E^ G) will match to (E  G)
      (E# G)
      (E- G)
      (Ex G)
      (E-- G)

Diatonic mode for pitch class queries

Using the -D option with a -p search is equivalent to adding a "^" wildcard after each diatonic pitch name. This option allows searching for diatonic pitch sequences with any accidental after the pitch name. This capability is useful for searching for modal equivalents, and various interpretations of musica ficta.

Count the number of songs which contain the exact pitch sequence "A B C D E":

    themax -p "a b c d e" indexfile  --total
    8

Now count the number of songs which contain the diatonic pitch sequence "A B C D E", allowing for accidentals attached to any of those notes:

    themax -D -p "a b c d e" indexfile  --total
    13

The previous search is equivalent to:

    themax -p "a^ b^ c^ d^ e^" indexfile  --total
    13

Scale degree

The -d option is used to search using a scale-degree representation of the music. 1 is assigned to the tonic note, and values from 1-7 represent the 7 scale degrees of the major and minor scales. Accidentals are not considered. For example, in C major a C-sharp is encoded as "1" for this musical feature. Spaces are optional between scale degrees.

An example query by scale degree, searching for "1 3 5":

    themax -d 135 indexfile | wc -l
    56
    themax -d "1 3 5" indexfile | wc -l
    56

Solfège syllables or letter names can be used in scale-degree queries. Movable-do is used in this case, so do=1, re=2, mi=3 and so on. If letter names are used, then the key of C is used to convert to scale degrees. In that case C=1, D=2, E=3 and so on.

    themax -d "do mi so" indexfile | wc -l
    56
    themax -d "C E G" indexfile | wc -l
    56

12-tone pitch class

The -P option is used to search using a twelve-tone pitch class representation of the music. The pitch classes are numbers from 0 (for C) to 11 (for B). for values 10 and higher, the pitch class numbers are mapped to letters of the alphabet. So 10 = Y (representing the pitch class B-flat/A-sharp), and 11 = Z (representing the pitch class B). The multi-digit pitch class names can also be used, provided spaces separate the numbers from adjacent values. If you use "Y" or "Z" for pitch classes, then you cannot also use 10 or 11 as the respective alternate names for the pitch classes.

Searching for the 12-tone pitch class sequence "6 8 Y":

    themax -P 68Y indexfile --trim
    erk035
    erk081 
    themax -P "6 8 10" indexfile --trim
    erk035
    erk081 
    themax -P "68 10" indexfile --trim
    erk035
    erk081 

Note that twelve-tone pitch classes are less selective than diatonic+accidental based pitch classes:

    themax -p "f# g# a#" indexfile --trim
    erk081 
    themax -p "g- a- b-" indexfile --trim
    erk035 

Pitch names can be used for 12-tone pitch class searches, with accidentals up to two sharps/flats. These values will be converted automatically into numeric 12-tone pitch classes within themax before searching the index file:

    themax -P "g- solis b-flat" indexfile --trim
    erk035 
    erk081 

Musical interval

The -I option can be used to search by musical interval. A musical interval has three components in this order: (1) interval direction, (2) interval quality, and (3) diatonic interval size.

  • The direction can be either a + (plus) for a rising interval, or a - (minus) for a falling interval.
  • The interval quality can be: "d or D for diminished, m for minor, M for major, P or p for perfect, A or a for augmented.
  • The diatonic interval size is, 1 for unisons, 2 for seconds, 3 for thirds, etc., 10 for tenths (octave plus a third), and so on.

Not all components are required: if any component is missing, then all states for that component will be matched. Example interval descriptions: +P5 = a rising perfect fifth. +3 = a rising third (can be a rising major third or a rising minor third). "P4" is a query for a perfect fourth (either rising or falling). A "+" by itself represent a rising interval of any quality and diatonic size.

Songs which start with a rising perfect fifth:

    themax -a -I "+P5" indexfile --trim
    erk035 

Number of songs which contain a rising perfect fifth anywhere in the song:

    themax -I +P5 indexfile --total
    87

Number of songs which contain a rising or falling octave:

    themax -I P8 indexfile --total
    61

Songs which contain a long up/down sequence of intervals:

    themax -I "+-+-+-+-+-" indexfile --trim
    erk009
    erk016
    erk038
    erk066
    erk103

Songs which contain 3 rising thirds in a row:

    themax -I "+3 +3 +3" indexfile --trim
    erk029
    erk074
    erk099
    erk101

Wildcards for musical interval search queries

Not all of these wildcards are active yet.

    .
    To search for an unknown or arbitrary interval, use the dot to represent any one interval. Example:
      (+M2 . -m3) will match to (+M2 -M2 -m3)
      (+M2 +P4 -m3)
      (+M2 -P5 -m3)
      (+M2  P1 -m3)
      etc.
    ?
    To indicate an optional interval, append a question mark. Example:
      (+M3 +m3? -m3) will match to (+M3 +m3 -m3)
      (+M3     -m3)
      etc.
    *
    To indicate any number of optional intervals, insert an asterisk. Example:
      (+M2 * -m3) will match to (+M2 -M2 P1 +M6 -m3)
      (+M2 +P5 -m3)
      (+M2 -P4 +m6 -m2 -m2 -m3)
      (+M2 +m7 -P8 +M2 -m3)
      etc.
    ^
    A caret appended to an interval indicates an optional inversion of the interval. The inverted interval is in the opposite direction. Examples:
      (+M3 +P4^ -m2) will match to (+M3 +P4 -m2)
      (+M3 -P5 -m2)
      (+P4 +M3^ -P5) will match to (+P4 +M3 -P5)
      (+P4 -m6 -P5)
      (+m2 A4^ +M2) will match to (+m2 +A4 +M2)
      (+m2 -d5 +M2)
      (+m2 -A4 +M2)
      (+m2 +d5 +M2)
      (+m6 -3^ -M2) will match to (+m6 -M3 -M2)
      (+m6 +m6 -M2)
      (+m6 -m3 -M2)
      (+m6 +M6 -M2)
      (+m6 -d3 -M2)
      (+m6 +A6 -M2)
      (+m6 -A3 -M2)
      (+m6 +d6 -M2)
    #
    To search for an interval or its inversion in the same direction, append the number symbol. Examples:
      (+M3 +P4# -m2) will match to (+M3 +P4 -m2)
      (+M3 +P5 -m2)
      (+P4 +M3# -P5) will match to (+P4 +M3 -P5)
      (+P4 +m6 -P5)
      (+m2 A4# +M2) will match to (+m2 +A4 +M2)
      (+m2 +d5 +M2)
      (+m2 -A4 +M2)
      (+m2 -d5 +M2)
      (+m6 -3# -M2) will match to (+m6 -M3 -M2)
      (+m6 -m6 -M2)
      (+m6 -m3 -M2)
      (+m6 -M6 -M2)
      (+m6 -d3 -M2)
      (+m6 -A6 -M2)
      (+m6 -A3 -M2)
      (+m6 -d6 -M2)

12-tone interval

The -i option can be used to search the music using twelve-tone intervals. This is equivalent to counting the number of half-steps between notes in the sequence. Rising intervals may be preceded by an optional "+" (plus), and falling intervals must be preceded by a "-" (minus). To specify that the direction is optional, prepend a "~" (tilde) character to the interval value. A repeated note is represented by a "0" interval. If an interval is preceded by a plus/minus or tilde sign, then spaces between the intervals are optional.

Look for songs which contain three rising whole-tones in a row:

    themax -i "2 2 2" indexfile --trim
    erk052
    erk078
    erk147 

Count the number of songs which contain three major seconds in a row, which can be either rising or falling:

    themax -i "~2 ~2 ~2" indexfile --trim
    119

Count the number of songs which have a major sixth interval up, followed by a major third down:

    themax -i "+9 -4" indexfile --total
    19 

Find a song with a long string of repeated notes:

    themax -i "0 0 0 0 0 0 0 0 0" liederschatz1.thema --trim
    erk130 

Pitch gross contour

The -C option can be used to search the musical data for gross contour (also called Parsons Code), which is a basic intervallic description of the melodic line split into three categories (1) up (next pitch is higher than current one), (2) down (next pitch is lower than current one), (3) same (next pitch is same as previous one). For up intervals, you can use u, U, or /. For down intervals, you can use d, D, or \. For repeated (same) intervals you can use s, S, = (equals sign), or - (dash).

Extended regular expression syntax work in the pitch gross contour search queries. For example S+ means one or more repeated interval (two or more repeated notes in a row).

Count the number of songs which have 6 up intervals followed by 6 down intervals:

    themax -C "uuuuuudddddd" indexfile  --trim
    erk115 

Same search as above, but using extended regular expressions:

    themax -C "u{6}d{6}" indexfile  --trim
    erk115 

Count the number of songs which have a least 4 upward intervals followed by any type of intervals, and eventually followed by 4 or more falling intervals

    themax -C "u{4,}.*d{4,}" indexfile --total
    34

Wildcard characters in Pitch gross contour queries:

?
An optional interval.

Pitch refined contour

The -c option searches using refined interval contour. Refined contour contains five intervallic levels rather than the three of gross contour. In the refined case, up and down intervals are split into two sub-categories (1) step-wise movement, and (2) leap movement.

Any step-wise movement up or down (a half-step, whole-step or augmented second) is represented by a lower case "u" or "d". Any leap up or down (a third or larger) is represented by an upper case "U" or "D". A repeated note is represented as an "s", "S" or "-" as with gross contour features.

Like gross contour, all extended regular expression wildcards are allowed.

Count the number of songs which contain a leap down followed by one more more steps or leaps upwards, followed by a leap down:

    themax -c "D(u|U)+D" indexfile --total
    109

Find songs which contain at least 10 or more successive leaps (any mixture of up and down leaps):

    themax -c "(U|D){10,}" indexfile --trim
    erk025 
    erk047 
    erk124 

Wildcard characters in Pitch refined contour queries:

?
An optional interval.

Rhythm query options

Duration

The -u option allows search by duration. Durations are specified by rhythmic values (actually the reciprocal of a duration value) using Humdrum's **recip representation. In other words "4" is a quarter note, "16" is a 16th note, "12" is a triplet-eighth note (there are 12 triplet-eighth notes in a whole note). Note that notated eighth notes in a 6/8 meter are not considered triplet eighth notes but rather, plain eighth notes ("8").

Count the number of songs which contain any 16th notes:

    themax -u 16 indexfile --trim
    121 

Multiple rhythms in the search query must be separate by one or more spaces in order to parse the rhythmic entities properly:

Count the number of songs which contain the rhythmic pattern "4 8 8":

    themax -u "4 8 8" indexfile --trim
    154 

A period (".") or "d" can be used to represent dotted rhythmic values.

Count the number of songs which have the rhythmic sequence of a dotted eighth note followed by a sixteenth note:

    themax -u "8. 16" indexfile --trim
    100 

Count the number of songs which contain dotted rhythms (Note that rests are not examined by themax):

    themax -u "d" indexfile --total
    174 

The duration feature allows for one regular expression wildcard. An "X" (or "x") can be used to represent any single duration of any type. A dot cannot be used as in the pitch feature, since it might be confused with an augmentation dot.

For example, count the number of songs which contain a sequence of a quarter note, followed by any rhythm, followed by a half note:

    themax -u "4 x 2" indexfile --total
    47

Duration gross contour

The -R option is used to search duration gross contour features: "S" if the following note has a shorter duration than the current note; "L" if the following note is longer, and "=" if the following note has the same duration as the current note. The search features are case insensitive, and extended regular expressions can be used in this search option.

Search for songs which contain a long string of equal durations (30 or more repeated durations):

    themax -R "={30}" indexfile --trim
    erk056 
    erk063 
    erk110 

Search for songs which contain four notes, each shorter than the previous one:

    themax -R 'sss' indexfile --trim
    erk052  (starting in second measure of second page)
    erk131 
    erk134 
    erk142 
    erk162  

Search for songs which have a long sequence of shorter-longer duration pairs:

    themax -R '(SL){10}' indexfile  --trim
    erk131 
    erk148 

Duration refined contour

The -r option is used to search duration refined contour which is similar to duration gross contour described above, but allows for 5 states, like pitch refined contour:

  • "S" next note is more than 1/2 as short as the current note.
  • "s" next note is 1/2 or less shorter than current note.
  • "=" next note is equal to duration of current note.
  • "l" (lower-case L) next note is twice or less longer than current note.
  • "L" next note is greater than twice as long as the current note.

Find songs with two shorter notes (of the same duration) followed by three longer notes (of the same duration) which are more than twice the duration of the previous two notes:

    themax -r '=L==' indexfile  --trim
    erk009
    erk059
    erk111 (starting in measure 8)
    erk130
    erk168
    erk169
    erk197

Beat level

The -b option can be used to search using the beat level musical feature. If a note is to be played on a beat, its feature value is "1". If a note is to be played off of the beat, its feature value is "0". All extended regular expression wildcards are allowed in this search field.

Count songs which start with an anacrusis:

    themax -a -b 0 indexfile --trim
    94 

Search for songs which contain a sequence of at least 40 notes which alternate on/off of the beat:

    themax -b "(10){20}" indexfile --trim
    erk010 
    erk017 
    erk020 
    erk027 
    erk035 
    erk148 
    erk195 

Metric position

The -l (lower-case L) option is used to search the "metrical position" feature. This is the beat number on which a note attack occurs. Beat values must be separated by one or more spaces.

In 4/4 meters, count the number of songs which contain only notes on the beats followed by a note in the next measure on the downbeat:

    themax -T 4/4 -l "1 2 3 4 1" indexfile  --total
    34 

Off-beats are specified by adding a space and then a fractional offset into the given beat. A dash ("-") can also be used to separate the beat from the fractional offbeat part of the number. For example, to search for a dotted eighth followed by a sixteenth note on beat four, you can search using the feature "4 4 3/4" or "4 4-3/4":

    themax -l "4 4 3/4" indexfile --total
    15
    themax -l "4 4-3/4" indexfile --total
    15 

Wildcards in metric position features

~
To search for a note anywhere within a given beat. Example:
    3~ will match to 3           (note starting on beat 3)
    3-1/2   (note starting 8th note after beat 3)
    3-3/4   (note starting 3 16th notes after beat 3)
    etc.
^
To search for a note anywhere within a given beat, except at the start. Example:
    2^ will match to 2-1/3   (note starting a triplet 8th note after beat 2)
    2-1/2   (note starting 8th note after beat 2)
    2-3/4   (note starting 3 16th notes after beat 2)
    etc.
.
To search for a note on any beat (as a place holder for a note which can be on any beat so that other notes can be search on specific metric positions), use a dot. etc.

The wildcard "~" is used after a beat to indicate that the note must fall within a given beat, either on the beat or an offbeat after that beat position, but before the next beat.

Count the number of songs in a 2/4 meter which have three notes occurring during the span of beat 2, with the first note occurring on the beat, and two others on subsequent off-beats before a note on beat 3. In other words, the search query will match to an eighth and two sixteenths on beat 2, or to three triplet eighth notes on beat 2:

    themax -T 2/4 -l "2 2~ 2~ 1" indexfile --total
    37 

The wildcard "^" is used to indicate any offbeat within the given beat, not including any note attack occurring at the start of the beat.

Search for songs which have an note on the first beat of a measure, none on the second but two notes on a off-beats of beat two, followed by a on beat three:

    themax -l "1 2^ 2^ 3" indexfile --trim
    erk009
    erk028
    erk058
    erk078
    erk147
    erk148

The dot wildcard in metric position searches represent any beat position. Here is a search for melodies which have a note on beat one, followed by two notes on any beat, followed by a note on beat 3 (either in the same measure or in a different measure):

    themax -l "1 . . 3" liederschatz1.thema  --total
    111

Metric level

The -L option is used to search metric level features. The metric level is a log2 indication of the metric stress of a note. Beats are assigned the value 0, eighth-note off-beats are assigned -1, sixteenth-notes after beats and eighth-note off-beats are assigned -2, and so on. In 4/4, the first beat of a measure is +2 and the third beat is +1. For non-negative metric levels (i.e., beats), the symbol "B" (or "b") can be used to indicate any beat. Likewise, "S" (or "s") can be used to indicate any sub-beat (metric position which does not fall on a beat). Note that this is similar to the beat level features. This feature is used to generate the metric refined contour and metric gross contour features.

Count the number of songs which have a downbeat in 4/4 followed by an eighth-note offbeat.

    themax -L "+2 -1" indexfile --total
    30 

Count the number of songs which start with an eighth-note upbeat:

    themax -a -L "-1" indexfile --total
    46 

Find any songs which do not contain sub-beats:

    themax -L "S" -v indexfile --trim
    erk039  

Metric gross contour

Metric gross contour is analogous to pitch gross contour. There are three states which describe the rhythmic relationship between a note and the following one, which can be either on a stronger metrical position, a weaker position or an equivalent position.

The -E option (or --MGC) is used to search metric gross contours, where there are 3 possible states:

  • "U" (up), or "H" (heavier). The next note is at a stronger metrical strength than the current note.
  • "D" (down), or "W" (weaker). The next note is at a weaker metrical strength than the current note.
  • "=" (equals sign), "E" (equal), or "S" (same). The next note is at the same metrical strength as the current note.
Lower-case letters may also be used, and they map to the upper-case versions.

Metric refined contour

Like metric gross contour, metric refined contour is modeled after pitch refined contour. The -e option (or --MRC) is used to search metric refined contour, where there are 5 metric levels:

  • "U" if the next note is on a metric level which is two or more levels weaker than the current one (example dotted eighth note followed by a sixteenth.
  • "u" if the next note is one metric level weaker than the current note (example: two sixteenth notes on an eighth-note offbeat).
  • "d" next note is one metric level stronger than current note (example: second to third note in four 16ths)
  • "D" next note is two or more metric levels stronger then the current note (example: eighth note pickup followed by a downbeat).
  • "S", "s", "E", "e", or "=" next note is on the same metric level as the current notes (example: two whole notes in 4/4).
    themax -e 'SSSS' indexfile  --trim
    erk030 

Other options

Interleaved search query

Features can be searched in parallel by specifying multiple feature options to themax at the same time. Alternately, you can create a single search string which contains the parallel features interleaved together. For example, to search for both pitch and rhythm at the same time for the notes C, D and E all in quarter-note durations:
     themax -p "c d e" -u "4 4 4" 
The -q option can be used to interleave these two options into a single equivalent search string:
     themax -q "p:u c:4 d:4 e:4" 
or, reversing the order of the features:
     themax -q "u:p 4:c 4:d 4:e" 
Spaces separate individual elements in the search, and the first element is the name of the uninterleaved option. Interleaved options within each element are separated by a colon character (:). Some search features have longer equivalent names, so the following command is equivalent to the above three:
     themax -q "pitch:duration c:4 d:4 e:4" 

Any number of interleaved features can occur in the -q option string. Also, interleaved features do not need to have the same number of elements, provided that shorter-length feature queries occur later in each element list. For example, here is a search primarily by interval and duration, but with a starting pitch specified:

     themax -q "interval:duration:pitch P4:4:C +m3:4 -M2:4" 
which is equivalent to:
     themax -u "4 4 4" -p "C" -I "P4 +m3 -M2" 
which means: search for four notes in a row which have the interval pattern perfect fourth (up or down) followed by a rising minor third followed by a falling major second. The first three notes of the search match must be quarter notes, and the search match must start on the pitch-class "C". Note that in this case, there are three elements in the interval and duration queries, but only one in the pitch query. Therefore the pitch query must be listed after the other two features in the interleaved query string.

Returning total entry match count

By default, themax returns the lines in the index file which the search query matched, so that further processing of the index data can be done (via a pipe to themax again, for example). Instead, the --total option can be used to return the number of matches found in the index file. This is equivalent to piping the default output of themax through the command "wc -l".
    themax --total -p "df#a" indexfile
    8
    themax -p "df#a" indexfile | wc -l
    8

Negated queries

The -v option allows a search query to be negated. This causes only entries which do not match to be returned.

Count the number of songs which do not contain the pitch C:

    themax -v -p C indexfile --total
    39 

Search for themes which do not contain a descending minor second:

    themax -v -I "-m2" indexfile  --trim
    erk040
    erk042
    erk068
    erk108
    erk159
    erk162
    erk188
    erk191

Display regular expression search query

The --regex option can be used to display the regular expression which will be used to search the index file given the input query options. The actual search will not be done, and the program will exit after the regular expression is printed to standard output. The regular expression uses extended syntax which can be used in the egrep program.
    themax -a -M --regex -d 135 indexfile
    Z[^=]*=.*%135
    themax -a -M -d 135 indexfile --total
    12
    egrep `themax -a -M --regex -d 135` indexfile | wc -l
    12

Prevent cleaning of search queries

By default, the themax command will attempt to automatically clean up the search queries for musical features so that users can input the features in multiple ways. For example, the pitch search using the -p option accepts both C, ut and do for the pitch name C.

If you are using themax under automated conditions, you can add the --raw option to prevent such pre-processing of the search queries. This will save some negligible time, and will allow the use of extended regular expressions directly on the data. Wildcard characters specific to themax (and not to regular expressions) are not available when using the --raw option.

Display post-processed user query by features

A more verbose version of --regex can be viewed with the --features option. This option will display the cleaned version of the user input queries which is useful for debugging complex queries.

In the following example, -M is converted into Z[^=]*=, -p "utdoissies" is converted into (?:C) (?:C#) (?:Bb), and the final regular expression which will be used to search the index file is Z[^=]*=.*J(?:C) (?:C#) (?:Bb).

    themax2 -a -M  -p "utdoissies" indexfile --queries
    Tonic:		Z[^=]*=
    Pitch-class:	(?:C) (?:C#) (?:Bb)
    Final Regular Expression: Z[^=]*=.*J(?:C) (?:C#) (?:Bb)[ 	]

Match counts per line

The --count option will display the number of matches to the search query found on each line in the index file. This can be used to obtain statistics on the frequency of a pattern in a database or a particular file.
    themax -p "g g e" indexfile --count
    erk027  1
    erk033  1
    erk043  2
    erk045  1
    erk058  1
    erk075  2
    erk101  1
    erk109  2
    erk129  2
    erk192  2
    erk200  3 

The output from themax when the --count option is given is the first column from the matched line, followed by the number of times the search query was found on that line. In this case the erk200 contains three match locations for the search query "g g e". Other matching entries contain one to two "g g e" pitch sequences.

When the --count option is used at the same time as the --total option, the last line of the output will be a count of the number of times the search query was found in all matched entries:

    themax -p "df#a" indexfile  --count --total
    erk005  1
    erk024  1
    erk047  2
    erk050  1
    erk094  1
    erk104  1
    erk107  1
    erk166  1
    9 

In the above example, 9 is a count of the number of occurrences of the search query in all matched entries in the index file. There are 8 matched songs from the index file, but there is one song where the pattern "D F# A" occurs twice, so the total count is listed as 9 (number of matched patterns) instead of 8 (number of songs containing at least one matched pattern).

Here is the number of perfect 4ths, 5ths, tritones, and the number of major/minor 6ths which occur in all of the songs ("tail -n 1" means to display only the last line from the output of themax):

    themax -I P4 --total --count indexfile | tail -n 1
    792
    themax -I P5 --total --count indexfile | tail -n 1
    309
    themax -i 6 --total --count indexfile | tail -n 1
    27
    themax -I 6 --total --count indexfile | tail -n 1
    383

Using --total with --count is also an easy method to count the number notes of a particular pitch class in a corpus. Counting the number of "C" pitches in the set of example songs:

    themax -p C --total --count indexfile | tail -n 1
    1586

Counting the number of C-sharps and D-flats:

    themax -p C-sharp --total --count indexfile | tail -n 1
    451
    themax -p D-flat --total --count indexfile | tail -n 1
    30

Counting both C-sharp and D-flats at the same time, as a 12-tone pitch class feature (which should be the sum of C-sharps and D-flats counted independently):

    themax -P 1 --total --count indexfile | tail -n 1
    481

Displaying match starting-note locations

The --location option is similar to the --count option, except that the starting note(s) of each match within the features is listed rather than the total number of matches within an index line.
    themax -p "g g e" indexfile --location
    erk027  10
    erk033  14
    erk043  9 20
    erk045  4
    erk058  2
    erk075  6 38
    erk101  12
    erk109  6 46
    erk129  4 20
    erk192  25 33
    erk200  62 76 99
For these search results, the pitch pattern "g g e" occurs starting on note 10 of erk027, note 14 of erk033, on both notes 9 and 20 of erk043, and so on.

Displaying match start/stop notes

Use the --location2 option instead of --location in order to list both the starting note and the ending note of the match. As an example usage of the --location2 option, consider the following C major scale:
This scale will be searched by diatonic interval, so the tindex command to extract the data necessary for searching can use the option -f "INT" to extract only interval information, and the -E option to suppress key and meter extraction.
    tindex -E -f "INT" scale.krn
Searching for the interval pattern of three seconds in a row will find many matches in the data. In the data, the pattern is found starting on note 1 through 4, 2 through 5, 3 through 6, 4 through 7, and 5 through 8.
    tindex -E -f "INT" scale.krn | themax --location2 -I "2 2 2"
However, searching for three major seconds in a row will only find a match on notes 4 through 7.
    tindex -E -f "INT" scale.krn | themax --location2 -I "M2 M2 M2"

Returning filenames only

The --trim option will remove all data fields in the matching lines except for the first column of data (the filename or identity string). This is equivalent to piping the default output of themax through the command "awk '{print}'".
    themax --trim -p "df#a" indexfile
    erk005
    erk024
    erk047
    erk050
    erk094
    erk104
    erk107
    erk166 
    themax -p "df#a" indexfile | awk '{print $1}'
    erk005
    erk024
    erk047
    erk050
    erk094
    erk104
    erk107
    erk166 

Parallel feature searching

The themax command allows for multiple musical features to be searched at the same time. By default, the features will be required to start on the same note, although the various feature queries can be of different lengths. When using wildcards match to variable note lengths, the only requirement is that start of each independent search feature starts on the same first note.

As an example, here is a search of the pitch sequence "G E G G" and the duration sequence "8 8 8 8" at the same time:

    themax -p "g e g g" -u "8 8 8 8" indexfile --trim
    erk075
    erk192 
In this case there are two songs which contain the pitch sequence "g e g g" and the duration sequence "8 8 8 8" starting on the same note:

If, however, you want to search for songs which have both a pitch sequence of "g e g g" and a rhythm sequence of "8 8 8 8", but not necessarily at the same time, then use the --unlink option to indicate that the multiple feature queries are not required to be linked to same starting note:

    themax -p "g e g g" -u "8 8 8 8" --unlink indexfile --trim
    erk075
    erk192
    erk200

In this case there is an extra song which matches to the search query. This song contains both the pitch and rhythm sequences, but these features to not align to the same notes (the pitch sequence occurs on the rhythms "8 2 4 8").

Another way to unlink multiple features (or do two independent searches on the same feature) would be to pipe the output from themax into another call to themax:

    themax -p "g e g g" indexfile | themax -u "8 8 8 8" --trim
    erk075
    erk192
    erk200

Themax does not search diatonic pitch values, only pitch chroma. One way to search in a semi-absolute pitch manner is to combine the pitch (or 12-tone pitch) feature with an interval or contour search.

For example, searching for the sequence "C A" will return all songs which contain a C pitch followed by an A pitch, regardless of whether the following A pitch is above or below the C pitch:

    themax -p "C A"  indexfile --total
    82

A parallel contour search can separate cases where the A pitch is above or below the C pitch:

    themax -p "C A"  -C D indexfile --total
    78
    themax -p "C A" -C U indexfile --total
    9
    themax -pca -Cu indexfile | themax -pca -Cd --total
    5

78 songs contain "C A" with the A pitch below the C pitch, while 9 songs contain a rising A pitch. Five songs contain a melodic fragment which has both a rising A pitch and a falling A pitch.

Kern-based note searching

A special case of parallel feature searches can be done with the -k option. This option allows for a sequence of kern notes to be used as the search query. The kern notes can contain both pitch and duration information (in any order), or just one feature of pitch or duration. The **kern notes can only contain pitch and duration values, and each note must be separated from adjacent notes in the sequence by one or more spaces. No other **kern characters (such as articulations or stem directions) are allowed.

Search for the pitch/duration sequence "8c 8e 4g":

 themax -k "8c 8e 4g" indexfile --trim
    erk171
    erk200 

The above search is equivalent to specifying the features independently:

    themax -p "c e g" -u "8 8 4" indexfile --trim
    erk171
    erk200 

If a note is missing either pitch or duration information, then a wildcard for the missing feature will be inserted automatically in the equivalent independent feature searches. In the following example, the middle note contains only a duration value, so any pitch is allowed as the second note in the search query:

    themax -k "8c 8 4g" liederschatz1.thema   --trim
    erk027
    erk034
    erk107
    erk110
    erk160
    erk171
    erk200 

which is equivalent to the following search:

    themax -p "c . g" -u "8 8 4" indexfile --trim
    erk027
    erk034
    erk107
    erk110
    erk160
    erk171
    erk200 

Search limit

The --limit option can be used to limit the time that searches are done within the index file(s). When a limit is given, then the themax program will stop searching for matches once the specified count of matches has been found in the index file.
    themax -P "C" indexfile --total
    162
    themax -P "C" --limit 100 indexfile --total
    100

When using --limit with the --count or --location options, the limiting will still apply to index entries, not to the output values given by --count or --location. In the following two example uses of themax, the total number of C pitches in the index file is 1472, but there are 1009 C pitches in the first 100 songs which contain at least one C pitch.

    themax -P "C" --limit 100 indexfile --count --total | tail -n 1
    1009
    themax -P "C" --limit 100 indexfile --count --total | tail -n 1
    1472

Segmentation boundaries

When a thema index has been created with tindex using any of the options --rest, --phrase or --fermata, the resulting index will contain segmentation boundaries which are the character R followed by an optional space (depending on the feature).

If segmentation markers are encoded within the features, the symbol R or r can be used in feature queries. In other words, searching for the diatonic pitch-class sequence "C R G" will search for a C followed by a segmentation boundary, followed by a G.

These segmentation boundaries can be ignored when searching features such as diatonic pitch-class names by using the -B option. The -B option should not be used for searching intervallic data (of both rhythm and pitch features) since this will yield inaccurate results. For intervallic searches without segmentation boundaries, the tindex command should be run without --rest, --fermata or --phrase instead of using the -B option; otherwise, inaccurate results are possible.

Thema indexes generated with tindex may contain the control messages #REST, #FERMATA, or #PHRASE if the -q was not used to suppress these messages. In any case, the presence of an R within a feature indicates that one or more of the segmentation options in tindex were used.

Control messages

The themebuilerx program may store control messages in the thema index data if certain options are used. By default, these messages are automatically echoed by the themax command. If a match count by entry match needs to be done, use the --no-messages option to suppress these message, or filter out lines in the output which start with a "#" character. A match count by indexed file can be done by running the output of themax --no-messages into the wc -l command which counts the number of lines in its input. In some cases these messages are necessary for theloc to produce correct output. If the required messages are not passed to theloc, you can instead specify them manually on the command-line call to theloc.

REFERENCES

  • Sapp, Craig, Yi-Wen Liu, and Eleanor Selfridge-Field. "Search-Effectiveness Measures for Symbolic Music Queries in Very Large Databases", ISMIR 2004, Barcelona, Spain. October 10-14, 2004. [PDF]

SEE ALSO

    • thememakerx (optional use: create incipits from longer files), tindex (create required musical feature index which themax reads), and pae2kern (used to convert Plaine & Easie data used in RISM A/II entries into **kern data for processing with thememakerx and searching with themax -- see the United States RISM A/II section of Themefinder for an application of pae2kern in conjuction with themax.

    • The theloc command, which can be used in conjunction with the --location option to identify the position of notes within the music.

DOWNLOAD

    The compiled themax program can be downloaded for the following platforms:
    • Linux (i386 processors) (dynamically linked) compiled on 28 Jun 2012.
    • Windows compiled on 29 Jun 2012.
    • Mac OS X/i386 compiled on 13 Nov 2013.

    The source code for the program was last modified on 17 Jan 2011. Click here to go to the full source-code download page.