Humdrum Extras

hgrep manpage


COMMAND

    hgrep -- Humdrum-aware grep.

SYNOPSIS

    hgrep [optionssearchstring [input[> output]

OPTIONS

-a Display the absolute beat position of a match in a Humdrum file which contains **kern data.
-b Display the metrical beat position of a match in a Humdrum file which contains **kern data.
-B Search only in bibliographic records.
-C Search only in comment records (global, local, and bibliographic).
-d Search only in data records (excluding measures).
-D Search from the top of the file until the first data line (but not stoping the search at any initial barlines).
-f Display the absolute position of a match in a Humdrum file which contains **kern data, normalized as a fraction of the total duration of the music represented in the file.
-k Search only in **kern spines, ignoring other data types. (see -x)
-m Display the bar number of the measure in which the match was found (for files which contain **kern data). The bar numbers must be present in the data. See the barnum command if you need to add bar numberings.
-N Resolve null data tokens into the data token they represent (after searching). By default the null token resolution will be enclosed in parentheses. Use --no-paren to display raw resolution.
-p Search only in the primary (left-most) spine of data in a multi-field exclusive interpretation spine (which occurs when a spine is split with the *^ spine manipulator. Only works if -k or -x are also being used.
-P Search only non-primary spines in multi-spine exclusive interpretation (opposite of the -p option). Only works if -k or -x are also being used.
-q Quiet mode. Do not display the line which matches the search string, but do print any location information according to options such as -a, -b, -f, -m, -n, and -s.
-s Display the spine number in which the match occurs. Can only be used with the -k, -x or -t options.
-t Search only in tandem interpretations.
-T Treat individual tokens as independent lines so that ^ matches the start of any data token, and $ matches to the end of any token.
-x exinterps Search only spines of the listed exclusive interpretation types, ignoring all other spines, as well as global comments and bibliographic records. Example: -x "**text, **kern" where the comma and space are optional. (see -k).
--and search Also search using the following search string(s), with the search results anded with the primary search string. Multiple search strings can be specified by separating them with the newline character or the text string "\n".
--sep separator Use the given string as a separator between location information at the start of the line (when -a, -b, -f, -m, and/or -n is used).

Options which are similar to those of grep:
-G Use basic regular expression syntax. The default is to use extended regular expression syntax (like most other Humdrum Toolkit commands).
-H Display the filename before the matching line.
-h Suppress the filename from being printed along with the matching line.
-i Ignore the difference between upper and lower case letters in the search string.
-L List only the names of the files which do not contain any matches for the search string.
-l List only the names of the files which contain one or more matches for the search string.
-n Display the line number of the match before the matched line.
-v Invert the match / non-matching status of lines which match the search string.

DESCRIPTION

    Hgrep functions similarly to the standard unix tool, grep, except that it can selectively search exclusive interpretation data and can also report the location of matches in more musically intelligent ways, such as the measure in which a match occurs, or the beat of a measure on which the match occurs.

    Regular Expressions

    The search string is the only required argument to hgrep. Optionally, a list of files to search can follow the search string, or if there are no files listed after the search string, then input data will be read from standard input (such as a piped data from another command).

    By default, both the basic and extended regular expression sets of meta characters are can be used in the search string. But the basic set can be used instead if you specify the -G option. Here are a list of the regular-expression operations:

      Basic Regular-Expression Operators:
      . Matches to any single character.
      * Matches to zero or more occurrences of the character (or atoms) which it follows
      [] Match to one character in the list between the brackets.
      ^ Called the beginning-of-line anchor. Does not match to specific characters, but rather to the start of a line. If you use this search operator when also using the -d option to search only data, or the -k/-x options for searching through specific types of data, then the ^ operator will match to the start of a data token (of which there may be several on a line). To complicate matters more ^ is also used to negate the character list inside of the [] operator.
      $ Matches to the end of a line. When -k, -x or -t options are used, then $ matches to the end of individual spine tokens.
      Extended Regular-Expression Operators:
      ? Matches to zero or one occurrence of the character or atom which precedes it.
      + Similar to *, but matches to one or more of the preceding character or atom.
      | Logical or operator for searching for alternate strings.
      () Combines multiple characters into an atom.
      {} Generalized counting operator; related to ?, * and +.

    Examples:

    Search for notes which are equal to or higher than C5:

            hgrep -kd '[a-g]{2,}' file.krn
     

    Search for starting or ending phrase marks:

           hgrep -kd '\{|\}' file.krn
           hgrep -Gkd '{\|}' file.krn
     

    Search for C-naturals, but not C# or C-flats (The -i option ignores the case of letters, so capital or lowercase text will match to the search string.):

           hgrep -kid 'c+([^#-]|$)' file.krn
           hgrep -kd '[Cc]+([^#-]|$)' file.krn
     

    Search for quarter notes (but not, for example, rhythm values of 24 (triplet sixteenths)):

           hgrep -kd '(^|[^0-9])4([^0-9]|$)' file.krn
     

    If you want to search for a character which also serves as a regular expression operator, then place a backslash before the character. For example \+ will search for a plus character, while + without the backslash will be interpreted as the operator which matches to one ore more of the preceding atom. Note that if you use the -G option, the extended regular expression operators are not recognized as being special characters, so the non-backslashed versions are literal characters. However, if you add a backslash in front of one of these characters, then it will be treated as an extended-regular expression operator. For example, the meanings of + and \+ are reversed when using the -G option. This is helpful in cases when searching for characters which normally are used in regular expresssions, avoiding adding backslashes which make it more difficult to read the regular-expression (such as the phrase-mark example above).

    For more information about regular expressions, search for the term in a search engine (such as on Google.com) or read the regular expression page on Wikipedia.

    Location information

    The options -a, -b, -f, -m, -n, and -s will display information about the location of a match in the Humdrum file. The location information will be displayed in front of the matched line.

    -a Absolute beat position of match: number of quarter-note durations from the start of the file.
    -b Metrical beat location inside of current measure.
    -f Fractional position of the match. Equivalent to the absolute beat position (-a) divided by the total score duration of the music in the file.
    -m Displays the current measure in which the match was found. The measure number is from the last barline above the matched line which which is labeled with a measure number.
    -n Display the line number on which the match was found.
    -s Display the spine (data column) in which the match was found.
    -t Search only in tandem interpretations.

    Any of these options can be mixed and matched. The options -a, -b and -f require data which can be parsed rhythmically, such as **kern data, in order to work. The options -n and -s can be used with any type of data, and -m (measure number) can be used when measures are labeled with numbers in the data file.

    Location information will be separated from the matched line and other location information fields by a colon character (:). This separator can be change to another string by using the --sep option. --sep "\t" can be used to separate information with tabs, and --sep "\n" to use the newline character.

    file.krn
    hgrep -a star file.krn

    hgrep -b star file.krn

    hgrep -f star file.krn

    hgrep -m star file.krn

    hgrep -mb star file.krn

    hgrep -n star file.krn

    hgrep -x "**text" -s star file.krn

    Adding measure numbers before using -m

    The measure location of matches reported when using the -m option requires a number attached to each barline. If there are no numbers on the barlines, then -m will not return useful results (measure number for any match will be 1). To add measure numbers beforehand, you can use the barnum command:

    file.krn
    hgrep -m cc file.krn
    invalid measure numbers due to missing numbers in data.


    barnum file.krn | hgrep -m cc
    valid measure numbers (supplied by barnum).

    Barlines which split a full measure are labeled as the same measure before and after the incomplete measure barline:

    file.krn
    hgrep -mb cc file.krn

    Selective search regions

    Searching can be limited to certain types of lines or spines belonging to particular exclusive interpretation. To only search in data lines, ignoring global/bibliographic comments, local comments, interpretations or measure lines, use the -d option. This is analogous to using the Humdrum Toolkit command rid -GLId before searching with grep.

    To search only in **kern spines, use the -k option. This will search any type of token in **kern spines, for example: local comments, interpretations, barlines, and musical data. Global comments and bibliographic records will be ignored when using the -k option. To search only in **kern data tokens, ignoring local comments, interpretations and barlines, combine the -k and -d options:

          hgrep -kd 4 file.krn 
    which will search for all rhythms containing a "4" in the **kern data.

    The -k option is shorthand for the more verbose option: -x "**kern" which only searching in **kern spines. You can list more than one exclusive interpretation type in the argument to the -x option. For example, -x "**text, **silbe" will search only in spines which belong to **text or **silbe exclusive interpretations, while all other types of spines will be ignored. Alternative formats of the -x option which behave the same way: -x "**text **silbe" and -x "**text**silbe". Note that the exclusive interpretation list only has to match the starting portion of an exclusive interpretation, so -x "**dyn" will search in both **dyn and **dynam spines.

    The -t option works similarly to the -d option, except that only tandem interpretations are searched, while other types of lines are ignored. This option might be useful, for example, when searching for the location of metronome markings in **kern data:

          hgrep -tk '^.MM[0-9.]+$' file.krn

    Here is another example of using the tandem search to count the number of clef changes present in Chopin mazurkas:

          hgrep -t 'clef' mazurka*.krn | wc -l
    Which returns 392 matches, which means that the clef changes 340 times during all of the 52 mazurkas were searched, or about 6.5 times per mazurka.

    If you want to only search in comment records (global comments, bibliographic records, or local comments), then use the -c option. Both -d and -t option can be used at the same time, but -c cannot be used together with either of those two options. If you want to only search in bibliographic records, but not in local comments or other global comments, then use the -B option. The -f option combined with the -B option can be useful to determine if bibliographic records are found before, after or in the middle of the data file:

    file.krn
    hgrep -Bf "." file.krn

    In the above example, bibliographic records found before the first data line (either before or after the first exclusive interpretation line) are labeled with "frac 0.000. lines which occur after the last data token (either before or after the last interpretation lines) are labeled with "frac 1.000. Bibliographic records occurring somewhere in the middle of the data records in the file will be labeled with a fractional value between 0.000 and 1.000.

    --and Usage

    Several searches can be done at the same time which are logically anded together. For example, here is a search for lines of code where there is a note higher than B4 played at the same time as a note lower than C3:

          hgrep -kd '[a-g]{2,}' --and '[A-G]{2,}' file.krn

    Multiple anded searches can be done at the same time. For example, here is a search for lines which contain the three diatonic pitch classes C, E, and G on the same line:

          hgrep -kid 'c' --and 'e\ng' file.krn
    In the and search string, a newline character or the string "\n" are used to separate searches.

    If you want to do anded searches located in several data types, you can add and Exclusive interpretation requirement infront of an anded search entry. That exclusive interpretation requirement will apply to all subsequent search strings in the --and list. Here is an example of searching for a diatonic pitch class C which occurs on the same line as **text data containing a vowel:

           hgrep -kid c --and '**text\n[aeiou]' file.krn
           hgrep -idx '**kern' c --and '**text\n[aeiou]' file.krn
           hgrep -idx '**text' '[aeiou]' --and '**kern\nc' file.krn
     

EXAMPLES

ONLINE DATA

    Input arguments or piped data which are expected to be Humdrum files can also be web addresses. For example, if a program can process files like this:
           program file.krn
    It can also read the data over the web:
           program http://www.some-computer.com/some-directory/file.krn
    Piped data works in a somewhat similar manner:
           cat file.krn | program
    is equivalent to a web file using ths form:
           echo http://www.some-computer.com/some-directory/file.krn | program

    Besides the http:// protocol, there is another special resource indicator prefix called humdrum:// which downloads data from the kernscores website. For example, using the URI humdrum://brandenburg/bwv1046a.krn:

          program humdrum://brandenburg/bwv1046a.krn
    will download the URL:
    Which is found in the Musedata Bach Brandenburg Concerto collection.

    This online-access of Humdrum data can also interface with the classical Humdrum Toolkit commands by using humcat to download the data from the kernscores website. For example, try the command pipeline:

          humcat humdrum://brandenburg/bwv1046a.krn | census -k

SEE ALSO

DOWNLOAD

    The compiled hgrep program can be downloaded for the following platforms:
    • Linux (i386 processors) (dynamically linked) compiled on 28 Jun 2012.
    • Mac OS X/i386 compiled on 13 Nov 2013.
    • Mac OS X/PowerPC (version 10.2 and higher) compiled on 13 May 2009.

    The source code for the program was last modified on 6 Apr 2013. Click here to go to the full source-code download page.