Humdrum Extras

extractx manpage


COMMAND

    extractx -- Select spines from Humdrum input.

SYNOPSIS

    extractx [-s spines | -x spines | -t tracefile | -c [input[> output]

OPTIONS

-c exinterp Exclusive interpretation used for "c" co-spine extraction modifiers in -s arguments.
-C Return a count of the spines present in the input data.
-e Expand all spines which contains subspines.
-E exinterp Expand all spines which match the given exclusive interpretation which contains subspines.
-g regex Extract all spines which contain the regular expression pattern.
-i interps Extract spines which contain any interpretations found in the list.
-I interps Exclude spines which contain any interpretations found in the list.
-m model Secondary sub-spine expansion model. The model is a single letter: d = duplicate (default): copy primary subspine token if secondary subspine is not present. n = use null token if secondary subspine is not present. r = use rests if secondary **kern subspine is not present.
-M model Cospine enumeration model. If the model is 'r' then rests in **kern are not considered when doing cospine extraction; otherwise rests are included by default.
-r Reverse spine order, grouped by **kern, see -R for using other exclusive interpretations.
-R exinterp Reverse spine order, grouped by exinterp.
-s spines Extract spines from the Humdrum data according to the spines list. -p is an alias for -s. Currently -f is also an alias, but this may change in the future.
-t tracefile Extract spines from the Humdrum data according to the data found in the trace file.
-T separator The token separator in co-spines (see the -c option). By default the separator is a single space: -T " ".
-x spines Exclude spines from the Humdrum data according to the spines list.
-Y Do not display editorial interpretation mark "yy" on rests when doing secondary subspine extraction using the -mr method.

DESCRIPTION

    Extractx emulates the behavior of the Humdrum Toolkit command called extract. Extractx is nearly backwards compatible with the extract command, and has these enhancements:
    1. Exclusion of particular spine paths.
    2. Extraction of multiple spine paths at the same time.
    3. Extraction of spines in different sequential orderings.
    4. Extraction of multiple copies of the same spine path.
    5. Extraction of sub-spine paths.
    6. Extraction of co-spine paths.

    Basic spine extraction

    The primary purpose of extractx is to select a subset of spines from a Humdrum file. For example, extractx can be used to extract an instrumental part from a full score, or a particular data spine can be pulled out of a file.

    Spines are enumerated from left to right (and top to bottom) in a file, with the first spine labeled as "1". To extract a particular spine use the -s (or equivalently -p) followed by the spine number. Here is an example which extracts the second spine from the input data:

       extractx -s 2 input-file > output-file
    input-file:
    output-file:

    A range of spines can be extracted in one pass by placing a dash between the starting and ending spine. Here is an example which extracts the second, third and fourth spine from the input data:

       extractx -s 2-4 input-file > output-file
    input-file:
    output-file:

    If non-consecutive spines need to be extracted, then separate each spine by a comma. Note that the spine extraction string should not include spaces (such as after commas) since this will confuse the command-line interpreter. However, if you place single or double quote marks ('...' or "...") around the string, spaces are tolerated. Here is an example of how to extract the first, third, and fourth spines from a file:

       extractx -s 1,3,4 input-file > output-file
    input-file:
    output-file:

    The dollar sign ($) is used as a variable symbol representing the last spine contained in the file. Additionally, spines can be referenced from the end by placing a number after the dollar sign. $1 (or $-1 for backwards compatibility with the extract command) refers to the penultimate spine (one spine before the end). Similarly, $2 or $-2 is the spine two before the last spine. When using a dollar sign in a spine extraction string, it is wise to enclose the string in single quotes ('...') to prevent the shell command interpreter from trying to treat the dollar sign as part of a shell variable name. Here is an example of how to extract the second spine through to the last spine:

       extractx -s '2-$' input-file > output-file
    input-file:
    output-file:

    And here is an example which extracts from one before the last spine until the last spine:

       extractx -s '$1-$' input-file > output-file
    input-file:
    output-file:

    In addition to extracting spines based on enumerations, you can also extract based on interpretation strings. For this case, use the -i option followed by a list of interpretations which are present in the spine(s) you want to extract, separating multiple interpretations by a comma (and no spaces). The spine interpretation list should be enclosed in single or double quotes to prevent the command-line interpreter from parsing asterisks (*) as file-completion wildcards. As an example of extraction by interpretation, here is a demonstration of how to extract all spines which are either **b or **c spines (which essentially removes the **a spine from the input in this case):

       extractx -i '**b,**c' input-file > output-file
    input-file:
    output-file:

    An example use of the -i option is to extract a specific instrumental part from a score. Here is an example of how to extract the viola part from a string quartet by selecting spines which contain the interpretation "*Iviola":

       extractx -i '*Iviola' input-file > output-file
    input-file:
    output-file:

    Reordering/duplicating spines

    Examples in the previous section always extract spines in the same order in which they are found in the input data. However, spines can be extracted in any order and also duplicated any number of times in the output data:
       extractx -s '$,2-4,1,1' input-file > output-file
    input-file:
    output-file:

    Spine exclusion

    If you need to remove specific spines from a file rather than select spines to extract, use the -x option in a manner similar to the -s option. here is an example which excludes the second and third spines in a file:

       extractx -x 2,3 input-file > output-file
    input-file:
    output-file:

    Likewise, the -I option can be used to exclude spines which contain particular interpretations:

       extractx -I '**b' input-file > output-file
    input-file:
    output-file:

    Any files containing *+ spine manipulators may not reverse gracefully. However, when reversing with the -r option, and the adjacent created spines are not **kern spines, then the reversal may work as expected.

    Blank spines

    The extraction string may contain the number 0 which will insert a **blank spine into the output at the specified sequence location. This is similar to using the blank command, but this option allows for easier insertion of the **blank spine into the middle of the spine sequence rather than only as the first or last spine in the data. Below is an example of inserting a **blank spine between each of the original input spines.
       extractx -s 1,0,2,0,3 input > output
    input-file:
    output-file:

    Reversing spines

    A simple method to reverse all spines in a file can be done by using the spine extraction string '$0-1'. This extraction string means: extract from the last spine to the first spine in the input. The 0 after the $ marker prevents the dash (-) from being interpreted as a minus sign operating on the $ marker, forcing it instead to be interpreted as a range operator (which has lower precedence than the subtraction operator).
       extractx -s '$0-1' input > output
    input-file:
    output-file:

    This basic method of reversing spines is not as useful if ordering relationships between certain spines need to be preserved when the data is reversed. For example, **kern spines are usually followed by other spines related to them. If you use the above spine-reversal method, then the non-kern spines will instead precede the **kern spines to which they are related.

    To handle this sort of case, the -r option can be used. This will reverse the order of all **kern spines in the data, but preserve the original ordering of non-kern spines after each **kern spine. Any initial non-kern spines coming before the first **kern spine in the input will be unaffected by the reversal operation and will remain in front of the first reversed **kern spine.

       extractx -r input > output
    input-file:
    output-file:

    You may use the -R option to do a group-based reversal with the primary spine being something other than **kern spines:

       extractx -R '**num' input > output
    input-file:
    output-file:

    Sub-spines

    Spines branching into two subspines (by using the *^ manipulator) can be divided into two separate spines by following spine enumerations by "a" or "b".
       extractx -s 1a,1b input-file > output-file
    input-file:
    output-file:

    There are three methods for extracting subspines. The default behavior is shown above: when a secondary spine is not present, extracting the secondary spine will borrow data from the primary spine. Two other methods of extracting subspines can be used:

    • "d" = duplicate tokens in primary spine when no secondary spine is present (default).
    • "r" = replace notes in the primary spine with rests.
    • "n" = use null tokens instead of primary spine data.

    Using an "n" method to use null tokens can lead to non-rhythmically parsable data in **kern spines. Using "r" in non-kern spines will automatically cause the subspine extraction method to switch to "n". Here are examples of the two additional expansion methods using the -m option:

    input-file:
    extractx -mr -s 1a,1b   
       extractx -mn -s 1a,1b

    When filling the secondary spine with rests, the -Y option can be used to suppress the editorial interpretation marker "yy" after generated rests:

    input-file:
         extractx -mr -s 1b     
        extractx -mr -Y -s 1b

    Different methods of secondary subspine data generation can be used at the same time by applying the appropriate method letters after a spine enumeration value:

    input-file:
    extractx -Y -s 1b,1bd,1br,1bn

    Example application of the subspine splitting feature is to extract the first and second instrumental part from a single spine which contains both parts, or extracting the left and right hand parts from piano music where the encoding combines multiple staves into a single spine. For example, below is a section of the flute parts for a Beethoven symphony. The first and second flutes are encoded in a single spine. However, being able to analyze the properties of flute I or flute II in isolation requires extracting the individual part from the combined part. In the following example, the combined parts are split into two separate spines using the command listed on the right:

    input-file:
    extractx -s 1a,2,1b,2

    For music-analysis purposes, the extracted subspines can be used to measure the difficulty of each part, measure the average pitch of each part, identify which parts plays more notes, etc. In terms of printing, the original composite spine can be placed on a single staff showing both parts, while the extracted subspines can be used to create individual parts:

    The -e option can be used to automatically expand all spines in the input which contain subspines.

       extractx -e input-file > output-file
    input-file:
    output-file:

    If you only want spines with a particular interpretation to be expanded, you can use the -E option. The following example expands any spines containing the interpretation '**a', but will not expand spines in any other type of spine:

       extractx -E '**a' input-file > output-file
    input-file:
    output-file:

    Co-spines

    A co-spine is a spine of data which contains multiple tokens indexed against data of a particular exclusive interpretation (by default all **kern spines) found on the current line.

    Below is an example use giving a serial number to every note in a file. The first data line contains four notes, so the **serial entry on that line contains four tokens. The first token for the first column of **kern data, the second token for the second spine, and so on. Note on the second line of data that the **kern spines contain only three notes in total, with the third spine not containing any notes. The serial numbers in the last spine therefore contains only serial numbers for the first, second and fourth spines of data.

    If you want to extract the third spine and its corresponding serial numbers from the **serial spine, then then use the following command.

       extractx -s '3,$c' input.krn > output.krn 

    In the above example, the field specification '3,$c' indicates that the 3rd spine should be extracted, and the last spine ($) should be treated as a co-spine, with only tokens extracted from the last spine which correspond to the data in the 3rd spine. The letter "c" following a number or $ construct (such as $c, $-1c, $1c, $-0c, or $0c) indicates that the spine should be treated as a co-spine. As with the subspine indicators "a" and "b", the co-spine indicator cannot be used with the range operator, rather it can only be used after single-number spine indicators separated from other spine numbers by a comma.

    Other exclusive interpretations besides **kern spines can be used with the co-spine feature by using the -c option:

       extractx -s 2,3c -c "**a" input-file > output-file
    input-file:
    output-file:

    In the above example, the field specification 2,3c indicates that the second spine should be extracted verbatim, and the third spine should be treated as a co-spine when extracting it. The -c "**a" option further indicates that the co-spine is indexed against **a exclusive interpretation spines.

EXAMPLES

RUN ONLINE

    The extractx command can be run online here.

ONLINE DATA

    Input arguments or piped data which are expected to be Humdrum files can also be web addresses. For example, if a program can process files like this:
           program file.krn
    It can also read the data over the web:
           program http://www.some-computer.com/some-directory/file.krn
    Piped data works in a somewhat similar manner:
           cat file.krn | program
    is equivalent to a web file using ths form:
           echo http://www.some-computer.com/some-directory/file.krn | program

    Besides the http:// protocol, there is another special resource indicator prefix called humdrum:// which downloads data from the kernscores website. For example, using the URI humdrum://brandenburg/bwv1046a.krn:

          program humdrum://brandenburg/bwv1046a.krn
    will download the URL:
    Which is found in the Musedata Bach Brandenburg Concerto collection.

    This online-access of Humdrum data can also interface with the classical Humdrum Toolkit commands by using humcat to download the data from the kernscores website. For example, try the command pipeline:

          humcat humdrum://brandenburg/bwv1046a.krn | census -k

SEE ALSO

DOWNLOAD

    The compiled extractx program can be downloaded for the following platforms:
    • Linux (i386 processors) (dynamically linked) compiled on 28 Jun 2012.
    • Windows compiled on 29 Jun 2012.
    • Mac OS X/i386 compiled on 13 Nov 2013.

    The source code for the program was last modified on 24 Oct 2013. Click here to go to the full source-code download page.