Imagine that you want to to print report on ATP tennis games played over a number of years. It is to be grouped in three levels: by year, tournament and player. Each level has it own heading, and after each level there is a summary.
The input data is sorted by year, tournament and winning player. Only players that have reached the fifth round in the current tournament are inluded, or there would be a lot of small groups containing players that lost perhaps in the very first round. This exclusion is per tournament, so losing early in one tournament does not exclude a player from occuring later in the input data.
... 2015 Acapulco Ferrer, D Harrison, R 4-6 6-0 6-0 2015 Acapulco Ferrer, D Matosevic, M 7-6(3) 6-4 2015 Acapulco Ferrer, D Nishikori, K 6-3 7-5 2015 Acapulco Ferrer, D Sijsling, I 6-3 7-6(4) 2015 Acapulco Ferrer, D Tomic, B 6-4 3-6 6-1 2015 Auckland Vesely, J Anderson, K 6-4 7-6(4) 2015 Auckland Vesely, J Bellucci, T 6-3 7-6(4) 2015 Auckland Vesely, J Gulbis, E 6-2 3-6 6-1 2015 Auckland Vesely, J Mannarino, A 6-3 6-2 2015 Auckland Vesely, J Young, D 6-2 6-3 2015 Australian Open Berdych, T Falla, A 6-3 7-6(1) 6-3 2015 Australian Open Berdych, T Melzer, J 7-6(0) 6-2 6-2 2015 Australian Open Berdych, T Nadal, R 6-2 6-0 7-6(5) 2015 Australian Open Berdych, T Tomic, B 6-2 7-6(3) 6-2 2015 Australian Open Berdych, T Troicki, V 6-4 6-3 6-4 2015 Australian Open Djokovic, N Bedene, A 6-3 6-2 6-4 2015 Australian Open Djokovic, N Kuznetsov, A 6-0 6-1 6-4 2015 Australian Open Djokovic, N Muller, G 6-4 7-5 7-5 2015 Australian Open Djokovic, N Murray, A 7-6(5) 6-7(4) 6-3 6-0 2015 Australian Open Djokovic, N Raonic, M 7-6(5) 6-4 6-2 2015 Australian Open Djokovic, N Verdasco, F 7-6(8) 6-3 6-4 2015 Australian Open Djokovic, N Wawrinka, S 7-6(1) 3-6 6-4 4-6 6-0 ....
Each match prints the opposite player and the score. For each player, tournament and year a count of included matches is printed. Of course, the total number of matches at a tournament is much higher than the number printed, but remember that only matches that are part of a five-match suite by a player is included.
Year 2015 Tournament Acapulco Ferrer, D Harrison, R 4-6 6-0 6-0 Matosevic, M 7-6(3) 6-4 Nishikori, K 6-3 7-5 Sijsling, I 6-3 7-6(4) Tomic, B 6-4 3-6 6-1 5 matches played by Ferrer, D at Acapulco 5 long-suite matches played at Acapulco Tournament Auckland Vesely, J Anderson, K 6-4 7-6(4) Bellucci, T 6-3 7-6(4) Gulbis, E 6-2 3-6 6-1 Mannarino, A 6-3 6-2 Young, D 6-2 6-3 5 matches played by Vesely, J at Auckland 5 long-suite matches played at Auckland Tournament Australian Open Berdych, T Falla, A 6-3 7-6(1) 6-3 Melzer, J 7-6(0) 6-2 6-2 Nadal, R 6-2 6-0 7-6(5) Tomic, B 6-2 7-6(3) 6-2 Troicki, V 6-4 6-3 6-4 5 matches played by Berdych, T at Australian Open Djokovic, N Bedene, A 6-3 6-2 6-4 Kuznetsov, A 6-0 6-1 6-4 Muller, G 6-4 7-5 7-5 Murray, A 7-6(5) 6-7(4) 6-3 6-0 Raonic, M 7-6(5) 6-4 6-2 Verdasco, F 7-6(8) 6-3 6-4 Wawrinka, S 7-6(1) 3-6 6-4 4-6 6-0 7 matches played by Djokovic, N at Australian Open ... more players 23 long-suite matches played at Australian Open ... more tournaments 325 long-suite matches played in 2015 ... more years
The problem is easy to understand, but the solution is more difficult. It is easy to it get wrong, messy or both.
What complicates the problem is that the end of a level is not necessarily followed by an new instance of the same level, or even a new instance of the previous level. When we reach the last match of one player, there are several possibilities:
- 1) a new player in the same tournaments follows;
- 2) end of tournament, a new tournament follows;
- 3) end of year, a new year follows;
- 4) end of input data.
At the start of a year, there are three levels to be closed, unless it is the first year. Then, unconditionallye, there are three levels to open. Similarly there are two levels for a change of tournament within a year, and one level for a change of player within a tournament.
At the start of the program we must make sure to open all three levels.At the end of the program we must make sure to close all three levels.
To complicate things, we do not know that a record is the last in a group until we have read the next record, at which time it is almost too late.
Sigh. Typical code will involve a lot of jumping around, and flags to try to keep track of the current state.
But there is a simple solution involving just a single flag. Let us first number the levels:
- 1: Year, the outermost group level.
- 2: Tournament, the second group level.
- 3: Player, the innermost group level.
This can be extended to any number of levels, but trhee is enough for demonstration. We will use a single integer change-level that indicates the outermost (lowest number) level at which the input file has changed from the previous record. If there is a change in year, we set it to 1. Otherwise, if there is a change in tournament, we set it to 2. Finally, if there is a change in player, we set it to 3, otherwise to 9.
The flag is set a three places:
- – At start of execution it is set to 1. This will cause the program to open all levels.
- – After reading the next record, it is set according to the above rules.
- – At end-of-file it is set to 0. This will cause the program to close all levels, and to exit the read loop.
set change-level to 0 read first record while change-level > 0 if change-level < 9 open levels process record save current record read next record set flag according to rules (0 if end-of-file) if level < 9 close levels end
The open levels opens one or more levels. The higher the change level (confusingly having a lower value of change-level), the more levels there are to open.
if level <= 1 open year level if level <= 2 open tournament level if level <= 3 open player level
The close levels closes one or more levels.
if level <= 3 close player level if level <= 2 close tournament level if level <= 1 close year level
The set flag updates change-level according to the rules above, taking note also of end-of-file.
- If end-of-file, then set level to 0. This will cause all levels to be closed, and the loop to be exited.
- Otherwise, if there is a change at year level, the outermost group level, set change-level to 1.
- Otherwise, if there is a change at tournament group level, set change-level to 2.
- Otherwise, if there is a change at player group level, set change-level to .
- Otherwise, set change-level to 9, the match level, a level that will cause no closing or opening of levels.
The code below might be slightly different from the downloadable code. Use the latter when trying out the program.
Start of the program.
IDENTIFICATION DIVISION. program-id. Nested. author. Lars Nordenstrom. ENVIRONMENT DIVISION. input-output section. file-control. select data-in assign to 'datain'. DATA DIVISION. file section. *2015 Acapulco Ferrer, D Harrison, R 4-6 6-0 6-0 fd data-in record contains 128 characters. 01 row. 05 key1 pic x(4). 05 filler pic x. 05 key2 pic x(28). 05 key3 pic x(23). 05 rest pic x(72). working-storage section. 01 filler. 06 prev-key1 pic x(4). 06 prev-key2 pic x(28). 06 prev-key3 pic x(23). 06 fmt pic z(04). 06 psum occurs 3 pic 9(09) usage comp. 06 grand-total pic 9(09) usage comp value 0. 06 change-level pic S9 usage comp value 0.
The main paragraph is at the very end of the source file
PROCEDURE DIVISION. perform main stop run .
Code run when opening a level.
100-begin. display ' Year ' key1 move 0 to psum(1) . 200-begin. display ' Tournament ' key2 move 0 to psum(2) . 300-begin. display ' ' key3 move 0 to psum(3) .
Code run between groups at the same levels, e.g. between tournaments within the same year, but neither before the first tournament of the year, nor after the last tournament of the year. This is useful for inserting space between groups, without superfluous space before or after.
100-between. display ' ' . 200-between. display ' ' . 300-between. continue .
Code run when opening a level.
300-end. add psum (3) to psum (2) move psum (3) to fmt display ' ' fmt ' matches won by ' prev-key3 ' at ' prev-key2 . 200-end. add psum (2) to psum (1) move psum (2) to fmt display ' ' fmt ' long-suite matches played at ' prev-key2 . 100-end. add psum (1) to grand-total move psum (1) to fmt display ' ' fmt ' long-suite matches played in ' prev-key1 .
Processing of the current record.
901-process. display ' ' rest add 1 to psum (3) .
Read the next record. set change-level to -1 if end-of-file, otherwise leave it unchanged; it must have been set to some other value before performing this paragraph. See the main paragraph, below, for more on change-level.
990-read. read data-in at end move -1 to change-level end-read .
All the logic is located in this single paragraph. The value of change-level is 0 when the program starts. This will trigger all the begin actions but not any of the between actions. Thing of 0 and -1 as two additional, sourrounding, outer levels.
main. open input data-in perform 990-read perform until change-level < 0 if change-level = 1 perform 100-between end-if if change-level = 2 perform 200-between end-if if change-level = 3 perform 300-between end-if if change-level <= 1 perform 100-begin end-if if change-level <= 2 perform 200-begin end-if if change-level <= 3 perform 300-begin end-if perform 901-process move key1 to prev-key1 move key2 to prev-key2 move key3 to prev-key3 move 9 to change-level perform 990-read if change-level > 0 evaluate true when key1 not = prev-key1 move 1 to change-level when key2 not = prev-key2 move 2 to change-level when key3 not = prev-key3 move 3 to change-level end-evaluate end-if if change-level <= 3 perform 300-end end-if if change-level <= 2 perform 200-end end-if if change-level <= 1 perform 100-end end-if end-perform close data-in move grand-total to fmt display "Grand Total: " fmt .
Adding or removing a level requires five changes in main. But this is tiny compared to the amount of work that is likely to go into the application code for a level, so in no real way is that a problem.
All the logic is nicely grouped in a single paragraph. No other part of the code needs to known how it works. It just does.
Downloading the files
The individual files are avaiable for download. There are also archives, one .zip and one .tar.gz.
Compiling and running on z/OS with Enterprise Cobol
Sample JCL for compile, load and go.
The partitioned data set (PDS) ????.SRC.COB.DATA.FB128 is fixed block, 128 characters.
//ZEKE JOB (),MSGCLASS=A,MSGLEVEL=(1,1),TIME=(0,10) //TEST EXEC IGYWCLG,LNGPRFX=IGY410 //COBOL.SYSIN DD DSN=&SYSUID..SRC.COB(NESTED),DISP=SHR //*---------------------------------------------------------------- //GO.DATAIN DD DSN=&SYSUID..SRC.COB.DATA.FB128(NESTEDIN),DISP=SHR
Compiling and running on Linux with GNU Cobol
The source is almost identical for z/OS and GNU.The only difference is the we must code LINE SEQUENTIAL on the SELECT statement.
The below script compiles and runs the program.
#!/bin/bash -e PROG=nested-gnu # # In GNU Cobol, The SELECT statement looks in the environment for # the "assinged-to" identifiers prefix by "DD_". # Similar to JCL DD statement. # export DD_datain=nestedin.txt # # Build. Tested with GNU Cobol 2.2. # (Using ".elf" for executables is a local convention.) # set -x rm -f *.elf cobc -std=cobol85 -x -o $PROG.elf $PROG.cob # # Run the program # ./$PROG.elf
Compiling and running on Windows with GNU Cobol
Untested, but should work. GNU Cobol is available on Windows.
Portability
The only difference between the z/OS and the GNU version is the use of LINE SEQUENTIAL.
Files is the archive
nested-zos.cob | Source code for z/OS Cobol. |
nested-gnu.cob | Source coce for GNU Cobol. |
nested.jcl | JCL for running the program on z/OS. |
nested.sh | Shell script for running the program on UNIX. |
nestedin.txt | The input data file. |
nestedin.pl | A Perl script for generating nestedin.txt. |
data | A directory with raw input files. |
nested.tar.gz | UNIX archive. |
nested.zip | Windows archive |
You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090