UST921028I4 - Sort process - Google Patents

Sort process Download PDF

Info

Publication number
UST921028I4
UST921028I4 US921028DH UST921028I4 US T921028 I4 UST921028 I4 US T921028I4 US 921028D H US921028D H US 921028DH US T921028 I4 UST921028 I4 US T921028I4
Authority
US
United States
Prior art keywords
records
sort
buckets
pass
distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Application granted granted Critical
Publication of UST921028I4 publication Critical patent/UST921028I4/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/22Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
    • G06F7/24Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers sorting methods in general
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/06Arrangements for sorting, selecting, merging, or comparing data on individual record carriers

Definitions

  • a distribution sort process is provided which results in the distribution of the records of a file into a plurality of buckets such that, the distributed records can be recovered in sequential order of key value in a one sort pass.
  • the tags of the records are first sorted into sequential order by key value (a tag including a rccords key and its address in the file).
  • the address portions of the tags as they are arranged in the tag sort are then sorted into a set of numbered substrings by a modification of a conventional internal sort method such as replacement selection.
  • the substrings then are merged into a final string.
  • the string number is added to the tag, and the key and address portion may be deleted to leave a list of string numbers which lie in the same order as the records in the file.
  • a distribution sort is then performed on the records by distributing them into a quantity of buckets equal to the quantity of substrings produced in the internal sort of the addresses, the buckets being numbered to correspond to the numbering of the substrings.
  • the distribution sort is designed such that the records distributed to a given bucket are those whose addresses are in the substring which has the same number as the bucket.
  • the buckets are arranged in sequential order of numerical value.
  • a single sort pass of the distributed records produces a string of records arranged in sequential order of key value.
  • the distribution phase can either be singleor multi-pass.
  • the invention contemplates an arrangement where the buckets can be visited cyclically for distributions of records thereinto or they can be selected for visitation in accordance with chosen criteria.
  • the invention enables the use of buckets whose size is on an average twice as large as the main store and makes possible advantageous minimization of seek and latency times.
  • FIG. 5A 9 5 4 a 2 (smmcn FIG, 5B 11 a 1 e 1 (STRINGZI FIG, 5C 12 10 1511111105) F
  • G,6A12s45s1as1o1112 FIG 6B 211112221325 F 210T81I45612591 April 23, 1974 Original Filed Dec. 16, 1971 E. T.
  • FIG. 15A I I Li 3 2 1 FIG. 138 FIG. 145 FIG. 15B
  • FIG. 1 A first figure.

Abstract

A DISTRIBUTION SHORT PROCESS IS PROVIDED WHICH RESULTS IN THE DISTRIBUTION OF THE RECORDS OF A FILE INTO A PLURALITY OF BUCKETS SUCH THAT, THE DISTRIBUTED RECORDS CAN BE RECOVERED IN SEQUENTIAL ORDER OF KEY VALUE IN A ONE SORT PASS. IN THE PROCESS, THE TAGS OF THE RECORDS ARE FIRST SORTED INTO SEQUENTIAL ORDER BY KEY VALUE (A TAG INCLUDING A RECORD''S KEY AND ITS ADDRESS IN THE FILE). THE ADDRESS PORTIONS OF THE TAGS AS THEY ARE ARRANGED IN THE TAG SORT ARE THEN SORTED INTO A SET OF NUMBERED SUBSTRINGS BY A MODIFICATION OF A CONVENTIONAL INTERNAL SORT METHOD SUCH AS REPLACEMENT SELECTION. THE SUBSTRINGS THEN ARE MERGED INTO A FINAL STRING. IN THE MERGE, THE STRING NUMBER IS ADDED TO THE TAG, AND THE KEY AND ADDRESS PORTION MAY BE DELETED TO LEAVE A LIST OF STRING NUMBERS WHICH LIE IN THE SAME ORDER AS THE RECORDS IN THE FILE. A DISTRIBUTION SORT IS THEN PERFORMED ON THE RECORDS BY DISTRIBUTING THEM INTO A QUANTITY OF BUCKETS EQUAL TO THE QUANTITY OF SUBSTRINGS PRODUCED IN THE INTERNAL SORT OF THE ADDRESSES, THE BUCKETS BEING NUMBERED TO CORRESPOND TO THE NUMBERING OF THE SUBSTRINGS. THE DISTRIBUTION SORT IS DESIGNED SUCH THAT THE RECORDS DISTRIBUTED TO A GIVEN BUCKET ARE THOSE WHOSE ADDRESSES ARE IN THE SUBSTRING WHICH HAS THE SAME NUMBER AS THE BUCKET. THE BUCKETS ARE ARRANGED IN SEQUENTIAL ORDER OF NUMERICAL VALVE. AT THE COMPLETION OF THE DISTRIBUTION SORT, A SINGLE SORT PASS OF THE DISTRIBUTED RECORDS PRODUCES A STRING OF RECORDS ARRANGED N SEQUENTIAL ORDER OF KEY VALVE. THE DISTRIBUTION PHASE CAN EITHER BE SINGLEOR MULTI-PASS. THE INVENTION CONTEMPLATES AN ARRANGEMENT WHERE THE BUCKETS CAN BE VISITED CYCLICALLY FOR DISTRIBUTIONS OF RECORDS THEREINTO OR THEY CAN BE SELECTED FOR VISITATION IN ACCORDANCE WITH CHOSED CRITERIA. THE INVENTION ENABLES THE USE OF BUCKETS WHOSE SIZE IS ON AN AVERAGE TWICE AS LARGE AS THE MAIN STORE AND MAKES POSSIBLE ADVANTAGEOUS MINIMIZATION OF SEEK AND LATENCY TIMES.

Description

EFENSWE PUELTGATiGN UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice of Dec. 16, 1969, 869 O.G. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained in the application as originally filed. The files of these applications are available to the public for inspection and reproduction may be purchased for 30 cents a sheet.
Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Oifice makes no assertion as to the novelty of the disclosed subject matter.
PUBLISHED APRIL 23, 1974 T921,028 SORT PROCESS Brian T. Bennett, Mohegan Lake, and Archie C. McKellar, Mount Kisco, N.Y., assignors to International Business Machines Corporation, Armonk, N.Y.
Continuation of application Ser. No. 208,546, Dec. 16, 1971. This application Sept. 17, 1973, Ser. No. 398,620 Int. Cl. G061? 9/12 U.S. Cl. 444-1 22 Sheets Drawing. 68 Pages Specification WEE] BlHtLl-D'I NO YES A distribution sort process is provided which results in the distribution of the records of a file into a plurality of buckets such that, the distributed records can be recovered in sequential order of key value in a one sort pass. In the process, the tags of the records are first sorted into sequential order by key value (a tag including a rccords key and its address in the file). The address portions of the tags as they are arranged in the tag sort are then sorted into a set of numbered substrings by a modification of a conventional internal sort method such as replacement selection. The substrings then are merged into a final string. In the merge, the string number is added to the tag, and the key and address portion may be deleted to leave a list of string numbers which lie in the same order as the records in the file. A distribution sort is then performed on the records by distributing them into a quantity of buckets equal to the quantity of substrings produced in the internal sort of the addresses, the buckets being numbered to correspond to the numbering of the substrings. The distribution sort is designed such that the records distributed to a given bucket are those whose addresses are in the substring which has the same number as the bucket. The buckets are arranged in sequential order of numerical value. At the completion of the distribution sort, a single sort pass of the distributed records produces a string of records arranged in sequential order of key value. The distribution phase can either be singleor multi-pass. The invention contemplates an arrangement where the buckets can be visited cyclically for distributions of records thereinto or they can be selected for visitation in accordance with chosen criteria. The invention enables the use of buckets whose size is on an average twice as large as the main store and makes possible advantageous minimization of seek and latency times.
April 23, 1974 BENNETT ETAL TQZLOZB somrnocass I Original Filed Dec. 16, 1971 22 Sheets-Sheet 1 89101112 KEY 121106Y83411259AD I F 3 LOAD MAIN STORE A REA WITH G ITEMS FROM THE INPUT SEQUENCE. N0 ITEMS ARE MARKED.
ARE ALL ITEMs MARKED? 12 NO YES 14 22 TEST FOR END OF END STRING. UNMARK MARKED INPUT SEQUENCE? ITEMS. START MExT STRING.
NO YES 16 COMPARE ITEM FROM INPUT SEQUENCE WITH THE SMALLEST HS MX'NNMASRTIEE ITTOEMS \25 UNMARKED ITEM IN MAIN sToRE. T T 0R R Is THE INPUT ITEM LARGER? CURRE" 5 DE NO YES SORT MARKED ITEMs IN THE 1 MAIN STORE AND OUTPUT -24 ITEM As FINAL STRING 20 2 END APPEND THE SMALLEST UNMARKED ITEM IN MAIII STORE T0 CURRENT 18/ OUTPUT STRING; REPLACE INVENTORS IT IN MAIN STORE BY BRIAN BENNETT THE INPUT ITEM. ARCHIE C. McKELLAR BY with.
ATTORNEY April 23, 1974 BENNETT EI'AL T921,0Z8
SORT PROCESS Original Filed Dec. 16, 1971 2,2 Sheets-Sheet B 26/ INPUT THE FIRST ITEM FROM EACH STRING TO BE MERGED SELECT THE SMALLEST ITEM. OUTPUT IT AND 28 REPLACE IT IF POSSIBLE WITH THE NEXT OCCURRING ITEM ON THE SAME STRING.
30 ANY ITEMS LEFT TO BE MERGED? NO YES END FIG, 5A 9 5 4 a 2 (smmcn FIG, 5B 11 a 1 e 1 (STRINGZI FIG, 5C 12 10 1511111105) F|G,6A12s45s1as1o1112 FIG 6B 211112221325 F=210T81I45612591 April 23, 1974 Original Filed Dec. 16, 1971 E. T. BENNETT ETAL SORT PROCESS 22 Sheets-Sheet 5 42* IND =1 NO YES 4 H J NEXT BUCKET NUMBER l [YES I F, REC(I) NEXT RECORD, F PTR (F) YES I=BTM(L) OUTPUT REC(I) T0 BUCKEHL) BTM(L)=TOP(L)? NO YES April 23, 1974 Original Filed Doc. 16,
F1G-8B FIG. 10A
FIG. 108
FG.1OC
REC
PTR
REC
PTR
REC
PTR
B. T. BENNETT L SORT I'ROCESS 22 Sheets-Sheet 4 12 3=e FIG. 9A
1 2 3.. FIG. 9B
1 2 1 2 3 0 0 2 BTM 1 3 0.
1 2 3.. FIG. 90
0 1 2' BTM 0.3 0
FIG. 11A
I 3 TOP 1 5 0 FIG-.128 FIG.11B 1 2 3 I 2 TOP 2 3 0 FIG. no
April 23, 1974 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 13 F1G.14A FIG. 15A I I Li 3 2 1 FIG. 138 FIG. 145 FIG. 15B
. I I 1 a 11 3 2 1 v FI. 13C FIG. 'I4C FIG. 15C '51 I 24569 10181112 FIG.'I6A F 2 FIG.16B F 11111111121110 ANYYMORE RECORDS 1 82 FIG. 17 GET NEXT RECORD 84 I IS 11111 STORE FULL? 86 7 YES N0 6010 NEXT BUCIIET IF NECESSARY. -88
IF 11o RECORDS,END.
OUTPUT ALL RECORDS 90 FOR 01111115111 BUCKET l April 23, 1914 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet) v Apr 23, 1974 ET ETAL TQZLOZS SORT PROCESS Original Filed Dec. 16, 1971 2.2 Sheets-Sheet 9 3 3 3 3 FIG. 21A
REG 0 0 1 2 3 =5 PTR 0 1 2 BTM 0 0 0 3 2 3 3 FIG. 21B
REC 7 10 2 1 2 3 =3 PTR 0 1 2 BTM 2 5 0 3 3 3 33 FIG. 21c
REC 2 1 2 =8 C PTR 2 o 2 5mm FIG.22A FIG. 23A FIG. 24A FIG. 25A
123 =M AR211 c000 6M5 F5 AR112 c210 .CM3 F0 F|G.23C FIG. 25C
April 23, 1974 Original Filed Dec. 16, 1971 B. T. BENNETT ET AL FIG. 28A
26A FiG. 27A
. 1 2 3 s 1 2 3 s 1 2 3 s TOP 1 3 o 011 2 1 0 1 o o 1 FIG. 278 FIG. 28B
1 2 s s 1 2 3 s EEII CT 1 CT 2 INITIALIZATION 185 FIG. 31
GET NEXT BUCKET NUMBER. GET NEXT RECORD. F191 1 19 MAIN STORE FULL 1 1'99 191 NO YES 191 1 j ANY MORE RECORDS TO BE READ 1 ANY RECORDS FOR CURRENT BUCKET 1 I YES 10 YES NO A FIG. 19
01111111 ALL RECORDS ANY RECORDS 111 111111 STORE 1 FOR CURRENT BUCKET YES 199 110 196 CHOOSE NEW ANY MORE RECORDS 10 BE BUCKET END I READ INTO MAIN STORE YES N0 April 23, 1974 BENNETT EK'AL TQZLOZS SORT PROCESS Original Filed DEC. 16, 1971 22 Sheets-Sheet ll ii -1 -2o2 FIG. 32
FIG.
32A [UPPRmLWR] 205 FIG. L 328 k-k km-(UPPR-LWR) #204 BND(0) LWR BND(I) k*I I=1,--,mj -209 BND(m-j+I) (m-j)*k+I K(k-1) I=1,---,j
ppm? -2os NO [YES 21o 1' 2o1 PTR(I) I1 I=1TOG PUT BND(I),BND(I1)INTO BTM(J) 0 J=1T0m BUCKET,TRACE(ii-1)*m+I F G,L 1,IND O I FOR PASS pp+1,I=1,-",m
mom -212 YES NO Nb-NEXT BUCKET NUMBER mom TRACE ii FOR PASS pp. CHOOSEJ 214 SUCH THAT snow-1) NJ 5 BNDU) 215 PP P? l NO YES PUT NJ=NEXT TRACE RECORD IN THE TRACE(ii-1)*m+J FOR PASS pp+1 216 EOF? YES l& l
IND 1 P4 REc(I) NEXT RECORD km J F -PTR(F) April23,1974 B TT ET'AL -'1'921,02s
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 15 HQ 33 TRACE FOR PASST (BUCKET NUMBERS 1, ..,16 111 ADDRESS-ORDER) 1 SUPERBUCKETS FOR PASS 2 (PRODUCED BY PASS 1) NUMBERS 12-16 NUMBERS 1-11 NUMBERS 1-6 RECORDS WlTH BUCKET RECORDS WITH BUCKET l RECORDS WITH BUCKET SUBTRACES FOR PASS 2 (PRODUCED BY PASS 1) (BUCKET NUMBERS (BUCKET NUMBERS (BUCKET NUMBERS 12-16111 1-11 111 1-6 111 BNDS BNDS BNDS ADDRESS 011111111 ADDRESS 011111511) ADDRESS 011111111 36 1 SUPERBUCKETS FOR PASS 3 (PRODUCED BY PASS 2 FOR SUPERBUCKET 3) RECORDS w1111 BUCKET RECORDS 111111 BUCKET RECORDS 111111 BUCKET 11111111511 16 NUMBERS 14 115 NUMBERS 12113 SUBTRACES FOR PASS 3 (PRODUCED BY PASS 2 FOR SUBTRACE 3) 3 2 1 16,15 (BUCKET NUMBER 15,13 (BUCKET NUMBERS 13,11 (BUCKET NUMBERS 1 S 16 111 ENDS 14115111 BNDS 12113 111 ADDRESS ORDER) ADDRESS ORDER) ADDRESS 011112111 g 38 SUPERBUCKETS FOR PASS 3 FE (PRODUCED BY PASS 2 FOR SUPERBUCKET 21 RECORDS 111111 BUCKET RECORDS 1111111 BUCKET RECORDS 11111 BUCKET 11111111111 11 NUMBERS 1616 11u1111111s 1111 Apri-F 23, 1974 a N T ETAL T921,028
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 1a.
FaG SUBTRACES FOR PASS 3 1 PRODUCED BY PASS 2 FOR SUBTRACE 1 3 2 1 1 1 (BUCKET NUMBER (BUCKET NUMBERS (BUCKET NUMBERS 1 11 111 11 R 10 111 816 1 Rs 111 ADDRESS ORDER) ADDRESS ORDER) ADDRESS ORDER) SUPERBUCKETS FOR PASS 3 (PRODUCED BY PASS 2 FOR SUPERBUCKET 1) 3 2 1 RECORDS 1111111 BUCKET RECORDS 1111111 BUCKET RECORDS 111111 BUCKET W NUMBERS 5 a e NUMBERS 3 & 4 NUMBERS 1 & 2
SUBTRACES FOR PASS 3 FIG'41 (PRODUCED BY PASSZFOR SUBTRACE1) 3 J 2 1 (Bucm NUMBERS (BUCKET NUMBERS (111101151 NUMBERS 614 5 R e 111 3 R 4 111 1 1 2 111 BNDS 3110s BNDS ADDRESS ORDER) ADDRESS ORDER) ADDRESS ORDER) BUCKJETS RESULT I'NG} FROM PASS 3 l ll ll ll ll ll Il ll l LlEJ k 15 14 J k 13 12 J 11 1o 9 (PASS 31-9 (PASS 31-11 (PASS 31-1 (PASS 3)6 (PASS 31-5 I II II H IIWMII IIW k s 7 J k 5 5 J k 4 3 J L 2 1 J (PASS 5)-4 (PASS 3)-3 (PASS 5)-2 (PASS 3)-1 ApriI23,1974 B T BENNETT ETAL T921928 soar PROCESS Original Filed Dec. 16, 1971 f 22 SheetsSheet 15 INITIATE READ OF FIRST BLOCK OF RECORDS INTO BUFFERI -2T2 INITIATE READ OF NEXT BLOCK OF RECORDS INTO BUFFER 2 -2T4 TEST. IS THE READ INTO BUFFER I COMPLETE? -2T6 IF NOT, WAIT UNTIL COMPLETE.
FIG.44 I
FIG. 45
INITIATE READ OF NEXT BLOCK OF RECORDS INTO BUFFER W, 286
W 5 -W, t -I.
TEST. IS READ INTO BUFFER W COMPLETE? 288 IF NOT, WAIT UNTIL COMPLETE.
A ril 23, 1974 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 16 April 23, 1974 Original Filed Dec. 16, 1971 22 Sheets-Sheet 17 NB(L)B OUTPUT PARTl-AL BLOCK FROM BUFFER NO E T0 BUCKET L f- I I FIG. 41 I l J A 550 NB(L)7-B? READ PREVIOUS PARTIAL NO YES BLOCK m BUCKETL mm BUFFER 4 l l 551 552 H-o 354 RD NB(L)? 538 NO YES r NB(L)*B NB(L) NB(L)-RD m Y H H+i 1- BTM(L) \540 OUTPUT REC(I) T0 BUCKET L BTM(L)=TOP(L) R 5 .2 A A0] YES BTM(L) -PTR(I) N BTM(L)-0 L PTR(I) F, F -I PTR(I) F, F I
PR0? $550 548 NO YES NO YES April 23, 1914 Original Filed Dec. 16, 1971 B. T. BENNETT ETA'L 508T PROCESS 22 Sheets-Sheet 1s Apri123, 1914 NN TT HAL T921,028
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 1 1 2 3=S REC NB 3 3 3 I PTR 0 1 2 F16. 48B 1 2 3=G 1 2 3 =8 REC 7102 NB 3 3 3 PTR 0 1 2 REC 2 FIG. 490 W 2 0 2 FIG. 50A FIG. 51A
BT11 0 0 0 I AR 2 1 1 FIG. 508 FIG. 51B
BTM 2 3 0 I AR 1 1 2 FIG.'5OC
US921028D 1973-09-17 1973-09-17 Sort process Pending UST921028I4 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US39862073A 1973-09-17 1973-09-17

Publications (1)

Publication Number Publication Date
UST921028I4 true UST921028I4 (en) 1974-04-23

Family

ID=23576091

Family Applications (1)

Application Number Title Priority Date Filing Date
US921028D Pending UST921028I4 (en) 1973-09-17 1973-09-17 Sort process

Country Status (1)

Country Link
US (1) UST921028I4 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4536857A (en) 1982-03-15 1985-08-20 U.S. Philips Corporation Device for the serial merging of two ordered lists in order to form a single ordered list
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4536857A (en) 1982-03-15 1985-08-20 U.S. Philips Corporation Device for the serial merging of two ordered lists in order to form a single ordered list
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Similar Documents

Publication Publication Date Title
CA1165449A (en) Qualifying and sorting file record data
US4433392A (en) Interactive data retrieval apparatus
US2800277A (en) Controlling arrangements for electronic digital computing machines
US20190324947A1 (en) Method, device and computer program product for deleting snapshots
US6457014B1 (en) System and method for extracting index key data fields
UST921028I4 (en) Sort process
US7003653B2 (en) Method for rapid interpretation of results returned by a parallel compare instruction
US3662400A (en) Subsidiary document identification system
CN106354721A (en) Retrieval method and device based on authority
US3613086A (en) Compressed index method and means with single control field
US3633179A (en) Information handling systems for eliminating distinctions between data items and program instructions
US6182071B1 (en) Sorting and summing record data including generated sum record with sort level key
US4327407A (en) Data driven processor
US6163783A (en) Check data operation for DB2
Brooker An attempt to simplify coding for the Manchester electronic computer
JPH0773187A (en) Retrieving system
Katz et al. An experiment in non-procedural programming
Lombardi Mathematical structure of nonarithmetic data processing procedures
GB812015A (en) Improvements in or relating to information sorting systems
JP2943693B2 (en) Sort work file space management method
JP3012482B2 (en) String data management system
JP2697630B2 (en) Level-by-level development method for material requirements planning
JPS63104132A (en) Extension system for summary of parts list
CN116821090A (en) Financial system migration test method, device, computer equipment and storage medium
GB1498371A (en) Text transforming apparatus