UST921028I4 - Sort process - Google Patents
Sort process Download PDFInfo
- Publication number
- UST921028I4 UST921028I4 US921028DH UST921028I4 US T921028 I4 UST921028 I4 US T921028I4 US 921028D H US921028D H US 921028DH US T921028 I4 UST921028 I4 US T921028I4
- Authority
- US
- United States
- Prior art keywords
- records
- sort
- buckets
- pass
- distribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title abstract description 19
- 238000009826 distribution Methods 0.000 abstract description 13
- 238000012986 modification Methods 0.000 abstract description 2
- 230000004048 modification Effects 0.000 abstract description 2
- 101100412102 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) rec2 gene Proteins 0.000 description 2
- 101100445488 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) ptr-2 gene Proteins 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/22—Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
- G06F7/24—Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers sorting methods in general
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/06—Arrangements for sorting, selecting, merging, or comparing data on individual record carriers
Definitions
- a distribution sort process is provided which results in the distribution of the records of a file into a plurality of buckets such that, the distributed records can be recovered in sequential order of key value in a one sort pass.
- the tags of the records are first sorted into sequential order by key value (a tag including a rccords key and its address in the file).
- the address portions of the tags as they are arranged in the tag sort are then sorted into a set of numbered substrings by a modification of a conventional internal sort method such as replacement selection.
- the substrings then are merged into a final string.
- the string number is added to the tag, and the key and address portion may be deleted to leave a list of string numbers which lie in the same order as the records in the file.
- a distribution sort is then performed on the records by distributing them into a quantity of buckets equal to the quantity of substrings produced in the internal sort of the addresses, the buckets being numbered to correspond to the numbering of the substrings.
- the distribution sort is designed such that the records distributed to a given bucket are those whose addresses are in the substring which has the same number as the bucket.
- the buckets are arranged in sequential order of numerical value.
- a single sort pass of the distributed records produces a string of records arranged in sequential order of key value.
- the distribution phase can either be singleor multi-pass.
- the invention contemplates an arrangement where the buckets can be visited cyclically for distributions of records thereinto or they can be selected for visitation in accordance with chosen criteria.
- the invention enables the use of buckets whose size is on an average twice as large as the main store and makes possible advantageous minimization of seek and latency times.
- FIG. 5A 9 5 4 a 2 (smmcn FIG, 5B 11 a 1 e 1 (STRINGZI FIG, 5C 12 10 1511111105) F
- G,6A12s45s1as1o1112 FIG 6B 211112221325 F 210T81I45612591 April 23, 1974 Original Filed Dec. 16, 1971 E. T.
- FIG. 15A I I Li 3 2 1 FIG. 138 FIG. 145 FIG. 15B
- FIG. 1 A first figure.
Abstract
A DISTRIBUTION SHORT PROCESS IS PROVIDED WHICH RESULTS IN THE DISTRIBUTION OF THE RECORDS OF A FILE INTO A PLURALITY OF BUCKETS SUCH THAT, THE DISTRIBUTED RECORDS CAN BE RECOVERED IN SEQUENTIAL ORDER OF KEY VALUE IN A ONE SORT PASS. IN THE PROCESS, THE TAGS OF THE RECORDS ARE FIRST SORTED INTO SEQUENTIAL ORDER BY KEY VALUE (A TAG INCLUDING A RECORD''S KEY AND ITS ADDRESS IN THE FILE). THE ADDRESS PORTIONS OF THE TAGS AS THEY ARE ARRANGED IN THE TAG SORT ARE THEN SORTED INTO A SET OF NUMBERED SUBSTRINGS BY A MODIFICATION OF A CONVENTIONAL INTERNAL SORT METHOD SUCH AS REPLACEMENT SELECTION. THE SUBSTRINGS THEN ARE MERGED INTO A FINAL STRING. IN THE MERGE, THE STRING NUMBER IS ADDED TO THE TAG, AND THE KEY AND ADDRESS PORTION MAY BE DELETED TO LEAVE A LIST OF STRING NUMBERS WHICH LIE IN THE SAME ORDER AS THE RECORDS IN THE FILE. A DISTRIBUTION SORT IS THEN PERFORMED ON THE RECORDS BY DISTRIBUTING THEM INTO A QUANTITY OF BUCKETS EQUAL TO THE QUANTITY OF SUBSTRINGS PRODUCED IN THE INTERNAL SORT OF THE ADDRESSES, THE BUCKETS BEING NUMBERED TO CORRESPOND TO THE NUMBERING OF THE SUBSTRINGS. THE DISTRIBUTION SORT IS DESIGNED SUCH THAT THE RECORDS DISTRIBUTED TO A GIVEN BUCKET ARE THOSE WHOSE ADDRESSES ARE IN THE SUBSTRING WHICH HAS THE SAME NUMBER AS THE BUCKET. THE BUCKETS ARE ARRANGED IN SEQUENTIAL ORDER OF NUMERICAL VALVE. AT THE COMPLETION OF THE DISTRIBUTION SORT, A SINGLE SORT PASS OF THE DISTRIBUTED RECORDS PRODUCES A STRING OF RECORDS ARRANGED N SEQUENTIAL ORDER OF KEY VALVE. THE DISTRIBUTION PHASE CAN EITHER BE SINGLEOR MULTI-PASS. THE INVENTION CONTEMPLATES AN ARRANGEMENT WHERE THE BUCKETS CAN BE VISITED CYCLICALLY FOR DISTRIBUTIONS OF RECORDS THEREINTO OR THEY CAN BE SELECTED FOR VISITATION IN ACCORDANCE WITH CHOSED CRITERIA. THE INVENTION ENABLES THE USE OF BUCKETS WHOSE SIZE IS ON AN AVERAGE TWICE AS LARGE AS THE MAIN STORE AND MAKES POSSIBLE ADVANTAGEOUS MINIMIZATION OF SEEK AND LATENCY TIMES.
Description
EFENSWE PUELTGATiGN UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice of Dec. 16, 1969, 869 O.G. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained in the application as originally filed. The files of these applications are available to the public for inspection and reproduction may be purchased for 30 cents a sheet.
Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Oifice makes no assertion as to the novelty of the disclosed subject matter.
PUBLISHED APRIL 23, 1974 T921,028 SORT PROCESS Brian T. Bennett, Mohegan Lake, and Archie C. McKellar, Mount Kisco, N.Y., assignors to International Business Machines Corporation, Armonk, N.Y.
Continuation of application Ser. No. 208,546, Dec. 16, 1971. This application Sept. 17, 1973, Ser. No. 398,620 Int. Cl. G061? 9/12 U.S. Cl. 444-1 22 Sheets Drawing. 68 Pages Specification WEE] BlHtLl-D'I NO YES A distribution sort process is provided which results in the distribution of the records of a file into a plurality of buckets such that, the distributed records can be recovered in sequential order of key value in a one sort pass. In the process, the tags of the records are first sorted into sequential order by key value (a tag including a rccords key and its address in the file). The address portions of the tags as they are arranged in the tag sort are then sorted into a set of numbered substrings by a modification of a conventional internal sort method such as replacement selection. The substrings then are merged into a final string. In the merge, the string number is added to the tag, and the key and address portion may be deleted to leave a list of string numbers which lie in the same order as the records in the file. A distribution sort is then performed on the records by distributing them into a quantity of buckets equal to the quantity of substrings produced in the internal sort of the addresses, the buckets being numbered to correspond to the numbering of the substrings. The distribution sort is designed such that the records distributed to a given bucket are those whose addresses are in the substring which has the same number as the bucket. The buckets are arranged in sequential order of numerical value. At the completion of the distribution sort, a single sort pass of the distributed records produces a string of records arranged in sequential order of key value. The distribution phase can either be singleor multi-pass. The invention contemplates an arrangement where the buckets can be visited cyclically for distributions of records thereinto or they can be selected for visitation in accordance with chosen criteria. The invention enables the use of buckets whose size is on an average twice as large as the main store and makes possible advantageous minimization of seek and latency times.
April 23, 1974 BENNETT ETAL TQZLOZB somrnocass I Original Filed Dec. 16, 1971 22 Sheets-Sheet 1 89101112 KEY 121106Y83411259AD I F 3 LOAD MAIN STORE A REA WITH G ITEMS FROM THE INPUT SEQUENCE. N0 ITEMS ARE MARKED.
ARE ALL ITEMs MARKED? 12 NO YES 14 22 TEST FOR END OF END STRING. UNMARK MARKED INPUT SEQUENCE? ITEMS. START MExT STRING.
NO YES 16 COMPARE ITEM FROM INPUT SEQUENCE WITH THE SMALLEST HS MX'NNMASRTIEE ITTOEMS \25 UNMARKED ITEM IN MAIN sToRE. T T 0R R Is THE INPUT ITEM LARGER? CURRE" 5 DE NO YES SORT MARKED ITEMs IN THE 1 MAIN STORE AND OUTPUT -24 ITEM As FINAL STRING 20 2 END APPEND THE SMALLEST UNMARKED ITEM IN MAIII STORE T0 CURRENT 18/ OUTPUT STRING; REPLACE INVENTORS IT IN MAIN STORE BY BRIAN BENNETT THE INPUT ITEM. ARCHIE C. McKELLAR BY with.
ATTORNEY April 23, 1974 BENNETT EI'AL T921,0Z8
SORT PROCESS Original Filed Dec. 16, 1971 2,2 Sheets-Sheet B 26/ INPUT THE FIRST ITEM FROM EACH STRING TO BE MERGED SELECT THE SMALLEST ITEM. OUTPUT IT AND 28 REPLACE IT IF POSSIBLE WITH THE NEXT OCCURRING ITEM ON THE SAME STRING.
30 ANY ITEMS LEFT TO BE MERGED? NO YES END FIG, 5A 9 5 4 a 2 (smmcn FIG, 5B 11 a 1 e 1 (STRINGZI FIG, 5C 12 10 1511111105) F|G,6A12s45s1as1o1112 FIG 6B 211112221325 F=210T81I45612591 April 23, 1974 Original Filed Dec. 16, 1971 E. T. BENNETT ETAL SORT PROCESS 22 Sheets-Sheet 5 42* IND =1 NO YES 4 H J NEXT BUCKET NUMBER l [YES I F, REC(I) NEXT RECORD, F PTR (F) YES I=BTM(L) OUTPUT REC(I) T0 BUCKEHL) BTM(L)=TOP(L)? NO YES April 23, 1974 Original Filed Doc. 16,
F1G-8B FIG. 10A
FIG. 108
FG.1OC
REC
PTR
REC
PTR
REC
PTR
B. T. BENNETT L SORT I'ROCESS 22 Sheets-Sheet 4 12 3=e FIG. 9A
1 2 3.. FIG. 9B
1 2 1 2 3 0 0 2 BTM 1 3 0.
1 2 3.. FIG. 90
0 1 2' BTM 0.3 0
FIG. 11A
I 3 TOP 1 5 0 FIG-.128 FIG.11B 1 2 3 I 2 TOP 2 3 0 FIG. no
April 23, 1974 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 13 F1G.14A FIG. 15A I I Li 3 2 1 FIG. 138 FIG. 145 FIG. 15B
. I I 1 a 11 3 2 1 v FI. 13C FIG. 'I4C FIG. 15C '51 I 24569 10181112 FIG.'I6A F 2 FIG.16B F 11111111121110 ANYYMORE RECORDS 1 82 FIG. 17 GET NEXT RECORD 84 I IS 11111 STORE FULL? 86 7 YES N0 6010 NEXT BUCIIET IF NECESSARY. -88
IF 11o RECORDS,END.
OUTPUT ALL RECORDS 90 FOR 01111115111 BUCKET l April 23, 1914 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet) v Apr 23, 1974 ET ETAL TQZLOZS SORT PROCESS Original Filed Dec. 16, 1971 2.2 Sheets-Sheet 9 3 3 3 3 FIG. 21A
123 =M AR211 c000 6M5 F5 AR112 c210 .CM3 F0 F|G.23C FIG. 25C
April 23, 1974 Original Filed Dec. 16, 1971 B. T. BENNETT ET AL FIG. 28A
26A FiG. 27A
. 1 2 3 s 1 2 3 s 1 2 3 s TOP 1 3 o 011 2 1 0 1 o o 1 FIG. 278 FIG. 28B
1 2 s s 1 2 3 s EEII CT 1 CT 2 INITIALIZATION 185 FIG. 31
GET NEXT BUCKET NUMBER. GET NEXT RECORD. F191 1 19 MAIN STORE FULL 1 1'99 191 NO YES 191 1 j ANY MORE RECORDS TO BE READ 1 ANY RECORDS FOR CURRENT BUCKET 1 I YES 10 YES NO A FIG. 19
01111111 ALL RECORDS ANY RECORDS 111 111111 STORE 1 FOR CURRENT BUCKET YES 199 110 196 CHOOSE NEW ANY MORE RECORDS 10 BE BUCKET END I READ INTO MAIN STORE YES N0 April 23, 1974 BENNETT EK'AL TQZLOZS SORT PROCESS Original Filed DEC. 16, 1971 22 Sheets-Sheet ll ii -1 -2o2 FIG. 32
FIG.
32A [UPPRmLWR] 205 FIG. L 328 k-k km-(UPPR-LWR) #204 BND(0) LWR BND(I) k*I I=1,--,mj -209 BND(m-j+I) (m-j)*k+I K(k-1) I=1,---,j
ppm? -2os NO [YES 21o 1' 2o1 PTR(I) I1 I=1TOG PUT BND(I),BND(I1)INTO BTM(J) 0 J=1T0m BUCKET,TRACE(ii-1)*m+I F G,L 1,IND O I FOR PASS pp+1,I=1,-",m
mom -212 YES NO Nb-NEXT BUCKET NUMBER mom TRACE ii FOR PASS pp. CHOOSEJ 214 SUCH THAT snow-1) NJ 5 BNDU) 215 PP P? l NO YES PUT NJ=NEXT TRACE RECORD IN THE TRACE(ii-1)*m+J FOR PASS pp+1 216 EOF? YES l& l
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 15 HQ 33 TRACE FOR PASST (BUCKET NUMBERS 1, ..,16 111 ADDRESS-ORDER) 1 SUPERBUCKETS FOR PASS 2 (PRODUCED BY PASS 1) NUMBERS 12-16 NUMBERS 1-11 NUMBERS 1-6 RECORDS WlTH BUCKET RECORDS WITH BUCKET l RECORDS WITH BUCKET SUBTRACES FOR PASS 2 (PRODUCED BY PASS 1) (BUCKET NUMBERS (BUCKET NUMBERS (BUCKET NUMBERS 12-16111 1-11 111 1-6 111 BNDS BNDS BNDS ADDRESS 011111111 ADDRESS 011111511) ADDRESS 011111111 36 1 SUPERBUCKETS FOR PASS 3 (PRODUCED BY PASS 2 FOR SUPERBUCKET 3) RECORDS w1111 BUCKET RECORDS 111111 BUCKET RECORDS 111111 BUCKET 11111111511 16 NUMBERS 14 115 NUMBERS 12113 SUBTRACES FOR PASS 3 (PRODUCED BY PASS 2 FOR SUBTRACE 3) 3 2 1 16,15 (BUCKET NUMBER 15,13 (BUCKET NUMBERS 13,11 (BUCKET NUMBERS 1 S 16 111 ENDS 14115111 BNDS 12113 111 ADDRESS ORDER) ADDRESS ORDER) ADDRESS 011112111 g 38 SUPERBUCKETS FOR PASS 3 FE (PRODUCED BY PASS 2 FOR SUPERBUCKET 21 RECORDS 111111 BUCKET RECORDS 1111111 BUCKET RECORDS 11111 BUCKET 11111111111 11 NUMBERS 1616 11u1111111s 1111 Apri-F 23, 1974 a N T ETAL T921,028
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 1a.
FaG SUBTRACES FOR PASS 3 1 PRODUCED BY PASS 2 FOR SUBTRACE 1 3 2 1 1 1 (BUCKET NUMBER (BUCKET NUMBERS (BUCKET NUMBERS 1 11 111 11 R 10 111 816 1 Rs 111 ADDRESS ORDER) ADDRESS ORDER) ADDRESS ORDER) SUPERBUCKETS FOR PASS 3 (PRODUCED BY PASS 2 FOR SUPERBUCKET 1) 3 2 1 RECORDS 1111111 BUCKET RECORDS 1111111 BUCKET RECORDS 111111 BUCKET W NUMBERS 5 a e NUMBERS 3 & 4 NUMBERS 1 & 2
FIG.44 I
FIG. 45
INITIATE READ OF NEXT BLOCK OF RECORDS INTO BUFFER W, 286
W 5 -W, t -I.
TEST. IS READ INTO BUFFER W COMPLETE? 288 IF NOT, WAIT UNTIL COMPLETE.
A ril 23, 1974 B. T. BENNETT ET AL SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 16 April 23, 1974 Original Filed Dec. 16, 1971 22 Sheets-Sheet 17 NB(L)B OUTPUT PARTl-AL BLOCK FROM BUFFER NO E T0 BUCKET L f- I I FIG. 41 I l J A 550 NB(L)7-B? READ PREVIOUS PARTIAL NO YES BLOCK m BUCKETL mm BUFFER 4 l l 551 552 H-o 354 RD NB(L)? 538 NO YES r NB(L)*B NB(L) NB(L)-RD m Y H H+i 1- BTM(L) \540 OUTPUT REC(I) T0 BUCKET L BTM(L)=TOP(L) R 5 .2 A A0] YES BTM(L) -PTR(I) N BTM(L)-0 L PTR(I) F, F -I PTR(I) F, F I
PR0? $550 548 NO YES NO YES April 23, 1914 Original Filed Dec. 16, 1971 B. T. BENNETT ETA'L 508T PROCESS 22 Sheets-Sheet 1s Apri123, 1914 NN TT HAL T921,028
SORT PROCESS Original Filed Dec. 16, 1971 22 Sheets-Sheet 1 1 2 3=S REC NB 3 3 3 I PTR 0 1 2 F16. 48B 1 2 3=G 1 2 3 =8 REC 7102 NB 3 3 3 PTR 0 1 2 REC 2 FIG. 490 W 2 0 2 FIG. 50A FIG. 51A
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US39862073A | 1973-09-17 | 1973-09-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
UST921028I4 true UST921028I4 (en) | 1974-04-23 |
Family
ID=23576091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US921028D Pending UST921028I4 (en) | 1973-09-17 | 1973-09-17 | Sort process |
Country Status (1)
Country | Link |
---|---|
US (1) | UST921028I4 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4536857A (en) | 1982-03-15 | 1985-08-20 | U.S. Philips Corporation | Device for the serial merging of two ordered lists in order to form a single ordered list |
US5349684A (en) * | 1989-06-30 | 1994-09-20 | Digital Equipment Corporation | Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted |
-
1973
- 1973-09-17 US US921028D patent/UST921028I4/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4536857A (en) | 1982-03-15 | 1985-08-20 | U.S. Philips Corporation | Device for the serial merging of two ordered lists in order to form a single ordered list |
US5349684A (en) * | 1989-06-30 | 1994-09-20 | Digital Equipment Corporation | Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA1165449A (en) | Qualifying and sorting file record data | |
US4433392A (en) | Interactive data retrieval apparatus | |
US2800277A (en) | Controlling arrangements for electronic digital computing machines | |
US20190324947A1 (en) | Method, device and computer program product for deleting snapshots | |
US6457014B1 (en) | System and method for extracting index key data fields | |
UST921028I4 (en) | Sort process | |
US7003653B2 (en) | Method for rapid interpretation of results returned by a parallel compare instruction | |
US3662400A (en) | Subsidiary document identification system | |
CN106354721A (en) | Retrieval method and device based on authority | |
US3613086A (en) | Compressed index method and means with single control field | |
US3633179A (en) | Information handling systems for eliminating distinctions between data items and program instructions | |
US6182071B1 (en) | Sorting and summing record data including generated sum record with sort level key | |
US4327407A (en) | Data driven processor | |
US6163783A (en) | Check data operation for DB2 | |
Brooker | An attempt to simplify coding for the Manchester electronic computer | |
JPH0773187A (en) | Retrieving system | |
Katz et al. | An experiment in non-procedural programming | |
Lombardi | Mathematical structure of nonarithmetic data processing procedures | |
GB812015A (en) | Improvements in or relating to information sorting systems | |
JP2943693B2 (en) | Sort work file space management method | |
JP3012482B2 (en) | String data management system | |
JP2697630B2 (en) | Level-by-level development method for material requirements planning | |
JPS63104132A (en) | Extension system for summary of parts list | |
CN116821090A (en) | Financial system migration test method, device, computer equipment and storage medium | |
GB1498371A (en) | Text transforming apparatus |