UST913007I4 - Sort process - Google Patents

Sort process Download PDF

Info

Publication number
UST913007I4
UST913007I4 US913007DH UST913007I4 US T913007 I4 UST913007 I4 US T913007I4 US 913007D H US913007D H US 913007DH US T913007 I4 UST913007 I4 US T913007I4
Authority
US
United States
Prior art keywords
file
sample
subsets
sorted
records
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Application granted granted Critical
Publication of UST913007I4 publication Critical patent/UST913007I4/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/22Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
    • G06F7/24Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers sorting methods in general

Definitions

  • the records of each of the subsets are again partitioned into [+1 subsets as described above and this process is continued until the size of the subsets is small enough so that they can be conveniently sorted into respective ordered sequences, preferably employing a tree type sort.
  • the ordered sequences are then concatenated to form the sorted file.
  • the random sample may be provided, for example, by using a random number generator to generate integers in the range of 1 to 21 wherein n is the total quantity of records in the file until sl+sl distinct integers have been generated.
  • the records at these integer addresses in the file can then be selected to constitute the sample.
  • FIG. 7 1+1 HIP-,2 NQ mac 32 NO YES SORT BUCKET MI) 44 SELECT A RANDOM 0F LEVEL (1) SAMPLE 7 SORT THE SAMPLE YES NO 48 V K(I)-K(I)+1 PARTITION THE FILE 40 COUNTING THE OUANTITY 0F RECORDS WHICH so INTO EACH BUCKET 1-1-1 no YES END Aug. 14,1973

Abstract

IN THE SORT PROCESS DISCLOSED HEREIN, THERE IS FIRST SELECTED A RANDOM SAMPLE OF THE RECORDS OF A FILE TO BE SORTED INTO A ORDERED SEQUENCE. THIS SAMPLE MAY SUITABLY HAVE THE SIZE SL+S-1 WHEREIN L+1 IS THE QUANTITY OF SUBSETS DESIRED FROM A PARTICULAR DISTRIBUTION PASS AND S IS A SELECTABLE PARAMETER. THE SELECTED SAMPLE IS SORTED INTO AN ORDERED SEQUENCE AND THE FILE IS THEN PARTIONED IN ACCORDANCE WITH EVERY STH KEY OF THE SORTED SAMPLE INTO L+1 SUBSETS. THE RECORDS OF EACH OF THE SUBSETS ARE AGAIN PARTITIONED INTO L+1 SUBSETS AS DESCRIBED ABOVE AND THIS PROCESS IS CONTINUED UNTIL THE SIZE OF THE SUBSETS IS SMALL ENOUGH SO THAT THEY CAN BE CONVENIENTLY SORTED INTO RESPECTIVE ORDERED SEQUENCES, PREFERABLY EMPLOYING A TREE TYPE SORT. THE ORDERED SEQUENCES ARE THEN CONCATENATED TO FORM THE SORTED FILE. THE RANDOM SAMPLE

MAY BE PROVIDED, FOR EXAMPLE, BY USING A RANDOM NUMBER GENERATOR TO GENERATE INTEGERS IN THE RANGE OF 1 TO N WHEREIN N IS THE TOTAL QUANTITY OF RECORDS IN THE FILE UNTIL SL+S-1 DISTINCT INTEGERS HAVE BEEN GENERATED. THE RECORDS AT THESE INTEGER ADDRESSES IN THE FILE CAN THEN BE SELECTED TO CONSTITUTE THE SAMPLE.

Description

DEFENSIVE PUBLICATION UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice or Dec. 16, 1969, 869 0.6%. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained in the application as originally filed. The files of these applications are available to the public-for inspection and reproduction may be purchased for 30 cents a sheet.
Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Ofiice makes no assertion as to the novelty of. the disclosed subject matter;
PUBLISHED AUGUST 14, 1973 T913,007 SORT PROCESS Archie Charles McKellar, Mount Kisco, N.Y., assignor to International Business Machines Corporation, Armonk, N.Y.
Continuation of application Ser. No. 214,200, Dec. 30, 1971. This application Feb. 20, 1973, Ser. No. 333,920 Int. Cl. G06f 7/06, 7/22 US. Cl. 444-1 8 Sheets Drawing. 25 Pages Specification In the sort process disclosed herein, there is first selected a random sample of the records of a file to be sorted into an ordered sequence. This sample may suitably have the size sl+s1 wherein 1+1 is the quantity of subsets desired from a particular distribution pass and s is a selectable parameter. The selected sample is sorted into an ordered sequence and the file is then partitioned in accordance with every sth key of the sorted sample into l+1 subsets. The records of each of the subsets are again partitioned into [+1 subsets as described above and this process is continued until the size of the subsets is small enough so that they can be conveniently sorted into respective ordered sequences, preferably employing a tree type sort. The ordered sequences are then concatenated to form the sorted file. The random sample may be provided, for example, by using a random number generator to generate integers in the range of 1 to 21 wherein n is the total quantity of records in the file until sl+sl distinct integers have been generated. The records at these integer addresses in the file can then be selected to constitute the sample.
THIS PROGRAM WILL SORT A FILE CONSISTING 0F RECORDS XH), ",Xinl
snrcr A mum some 0F NR SIZE sSHs-i FROM THE ms SORT THE some TO OBTAIN N" Y(1),---,Y(si+sn PARTITION. ms FILE mro 2+1 -suasrrs s ---,s vmrm: -16 s -QfljhliislsXiJkliiiHlsl} son s; in N N nus menu 22 24 commune s; WITH s w-s YES 0 Aug. 14, 1973 A. c. MCKELLAR SORT PROCESS 8 Sheets-Sheet 1 Original Filed Dec. 30, 1971 MAXIMUM KEY VALUE SORT PROCESS Original Filed Dec. 30. 1971 FIG. 2
8 Sheets-Sheet 2 THIS PROGRAM WILL SORT A FILE {0 CONSISTING 0F RECORDS N XH), X(n) 7 SELECT A RANDOM SAMPLE 0F SIZE sR+s-i FROM THE FILE SORT THE SAMPLE TO OBTAIN {4 Hi), ,Y(sR+s-L) V PARTITION. THE FILE mm Q+1 SUBSETS s s WHERE czslzE 0F s 20 no YES Si BY 22 sum s -24 THIS PROGRAM CONCATENATE s; WITH s 5, -s
YES 0 Aug. 14, 1973 A. c M KELLAR T913307 SORT PROCESS Original Filed Dec. 30, 1971 8 Sheets-Sheet 3 FIG. 3A
A00REssi'2345e1a910H RECORD 4 9172413 6 25 2 i8 443 FIG. 3B
ADDRESS123456789IOH RECORD 2 3 4 6 9 i3 417 i8 2425 FILE SUBSET s suasn s SUB'SET s suassr s 2 3 We #9 "3, 14 .m 8 24, 25
FILE
sussn s SUBSET s, SUBSET s SUBSET s; 305.6 G S256 S356 SUBSET s SUBSET s SUBSET s SUBSET s w 11 56 i2 56 43 $9 Fl (3. 6 o 10 1: 12 13 2 3 Aug. 14,1973
Original Filed Dec. 50, 1971 8 Sheets-3n0et a FIG. 7 1+1 HIP-,2 NQ mac 32 NO YES SORT BUCKET MI) 44 SELECT A RANDOM 0F LEVEL (1) SAMPLE 7 SORT THE SAMPLE YES NO 48 V K(I)-K(I)+1 PARTITION THE FILE 40 COUNTING THE OUANTITY 0F RECORDS WHICH so INTO EACH BUCKET 1-1-1 no YES END Aug. 14,1973
A. C. M KELLAR SORT PROCESS Original Filed Dec. 30, 1971 FIG. 8
PUT RECORDS 25);
PUT RECORDS Y(2S),--
--,Y(S1) mro s 8 Sheets-Sheet 5 V 0 (I+"S"1 N (IH) N2 (I+i) s NR (I+i) -S .4m RECORD FROM N64 INPUT ms sun or FILE N66 NO YES rmo i sucu THAT PUT T mo s; 72
END
Aug. 14, 1973 A. c. MOKELLAR SORT PROCESS 8 Sheets-Sheet 7 Original Filed Dec 30, 1971 FIG. iOA
ADDRESS 2 RECORD [4 e n 24 i3 6 2s 2 1a 143 FIG. I08
ADDRESS RECORD CYL I N DERS FILLED AREA :l UNFILLED AREA Aug. 14, 1973 A. c. MCKELLAR T9l3,007
SORT PROCESS Original Filed Dec. 30, 1971 8 Sheets-Sheet 5. FIG. H PMERH TdSb fi Y(1),-'-,Y(n)
LOWER mm UPPER n SELECTjSUCH mm THAT isjsn i T*Y(j) 43 Y(j) -YH) Y(UPPER) T YES NO 14+ l E mowER) 1 Y(L0WER) Y(UPPER) UPPERUPPER 4 NO YES 1 M0 150 +46 I 142 r v LOIER LOWER+i Y(UPPER) Y(LOWER) UPPER=LOWER 2 J M4 YES NO UPPER=LOWER 2 no LYES n-UPPER LOWER-i 2 J YES NO I I SORT n ,i SORT Ym- ---Y(LowER-n 1 BY nus PROCEDURE BY nus PROCEDURE 7 I SORT m); -,Y(LOWER-1) 58 SORT (Y(UPPER+i),---,Y(n) 1,
BY nus PROCEDURE BY nus PROCEDURE
US913007D 1973-02-20 1973-02-20 Sort process Pending UST913007I4 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US33392073A 1973-02-20 1973-02-20

Publications (1)

Publication Number Publication Date
UST913007I4 true UST913007I4 (en) 1973-08-14

Family

ID=23304814

Family Applications (1)

Application Number Title Priority Date Filing Date
US913007D Pending UST913007I4 (en) 1973-02-20 1973-02-20 Sort process

Country Status (1)

Country Link
US (1) UST913007I4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Similar Documents

Publication Publication Date Title
Rothnie Jr et al. Attribute based file organization in a paged memory environment
US4468728A (en) Data structure and search method for a data base management system
Kernighan et al. An efficient heuristic procedure for partitioning graphs
Charlesby Solubility and molecular size distribution of crosslinked polystyrene
US6557014B1 (en) Method and apparatus for record addressing in partitioned files
Sevast’Yanov Poisson limit law for a scheme of sums of dependent random variables
Atkinson An algorithm for finding the blocks of a permutation group
Shintani et al. Parallel mining algorithms for generalized association rules with classification hierarchy
Hagerup Towards optimal parallel bucket sorting
GB2207264A (en) Data processing system
Andersson Sublogarithmic searching without multiplications
Poblete et al. The analysis of a fringe heuristic for binary search trees
Feldman et al. An efficient design for chemical structure searching. I. The screens
UST913007I4 (en) Sort process
Wallace On mixed groups of torsion-free rank one with totally projective primary components
Aggarwal et al. Optimal parallel sorting in multi-level storage
GB1420163A (en) Allocation of storage addresses to data elements
Pittel Linear probing: the probable largest search time grows logarithmically with the number of records
GB1011572A (en) Phosphorus-containing carboxylic acid amides
US2983657A (en) Manufacture of grafted polymers
CN109254962A (en) A kind of optimiged index method and device based on T- tree
CN107644086A (en) The location mode of spatial data
Kianfar Stronger inequalities for 0, 1 integer programming using knapsack functions
Froese Limiting Screening Numbers and Energy Parameters
Du On the file design problem for partial match retrieval