US20030226135A1 - Optimized program analysis - Google Patents

Optimized program analysis Download PDF

Info

Publication number
US20030226135A1
US20030226135A1 US10/443,316 US44331603A US2003226135A1 US 20030226135 A1 US20030226135 A1 US 20030226135A1 US 44331603 A US44331603 A US 44331603A US 2003226135 A1 US2003226135 A1 US 2003226135A1
Authority
US
United States
Prior art keywords
optimized
computer
file
executable
symbol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/443,316
Inventor
Ajay Sethi
Sameer Shisodia
Mahantesh Hosmath
Ritesh Motlani
Ramesh Bhattiprolu
Kirk Bradley
John Pullokkaran
Sunil Kumar
Gopalaswamy Ramesh
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Priority to US10/443,316 priority Critical patent/US20030226135A1/en
Assigned to ORACLE INTERNATIONAL CORPORATION reassignment ORACLE INTERNATIONAL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BHATTIPROLU, RAMESH, BRADLEY, KIRK, HOSMATH, MAHANTESH, MOTLANI, RITESH, SETHI, AJAY, SHISODIA, SAMEER, RAMESH, GOPALASWAMY, KUMAR, SUNIL, PULLOKKARAN, JOHN
Publication of US20030226135A1 publication Critical patent/US20030226135A1/en
Priority to US11/508,576 priority patent/US8156478B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/366Software debugging using diagnostics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0769Readable error formats, e.g. cross-platform generic formats, human understandable formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0766Error or fault reporting or storing
    • G06F11/0778Dumping, i.e. gathering error/state information after a fault for later diagnosis

Definitions

  • the present invention generally relates to computer software, and more specifically, to a computerized utility for debugging software.
  • a mechanism for analysis of core files in a platform independent manner is provided.
  • a method of analysis and interpretation of converted data from existing optimized executables, source files, and core files is provided.
  • the converted data can be created in a generic format, such as the generic core format (GCORE) according to the techniques as described in non-provisional patent application Ser. No. 10/XXX,XXX “Representation of Core Files in a Generic Format” filed on the same day herewith.
  • Converted data henceforth referenced as a GCORE, can be analyzed and interpreted according to the techniques described herein.
  • FIG. 1 is a block diagram that depicts a high level overview of a system for optimized program analysis
  • FIG. 2 is a block diagram that depicts an example of a system for optimized program analysis
  • FIG. 3 is a block diagram that depicts an example of a generated symbol and type table for analysis of optimized code in a generic format
  • FIG. 4 is a block diagram that depicts a computer system upon which embodiments of the invention may be implemented.
  • GCORE includes a superset of binary formats used within UNIX. Examples include: the Executable and Linking Format (ELF), the Common Object File Format (COFF), the Programmable Instruction Set Computers Format (PRISC), and the Mobilization Stationing, Planning, and Execution System Format (MSPES).
  • ELF Executable and Linking Format
  • COFF Common Object File Format
  • PRISC Programmable Instruction Set Computers Format
  • MSPES Mobilization Stationing, Planning, and Execution System Format
  • This superset of binary formats can be extended to support a multitude of binary formats. Since GCORE captures different segments across a multitude of binary formats, GCORE overcomes the debugging requirement of having a compiled binary for each platform.
  • the code base for GCORE is generic therefore analysis can be performed on any platform. According to an embodiment, analysis of the GCORE can be done according to the techniques described herein.
  • analyzing the core dump of an optimized executable file is accomplished by reconstructing the information about symbol types found in the executable.
  • Type information describes the entire declaration of a symbol. For example, for a declaration like “int *a[10]”, “*a”, “*a[5]” or just “a” itself can all produce meaningful data. Reconstruction of this information is possible by parsing declarations in the original source code. After parsing, symbols extracted from the core are matched with their corresponding type details. For each symbol, an entry is added to a types table. The type information is combined with the starting address for each symbol in the core and the type's size to extract the values of program variables when the execution of the program was halted.
  • an analyzer examines the declaration of the structure and the type information, referring to the header file where the structure defined. Based on this information, type information may be determined by the size of intrinsic data types, for example, the number of bytes for integer, and for character.
  • symbol type information is stripped from the executable.
  • An entry in an optimized executable has an address which points to a data segment within the core file. From just the operating system core file and the optimized executable, it is impossible to gather enough information to reconstruct what caused the failure. After compilation, some information exists about global symbols, such as the symbol name, the address of the data, and its value. However, no information about symbol type and size exist. According to an embodiment, analyzing the core dump of an optimized executable is done by reconstructing information about the types of the symbols found in the optimized executable.
  • FIG. 1 is a block diagram that depicts a high level overview of a system for analysis of optimized executables.
  • a system for analysis of a generic representation of an optimized executable core file such as a GCORE file, is provided.
  • a converter component 110 is employed to convert data from optimized executable 102 and operating system core file 104 .
  • the converter component 110 reads both input files from the executable 102 and operating system core file 104 , combines them into a generic format, and establishes initial linkages between these two input files within the GCORE 106 .
  • Symbol information 118 and type information 120 extracted from source files 130 is added to GCORE 106 .
  • the GCORE 106 is processed by an offline analyzer 200 , which provides access to program structures and values that existed at the point of failure. The program structures and values are used in analysis and debugging of this failure.
  • FIG. 2. depicts details of offline analyzer 200 .
  • a parser and analyzer 202 processes information from executable 102 , such as global, local, and structure/union members, and information about function parameters.
  • the parser and analyzer 202 processes information from the operating system core file 104 , such as virtual addresses and offsets.
  • the parser and analyzer 202 also processes user commands 208 , which contain user-defined type definitions which share namespace with global symbols extracted by parsing code declarations for various types and functions.
  • From the processed information, parser and analyzer 202 interprets the processed information and generates an external reconstructed symbol table 204 and a types table 206 .
  • the reconstructed symbol and type information can now be made available to third party applications such as a debugger or some other tool 212 .
  • reconstruction of symbol and type information is performed by parsing declarations in source code. Symbols obtained from the operating system core file have corresponding type details. Therefore, for each type there exists an entry in a types table. A starting address for each symbol is available in the basic symbol table available in the executable. From this information, type and size information can be gleaned as well.
  • symbol table 204 and types table 206 are generated by the parser and analyzer 112 with entries corresponding to each symbol in the executable 102 .
  • FIG. 3 depicts details of symbol table 204 and types table 206 according to an embodiment of the present invention.
  • reconstruction of symbol and type information is depicted through four closely interlinked lists 300 .
  • the four closely interlinked lists 300 represent value and parameter details for reconstructing symbol and type information.
  • symbol table 204 is represented by two distinct lists, symbol list 310 and symbol info list 308 .
  • Each entry in symbol list 310 points to an entry in symbol info list 308 , which lists symbol type details.
  • Entries in symbol info list 308 each have a pointer which corresponds to an entry in type table 206 .
  • the type table 206 is represented by two distinct lists, types list 306 and type offset list 314 . There is an entry in type list 306 corresponding to every type in the executable 102 . Complex types, such as structures and functions have an additional pointer to a types offset list 314 that lists related elements or parameters.
  • Symbol table 204 has an entry for each type, listed in the type offset list 314 . Entries in this type offset list 314 refer to an entry in the symbol table 204 that identifies its parent. An identifier, such as a flag, may be used to distinguish a parent symbol from a child symbol.
  • the four closely interlinked lists 300 represent details for reconstructed symbol and type tables as depicted in 204 and 206 respectively.
  • the approach for analysis of optimized executables described herein may be implemented in a variety of ways and the invention is not limited to any particular implementation.
  • the approach may be implemented as a stand-alone mechanism.
  • the approach may be implemented in computer software, hardware, or a combination thereof.
  • FIG. 4 is a block diagram that depicts a computer system 400 upon which an embodiment of the invention may be implemented.
  • Computer system 400 includes a bus 402 or other communication mechanism for communicating information, and a processor 404 coupled with bus 402 for processing information.
  • Computer system 400 also includes a main memory 406 , such as a random access memory (RAM) or other dynamic storage device, coupled to bus 402 for storing information and instructions to be executed by processor 404 .
  • Main memory 406 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 404 .
  • Computer system 400 further includes a read only memory (ROM) 408 or other static storage device coupled to bus 402 for storing static information and instructions for processor 404 .
  • a storage device 410 such as a magnetic disk or optical disk, is provided and coupled to bus 402 for storing information and instructions.
  • Computer system 400 may be coupled via bus 402 to a display 412 , such as a cathode ray tube (CRT), for displaying information to a computer user.
  • a display 412 such as a cathode ray tube (CRT)
  • An input device 414 is coupled to bus 402 for communicating information and command selections to processor 404 .
  • cursor control 416 is Another type of user input device
  • cursor control 416 such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 404 and for controlling cursor movement on display 412 .
  • This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • the invention is related to the use of computer system 400 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406 . Such instructions may be read into main memory 406 from another computer-readable medium, such as storage device 410 . Execution of the sequences of instructions contained in main memory 406 causes processor 404 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
  • Non-volatile media includes, for example, optical or magnetic disks, such as storage device 410 .
  • Volatile media includes dynamic memory, such as main memory 406 .
  • Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 402 . Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
  • Computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
  • Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 404 for execution.
  • the instructions may initially be carried on a magnetic disk of a remote computer.
  • the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
  • a modem local to computer system 400 can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal.
  • An infrared detector can receive the data carried in the infrared signal and appropriate circuitry can place the data on bus 402 .
  • Bus 402 carries the data to main memory 406 , from which processor 404 retrieves and executes the instructions.
  • the instructions received by main memory 406 may optionally be stored on storage device 410 either before or after execution by processor 404 .
  • Computer system 400 also includes a communication interface 418 coupled to bus 402 .
  • Communication interface 418 provides a two-way data communication coupling to a network link 420 that is connected to a local network 422 .
  • communication interface 418 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line.
  • ISDN integrated services digital network
  • communication interface 418 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
  • LAN local area network
  • Wireless links may also be implemented.
  • communication interface 418 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link 420 typically provides data communication through one or more networks to other data devices.
  • network link 420 may provide a connection through local network 422 to a host computer 424 or to data equipment operated by an Internet Service Provider (ISP) 426 .
  • ISP 426 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” 428 .
  • Internet 428 uses electrical, electromagnetic or optical signals that carry digital data streams.
  • the signals through the various networks and the signals on network link 420 and through communication interface 418 which carry the digital data to and from computer system 400 , are exemplary forms of carrier waves transporting the information.
  • Computer system 400 can send messages and receive data, including program code, through the network(s), network link 420 and communication interface 418 .
  • a server 430 might transmit a requested code for an application program through Internet 428 , ISP 426 , local network 422 and communication interface 418 .
  • Processor 404 may execute the received code as it is received, and/or stored in storage device 410 , or other non-volatile storage for later execution. In this manner, computer system 400 may obtain application code in the form of a carrier wave.

Abstract

The present invention generally relates to computer software, and more specifically, to a computerized utility for analysis of optimized program files. A method and apparatus for optimized program analysis is disclosed.

Description

    PRIORITY CLAIM AND RELATED APPLICATION
  • This application claims domestic priority from prior U.S. provisional application Ser. No. 60/384,206, entitled “Platform Independent Core Dump Analysis,” filed May 29, 2002, naming as inventor Ajay Sethi, the entire disclosure of which is hereby incorporated by reference for all purposes as if fully set forth herein. This application is related to U.S. non-provisional application Ser. No. 10/XXX,XXX (Attorney Docket No. 50277-2028), entitled “Representation of Core Files in a Generic Format,” filed on the same day herewith, naming as inventors Ajay Sethi, Sameer Shisodia, Mahantesh Hosmath, Ritesh Motlani, Ramesh Bhattiprolu, Kirk Bradley, John Pullokkaran, Sunil Kumar, and Gopalaswamy Ramesh, the entire disclosure of which is hereby incorporated by reference for all purposes as if fully set forth herein.[0001]
  • FIELD OF THE INVENTION
  • The present invention generally relates to computer software, and more specifically, to a computerized utility for debugging software. [0002]
  • BACKGROUND OF THE INVENTION
  • Unless otherwise indicated, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section. [0003]
  • When developing program code for multiple computer operating systems, the program code is generic except for specific layers performing platform-dependent tasks. Generic program code should compile and run on all platforms. A core file is typically generated by the operating system when a process fails because of an irrecoverable error. Information obtained from this core file serves as a starting point for determining and analyzing what contributed to the failure. [0004]
  • Commercially available software programs are often shipped in an optimized format, without symbol and type information. In conventional debugging and analysis techniques, lack of this information can necessitate running a process multiple times. Rebuilding unoptimized code is extremely inefficient for software programs with a large source base. When it is not readily apparent how much of the code needs rebuilding, it is impractical to rebuild the code in its entirety because of the size of the resulting binary. [0005]
  • To circumvent this limitation, engineers manually inspect source code while running optimized executables, trying to pinpoint areas that could have contributed to the error. To debug code, engineers typically rebuild the suspect portion of the code unoptimized, run it in a debugging environment on the same platform where the error occurred, and attempt to replicate the error. Time and inaccuracy are major drawbacks to this conventional debugging and analysis technique. In addition, the unoptimized code may not behave consistently with the optimized code because the behavior of the executable may be different, and therefore, the error may not be reproducible. [0006]
  • Support and development teams typically perform debugging in tandem. Platforms at client, development, and support sites may well vary, and core file formats vary from platform to platform. Additionally, byte ordering of data differs depending on machine architecture. There are many limitations to conventional debugging and analysis techniques. [0007]
  • For example, in most collaborative support and development environments, support teams are the first to receive and analyze core files generated by a software crash at a client site. Generally, development and support work together to troubleshoot and resolve code errors. One benefit of collaborative environments is that individuals are able to contribute to areas of the code in which they have expertise. However, a drawback to traditional techniques is that collaborative environments often include multiple platforms, operating system versions, and environments. Traditional techniques can require support and development personnel to repeat steps in their separate environments. Both time and effort would be saved if developers and support analysts were able to contribute to editing and building code without duplicating effort. Incremental and persistent capture and storage of analysis and debugging data would save additional time and effort. [0008]
  • In addition, when platforms at client and development sites are different, replicating bugs may be difficult or even impossible. Conventional debuggers require a compiled binary for each platform. A drawback to traditional techniques is that even with platform-specific layers, there may be bugs on a specific platform that will not replicate on another platform. Traditional techniques require that the developer would have to replicate, change, test, and debug the code on both deployment and development environments. This approach requires that the developer be familiar with tools, debuggers and other support software on both platforms. If the developer could analyze the code in a generic format on any platform, time and effort would be saved. [0009]
  • Based on the foregoing, it is desirable to provide techniques for analysis of optimized code in a generic format wherein the optimized code can be analyzed to help in determining errors in code that occurred that the operating system is unable to handle. Additionally, it is desirable to analyze code using existing core dumps from existing optimized binaries. [0010]
  • SUMMARY OF THE INVENTION
  • When developing code for multiple platforms, traditionally, a majority of the code is generic except for the platform specific layers. Ideally, the generic code should compile and run as is on all platforms. The platform-specific layers are added for tasks which need platform specific implementations, and work differently across platforms. Because of the variations in core file formats, traditional techniques make it impossible to transparently analyze core files from one platform another. [0011]
  • Techniques are provided for analysis of optimized program files. According to one aspect, a mechanism for analysis of core files in a platform independent manner is provided. According to another aspect, a method of analysis and interpretation of converted data from existing optimized executables, source files, and core files is provided. The converted data can be created in a generic format, such as the generic core format (GCORE) according to the techniques as described in non-provisional patent application Ser. No. 10/XXX,XXX “Representation of Core Files in a Generic Format” filed on the same day herewith. Converted data, henceforth referenced as a GCORE, can be analyzed and interpreted according to the techniques described herein. [0012]
  • Other objects and advantages will become readily apparent from the following detailed description. The invention can be embodied in different ways, and its details varied without departing from the invention. Accordingly, the drawings and descriptions are to be regarded as illustrative in nature, and not as restrictive. [0013]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention is depicted by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which: [0014]
  • FIG. 1 is a block diagram that depicts a high level overview of a system for optimized program analysis; [0015]
  • FIG. 2 is a block diagram that depicts an example of a system for optimized program analysis; [0016]
  • FIG. 3 is a block diagram that depicts an example of a generated symbol and type table for analysis of optimized code in a generic format; [0017]
  • FIG. 4 is a block diagram that depicts a computer system upon which embodiments of the invention may be implemented. [0018]
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • A method and apparatus for analysis of optimized program files is herein described. Specific details are set forth to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these details. In other instances, well-known structures and devices are depicted in block diagram format to avoid unnecessarily obscuring the present invention. [0019]
  • GCORE: The Generic Core
  • A generic representation of core files and executables, or GCORE as it is henceforth referenced, contains information about core files and executables. According to an embodiment, GCORE includes a superset of binary formats used within UNIX. Examples include: the Executable and Linking Format (ELF), the Common Object File Format (COFF), the Programmable Instruction Set Computers Format (PRISC), and the Mobilization Stationing, Planning, and Execution System Format (MSPES). This superset of binary formats can be extended to support a multitude of binary formats. Since GCORE captures different segments across a multitude of binary formats, GCORE overcomes the debugging requirement of having a compiled binary for each platform. The code base for GCORE is generic therefore analysis can be performed on any platform. According to an embodiment, analysis of the GCORE can be done according to the techniques described herein. [0020]
  • Optimized Program Analysis
  • In the analysis of a core file, it is often difficult to ascertain what caused an executable to fail. Most data required for meaningful analysis of the core file exist in the core file's data sections. This data exists in raw binary format. Interpreting this data as such is not possible because symbol information is not available. In optimized executables, symbol information is stripped and therefore is not available. Debugging core dumps produced by executables on many operating systems involves determining the state of a process at the time of core dump. The state of a process at the time of a core dump comprises information such as the following: [0021]
  • The function call stack and parameters of the called function. [0022]
  • The values of local and global variables in the executable. [0023]
  • Contents of registers [0024]
  • Signal state at point of failure [0025]
  • Of the above, in optimized executables, it is often not possible to get the parameters of the function calls and the values of the variables, whether local or global. This necessitates recompiling the code unoptimized and reproducing the problem to produce a core dump. However, in the real world, this can cause few problems: [0026]
  • Unoptimized executables do not always behave exactly like optimized ones. [0027]
  • The problem may be difficult to reproduce consistently. [0028]
  • For larger executables, it may be difficult to isolate errors because it may not be feasible to recompile large portions of code as unoptimized [0029]
  • According to an embodiment, analyzing the core dump of an optimized executable file is accomplished by reconstructing the information about symbol types found in the executable. Type information describes the entire declaration of a symbol. For example, for a declaration like “int *a[10]”, “*a”, “*a[5]” or just “a” itself can all produce meaningful data. Reconstruction of this information is possible by parsing declarations in the original source code. After parsing, symbols extracted from the core are matched with their corresponding type details. For each symbol, an entry is added to a types table. The type information is combined with the starting address for each symbol in the core and the type's size to extract the values of program variables when the execution of the program was halted. According to an embodiment, an analyzer examines the declaration of the structure and the type information, referring to the header file where the structure defined. Based on this information, type information may be determined by the size of intrinsic data types, for example, the number of bytes for integer, and for character. [0030]
  • After an executable has been compiled and optimized, symbol type information is stripped from the executable. An entry in an optimized executable has an address which points to a data segment within the core file. From just the operating system core file and the optimized executable, it is impossible to gather enough information to reconstruct what caused the failure. After compilation, some information exists about global symbols, such as the symbol name, the address of the data, and its value. However, no information about symbol type and size exist. According to an embodiment, analyzing the core dump of an optimized executable is done by reconstructing information about the types of the symbols found in the optimized executable. [0031]
  • FIG. 1 is a block diagram that depicts a high level overview of a system for analysis of optimized executables. According to an embodiment, a system for analysis of a generic representation of an optimized executable core file, such as a GCORE file, is provided. [0032]
  • To create a generic core file for analysis, a [0033] converter component 110 is employed to convert data from optimized executable 102 and operating system core file 104. The converter component 110 reads both input files from the executable 102 and operating system core file 104, combines them into a generic format, and establishes initial linkages between these two input files within the GCORE 106. Symbol information 118 and type information 120 extracted from source files 130 is added to GCORE 106. The GCORE 106 is processed by an offline analyzer 200, which provides access to program structures and values that existed at the point of failure. The program structures and values are used in analysis and debugging of this failure.
  • According to an embodiment, FIG. 2. depicts details of [0034] offline analyzer 200. A parser and analyzer 202 processes information from executable 102, such as global, local, and structure/union members, and information about function parameters. The parser and analyzer 202 processes information from the operating system core file 104, such as virtual addresses and offsets. The parser and analyzer 202 also processes user commands 208, which contain user-defined type definitions which share namespace with global symbols extracted by parsing code declarations for various types and functions. From the processed information, parser and analyzer 202 interprets the processed information and generates an external reconstructed symbol table 204 and a types table 206. The reconstructed symbol and type information can now be made available to third party applications such as a debugger or some other tool 212.
  • Layout of the Symbol and Type Tables
  • According to an embodiment, reconstruction of symbol and type information is performed by parsing declarations in source code. Symbols obtained from the operating system core file have corresponding type details. Therefore, for each type there exists an entry in a types table. A starting address for each symbol is available in the basic symbol table available in the executable. From this information, type and size information can be gleaned as well. [0035]
  • As depicted in FIG. 2, symbol table [0036] 204 and types table 206 are generated by the parser and analyzer 112 with entries corresponding to each symbol in the executable 102. FIG. 3 depicts details of symbol table 204 and types table 206 according to an embodiment of the present invention.
  • According to an embodiment, reconstruction of symbol and type information is depicted through four closely interlinked lists [0037] 300. The four closely interlinked lists 300 represent value and parameter details for reconstructing symbol and type information.
  • Symbol Table
  • According to an embodiment, symbol table [0038] 204 is represented by two distinct lists, symbol list 310 and symbol info list 308. Each entry in symbol list 310 points to an entry in symbol info list 308, which lists symbol type details. Entries in symbol info list 308 each have a pointer which corresponds to an entry in type table 206.
  • Type Table
  • The type table [0039] 206 is represented by two distinct lists, types list 306 and type offset list 314. There is an entry in type list 306 corresponding to every type in the executable 102. Complex types, such as structures and functions have an additional pointer to a types offset list 314 that lists related elements or parameters. Symbol table 204 has an entry for each type, listed in the type offset list 314. Entries in this type offset list 314 refer to an entry in the symbol table 204 that identifies its parent. An identifier, such as a flag, may be used to distinguish a parent symbol from a child symbol. The four closely interlinked lists 300 represent details for reconstructed symbol and type tables as depicted in 204 and 206 respectively.
  • External creation of the symbol and types information can be an effective solution to the problems that arise because of the optimization of executables after compilation of program source code, such as C programs. The invention eliminates the need for recompiling optimized executables, as it reduces overhead (due to recompilation time) and enables analysts to determine causes of core dumps. Issues, for example, such as those related to memory corruption, disappear once executables are recompiled with debug option. The invention is therefore applicable to any executable. To ensure that the released code performs well, executables are built with the maximum optimization level. Therefore, the invention simplifies analysis of errors encountered in any optimized executable. [0040]
  • Hardware Overview
  • The approach for analysis of optimized executables described herein may be implemented in a variety of ways and the invention is not limited to any particular implementation. The approach may be implemented as a stand-alone mechanism. Furthermore, the approach may be implemented in computer software, hardware, or a combination thereof. [0041]
  • FIG. 4 is a block diagram that depicts a [0042] computer system 400 upon which an embodiment of the invention may be implemented. Computer system 400 includes a bus 402 or other communication mechanism for communicating information, and a processor 404 coupled with bus 402 for processing information. Computer system 400 also includes a main memory 406, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 402 for storing information and instructions to be executed by processor 404. Main memory 406 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 404. Computer system 400 further includes a read only memory (ROM) 408 or other static storage device coupled to bus 402 for storing static information and instructions for processor 404. A storage device 410, such as a magnetic disk or optical disk, is provided and coupled to bus 402 for storing information and instructions.
  • [0043] Computer system 400 may be coupled via bus 402 to a display 412, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 414, including alphanumeric and other keys, is coupled to bus 402 for communicating information and command selections to processor 404. Another type of user input device is cursor control 416, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 404 and for controlling cursor movement on display 412. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • The invention is related to the use of [0044] computer system 400 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 400 in response to processor 404 executing one or more sequences of one or more instructions contained in main memory 406. Such instructions may be read into main memory 406 from another computer-readable medium, such as storage device 410. Execution of the sequences of instructions contained in main memory 406 causes processor 404 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
  • The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to [0045] processor 404 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 410. Volatile media includes dynamic memory, such as main memory 406. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 402. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
  • Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read. [0046]
  • Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to [0047] processor 404 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 400 can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector can receive the data carried in the infrared signal and appropriate circuitry can place the data on bus 402. Bus 402 carries the data to main memory 406, from which processor 404 retrieves and executes the instructions. The instructions received by main memory 406 may optionally be stored on storage device 410 either before or after execution by processor 404.
  • [0048] Computer system 400 also includes a communication interface 418 coupled to bus 402. Communication interface 418 provides a two-way data communication coupling to a network link 420 that is connected to a local network 422. For example, communication interface 418 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 418 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 418 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link [0049] 420 typically provides data communication through one or more networks to other data devices. For example, network link 420 may provide a connection through local network 422 to a host computer 424 or to data equipment operated by an Internet Service Provider (ISP) 426. ISP 426 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” 428. Local network 422 and Internet 428 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 420 and through communication interface 418, which carry the digital data to and from computer system 400, are exemplary forms of carrier waves transporting the information.
  • [0050] Computer system 400 can send messages and receive data, including program code, through the network(s), network link 420 and communication interface 418. In the Internet example, a server 430 might transmit a requested code for an application program through Internet 428, ISP 426, local network 422 and communication interface 418.
  • [0051] Processor 404 may execute the received code as it is received, and/or stored in storage device 410, or other non-volatile storage for later execution. In this manner, computer system 400 may obtain application code in the form of a carrier wave.
  • Extensions and Alternatives
  • In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. Thus, the specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The invention includes other contexts and applications in which the mechanisms and processes described herein are available to other mechanisms, methods, programs, and processes. [0052]
  • In addition, in this disclosure, certain process steps are set forth in a particular order, and alphabetic and alphanumeric labels are used to identify certain steps. Unless specifically stated in the disclosure, embodiments of the invention are not limited to any particular order of carrying out such steps. In particular, the labels are used merely for convenient identification of steps, and are not intended to imply, specify or require a particular order of carrying out such steps. Furthermore, other embodiments may use more or fewer steps than those discussed herein. [0053]

Claims (8)

What is claimed is:
1. A method for analysis of optimized executables, the method comprising the computer-implemented steps of:
generating a types table,
generating a symbol table;
combining said types table, and said symbols table with existing global symbol table data available from an executable to generate an optimized program file,
parsing said optimized program file with an offline program analyzer,
analyzing said optimized program file with an offline program analyzer.
2. The method of claim 1, the method further comprising the computer-implemented steps of:
employing a converter component to read input from an executable and a core file,
employing a converter component to extract said symbol and type information from said source files,
establishing linkages between said two input files within a generic core file;
wherein,
said converter component establishes said linkages.
3. The method of claim 2, wherein storage of said linkages is done persistently.
4. The method of claim 2, wherein said method of analysis further comprises the computer-implemented step of:
parsing and analyzing said linkages wherein said linkages may be parsed and
analyzed transparently on a plurality of platforms.
5. The method of claim 2, wherein the method further comprises the computer-implemented step of:
analyzing said linkages within an analyzer component.
6. The method of claim 1, the method further comprising the computer-implemented steps of:
reconstructing information about the types of the symbols found in the executable;
adding entries into said types table;
wherein,
symbols that were obtained from an operating system core file include corresponding type details.
7. A computer-readable medium carrying one or more sequences of instructions for analysis of optimized executables, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
generating a types table,
generating a symbol table;
combining types table, and said symbols table with existing global symbol table data available from an executable to generate an optimized program file,
parsing said optimized program file with an offline program analyzer,
analyzing said optimized program file with an offline program analyzer.
8. A computer apparatus comprising:
a processor; and
a memory coupled to the processor, the memory containing one or more sequences of instructions for optimized program analysis, wherein execution of the one or more sequences of instructions by the processor causes the processor to perform the steps of:
generating a types table,
generating a symbol table;
combining types table, and said symbols table with existing global symbol table data available from an executable to generate an optimized program file,
parsing said optimized program file with an offline program analyzer,
analyzing said optimized program file with an offline program analyzer.
US10/443,316 2002-05-29 2003-05-21 Optimized program analysis Abandoned US20030226135A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/443,316 US20030226135A1 (en) 2002-05-29 2003-05-21 Optimized program analysis
US11/508,576 US8156478B2 (en) 2002-05-29 2006-08-22 Optimized program analysis

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US38420602P 2002-05-29 2002-05-29
US10/443,316 US20030226135A1 (en) 2002-05-29 2003-05-21 Optimized program analysis

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/508,576 Division US8156478B2 (en) 2002-05-29 2006-08-22 Optimized program analysis

Publications (1)

Publication Number Publication Date
US20030226135A1 true US20030226135A1 (en) 2003-12-04

Family

ID=29587069

Family Applications (3)

Application Number Title Priority Date Filing Date
US10/443,311 Active 2025-04-27 US7243338B2 (en) 2002-05-29 2003-05-21 Representation of core files in a generic format
US10/443,316 Abandoned US20030226135A1 (en) 2002-05-29 2003-05-21 Optimized program analysis
US11/508,576 Active 2027-01-27 US8156478B2 (en) 2002-05-29 2006-08-22 Optimized program analysis

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/443,311 Active 2025-04-27 US7243338B2 (en) 2002-05-29 2003-05-21 Representation of core files in a generic format

Family Applications After (1)

Application Number Title Priority Date Filing Date
US11/508,576 Active 2027-01-27 US8156478B2 (en) 2002-05-29 2006-08-22 Optimized program analysis

Country Status (1)

Country Link
US (3) US7243338B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030226134A1 (en) * 2002-05-29 2003-12-04 Oracle International Corporation Representation of core files in a generic format
US20050004936A1 (en) * 2003-07-03 2005-01-06 Oracle International Corporation Fact table storage in a decision support system environment
US20080177525A1 (en) * 2007-01-23 2008-07-24 Microsoft Corporation Integrated debugger simulator
US11487681B2 (en) * 2018-06-29 2022-11-01 Visa International Service Association Chip card socket communication

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8769517B2 (en) * 2002-03-15 2014-07-01 International Business Machines Corporation Generating a common symbol table for symbols of independent applications
US7496895B1 (en) * 2004-12-29 2009-02-24 The Mathworks, Inc. Multi-domain unified debugger
US7865774B2 (en) * 2007-09-19 2011-01-04 Cisco Technology, Inc. Multiprocessor core dump retrieval
US20100332549A1 (en) * 2009-06-26 2010-12-30 Microsoft Corporation Recipes for rebuilding files
US20110029819A1 (en) * 2009-07-31 2011-02-03 Virendra Kumar Mehta System and method for providing program tracking information
US20110113409A1 (en) * 2009-11-10 2011-05-12 Rodrick Evans Symbol capabilities support within elf
US8607098B2 (en) * 2011-05-26 2013-12-10 International Business Machines Corporation Generating appropriately sized core files used in diagnosing application crashes
US9619779B2 (en) * 2011-08-26 2017-04-11 Apple Inc. Client-side policy enforcement of developer API use
US9104796B2 (en) 2012-12-21 2015-08-11 International Business Machines Corporation Correlation of source code with system dump information
US9632911B2 (en) * 2013-02-08 2017-04-25 Red Hat, Inc. Stack trace clustering
US9098627B2 (en) * 2013-03-06 2015-08-04 Red Hat, Inc. Providing a core dump-level stack trace
US10460513B2 (en) * 2016-09-22 2019-10-29 Advanced Micro Devices, Inc. Combined world-space pipeline shader stages
US11687547B1 (en) * 2020-10-21 2023-06-27 T-Mobile Innovations Llc System and methods for an automated core dump to a Java heap dump conversion

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5339406A (en) * 1992-04-03 1994-08-16 Sun Microsystems, Inc. Reconstructing symbol definitions of a dynamically configurable operating system defined at the time of a system crash
US5560009A (en) * 1990-09-21 1996-09-24 Hewlett-Packard Company Generating symbolic debug information by merging translation and compiler debug information
US5740444A (en) * 1992-11-19 1998-04-14 Borland International, Inc. Symbol browsing in an object-oriented development system
US5854924A (en) * 1996-08-08 1998-12-29 Globetrotter Software, Inc. Static debugging tool and method
US5999933A (en) * 1995-12-14 1999-12-07 Compaq Computer Corporation Process and apparatus for collecting a data structure of a memory dump into a logical table
US6151701A (en) * 1997-09-30 2000-11-21 Ahpah Software, Inc. Method for reconstructing debugging information for a decompiled executable file
US6226786B1 (en) * 1996-12-24 2001-05-01 International Business Machines Corporation Minimizing debug information for global types in compiled languages

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6163858A (en) * 1998-06-08 2000-12-19 Oracle Corporation Diagnostic methodology for debugging integrated software
US6226761B1 (en) * 1998-09-24 2001-05-01 International Business Machines Corporation Post dump garbage collection
US6601188B1 (en) * 1999-10-28 2003-07-29 International Business Machines Corporation Method and apparatus for external crash analysis in a multitasking operating system
US6795963B1 (en) * 1999-11-12 2004-09-21 International Business Machines Corporation Method and system for optimizing systems with enhanced debugging information
US6678883B1 (en) * 2000-07-10 2004-01-13 International Business Machines Corporation Apparatus and method for creating a trace file for a trace of a computer program based on loaded module information
US6681348B1 (en) * 2000-12-15 2004-01-20 Microsoft Corporation Creation of mini dump files from full dump files
US7243338B2 (en) 2002-05-29 2007-07-10 Oracle International Corporation Representation of core files in a generic format
US7149929B2 (en) * 2003-08-25 2006-12-12 Hewlett-Packard Development Company, L.P. Method of and apparatus for cross-platform core dumping during dynamic binary translation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5560009A (en) * 1990-09-21 1996-09-24 Hewlett-Packard Company Generating symbolic debug information by merging translation and compiler debug information
US5339406A (en) * 1992-04-03 1994-08-16 Sun Microsystems, Inc. Reconstructing symbol definitions of a dynamically configurable operating system defined at the time of a system crash
US5740444A (en) * 1992-11-19 1998-04-14 Borland International, Inc. Symbol browsing in an object-oriented development system
US5999933A (en) * 1995-12-14 1999-12-07 Compaq Computer Corporation Process and apparatus for collecting a data structure of a memory dump into a logical table
US5854924A (en) * 1996-08-08 1998-12-29 Globetrotter Software, Inc. Static debugging tool and method
US6226786B1 (en) * 1996-12-24 2001-05-01 International Business Machines Corporation Minimizing debug information for global types in compiled languages
US6151701A (en) * 1997-09-30 2000-11-21 Ahpah Software, Inc. Method for reconstructing debugging information for a decompiled executable file

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030226134A1 (en) * 2002-05-29 2003-12-04 Oracle International Corporation Representation of core files in a generic format
US7243338B2 (en) 2002-05-29 2007-07-10 Oracle International Corporation Representation of core files in a generic format
US20050004936A1 (en) * 2003-07-03 2005-01-06 Oracle International Corporation Fact table storage in a decision support system environment
US7480662B2 (en) 2003-07-03 2009-01-20 Oracle International Corporation Fact table storage in a decision support system environment
US20080177525A1 (en) * 2007-01-23 2008-07-24 Microsoft Corporation Integrated debugger simulator
US8135572B2 (en) 2007-01-23 2012-03-13 Microsoft Corporation Integrated debugger simulator
US11487681B2 (en) * 2018-06-29 2022-11-01 Visa International Service Association Chip card socket communication
US11816048B2 (en) 2018-06-29 2023-11-14 Visa International Service Association Chip card socket communication

Also Published As

Publication number Publication date
US8156478B2 (en) 2012-04-10
US20070006164A1 (en) 2007-01-04
US20030226134A1 (en) 2003-12-04
US7243338B2 (en) 2007-07-10

Similar Documents

Publication Publication Date Title
US8156478B2 (en) Optimized program analysis
US7222333B1 (en) Techniques for generating software application build scripts based on tags in comments
US7478366B2 (en) Debugger and method for debugging computer programs across multiple programming languages
US6978401B2 (en) Software application test coverage analyzer
AU2005203386B2 (en) Test automation stack layering
US7673292B2 (en) Auto conversion of tests between different functional testing tools
US7272822B1 (en) Automatically generating software tests based on metadata
US7340726B1 (en) Systems and methods for performing static analysis on source code
US6868454B1 (en) Distributed-object development system and computer-readable recording medium recorded with program for making computer execute distributed-object development
US7698691B2 (en) Server application state
US20080109790A1 (en) Determining causes of software regressions based on regression and delta information
US5881289A (en) Remote compiling of source code for cross development
EP0592080A2 (en) Method and apparatus for interprocess communication in a multicomputer system
US20030208743A1 (en) Workflow code generator
US20080120595A1 (en) System and method for hot code replace
US8301720B1 (en) Method and system to collect and communicate problem context in XML-based distributed applications
US20020198868A1 (en) System and method for specification tracking in a Java compatibility testing environment
JP5396979B2 (en) Software development support device, system, software development support device function expansion method, and program
CN1573713A (en) Debugging breakpoints on pluggable components
CN109144525A (en) A kind of software installation method and system of network self-adapting
US8365154B2 (en) Multi-language software development
US20060271817A1 (en) System and method for error checking of failed I/O open calls
Amintabar et al. ExceptionTracer: A solution recommender for exceptions in an integrated development environment
EP3005087A1 (en) Declarative configuration elements
EP1388063B1 (en) System and method for automated assertion acquisition in a java compatibility testing

Legal Events

Date Code Title Description
AS Assignment

Owner name: ORACLE INTERNATIONAL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SETHI, AJAY;SHISODIA, SAMEER;HOSMATH, MAHANTESH;AND OTHERS;REEL/FRAME:014110/0089;SIGNING DATES FROM 20030516 TO 20030520

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION