EP1864222A2

EP1864222A2 - System and method for transferring web page data

Info

Publication number: EP1864222A2
Application number: EP06748854A
Authority: EP
Inventors: Blaise Aguera Y Arcas; Julian Walker
Original assignee: Microsoft Corp
Current assignee: Microsoft Corp
Priority date: 2005-03-29
Filing date: 2006-03-29
Publication date: 2007-12-12
Also published as: AU2006230233B2; KR20070116925A; EP1864222A4; WO2006105158A2; WO2006105158A3; CA2599357A1; AU2006230233A1

Abstract

A system and method are disclosed which may include accessing an Internet site by a proxy server; converting image data from the Internet site into a multiresolution representation; and transmitting the image data in said multiresolution representation to a client device.

Description

SYSTEM AND METHOD FOR TRANSFERRING WEB PAGE DATA

BACKGROUND ART

It has long been recognized that browsing the Web on a small device, such as a Personal Digital Assistant (PDA) , cellular phone, or wireless terminal, would be useful. The difficulties involved in implementing web browsing on such devices generally do not arise because of limitations in computing power. Rather such difficulties arise from the two following causes. First, the displays are usually very small, making it difficult for them to display web pages designed specifically for larger displays or windows (e.g., typically, at least 800x200 or 1024x768 pixels) . Second, small wireless devices are typically connected to wide-area networks at low bandwidth, making traditional web browsing a nuisance due to the need to download HTML (Hypertext Markup Language) pages in their entirety before viewing them.

Some partial solutions to this problem exist. For example, one computer hardware manufacturer has designed a new platform for wireless delivery of content to small devices based on BREW (Binary Runtime Environment for Wireless) . This kind of solution, however, requires that content providers agree to provide their content using BREW technologies, or that third-party content aggregators continually repackage existing content using formatting specialized to the content or application, again serving the results using BREW technologies .

Another partial solution is Opera For Mobile, in combination with Opera Mobile Accelerator and Small-Screen Rendering™. These technologies are based on a popular Web browser, Opera, which has been ported to mobile wireless platforms. Small-Screen Rendering™ attempts to reformat Web pages dynamically to fit within the horizontal dimension of the mobile terminal's display. Opera Mobile Accelerator uses a proxy server to access Web pages in their original HTML form on demand, and then compresses this HTML and forwards the compressed version to the mobile terminal. According to Opera, Opera Mobile Accelerator reduces the size of Web pages by approximately 50-70%, but it has the ability to reduce them up to 90%. With this in mind, the data is retrieved faster and the Web pages are displayed sooner. Generally, the increase in speed depends on the type of sites you are visiting. These two technologies present partial solutions to the problems of mobile terminal display size and bandwidth in connection with Web browsing. However, there is a need in the art for an improved system and method for providing web page data to digital devices having limited bandwidth or other communication and/or processing power limitations.

DISCLOSURE OF THE INVENTION

According to one aspect, the present invention discloses a method, that may include accessing an Internet site by a proxy server; converting image data from the Internet site into a multiresolution representation; and transmitting the image data in the multiresolution representation to a client device. Preferably, the method further includes navigating a web page by the client device. Preferably, the navigating step includes (a) navigating the Internet site image data by the proxy server; and (b) navigating the multiresolution image data by the client device, wherein the navigating of step (a) and the navigating of step (b) occur substantially simultaneously. Preferably, the navigating includes storing multiresolution image data for at least substantially the entirety of the web page at the proxy server; and navigating the stored multiresolution image data by the client device. Preferably, the converting step comprises: continuously converting the Internet site image data as needed by the navigating by the client device. Preferably, the converting step comprises: pre-converting selected image data from the Internet site by the proxy server prior to the selected image data being needed for the client device navigation; and storing the pre-converted image data at the proxy server.

Preferably, the method further comprises transmitting the stored image data to the client device as needed by the client device navigation. Preferably, the method further includes ensuring that the stored pre-converted image data is not stale prior to transmission thereof to the client device.

Preferably, the ensuring step includes comparing one of a timestamp and a checksum of the pre-converted image data with a respective one of a timestamp and a checksum of corresponding image data originating from the Internet site.

Preferably, the pre-converting step includes converting image data for at least one web page by the proxy server; and storing the converted image data for the at least one web page at the proxy server.

Preferably, the method further includes enabling bidirectional interactive communication between the client device and the Internet site. Preferably, the interactive communication comprises: transmitting navigation commands from the client device to the proxy server. Preferably, the method further includes emulating dynamic Internet graphics for the client device by the proxy server. Preferably, the emulating step comprises emulating at least one of: dynamic HTML (Hypertext Markup Language); and applets. Preferably, the navigating comprises: selecting a resolution level at which to view at least a portion of the web page, at the client device. Preferably, the navigating includes selecting a region of the web page for display on the client device. Preferably, the navigating further includes selecting a resolution at which to view the region of the web page. Preferably, the navigating step comprises selecting a plurality of portions of the web page for viewing at the client device. Preferably, the navigating step further includes selecting a plurality of respective resolutions at which to view the plurality of web page portions. Preferably, navigating the multiresolution image data includes panning the multiresolution image data. Preferably, the navigating the multiresolution image data comprises: zooming in on the multiresolution image data.

According to another aspect, the invention may provide an apparatus, that may include a client device; and a proxy server in communication with, and serving as an intermediary between, the client device and an Internet site, wherein the proxy server is operable to convert image data from the Internet site into a multiresolution visual data format. Preferably, the multiresolution visual data format is JPEG2000. Preferably, the client device is a cell phone, and wherein the cell phone comprises: at least one mechanism for navigating image data in the multiresolution visual data format. Preferably, the mechanism is a touchpad and the touchpad enables panning image data in the multiresolution visual data format. Preferably, the mechanism enables zooming in on image data in the multiresolution visual data format. Preferably, the mechanism for zooming comprises a roller that is rotatable by a user of the client device. Preferably, the client device is one of the group consisting of: a cell phone, a PDA (Personal Digital Assistant) , a notebook computer, a desktop computer, a tablet computer, and a television set.

Other aspects, features, advantages, etc. will become apparent to one skilled in the art when the description of the preferred embodiments of the invention herein is taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

For the purposes of illustrating the various aspects of the invention, there are shown in the drawings forms that are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.

FIG. 1 is a bock diagram of a communication system including a proxy server and a client device in accordance with one or more embodiments of the present invention;

FIG. 2A is a front view of a client device, which may be a cell phone, customized in accordance with one or more embodiments of the present invention; FIG. 2B is an elevational view of the back of a client device, which may be a cell phone, customized in accordance with one or more embodiments of the present invention;

FIG. 2C is an elevational view of the back of a client device, which may be a cell phone, customized in accordance with one or more embodiments of the present invention;

FIG. 3 is a flowchart of a method for navigating image data by a client device in accordance with one or more embodiments of the present invention;

FIG. 4 is a view of a web page column as seen on a client device in accordance with one or more embodiments of the present invention;

FIG. 5A is a view of the web page of FIG. 4 after about two seconds of loading at a rate of 20 kilobits per second, in accordance with one or more embodiments of the present invention; FIG. 5B is a view of the web page of FIG. 4 after about ten seconds of loading at a rate of 20 kilobits per second, in accordance with one or more embodiments of the present invention; FIG. 6A is a view of a top portion of the web page of FIG. 4 after loading data at 20 kilobits per second for about two seconds, in accordance with one or more embodiments of the present invention;

FIG. 6B is a view of a top portion of the web page of FIG. 4 after loading data at 20 kilobits per second for about ten seconds, in accordance with one or more embodiments of the present invention;

FIG. 7 is a zoomed in view of a portion of the web page of FIG. 4 captured shortly after capture of the views of FIG. 6, in accordance with one or more embodiments of the present invention;

FIGS. 8A and 8B are portrait and landscape views, respectively, of a portion of the web page of FIG. 4, in accordance with one or more embodiments of the present invention; and

FIG. 9 is a block diagram of a computer system adaptable for use with one or more embodiments of the present invention.

BEST MODE OF CARRYING OUT THE INVENTION FIG. 1 is a bock diagram of a communication system 140 including a proxy server 150 and a client device 230 in accordance with one or more embodiments of the present invention. Communication system 140 may include Internet 180, client device 230, and proxy server 150, which may in turn include cache 160. Internet 180 may interpreted to include communication links and computing and communication hardware needed to enable proxy server 150 to browse the Internet in a conventional manner. While a high-bandwidth connection is preferred for use between proxy server 150 and Internet 180, any communication link, such as is known in the art, may be used to connect proxy server 150 to Internet 180. Generally, proxy server 150 may be connected to an Internet Service Provider (ISP) (not shown) which serves as a gateway to the Internet as a whole. In one or more embodiments herein, Internet 180 is intended to include such an Internet Service Provider.

Proxy server 150 may be a personal computer or other computing device suitably configured to communicate with the

Internet and to process data, including image data, downloaded therefrom. Proxy server 150 is preferably able to transmit commands to and receive image data from Internet 180.

Proxy server 150 may include cache 160 for storing data, including image data, to enable rapid and easy access to such data. Cache 160 may be physically incorporated within server 150, such as by being within the same physical enclosure as proxy server 150. Alternatively, cache 160 may be housed separately from proxy server 150 but be accessible to cache 160.

Client device 230 is preferably a portable computing device able to conduct communication with proxy server 150. However, client device could be a larger, non-portable computing device, such as a desktop computer. Client device 230 may be one of a cell phone, a Personal Digital Assistant (PDA) , a notebook computer, a desktop computer, a tablet computer, and a television set. However, client device 230 is not limited to being one of the foregoing items. Preferably, if client device 230 is a television set, this television set is able receive multiresolution image data and to transmit navigation commands to an external device, such as, but not limited to proxy server 150. Client device 230 may also include a data storage capability, including volatile and/or non-volatile storage, for storing multiresolution image data received from proxy server 150. Such data storage may include but is not limited to: RAM (Random Access Memory, ROM (Read Only Memory) , hard drive, CD-ROM drive, CD read-write, and bubble memory. Where desired, the client device 230 data storage may include cache memory. Proxy server 150 may serve as an intermediary between Internet 180 and client device 230 to enable client device 230 to browse image data from Internet 180 after conversion of such data into a multiresolution format, as discussed in greater detail in connection with FIG. 2. FIG. 2A is a front view of client device 230, which may be a cell phone, customized in accordance with one or more embodiments of the present invention. FIG. 2B is an elevational view of the back of the client device 230 of FIG. 2A. And, FIG. 2C is an elevational view of the back of the client device of FIG. 2A.

In one or more embodiments covered by the illustrations of FIG. 2, client device 230 is a cell phone or PDA that is specially configured for browsing multiresolution image data. Client device 230 may include a body 232, a display 234, an antenna 236, a touchpad 238, a joystick 240, and/or a roller 242. In FIG. 2, roller 242 is shown protruding to some extent from body 232 of client device 230. This protrusion is shown for illustrative purposes. Roller or wheel 242 need only be accessible to a user of client device 230. Thus, it may protrude from body 232 very slightly. Alternatively, a recess

(not shown) in the side of body 232 may be provided into which wheel 242 is nested, thereby enabling wheel 242 to not protrude beyond the straight-edged portion of the side of body 232 of client device 230.

In one or more embodiments, display 234 may be located on the front of client device 230 (FIG. 2A) . However, in other embodiments, display 234 may be located anywhere on client device 230. Touchpad 238 may be located on the rear of client device 230 (FIG. 2B) and may be used for panning image data received from proxy server 150. However, alternatively, touchpad 238 may be located on any surface of client device 230. Alternatively, touchpad 238 may be used for zooming image data. In one or more other embodiments, touchpad 238 may be used for panning and zooming image data.

A user of client device 230 may pan or zoom image data by moving a finger along the surface of touchpad 238. Additionally or alternatively, touchpad 238 may be employed to move a cursor over one or more image regions or web page regions displayed on display 234. Where plural images (possibly but not necessarily from two different web pages) are displayed on display 234, use of a cursor may be beneficial for identifying one of the images for the purpose of performing a navigation step thereon.

In an alternative embodiment, joystick 240 (FIG. 2C) may be employed for panning image data either in addition to, or in place of touchpad 238. Joystick may be used for panning image data by pushing the joystick 240 in a direction corresponding to a direction in which a user of client device 230 wishes to pan across an image. As with touchpad 238, joystick 240 may be located on any surface of client device 230. In one or more embodiments, zooming may be accomplished using a separate device, such as wheel 242, or other device such as a slider (not shown) , or using a stereotyped gesture on touchpad 238, such as sliding the finger up and down one edge of the touchpad 238 surface. In the latter case, both index fingers could be used on the touchpad simultaneously, one panning and the other zooming. In one or more embodiments that include both touchpad 240 and wheel 242, a user of client device 230 could rapidly switch back and forth between zooming and panning, or even pan and zoom simultaneously. Herein, wheel 242 may also be referred to as a "roller." Employing one or more of the above-described features, panning and/or zooming may become tactile experiences, as though a user of client device 230 were looking through the display (screen) 234 at a larger surface. The index finger of one hand could stroke touchpad 238, panning the image and providing a sense that the larger image is being moved from behind display 234 using touchpad 238. In an analogous manner, the use of wheel 242 for zooming in on an image may be employed to provide a sense that movement of wheel 242 moves the "larger image" closer to or further away from display 234, depending upon the direction of rotation of wheel 242.

The haptic interface described above may provide a natural and familiar "feel" with a shallow learning curve. One or more embodiments may also provide the advantage, compared to the more conventional solution of using a touchscreen, of allowing an unobstructed view of the display while the finger strokes touchpad 238 to pan an image. Moreover, in or more embodiments, synergistic benefits may be obtained by being able to readily alternate between panning and zooming and/or to conduct these two activities simultaneously or substantially simultaneously, whether using touchpad 238 and wheel 242, or using other navigation mechanisms . In one or more embodiments, touchpad 238 may have dimensions of about 2 inches by 1.5 inches, however touchpad 238 may be smaller or larger in either dimension than the listed values . FIG. 3 is a flowchart of a method 320 for navigating image data by a client device 230 in accordance with one or more embodiments of the present invention. The steps shown in FIG. 3 are not limited to being performed in the order shown. In certain cases, two or more of the illustrated steps may be performed simultaneously. In the following, the steps are listed and summarized, following which the application of the steps to basic interactive browsing and the variations thereof are discussed.

Method 320 may include accessing 322 an Internet site at Internet 180 by proxy server 150. Proxy server may convert

324 image data from the accessed internet site into a multiresolution visual data format that supports spatial random access (multiresolution image data) . JPEG2000 is one possible example of such a visual data format. However, the present invention is not limited to the use of this standard.

Herein, the term "converting" generally corresponds to the term "repackaging, " as applied to turning ordinary web page data into multiresolution image data.

Method 320 may include transmitting 326 the multiresolution image data to client device 230 by proxy server 150, for ultimate display 328 on client device 230.

Client device 230 may navigate 330 the transmitted multiresolution image data, either after or concurrently with the transmission in step 326. The transmitted multiresolution image data navigated 330 by client device 230 may include emulated Internet graphical features, such as, but not limited to dynamic HTML and applet functionality. In one embodiment, proxy server 150 converts 324 image data from web pages on Internet 180 into multiresolution image data, as such data is needed by the navigation 330 of client device 230. The multiresolution image data is then preferably transmitted 326 to client device 230 and displayed 328 thereon. In this embodiment, the accessing 322 of the Internet site, which may be a web page, by proxy server 150 and the navigation of the multiresolution image data by client device 230 may occur substantially simultaneously. Thus, in this embodiment, the conversion 324 of web page (or other type of) image data occurs substantially "on the fly" in response to the navigation 330 conducted by client device 230. Navigation 330 may include but is not limited to panning and zooming by client device 230.

In another embodiment, proxy server may convert, or "pre- convert", one or more entire web pages, or other form of Internet image data, into multiresolution image data and store the converted data in cache 160, or other memory device accessible to proxy server 150, for ready accessibility by, and transmission to, client device 230. This pre-conversion may be performed prior to the web page data concerned being needed in response to client device 230 navigation 330 commands . In one or more embodiments, proxy server 150 may ensure that, when servicing a request for image data from client device 230, the cached pre-converted multiresolution image data isn't "stale" by verifying that the original HTML content is unaltered in comparison with the cached version. This verification may be accomplished by comparing timestamps, checksums, or by using other well-known methods. Some tolerance for "staleness" may be built in to bound the number of HTML requests the proxy server must make. In this disclosure, data that is not "stale" is "fresh."

In one or more embodiments, proxy server 150 serves as an intermediary between the Internet 180, or more specifically a site or Internet page of Internet 180, and client device 230.

Consequently, proxy server 150' s communication with the

Internet 180 is preferably responsive to the browsing of multiresolution image data by client device 230. Also, two or more of the steps illustrated in FIG. 3 may occur interactively and/or concurrently rather than occurring one at a time and strictly in the sequence shown.

Generally, web page image data flows from Internet 180 to proxy server 150 where it may be converted into multiresolution image data. At that point, the multiresolution image data may be stored in cache 160 and/or transmitted to client device 230 for display thereon. Client device may, where needed, store the received multiresolution image data.

Navigation or browsing commands generally proceed in the reverse direction. Specifically, navigation requests and/or other commands pertinent to browsing a web page or Internet site may originate at client device 230, proceed to proxy server 150 and thereafter may be sent, possibly in modified form, from proxy server 150 to the Internet 180. Suitable conversion or reconfiguration of commands received at proxy server 150 from client device 230 may be implemented at proxy server 150 before re-transmitting such commands to the

Internet 180. For example, where client device 230 clicks on a link to another web page, proxy server 150 may re-transmit this linking command to the Internet 180 in order to access the target web page. The commands that may be sent by client device 230 may include navigation commands such as panning and zooming. Further commands may include clicking with a mouse or other device, typing, speaking (where voice commands may be understood at client device 230) , gesturing, movement of a finger or implement along the surface of touchpad 238, movement of wheel 242, or movement of joystick 240.

A first example is considered in which a single web page is accessed 322, converted 324 into multiresolution image data, and stored in cache 160. After being stored in cache

160, the multiresolution image data may be browsed as needed by client device 230.

Continuing with the example, let us suppose that client device 230 receives a initial rendition of the converted web page at a low resolution level. From this starting point

("resolution level 1") , the user of client device 230 zooms in on a selection region ^λλA" of the web page. Assuming that client device does not have the higher-resolution image data in its local storage, the zoom command is preferably transmitted to proxy server 150 which may retrieve the needed data from cache 160 and which may transmit this image data to client device 230.

Preferably, the extent of the zoom requested by client device 230 may be associated with a selected resolution level ("resolution level 2" for the sake of this example) by proxy server 150. The resulting resolution level may be selected from one of a set of pre-defined resolution levels of the selected region (region "A") . Alternatively, if the selected resolution level does not precisely correspond to one of the pre-defined resolution levels, the requested resolution level may calculated by conducting one or more interpolation calculations as disclosed in a document incorporated herein.

Display 234 of client device 230 then preferably shows the selected region "A" at the selected resolution. Presumably, substantially only region A of the converted web page is then displayed on display 234.

Continuing with the example, we consider a situation where the user of client device 230 next wishes to pan laterally from region "A" to region "B" (regions are not shown) within a web page, while preferably maintaining the resolution level last used to display region "A" ("resolution level 2") . Assuming that the multiresolution image data for region "B" is not available within client device 230, the step of panning toward region "B" preferably causes proxy server 150 to seek the image data for region "B" from cache 160, or other data storage device accessible to proxy server 150, and to transmit this data to client device 230. Upon receiving this data, client device 230 should be able to display region "B" at "resolution level 2". It is noted that, if the image data for resolution level 2 had not been available in cache 160, proxy server 150 would preferably have again accessed the pertinent web page from Internet 180, and converted data for that web page into multiresolution image data having greater resolution than that stored in cache 160.

Continuing with the example, we next consider a situation in which a user of client device 230 wishes to activate a hyperlink within the first web page, preferably within region B of the first web page. As the user of client device 230 clicks on the hyperlink, proxy server 150 may determine that the web page image data associated with the hyperlink is not located in cache 160. Accordingly, proxy server 150 preferably replicates the request for the hyperlink' s target web page in its communication with Internet 180.

Preferably, proxy server 150 then accesses the hyperlink's target web page. Proxy server 150 may then convert image data from the target web page into multiresolution image data and dynamically stream the multiresolution image data to client device 230 for display thereon. Alternatively or additionally, proxy server 150 may store all or part of the multiresolution image data for the target web page in cache 160 for later access, as needed, by client device 230. Such storage in cache 160, or other suitable storage device, may be beneficial since client device 230 may not be able to receive and display all of the multiresolution image data for the target (hyperlinked) web page at once. This inability is more likely to be present where client device 230 has selected a high resolution level at which to view one or more regions of the target web page. It is noted that, in general, the higher the resolution (of a displayed region) at which client device displays multiresolution image data, the smaller the displayed region will be .

In one or more embodiments, client device 230 may be able to display plural regions of a web page, or plural regions drawn from a plurality of respective web pages. In this case, client device 230 is preferably able to independently select display resolutions for each such displayed region. In this case, touchpad 238 or other mechanism within client device 230 is preferably able to move a cursor over each image, in turn, for the purpose of selecting a desired resolution for each such image.

Moreover, within any one image, which image may form only a portion of a web page, the resolution need not be constant. The resolution may vary as needed to reconcile the competing priorities of viewing a region of interest at high resolution and avoiding the data storage and data communication burden of receiving the amount of multiresolution image data needed to display an entire image at high resolution. One or more embodiments of the approach described herein for Web browsing by proxy may provide the features discussed below.

As discussed above, a zooming function may be provided with web browsing by proxy. Thus, a Web page formatted for viewing a wider display than that of client device 230 can nonetheless be seen in its full column width on a small display such as display 234. If the text is small relative to the column size, it may be beneficial to pan horizontally to read single lines in their entirety. However, the use of continuous zooming may enable many text columns to be read in their entirety even on a 176 pixel wide display (which width is featured on certain commercially available mobile phones) .

In one or more embodiments, this is possible without any reformatting, which would alter the layout of the Web page.

FIG. 4 shows a column of a Web page (originally designed for viewing on an 800x200 pixel display) being viewed in this fashion.

Provided that either conversion to multiresolution image data format is reasonably fast or that cached multiresolution image content is already available at cache 160, an initial view of an entire Web page can be rendered visible at client device 230 shortly after requesting the pertinent web page.

This ability may be provided for Web pages of any length. It also applies whether the initial view of the Web page is a zoomed-out overview of the entire page or a zoomed-in view of the top (or any other part) of the web page.

FIGS. 5A and 6A illustrate these situations, providing general overviews and a top views, respectively, of a large Web page after about 2 seconds of loading over a 20 kilobit/second connection. FIGS. 5B and 6B show clearer versions of FIGS. 5A and 6A, respectively, after about ten seconds of data loading time.

For complex Web pages with a variety of content, such as newspapers, the client can continuously focus in on interesting areas immediately, without waiting for the entire document to download. See FIG. 7.

Web pages of virtually unlimited length may be viewed without taxing the resources of client device 230, without delaying the time to initial view, and without causing degraded performance .

In one or more embodiments, a Web page may have very large images, which client device 230 may zoom in on to see fine details of, without taxing client device 230' s resources, without delaying the time to initial viewing of the web page, and/or without resulting in degraded performance.

In some cases, small device displays are taller than they are wide (e.g. 176 pixels wide by 220 pixels high for the Samsung® SGH-D410 mobile phone) . Yet for Web browsing, the most constraining variable may be column width, making it desirable for the cellular display to use a "landscape" rather than a "portrait" page setup. Because multiresolution image data allows rotation of visual content by a selectable angular distance, the larger dimension of the display can be used to its greatest effect during content browsing. See FIGS. 8A and 8B.

In one or more embodiments, a client device 230 accessing multiresolution image data may be enabled to view image content originating on any server having access to multiresolution image data, not only the multiresolution image data generated by proxy server 150. Thus, such a client device 150 may be enabled to continuously navigate large numbers of large images, large multiresolution vectorial maps, and other kinds of image content that might be difficult or impossible to serve using conventional HTML.

The above-listed advantages may be compelling enough to make multiresolution image data browsing using proxy server 150 beneficial not only for cellular phone users, but also for users of larger devices, including, but not limited to: PDA or palmtop wireless devices; notebook computers with wireless internet; home computer users connected to the internet via a narrowband connection such as a modem; and/or home broadband users.

In the case of home broadband users, one or more embodiments of a system using proxy server 150 to transmit multiresolution image data to client device 230 may provide features that include but are not limited to the following: a) Web pages may be zoomed continuously to improve legibility of small type or examine details of images; b) web pages designed for a small display could be browsed in sharp detail (for text or vector components) on a high-resolution display; c) multiple Web pages could be rearranged, manipulated and resized in a rich 2D (two-dimensional) or 3D (three- dimensional) multiscale environment, providing alternative user experiences to conventional multi-window or tabbed browsing; and/or d) through the judicious use of proxy server 150 caching, potentially unlimited numbers of Web pages could be aggregated into live large-scale "views" of the Web, without requiring excessive client device 230 memory or other resources.

Finally, because displaying images in a multiresolution image data format may provide complementary advantages for large-display, high-bandwidth platforms and small-display, low-bandwidth platforms, displaying in this manner may unify user experience and may simplify cross-platform engineering and development, rather than requiring different versions of visual content and/or browsers to be provided for the different platforms.

FIG. 9 is a block diagram of a computing system 800 adaptable for use with one or more embodiments of the present invention. In one or more embodiments, central processing unit (CPU) 802 may be coupled to bus 804. In addition, bus 804 may be coupled to random access memory (RAM) 806, read only memory (ROM) 808, input/output (I/O) adapter 810, communications adapter 822, user interface adapter 806, and display adapter 818.

In one or more embodiments, RAM 806 and/or ROM 808 may hold user data, system data, and/or programs. I/O adapter 810 may connect storage devices, such as hard drive 812, a CD-ROM

(not shown) , or other mass storage device to computing system

800. Communications adapter 822 may couple computing system

800 to a local, wide-area, or Internet network 824. User interface adapter 816 may couple user input devices, such as keyboard 826 and/or pointing device 814, to computing system

800. Moreover, display adapter 818 may be driven by CPU 802 to control the display on display device 820. CPU 802 may be any general purpose CPU. It is noted that the methods and apparatus described thus far and/or described later in this document may be achieved utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, programmable digital devices or systems, programmable array logic devices, or any combination of the above. One or more embodiments of the invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by appended claims appended to the end of this description.

It is noted that additional disclosure text is appended hereto which precede the claims and abstract of this application .

SYSTEM AND METHOD FOR EXACT RENDERING IN A ZOOMING USER INTERFACE

FIELD OF THE INVENTION The present invention relates generally to graphical zooming user interfaces (ZUI) for computers. More specifically, the invention is a system and method for progressively rendering zoomable visual content in a manner that is both computationally efficient, resulting in good user responsiveness and interactive frame rates, and exact, in the sense that vector drawings, text, and other non-photographic content is ultimately drawn without the resampling which would normally lead to degradation in image quality, and without interpolation of other images, which would also lead to degradation.

BACKGROUND OF THE INVENTION

Most present-day graphical computer user interfaces (GUIs) are designed using visual components of a fixed spatial scale. However, it was recognized from the birth of the field of computer graphics that visual components could be represented and manipulated in such a way that they do not have a fixed spatial scale on the display, but can be zoomed in or out. The desirability of zoomable components is obvious in many application domains; to name only a few: viewing maps, browsing through large heterogeneous text layouts such as newspapers, viewing albums of digital photographs, and working with visualizations of large data sets. Even when viewing ordinary documents, such as spreadsheets and reports, it is often useful to be able to glance at a document overview, and then zoom in on an area of interest. Many modern computer applications include zoomable components, such as Microsoft® Word ® and other Office ® products (Zoom under the View menu) , Adobe ® Photoshop ®, Adobe ® Acrobat ®, QuarkXPress ®, etc. In most cases, these applications allow zooming in and out of documents, but not necessarily zooming in and out of the visual components of the applications themselves. Further, zooming is normally a peripheral aspect of the user' s interaction with the software, and the zoom setting is only modified occasionally. Although continuous panning over a document is standard (i.e., using scrollbars or the cursor to translate the viewed document left, right, up or down) , the ability to zoom and pan continuously in a user-friendly manner is absent from prior art systems.

First, we set forth several definitions. A display is the device or devices used to output rendered imagery to the user. A frame buffer is used to dynamically represent the contents of at least a portion of the display. Display refresh rate is the rate at which the physical display, or portion thereof, is refreshed using the contents of the frame buffer. A frame buffer's frame rate is the rate at which the frame buffer is updated. For example, in a typical personal computer, the display refresh rate is 60-90 Hz. Most digital video, for example, has a frame rate of 24-30 Hz. Thus, each frame of digital video will actually be displayed at least twice as the display is refreshed. Plural frame buffers may be utilized at different frame rates and thus be displayed substantially simultaneously on the same display. This would occur, for example, when two digital videos with different frame rates were being played on the same display, in different windows.

One problem with zooming user interfaces (ZUI) is that the visual content has to be displayed at different resolutions as the user zooms. The ideal solution to this problem would be to display, in every consecutive frame, an exact and newly computed image based on the underlying visual content. The problem with such an approach is that the exact recalculation of each resolution of the visual content in real time as the user zooms is computationally impractical if the underlying visual content is complex. As a result of the foregoing, many prior art ZUI systems use a plurality of precomputed images, each being a representation of the same visual content but at different resolutions. We term each of those different precomputed images a Level of Detail (LOD) . The complete set of LODs, organized conceptually as a stack of images of decreasing resolution, is termed the LOD pyramid— see Fig. 10. In such prior systems, as zooming occurs, the system interpolates between the LODs and displays a resulting image at a desired resolution. While this approach solves the computational issue, it displays a final compromised image that is often blurred and unrealistic, and often involves loss of information due to the fact that it represents interpolation of different LODs. These interpolation errors are especially noticeable when the user stops zooming and has the opportunity to view a still image at a chosen resolution which does not precisely match the resolution of any of the LODs.

Another problem with interpolating between precomputed LODs is that this approach typically treats vector data in the same way as photographic or image data. Vector data, such as blueprints or line drawings, are displayed by processing a set of abstract instructions using a rendering algorithm, which can render lines, curves and other primitive shapes at any desired resolution. Text rendered using scalable fonts is an important special case of vector data. Image or photographic data (including text rendered using bitmapped fonts) are not so generated, but must be displayed either by interpolation between precomputed LODs or by resampling an original image. We refer to the latter herein as nonvector data. Prior art systems that use rendering algorithms to redisplay vector data at a new resolution for each frame during a zoom sequence must restrict themselves to simple vector drawings only in order to achieve interactive frame rates. On the other hand, prior art systems that precompute LODs for vector data and interpolate between them, as for nonvector data, suffer from markedly degraded visual quality, as the sharp edges inherent in most vector data renditions are particularly sensitive to interpolation error. This degradation is usually unacceptable for textual content, which is a special case of vector data.

It is an object of the invention to create a ZUI that replicates the zooming effect a user would see if he or she actually had viewed a physical object and moved it closer to himself or herself.

It is an object of the invention to create a ZUI that displays images at an appropriate resolution but which avoids or diminishes the interpolation errors in the final displayed image. A further object of the present invention is to allow the user to zoom arbitrarily far in on vector content while maintaining a crisp, unblurred view of the content and maintaining interactive frame rates.

A further object of the present invention is to allow the user to zoom arbitrarily far out to get an overview of complex vectorial content, while both preserving the overall appearance of the content and maintaining interactive frame rates .

A further object of the present invention is to diminish the user' s perception of transitions between LODs or rendition qualities during interaction.

A further object of the present invention is to allow the graceful degradation of image quality by blurring when information ordinarily needed to render portions of the image is as yet incomplete.

A further object of the present invention is to gradually increase image quality by bringing it into sharper focus as more complete information needed to render portions of the image becomes available.

It is an object of the invention to optimally and independently render both vector and nonvector data.

These and other objects of the present invention will become apparent to those skilled in the art from a review of the specification that follows.

SUMMARY OF THE INVENTION

The above and other problems of the prior art are overcome in accordance with the present invention, which relates to a hybrid strategy for implementing a ZUI allowing an image to be displayed at a dynamically varying resolution as a user zooms in or out, rotates, pans, or otherwise changes his or her view of an image. Any such change in view is termed navigation. Zooming of the image to a resolution not equal to that of any of the predefined LODs is accomplished by displaying the image at a new resolution that is interpolated from predefined LODs that "surround" the desired resolution. By "surrounding LODs" we mean the LOD of lowest resolution which is greater than the desired resolution and the LOD of highest resolution which is less than the desired resolution. If the desired resolution is either greater than the resolution of the LOD with the highest available resolution or less than the resolution of the LOD with the lowest resolution, then there will be only a single "surrounding LOD". The dynamic interpolation of an image at a desired resolution based on a set of precomputed LODs is termed in the literature m±pmapp±ng or trilinβar interpolation. The latter term further indicates that bilinear sampling is used to resample the surrounding LODs, followed by linear interpolation between these resampled LODs (hence trilinear) . See, e.g.; Lance Williams. "Pyramidal

Parametrics," Computer Graphics (Proc. SIGGRAPH '83) 17(3): 1- 11 (1983) . The foregoing document is incorporated herein by reference in its entirety. Obvious modifications of or extensions to the mipmapping technique introduced by Williams use nonlinear resampling and/or interpolation of the surrounding LODs. In the present invention it is immaterial whether the resampling and interpolation operations are zeroth-order (nearest-neighbor) , linear, higher-order, or more generally nonlinear. In accordance with the invention described herein, when the user defines an exact desired resolution, which is almost never the resolution of one of the predefined LODs, the final image is then displayed by preferably first displaying an intermediate final image. The intermediate final image is the first image displayed at the desired resolution before that image is refined as described hereafter. The intermediate final image may correspond to the image that would be displayed at the desired resolution using the prior art.

In a preferred embodiment, the transition from the intermediate final image to the final image may be gradual, as explained in more detail below.

In an enhanced embodiment, the present invention allows LODs to be spaced in any resolution increments, including irrational increments (i.e. magnification or minification factors between consecutive LODs which cannot be expressed as the ratio of two integers) , as explained in more detail below.

In another enhanced embodiment, portions of the image at each different LOD are denoted tiles, and such tiles are rendered in an order that minimizes any perceived imperfections to a viewer. In other embodiments, the displayed visual content is made up of plural LODs (potentially a superset of the surrounding LODs as described above) , each of which is displayed in the proper proportion and location in order to cause the display to gradually fade into the final image in a manner that conceals imperfections.

The rendition of various tiles in plural LODs is accomplished in an order that optimizes the appearance of the visual content while staying within acceptable levels of computational complexity so that the system can run on standard computers with typical clock speeds available in most laptop and desktop personal computers.

The present invention involves a hybrid strategy, in which an image is displayed using predefined LODs during rapid zooming and panning, but when the view stabilizes sufficiently, an exact LOD is rendered and displayed. The exact LOD is rendered and displayed at the precise resolution chosen by the user, which is normally different from the predefined LODs. Because the human visual system is insensitive to fine detail in the visual content while it is still in motion, this hybrid strategy can produce the illusion of continuous "perfect rendering" with far less computation.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 10 depicts an LOD pyramid (in this case the base of the pyramid, representing the highest-resolution representation, is a 512x512 sample image, and successive minifications of this image are shown in factors of 2) ; Figure 11 depicts a flow chart for use in an exemplary embodiment of the invention; Figure 12 is another flow chart that shows how the system displays the final image after zooming;

Figure 13 is the LOD pyramid of Figure 1 with grid lines added showing the subdivision of each LOD into rectangular tiles of equal size in samples;

Figure 14 is another flow chart, for use in connection with the present invention, and it depicts a process for displaying rendered tiles on a display;

Figure 15 shows a concept termed irrational tiling, explained in more detail herein;

Figure 16 depicts a composite tile and the tiles that make up the composite tile, as explained more fully below; and

FIG. 17 depicts successive stages of the rendering, in foveated order, of the second layer of the LOD pyramid of FIG. 4, the successive stages being illustrated in FIGS. 17A, 17B,

17C, and 17D, respectively, in accordance with one or more embodiments of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT Figure 11 shows a flowchart of a basic technique for implementation of the present invention. The flowchart of Figure 11 represents an exemplary embodiment of the invention and would begin executing when an image is displayed at an initial resolution. It is noted that the invention may be used in the client/server model, but that the client and server may be on the same or different machines. Thus, for example, there could be a set of discrete LODs stored remotely at a host computer, and the user can be connected to said host through a local PC. The actual hardware platform and system utilized are not critical to the present invention.

The flowchart is entered at start block 241 with an initial view of an image at a particular resolution. In this example, the image is taken to be static. The image is displayed at block 242. A user may navigate that image by moving, for example, a computer mouse. The initial view displayed at block 242 will change when the user navigates the image. It is noted that the underlying image may itself be dynamic, such as in the case of motion video, however, for purposes of this example, the image itself is treated as static. As explained above, any image to be displayed may also have textual or other vector data and/or nonvector data such as photographs and other images. The present invention, and the entire discussion below, is applicable regardless of whether the image comprises vector or nonvector data, or both.

Regardless of the type of visual content displayed in block 242, the method transfers control to decision point 243 at which navigation input may be detected. If such input is not detected, the method loops back to block 242 and continues displaying the stationary visual content. If a navigation input is detected, control will be transferred to block 244 as shown . Decision point 243 may be implemented by a continuous loop in software looking for a particular signal that detects movement, an interrupt system in hardware, or any other desired methodology. The particular technique utilized to detect and analyze the navigation request is not critical to the present invention. Regardless of the methodology used, the system can detect the request, thus indicating a desire to navigate the image. Although much of the discussion herein relates to zooming, it is noted that the techniques are applicable to zooming, panning, or otherwise navigating. Indeed, the techniques described herein are applicable to any type of dynamic transformation or change in perspective on the image. Such transformations may include, for example, three dimensional translation and rotation, application of an image filter, local stretching, dynamic spatial distortion applied to selected areas of the image, or any other kind of distortion that might reveal more information. Another example would be a virtual magnifying glass, that can get moved over the image and which magnifies parts of the image under the virtual magnifying glass. When decision point 243 detects that a user is initiating navigation, block 244 will then render and display a new view of the image, which may be, for example, at a different resolution from the prior displayed view.

One straightforward prior art technique of displaying the new view is based upon interpolating LODs as the user zooms in or out. The selected LODs may be those two LODs that "surround" the desired resolution/ i.e.; the resolution of the new view. The interpolation, in prior systems, constantly occurs as the user zooms and is thus often implemented directly in the hardware to achieve speed. The combination of detection of movement in decision point 245 and a substantially immediate display of an appropriate interpolated image at block 244 results in the image appearing to zoom continuously as the user navigates. During zooming in or out, since the image is moving, an interpolated image is sufficient to look realistic and clear. Any interpolation error is only minimally detectable by the human visual system, as such errors are disguised by the constantly changing view of the image .

At decision point 245, the system tests whether or not the movement has substantially ceased. This can be accomplished using a variety of techniques, including, for example, measuring the rate of change of one or more parameters of the view. That is, the methodology ascertains whether or not the user has arrived at the point where he has finished zooming. Upon such stabilization at decision point 245, control is transferred to block 246, where an exact image is rendered, after which control returns to block 243. Thus, at any desired resolution, the system will eventually display an exact LOD. Notably, the display is not simply rendered and displayed by an interpolation of two predefined LODs, but may be rendered and displayed by re-rendering vector data using the original algorithm used to render the text or other vector data when the initial view was displayed at block 242. Nonvector data may also be resampled for rendering and displayed at the exact required LOD. The required re- rendering or resampling may be performed not only at the precise resolution required for display at the desired resolution, but also on a sampling grid corresponding precisely to the correct positions of the display pixels relative to the underlying content, as calculated based on the desired view. As an example, translation of the image on the display by H pixel in the display plane does not change the required resolution, but it does alter the sampling grid, and therefore requires re-rendering or resampling of the exact LOD.

The foregoing system of Fig. 11 represents a hybrid approach in which interpolation based upon predefined LODs is utilized while the view is changing (e.g. navigation is occurring) but an exact view is rendered and displayed when the view becomes substantially stationary.

For purposes of explanation herein, the term render refers to the generation by the computer of a tile at a specific LOD based upon vector or nonvector data. With respect to nonvector data, these may be rerendered at an arbitrary resolution by resampling an original image at higher or lower resolution. We turn now to the methodology of rendering and displaying the different portions of the visual content needed to achieve an exact final image as represented by block 246 of Fig. 11. With reference to Fig. 12, when it is determined that navigation has ceased, control is transferred to block 333 and an interpolated image is immediately displayed, just as is the case during zooming. We call this interpolated image that may be temporarily displayed after the navigation ceases the Intermediate final image, or simply an intermediate image. This image is generated from an interpolation of the surrounding LODs. In some cases, as explained in more detail below, the intermediate image may be interpolated from more than two discrete LODs, or from two discrete LODs other than the ones that surround the desired resolution. Once the intermediate image is displayed, block 334 is entered, which causes the image to begin to gradually fade towards an exact rendition of the image, which we term the final Image. The final image differs from the intermediate image in that the final image may not involve interpolation of any predefined LODs. Instead, the final image, or portions thereof, may comprise newly rendered tiles. In the case of photographic data, the newly rendered tiles may result from resampling the original data, and in the case of vector data, the newly rendered tiles may result from rasterization at the desired resolution.

It is also noted that it is possible to skip directly from block 333 to 335, immediately replacing the interpolated image with a final and exact image. However, in the preferred embodiment, step 334 is executed so the changeover from the intermediate final image to the final image is done gradually and smoothly. This gradual fading, sometimes called blending, causes the image to come into focus gradually when navigation ceases, producing an effect similar to automatic focusing in cameras or other optical instruments. The illusion of physicality created by this effect is an important aspect of the present invention.

Following is a discussion of the manner in which this fading or blending may take place in order to minimize perceived irregularities, sudden changes, seams, and other imperfections in the image. It is understood however that the particular technique of fading is not critical to the present invention, and that many variations will be apparent to those of skill in the art.

Different LODs differ in the number of samples per physical area of the underlying visual content. Thus, a first LOD may take a 1 inch by 1 inch area of a viewable object and generate a single 32 by 32 sample tile. However, the information may also be rendered by taking the same 1 inch by 1 inch area and representing it as a tile that is 64 by 64 samples, and therefore at a higher resolution.

We define a concept called irrational tiling. Tiling granularity, which we will write as the variable g, is defined as the ratio of the linear tiling grid size at a higher- resolution LOD to the linear tiling grid size at the next lower-resolution LOD. In the Williams paper introducing trilinear interpolation, g = 2. This same value of g has been used in other prior art. Although LODs may be subdivided into tiles in any fashion, in an exemplary embodiment each LOD is subdivided into a grid of square or rectangular tiles containing a constant number of samples (except, as required, at the edges of the visual content) . Conceptually, when g = 2, each tile at a certain LOD "breaks up" into 2x2=4 tiles at the next higher-resolution LOD (again, except potentially at the edges), as shown in Figure 13.

There are fundamental shortcomings in tilings of granularity 2. Usually, if a user zooms in on a random point in a tile, every g-fold increase in zoom will require the rendition of a single additional tile corresponding to the next higher-resolution LOD near the point toward which the user is zooming. However, if a user is zooming in on a grid line in the tiling grid, then two new tiles need to be rendered, one on either side of the line. Finally, if a user is zooming in on the intersection of two grid lines, then four new tiles need to be rendered. If these events—requests for 1, 2 or 4 new tiles with each g-fold zoom—are interspersed randomly throughout an extended zooming sequence, then overall performance will be consistent. However, a grid line in any integral-granularity tiling (i.e. where g is a whole number) remains a grid line for every higher-resolution LOD.

Consider, for example, zooming in on the center of a very large image tiled with granularity 2. We will write the (x,y) coordinates of this point as (H₁ H) , adopting the convention that the visual content falls within a square with corners (0,0), (0,1), (1,0) and (1,1). Because the center is at the intersection of two grid lines, as the user reaches each higher-resolution LOD, four new tiles need to be rendered every time; this will result in slow performance and inefficiency for zooming on this particular point. Suppose, on the other hand, that the user zooms in on an irrational point—meaning a grid point (x,y) such that x and y cannot be expressed as the ratios of two whole numbers. Examples of such numbers are pi (=3.14159...) and the square root of 2 (=1.414213...). Then, it can easily be demonstrated that the sequence of l's, 2's and 4's given by the number of tiles that need to be rendered for every g-fold zoom is quasi-random, i.e. follows no periodic pattern. This kind of quasi-random sequence is clearly more desirable from the point of view of performance; then there are no distinguished points for zooming from a performance standpoint. Irrational tiling resolves this issue: g itself is taken to be an irrational number, typically the square root of 3, 5 or 12. Although this means that on average 3, 5 or 12 tiles

(correspondingly) at a given LOD are contained within a single tile at the next lower-resolution LOD, note that the tiling grids at consecutive LODs no longer "agree" on any grid lines in this scheme (except potentially at the leading edges of the visual content, x=0 and y=0, or at some other preselected single grid line along each axis) . If g is chosen such that it is not the n^th root of any integer (pi is such a number) , then no LODs will share any grid lines (again, potentially except x=0 and y=0) . Hence it can be shown that each tile may randomly overlap 1, 2, or 4 tiles at the next lower LOD, whereas with g=2 this number is always 1. With irrational tiling granularity, zooming in on any point will therefore produce a quasi-random stream of requests for 1, 2 or 4 tiles, and performance will be on average uniform when zooming in everywhere. Perhaps the greatest benefit of irrational tiling emerges in connection with panning after a deep zoom. When the user pans the image after having zoomed in deeply, at some point a grid line will be moved onto the display. It will usually be the case that the region on the other side of this grid line will correspond to a lower-resolution LOD than the rest of the display; it is desirable, however, for the difference between these resolutions to be as small as possible. With integral g, however, the difference will often be extremely large, because grid lines can overlap over many consecutive LODs. This creates "deep cracks" in resolution over the node area, as shown in Figure 15 (a) .

On the other hand, because grid lines in an irrational tiling never overlap those of an adjacent LOD (again with the possible exception of one grid line in each direction, which may be at one corner of the image) , discontinuities in resolution of more than one LOD do not occur. This increased smoothness in relative resolution allows the illusion of spatial continuity to be much more convincing. Figure 15 (b) illustrates the advantage gained by irrational tiling granularity. Figure 15 shows cross-sections through several LODs of the visual content; each bar represents a cross-section of a rectangular tile. Hence the second level from the top, in which there are two bars, might be a 2x2=4 tile LOD. The curves 631, drawn from top to bottom, represent the bounds of the visible area of the visual content at the relevant LOD during a zooming operation: as the resolution is increased (zooming in to reveal more detail) , the area under examination decreases. Darker bars (e.g., 632) represent tiles which have already been rendered over the course of the zoom. Lighter bars have not yet been rendered, so cannot be displayed. Note that when the tiling is integral as in Figure 15 (a) , abrupt changes in resolution over space are common; if the user were to pan right after the zoom, then at the spatial boundary indicated by the arrow, four LODs would "end" abruptly. The resulting image would look sharp to the left of this boundary, and extremely blurry to the right. The same visual content represented using an irrational tiling granularity lacks such resolution "cracks": adjacent LODs do not share tile boundaries, except as shown at the left edge. Mathematically, this shared boundary may occur at most in one position on the x-axis and at one position on the y-axis. In the embodiment shown these shared boundaries are positioned at y=0 and x=0, but, if present, they may also be placed at any other position.

Another benefit of irrational tiling granularity is that it allows finer control of g, since there are a great many more irrational numbers than integers, particularly over the useful range where g is not too large. This additional freedom can be useful for tuning the zooming performance of certain applications. If g is set to the irrational square root of an integer (such as sqrt(2), sqrt(5) or sqrt(8)), then in the embodiment described above the grid lines of alternate LODs would align exactly/ if g is an irrational cube root, then every third LOD would align exactly; and so on. This confers an additional benefit with respect to limiting the complexity of a composite tiling, as defined below. An important aspect of the invention is the order in which the tiles are rendered. More particularly, the various tiles of the various LODs are optimally rendered such that all visible tiles are rendered first. Nonvisible tiles may not be rendered at all. Within the set of visible tiles, rendition proceeds in order of increasing resolution, so that tiles within low-resolution LODs are rendered first. Within any particular LOD, tiles are rendered in order of increasing distance from the center of the display, which we refer to as foveated rendering. To sort such tiles in the described order, numerous sorting algorithms such as heapsort, quicksort, or others may be used. To implement this ordering, a lexicographic key may be used for sorting "requests" to render tiles, such that the outer subkey is visibility, the middle subkey is resolution in samples per physical unit, and the inner subkey is distance to the center of the display. Other methods for ordering tile rendering requests may also be used. The actual rendering of the tiles optimally takes place as a parallel process with the navigation and display described herein. When rendering and navigation/display proceed as parallel processes, user responsiveness may remain high even when tiles are slow to render.

We now describe the process of rendering a tile in an exemplary embodiment. If a tile represents vector data, such as alphabetic typography in a stroke based font, then rendering of the tile would involve running the algorithm to rasterize the alphabetic data and possibly transmitting that data to a client from a server. Alternatively, the data fed to the rasterization algorithm could be sent to the client, and the client could run the algorithm to rasterize the tile. In another example, rendering of a tile involving digitally sampled photographic data could involve resampling of that data to generate the tile at the appropriate LOD. For discrete LODs that are prestored, rendering may involve no more than simply transmitting the tile to a client computer for subsequent display. For tiles that fall between discrete LODs, such as tiles in the final image, some further calculation as described above may be required. At any given time, when the tiles are rendered and the image begins to fade toward the exact image, the actual display may comprise different mixes of different tiles from different LODs. Thus, any portion of the display could contain for example, 20% from LOD 1, 40% from LOD 2, and 40% from LOD 3. Regardless of the tiles displayed, the algorithm attempts to render tiles from the various LODs in a priority order best suited to supply the rendered tiles for display as they are most needed. The actual display of the rendered tiles will be explained in more detail later with reference to Figure 14.

In what follows we describe a method for drawing the plural LODs using an algorithm which can guarantee spatial and temporal continuity of image detail. The algorithm is designed to make the best use of all rendered tiles, using high-resolution tiles in preference to lower-resolution tiles covering the same display area, yet using spatial blending to avoiding sharp boundaries between LODs, and temporally graduated blending weights to blend in higher detail if and when it becomes available (i.e. when higher-resolution tiles have been rendered) . Unlike the prior art, this algorithm and variants thereof can result in more than two LODs being blended together at a given point on the display; it can also result in blending coefficients that vary smoothly over the display area; and it can result in blending coefficients that evolve in time even after the user has stopped navigating. In this exemplary embodiment it is nonetheless computationally efficient, and can be used to render imagery as partially transparent, or with an overall transparency that varies over the image area, as will become apparent.

We define herein a composite tile area, or simply a composite tile. To define a composite tile we consider all of the LODs stacked on top of each other. Each LOD has its own tile grid. The composite grid is then formed by the projection of all of the grids from all of the LODs onto a single plane. The composite grid is then made up of various composite tiles of different sizes, defined by the boundaries of tiles from all of the different LODs. This is shown conceptually in Fig. 16. Fig. 16 depicts the tiles from three different LODs, 701 through 703, all representing the same image. One can imagine the LODs 701 through 703 being stacked up on top of each other. In such a case, if one lined up corner 750 from each of these LODs and stacked them on top of each other, an area represented by 740 would be inside the area represented by 730, and the areas represented by 730 and 740, would be inside the area represented by 720. Area 710 of Fig. 16 shows that there would be a single "composite tile" 710. Each of the composite tiles is examined during each frame, wherein the frame rate may be typically greater than ten frames per second. Note that, as explained above, this frame rate is not necessarily the display refresh rate. Fig. 14 depicts a flow chart of an algorithm for updating the frame buffer as tiles are rendered. The arrangement of Fig. 14 is intended to operate on every composite tile in the displayed image each time the frame buffer is updated. Thus, for example, if a frame duration is 1/20 of a second, each of the composite tiles on the entire screen would preferably be examined and updated during each 1/20 of a second. When a composite tile is operated upon by the process of Fig. 14, the composite tile may lack the relevant tiles in one or more LODs. The process of Fig. 14 attempts to display each composite tile as a weighted average of all the available superimposed tiles within which the composite tile lies. Note that composite tiles are defined in such a way that they fall within exactly one tile at any given LOD; hence the weighted average can be expressed as a relative proportion of each LOD. The process attempts to determine the appropriate weights for each LOD within the composite tile, and to vary those weights gradually over space and time to cause the image to gradually fade towards the final image discussed above. The composite grid includes plural vertices which are defined to be any intersection or corner of gridlines in the composite grid. These are termed composite grid vertices. We define an opacity for each LOD at each composite grid vertex. The opacity can be expressed as a weight between 0.0 and 1.0, and the sum of all the LOD weights at each vertex should therefore be 1.0 if the desired result is for the image to be totally opaque. The current weights at any particular time for each LOD at each vertex are maintained in memory.

The algorithm for updating vertex weights proceeds as described below.

The following variables, which are taken to be numbers between 0.0 and 1.0, are kept in memory for each tile: centerOpacity, cornerOpacity for each corner (4 if the tiling is a rectangular grid) , and edgeOpacity for each edge (4 if the tiling is a rectangular grid) . When a tile is first rendered, all of its opacities as just listed are normally set to 1.0. During a drawing pass, the algorithm walks through the composite tiling once for each relevant LOD, beginning with the highest-resolution LOD. In addition to the per-tile variables, the algorithm maintains the following variables: levelOpacityGrid and opacityGrid. Both of these variables are again numbers between 0.0 and 1.0, and are maintained for each vertex in the composite tiling.

The algorithm walks through each LOD in turn, in order from highest-resolution to lowest, performing the following operations. First 0.0 is assigned to levelOpacityGrid at all vertices. Then, for each rendered tile at that LOD (which may be a subset of the set of tiles at that LOD, if some have not yet been rendered) , the algorithm updates the parts of the levelOpacityGrid touching that tile based on the tile's centerOpacity, cornerOpacity and edgeOpacity values : If the vertex is entirely in the interior of the tile, then it gets updated using centerOpacity.

If the vertex is e.g. on the tile's left edge, it gets updated with the left edgeOpacity.

If the vertex is e.g. on the top right corner, it gets updated with the top right cornerOpacity.

"Updating" means the following: if the pre-existing levelOpacityGrid value is greater than 0.0, then set the new value to the minimum of the present value, or the value it's being updated with. If the pre-existing value is zero (i.e. this vertex hasn't been touched yet) then just set the levelOpacityGrid value to the value it's being updated with. The end result is that the levelOpacityGrid at each vertex position gets set to the minimum nonzero value with which it gets updated.

The algorithm then walks through the levelOpacityGrid and sets to 0.0 any vertices that touch a tile which has not yet been rendered, termed a hole. This ensures spatial continuity of blending: wherever a composite tile falls within a hole, at the current LOD, drawing opacity should fade to zero at all vertices abutting that hole.

In an enhanced embodiment, the algorithm can then relax all levelOpacityGrid values to further improve spatial continuity of LOD blending. The situation as described thus far can be visualized as follows: every vertex is like a tentpole, where the levelOpacityGrid value at that point are the tentpole' s height. The algorithm has thus far ensured that at all points bordering on a hole, the tentpoles have zero height; and in the interior of tiles that have been rendered, the tentpoles are set to some (probably) nonzero value. In the extreme case, perhaps all the values inside a rendered tile are set to 1.0. Assume for purposes of illustration that the rendered tile has no rendered neighbors yet, so the border values are 0.0. We have not specified how narrow the "margin" is between a 0.0 border tentpole and one of the 1.0 internal tentpoles. If this margin is too small, then even though the blending is technically continuous, the transition may be too sharp when measured as an opacity derivative over space. The relax operation smoothes out the tent, always preserving values of 0.0, but possibly lowering other tentpoles to make the function defined by the tent surface smoother, i.e. limiting its maximum spatial derivative. It is immaterial to the invention which of a variety of methods are used to implement this operation; one approach, for example, is to use selective low-pass filtering, locally replacing every nonzero value with a weighted average of its neighbors while leaving zeroes intact. Other methods will also be apparent to those skilled in the art.

The algorithm then walks over all composite grid vertices, considering corresponding values of levelOpacityGrid and opacityGrid at each vertex: if levelOpacityGrid is greater than 1.0-oρacityGrid, then levelOpacityGrid is set to 1.0- opacityGrid. Then, again for each vertex, corresponding values of levelOpacityGrid are added to opacityGrid. Due to the previous step, this can never bring opacityGrid above 1.0. These steps in the algorithm ensure that as much opacity as possible is contributed by higher-resolution LODs when they are available, allowing lower-resolution LODs to "show through" only where there are holes.

The final step in the traversal of the current LOD is to actually draw the composite tiles at the current LOD, using levelOpacityGrid as the per-vertex opacity values. In an enhanced embodiment, levelOpacityGrid can be multiplied by a scalar overallOpacity variable in the range 0.0 to 1.0 just before drawing; this allows the entire image to be drawn with partial transparency given by the overallOpacity. Note that drawing an image-containing polygon, such as a rectangle, with different opacities at each vertex is a standard procedure. It can be accomplished, for example, using industry-standard texture mapping functions using the OpenGL or Direct3D graphics libraries. In practice, the drawn opacity within the interior of each such polygon is spatially interpolated, resulting in a smooth change in opacity over the polygon.

In another enhanced embodiment of the algorithm described above, tiles maintain not only their current values of centerOpacity, cornerOpacity and edgeOpacity (called the current values) , but also a parallel set of values called targetCenterOpacity, targetCornerOpacity and targetEdgeOpacity (called the target values) . In this enhanced embodiment, the current values are all set to 0.0 when a tile is first rendered, but the the target values are all set to 1.0. Then, after each frame, the current values are adjusted to new values closer to the target values. This may be implemented using a number of mathematical formulae, but as an example, it can be done in the following way: newValue = oldValue* (1-jb) + targetValue*jb, where b is a rate in greater than 0.0 and less than 1.0. A value of b close to 0.0 will result in a very slow transition toward the target value, and a value of b close to 1.0 will result in a very rapid transition toward the target value. This method of updating opacities results in exponential convergence toward the target, and results in a visually pleasing impression of temporal continuity. Other formulae can achieve the same result. FIG. 17 depicts successive stages of the rendering, in foveated order, of the second layer of the LOD pyramid of FIG. 17, in accordance with one or more embodiments of the present invention .

The foregoing describes the preferred embodiment of the present invention. The invention is not limited to such preferred embodiment, and various modifications consistent with the appended claims are included within the invention as well .

A SYSTEM AND METHOD FOR MULTIPLE NODE DISPLAY

BACKGROUND OF THE INVENTION

The present invention relates to zooming user interfaces (ZUI) for computers.

Most present day graphical computer user interfaces are designed using visual components of a fixed spacial scale. The visual content can be manipulated by zooming in or out or otherwise navigating through it. However, the precision with which coordinates of various objects can be represented is extremely limited by the number of bits, usually between 16 and 64, designated to represent such coordinates. Because of their limited representational size, there is limited precision . In the context of the zooming user interface, the user is easily able to zoom in, causing the area which previously covered only a single pixel to fill the entire display. Conversely, the user may zoom out, causing the contents of the entire display to shrink to the size of a single pixel. Since each zoom in or out may multiply or divide the xy coordinates by numerous orders of magnitude, just a few such zooms completely exhaust the precision available with a 64 bit floating point number, for example. Thereafter, round-off causes noticeable degradation of image quality. It is an object of the present invention to provide a ZUI in which a larger range of zooms is possible.

It is a further object of the invention to provide a ZUI in which the precision in which coordinates are represented is related to the required precision needed at a particular zoom level of detail. It is a further object of the present invention to allow a pannable and zoomable two-dimensional space of a finite physical size, but of an arbitrarily high complexity or resolution, to be embedded into a well-defined area of a larger pannable and zoomable two-dimensional space. A further objective of the present invention is to allow zooming out after a deep zoom-in to behave like the "back" button of a web browser, letting the user retrace his or her steps through a visual navigation. A further objective of the present invention is to allow zooming in immediately after zooming out to behave analogously to the "forward" button of a web browser, letting the user precisely undo the effects of an arbitrarily long zoom-out.

A further objective of the present invention is to allow a node, a visual object as defined more precisely below, to have a very large number of child nodes (for example, up to 10^Λ28) .

A further objective of the present invention is to allow a node to generate its own children programmatically on the fly, enabling content to be defined, created or modified dynamically during navigation.

A further objective of the present invention is to enable near-immediate viewing of arbitrarily complex visual content, even if this content is ultimately represented using a very large amount of data, and even if the data are stored at a remote location and shared over a low-bandwidth network.

A further objective of the present invention is to allow the user to zoom arbitrarily far in on visual content while maintaining interactive frame rates. A further objective of the present invention is to allow the user to zoom arbitrarily far out to get an overview of complex visual content, in the process both preserving the overall appearance of the content and maintaining interactive frame rates. These and other broader objectives of the present invention will become apparent to those skilled in the art from a review of the specification that follows.

SUMMARY OF THE INVENTION The above and other objects of the present invention are accomplished by displaying visual content as plural "nodes." Each node preferably has its own coordinate system and rendering method, but may be contained within a parent node, and may be represented in the coordinate system and rendering method of the parent node. As a user navigates the visual content, by for example, zooming in or out, a node is only "launched" when the zooming results in an appropriate level of detail. The launching of the node causes the node to be represented in its own coordinate system and/or rendering method, rather than in the coordinate system and/or rendering method of a different node.

Prior to the node being launched, the node is either represented in the coordinate system of the parent node, or not represented at all. By launching nodes only when they are required, the precision of a coordinate system is a function of the zoom level of detail of what is being displayed. This allows a variable level of precision, up to and including the maximum permissible by the memory of the computer in which the system operates.

DESCRIPTION OF THE DRAWINGS

For the purposes of illustration, there are forms shown in the drawings that are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.

FIG. 18 is a depiction of visual content on a display;

FIG. 19 is an image of the visual content of FIG. 18 at a different level of detail; FIG. 20 is a representation of an embodiment of the invention;

FIG. 21 is an exemplary embodiment of the invention showing plural nodes on a display;

FIG 22 is a tree diagram corresponding to the exemplary embodiment shown in FIG 21.

FIG. 23 is a block diagram corresponding to a portion of the tree diagram of FIG. 22.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

We assume a user interface metaphor in which the display is a camera, through which the user can view part of a two-dimensional surface, or 2D universe. For convenience, although it is not necessary to do so, we ascribe physical dimensions to this universe, so that it may be, for example, one meter square. The invention is equally applicable to N- dimensional representations.

The exemplary universe in turn contains 2D objects, or nodes, which have a visual representation, and may also be dynamic or interactive (i.e. video clips, applications, editable text documents, CAD drawings, or still images) . For a node to be visible it must be associated with a rendering method, which is able to draw it in whole or in part on some area of the display. Each node is also endowed with a local coordinate system of finite precision. For illustrative purposes, we assume a node is rectangular and represented by a local coordinate system.

These two parameters, the rendering method and coordinate system, specify how to display the node, and the positions of items in the node. Each node may have 0 or more child nodes, which are addressed by reference. The node need not, and generally does not, contain all the information of each child node, but instead only an address providing information necessary to obtain the child node. As a user navigates, for example, zooms in and out, the nodes are displayed on the screen, as shown, for example in FIG. 18.

Generally, a "node" is the basic unit of functionality in the present invention. Most nodes manifest visually on the user's display during navigation, and some nodes may also be animated and/or respond to user input. Nodes are hierarchical, in that a node may contain child nodes. The containing node is then called a parent node. When a parent node contains a child node, the child's visual manifestation is also contained within the parent's visual manifestation. Each node has a logical coordinate system, such that the entire extent of the node is contained within an exemplary rectangle defined in this logical coordinate system; e.g. a node may define a logical coordinate system such that it is contained in the rectangle (0, 0) - (100, 100) .

Each node may have the following data defining its properties : o the node's logical coordinate system, including its logical size (100 x 100 in the above example) ; o the identities, positions and sizes of any child nodes, specified in the (parent) node's logical coordinates; o optionally, any necessary user data; - executable code defining these operations or "methods": o initialization of the node' s data based on "construction arguments" o rendering all or a portion of the node's visual appearance (the output of this method is a rendered tile) ; o optionally, responding to user input, such as keyboard or mouse events.

The executable code defines a "node class", and may be shared among many "node instances". Node instances differ in their data content. Hence a node class might define the logic needed to render a JPEG image. The "construction arguments" given to the initialization code would then include the URL of the JPEG image to display. A node displaying a particular image would be an instance of the JPEG node class. Plural instances of a node may be viewable in the same visual content, similar to the way a software application may be instantiated numerous times simultaneously. Note that in a complex visual document or application, it is usually possible to divide the necessary functionality into nodes in many different ways. For example, a scripted web- page-like document containing multiple images, pull-down menus and buttons could be implemented as a single node with complex rendering and user input methods . Alternatively, it could be implemented as a parent node which only defines the overall layout of the page, with every constituent image and button a child node. This has the obvious advantage of reusing or "factoring" the functionality more effectively: the buttons may all have the same behavior, and hence all be instances of the same node class; the images may all be in the same format and so also be instances of a common node class, etc. This also simplifies rearranging the layout— the parent node can easily move or resize the child nodes. In accordance with the present invention, visual content may be displayed in a manner that depends upon the state of navigation input by a user. For example, FIG. 18 shows a node 105 which may be the image of a portion of the city. Node 105 may contain child nodes 101-103. Node 101 may be an image of a building in the city, node 107 could be an image of a playground, and node 103 might be a sports arena. At the level of zoom shown, nodes 101-103 are relatively small, so they can be represented as a small darkened area with no detail in node 105, located at the correct location in the coordinate system of node 105. Only the coordinate system and the rendering method of node 105 is needed.

Consider the case where the user now zooms in so that a different level of detail (LOD) such as that shown in FIG. 19 results. In the LOD of FIG. 19, nodes 101 and 107 are no longer visible on the screen, due to the fact that the visual content is displayed as much larger. Additionally, it is noted that the because the size at which sports arena node 103 is displayed is now much larger, the details of the sports arena, such as the individual seats, the field, etc, now must be displayed.

In furtherance of the foregoing, sports arena node 103 would now be displayed not as a darkened area with no detail in the coordinate system of node 105, but rather, it would be "launched" to be displayed using its own coordinate system and rendering method. When displayed using its own coordinate system and rendering method, the details such as seating, the filed of play, etc. would be individually shown. Other functions discussed above, and associated with the node 103, would also begin executing at the point when node 103 is launched. The particular navigation condition that causes the launching of node 103, or any node for that matter, is a function of design choice and is not critical to the present invention . The precision with which the node 103 will be displayed is the combined precision of the coordinate system utilized by node 105, as well as that of node 103. Thus, for example, if the coordinate system of each of said nodes utilizes 8 bits, then the combined precision will be 16 bits because the coordinate system of node 103 is only utilized to specify the position of items in node 103, but the overall location of node 103 within node 105 is specified within the coordinate system of node 105. Note that this nesting may continue repeatedly if sports arena 103 itself contains additional nodes within it. For example, one such node 201 may in fact be a particular concession stand within the sports arena. It is represented without much detail in the coordinate system and rendering method of node 103. As a user continues zooming in on sports arena 103, at some point node 201 will launch. If it is displayed using 8 bits of precision, those 8 bits will specify where within the node 201 coordinate system particular items are to be displayed. Yet, the location of node 201 within node 103 will be maintained to 8 bits of precision within the coordinate system of node 103, the location of which will in turn be maintained within the coordinate system of node 105 using 8 bits. Hence, items within node 201 will ultimately be displayed using 24 bits of precision . By nesting nodes within nodes, the precision at which visual content may ultimately be displayed is limited only by the memory capacity of the computer. The ultimate precision with which visual content in a node is displayed after that node is launched is effectively the combined precision of all parent nodes and the precision of the node that has launched. Hence, depending upon the level of nesting, the precision may increase as needed limited only by the storage capacity of the computer, which is almost always much more than sufficient. Additionally, the increased precision is only utilized when necessary, because if the image is at an LOD that does not require launching, then in accordance with the above description, it will only be displayed with the precision of the node within which it is contained if that node has been launched. Thus, for nodes nested within other nodes, as one moves from the outermost node inward, one may traverse nodes that have launched until finally reaching a node that has not launched yet. Any such unlaunched node, and nodes further within it, will be displayed only with the precision of the last traversed node that has launched. This results in an "accordion" type precision, wherein the precision at which visual content is displayed expands and contracts as necessary and as dictated by the navigational input of the user, maximizing the efficiency of system resources by using them only when necessary for higher precision .

It is also noted, that when a node launches the display of that node changes from being based upon the coordinates and rendering method of the parent node to the coordinates and rendering method of the child node. That change is optimally made gradual through the use of blending, as described, for example, in copending US patent application no. 10/790,253. However, other methodologies of gradually changing from the display of the information in the coordinate system and rendering method the parent node to the child node are possible. The system could be programmed, for example, that over a particular range, the blending from parent to child occurs. Then, as the user traverses through that range during a zoom, the changeover occurs, unless the navigation is ceased during that range, in which case the blending may continue until fully displayed in the appropriate coordinate system.

An additional issue solved by the present invention relates to a system for maintaining the spatial interrelationship among all nodes during display. More particularly, during dynamic navigation such as zooming and panning, many different coordinate systems are being used to display potentially different nodes. Some nodes, as explained above, are being displayed merely as an image in the coordinate system of other nodes, and some are being displayed in their own coordinate systems. Indeed, the entire visual display may be populated with nodes displayed at different positions in different coordinate systems, and the coordinate systems and precisions used for the various nodes may vary during navigation as nodes are launched. Hence, it is important to ensure that the nodes are properly located with respect to each other, because each node is only knowledgeable of its own coordinate system. The present invention provides a technique for propagating relative location information among all of the nodes and for updating that information when needed so that each node will "know" the proper position in the overall view at which it should render itself.

The foregoing may be accomplished with the addition of a _field to the node structure and an additional address stack data structure. The expanded node definition includes a field which we term the "view" field, and which is used by the node to locate itself relative to the entire display. The view field represents, in the coordinates of that node, the visible area of the node—that is, the image of the display rectangle in the node's coordinates. This rectangle may only partially overlap the node's area, as when the node is partially offscreen. Clearly the view field cannot always be kept updated for every node, as we cannot necessarily traverse the entire directed graph of nodes in real time as navigation occurs. The stack structure is defined thus:

Stack<Address> viewStack; where this stack is a global variable of the client (the computer connected to the display) . For exemplary purposes we assume that navigation begins with an overview of a universe of content, defined by a root node; then this root node is pushed onto the viewStack, and the root node's view field might be initialized to be the entire area of the root node, i.e. rootNode.view = rootNode.coordSystem; Push (viewStack, rootNode) ;

Schematically, the viewStack will specify the addresses of a sequence of nodes "pierced" by a point relative to the display, which we will take in our exemplary implementation to be the center of the display. This sequence must begin with the root node, but may be infinite, and therefore must be truncated. In an exemplary embodiment, the sequence is truncated when the nodes "pierced" become smaller than some minimum size, defined as minimumArea. The current view is then represented by the view fields of all of the nodes in the viewStack, each of which specify the current view in terms of the node's local coordinate system. If a user has zoomed very deeply into a universe, then the detailed location of the display will be given most precisely by the view field of the last node in the stack. The last element's view field does not, however, specify the user's viewpoint relative to the entire universe, but only relative to its local coordinates. The view field of the root node, on the other hand, does specify where in the universe the user is looking. Nodes closer to the "fine end" of the viewStack thus specify the view position with increasing precision, but relative to progressively smaller areas in the universe. This is shown conceptually in FIG. 20, where it can be seen that of the three nodes 301, 302, and 303 that have been launched, node 303 provides the most accurate indication of where the user is looking, since its coordinate system is the "finest", but node 301 provides information, albeit not as fine, on a much larger area of the visual content.

The problem then reduces to the following: the views (i.e. view fields) of all visible nodes must be kept synchronized as the user navigates through the universe, panning and zooming. Failure to keep them synchronized would result in the appearance of nodes moving on the display independently of each other, rather than behaving as a cohesive and physically consistent 2D surface.

Changing the view during any navigation operation proceeds as follows . Because the last node in the viewStack has the most precise representation of the view, the first step is to alter the view field of this last node; this altered view is taken to be the correct new view, and any other visible nodes must follow along. The second step is to propagate the new view "upward" toward the root node, which entails making progressively smaller and smaller changes to the view fields of nodes earlier in the stack. If the user is deeply zoomed, then at some point in the upward propagation the alteration to the view may be so small that it ceases to be accurately representable; upward propagation stops at this node. At each stage of the upward propagation, the change is also propagated downward to other visible nodes. Hence, first, the last node's parent's view is modified; then, in the downward propagation, the last node's "siblings". The next upward propagation modified the grandparent' s view, and the second downward propagation modifies first uncles, then first cousins. The downward propagation is halted, as before, when the areas of "cousin nodes" become smaller than minimumArea, or when a node falls entirely offscreen.

The foregoing technique involves translating the layout of the various nodes into a tree, which conceptually is illustrated in FIGs. 21 and 22. As can be seen from FIGs. 21 and 22, there is a corresponding tree for a particular displayed set of nodes, and the tree structure may be used to propagate the view information as previously described. Each of the nodes 401 through 408 corresponds to a point on the tree with a similar number shown in FIG. 22. Shown conceptually in Figure 20 is the information that would be stored with respect to a node, such as the pointer and the information indicative of the size and location of another node, for example. FIG. 23 is a block diagram corresponding to a portion of the tree diagram of FIG. 22, in which exemplary values are employed to illustrate the data that may be used to define the properties of node 403. In this exemplary case, node 403 includes a pointer to child node 406, information about the logical size (100X100) of child node 406, and data specifying the location of child node 406 in the coordinates of node 403: 10, 20.

A panning operation may move the last node far enough away that it no longer belongs in the viewStack. Alternatively, zooming in might enlarge a child to the extent that a lengthening of the viewStack is required, or zooming out might bring the last node's area below a minimum area requiring a truncation of the viewStack. In all of these cases the identity of the last node changes. These situations are detected during the downward propagation, which may alter the viewStack accordingly, potentially leaving it longer or shorter.

One simple case of the foregoing is that during zooming, a node gets launched so that now it needs to be placed in the view stack. Another example is that by zooming out, a previously visible node becomes so small that it must be removed from the viewstack.

An extension of the idea is to avoid truncating the viewStack immediately in response to a long outward zoom. Truncating the viewStack is only necessary if the user then pans. Although a long outward zoom will cause the view fields of deeply zoomed nodes to grow very large (and therefore numerically inaccurate) , a field Point2D viewCenter; can be added to the Node structure, representing the central point of the view rectangle; zooming without panning therefore does not alter the viewCenter field of any node. This construction allows zooming far outward to be followed immediately by zooming back in. Because the viewStack has been left intact, the user can then return to precisely the starting view. This behavior is analogous to the "back" and "forward" buttons of a web browser: "back" is analogous to zooming out, and "forward" is analogous to zooming back in. In a web browser, if a user uses "back" to return to a previous web page, but then follows an alternate link, it is at this point that "forward" ceases to work. Following an alternate link is thus analogous to panning after zooming out.

The foregoing provides that visual content may be displayed and navigated in a variety of fashions with substantially infinite precision, limited only by the capacity of the computer system on which the application is running. The visual content displayed at any given time is then displayed as an assembly of nodes, wherein only the nodes needed for the particular view have been launched, and all other nodes are displayed without launching as part of another node or not at all. It is understood that various other embodiments will be apparent to those of ordinary skill in the art, and that the invention is not limited to the embodiments described herein.

METHODS AND APPARATUS FOR NAVIGATING AN IMAGE

BACKGROUND OF THE INVENTION The present invention relates to methods and apparatus for navigating, such as zooming and panning, over an image of an object in such a way as to provide the appearance of smooth, continuous navigational movement.

Most conventional graphical computer user interfaces (GUIs) are designed using visual components of fixed spatial scale, it has long been recognized, however, that visual components may be represented and manipulated such that they do not have a fixed spatial scale on the display; indeed, the visual components may be panned and/or zoomed in or out. The ability to zoom in and out on an image is desirable in connection with, for example, viewing maps, browsing through text layouts such as newspapers, viewing digital photographs, viewing blueprints or diagrams, and viewing other large data sets .

Many existing computer applications, such as Microsoft Word, Adobe Photo Shop, Adobe Acrobat, etc., include zoomable components. In general, the zooming capability provided by these computer applications is a peripheral aspect of a user's interaction with the software and the zooming feature is only employed occasionally. These computer applications permit a user to pan over an image smoothly and continuously (e.g., utilizing scroll bars or the cursor to translate the viewed image left, right, up or down) . A significant problem with such computer applications, however, is that they do not permit a user to zoom smoothly and continuously. Indeed, they provide zooming in discrete steps, such as 10%, 25%, 50%, 75%, 100%, 150%, 200%, 500%, etc. The user selects the desired zoom using the cursor and, in response, the image changes abruptly to the selected zoom level .

The undesirable qualities of discontinuous zooming also exist in Internet-based computer applications. The computer application underlying the www.mapquest.com website illustrates this point. The MapQuest website permits a user to enter one or more addresses and receive an image of a roadmap in response. FIGS. 1-4 are examples of images that one may obtain from the MapQuest website in response to a query for a regional map of Long Island, NY, U.S.A. The MapQuest website permits the user to zoom in and zoom out to discrete levels, such as 10 levels. FIG. 24 is a rendition at zoom level 5, which is approximately 100 meters/pixel. FIG. 25 is an image at a zoom level 6, which is about 35 meters/pixel. FIG. 26 is an image at a zoom level 7, which is about 20 meters/pixel. FIG. 27 is an image at a zoom level 9, which is about 10 meters/pixel.

As can be seen by comparing FIGs. 1-4, the abrupt transitions between zoom levels result in a sudden and abrupt loss of detail when zooming out and a sudden and abrupt addition of detail when zooming in. For example, no local, secondary or connecting roads may be seen in FIG. 24 (at zoom level 5) , although secondary and connecting roads suddenly appear in FIG. 25, which is the very next zoom level. Such abrupt discontinuities are very displeasing when utilizing the MapQuest website. It is noted, however, that even if the MapQuest software application were modified to permit a view of, for example, local streets at zoom level 5 (FIG. 24), the results would still be unsatisfactory. Although the visual density of the map would change with the zoom level such that at some level of zoom, the result might be pleasing (e.g., at level 7, FIG. 26) , as one zoomed in the roads would not thicken, making the map look overly sparse. As one zoomed out, the roads would eventually run into each other, rapidly forming a solid nest in which individual roads would be indistinguishable.

The ability to provide smooth, continuous zooming on images of road maps is problematic because of the varying levels of coarseness associated with the road categories. In the United States, there are about five categories of roads (as categorized under the Tiger/Line Data distributed by the U.S. Census Bureau) : Al, primary highways; A2, primary roads; A3, state highways, secondary roads, and connecting roads; A4, local streets, city streets and rural roads; and A5, dirt roads. These roads may be considered the elements of an overall object (i.e., a roadmap) . The coarseness of the road elements manifests because there are considerably more A4 roads than A3 roads, there are considerably more A3 roads than A2 roads, and there are considerably more A2 roads than Al roads. In addition, the physical dimensions of the roads (e.g., their widths), vary significantly. Al roads may be about 16 meters wide, A2 roads may be about 12 meters wide, A3 roads may be about 8 meters wide, A4 roads may be about 5 meters wide, and A5 roads may be about 2.5 meters wide.

The MapQuest computer application deals with these varying levels of coarseness by displaying only the road categories deemed appropriate at a particular zoom, level. For example, a nation-wide view might only show Al roads, while a state-wide view might show Al and A2 roads, and a county-wide view might show Al, A2 and A3 roads. Even if MapQuest were modified to allow continuous zooming of the roadmap, this approach would lead to the sudden appearance and disappearance of road categories during zooming, which is confusing and visually displeasing.

In view of the foregoing, there are needs in the art for new methods and apparatus for navigating images of complex objects, which permit smooth and continuous zooming of the image while also preserving visual distinctions between the elements of the objects based on their size or importance.

SUMM-VRY OF THE INVENTION

In accordance with one or more aspects of the present invention, methods and apparatus are contemplated to perform various actions, including: zooming into or out of an image having at least one object, wherein at least some elements of at least one object are scaled up and/or down in a way that is non-physically proportional to one or more zoom levels associated with the zooming. The non-physically proportional scaling may be expressed by the following formula: p = c • d ^• z^a, where p is a linear size in pixels of one or more elements of the object at the zoom level, c is a constant, d is a linear size in physical units of the one or more elements of the object, z is the zoom level in units of physical linear size/pixel, and a is a scale power where a ≠ -1.

Under non-physical scaling, the scale power a is not equal to -1 (typically -1 < a < 0) within a range of zoom levels zθ and zl, where zθ is of a lower physical linear size/pixel than zl. Preferably, at least one of zθ and zl may vary for one or more elements of the object. It is noted that a, c and d may also vary from element to element.

At least some elements of the at least one object may also be scaled up and/or down in a way that is physically proportional to one or more zoom levels associated with the zooming. The physically proportional scaling may be expressed by the following formula: p = c ^• d/z, where p is a linear size in pixels of one or more elements of the object at the zoom level, c is a constant, d is a linear size of the one or more elements of the object in physical units, and z is the zoom level in units of physical linear size/pixel.

It is noted that the methods and apparatus described thus far and/or described later in this document may be achieved utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, programmable digital devices or systems, programmable array logic devices, or any combination of the above. The invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit.

The elements of the object may be of varying degrees of coarseness. For example, as discussed above, the coarseness of the elements of a roadmap object manifests because there are considerably more A4 roads than A3 roads, there are considerably more A3 roads than A2 roads, and there are considerably more A2 roads than Al roads. Degree of coarseness in road categories also manifests in such properties as average road length, frequency of intersections, and maximum curvature. The coarseness of the elements of other image objects may manifest in other ways too numerous to list in their entirety. Thus, the scaling of the elements in a given predetermined image may be physically proportional or non-physically proportional based on at least one of: (i) a degree of coarseness of such elements; and (ii) the zoom level of the given predetermined image. For example, the object may be a roadmap, the elements of the object may be roads, and the varying degrees of coarseness may be road hierarchies. Thus; the scaling of a given road in a given predetermined image may be physically proportional or non-physically proportional based on: (i) the road hierarchy of the given road; and (ii) the zoom level of the given predetermined image.

In accordance with one or more further aspects of the present invention, methods and apparatus are contemplated to perform various actions, including: receiving at a client terminal a plurality of pre-rendered images of varying zoom levels of a roadmap; receiving one or more user navigation commands including zooming information at the client terminal; and blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands such that a display of the intermediate image on the client terminal provides the appearance of smooth navigation.

In accordance with one or more still further aspects of the present invention, methods and apparatus are contemplated to perform various actions, including: receiving at a client terminal a plurality of pre-rendered images of varying zoom levels of at least one object, at least some elements of the at least one object being scaled up and/or down in order to produce the plurality of pre-determined images, and the scaling being at least one of: (i) physically proportional to the zoom level; and

(ii) non-physically proportional to the zoom level; receiving one or more user navigation commands including zooming information at the client terminal; blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands; and displaying the intermediate image on the client terminal.

In accordance with one or more still further aspects of the present invention, methods and apparatus are contemplated to perform various actions, including: transmitting a plurality of pre-rendered images of varying zoom levels of a roadmap to a client terminal over a communications channel; receiving the plurality of pre-rendered images at the client terminal; issuing one or more user navigation commands including zooming information using the client terminal; and blending two or more of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands such that a display of the intermediate image on the client terminal provides the appearance of smooth navigation.

In accordance with one or more still further aspects of the present invention, methods and apparatus are contemplated to perform various actions, including: transmitting a plurality of pre-rendered images of varying zoom levels of at least one object to a client terminal over a communications channel, at least some elements of the at least one object being scaled up and/or down in order to produce the plurality of pre-determined images, and the scaling being at least one of: (i) physically proportional to the zoom level; and (ii) non-physically proportional to the zoom level; receiving the plurality of pre-rendered images at the client terminal; issuing one or more user navigation commands including zooming information using the client terminal; blending two of the pre-rendered images to obtain an intermediate image of an intermediate zoom level that corresponds with the zooming information of the navigation commands; and displaying the intermediate image on the client terminal.

Other aspects, features, and advantages will become apparent to one of ordinary skill in the art when the description herein is taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

For the purposes of illustrating the invention, forms are shown in the drawing, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.

FIG. 24 is an image taken from the MapQuest website, which is at a zoom level 5;

FIG. 25 is an image taken from the MapQuest website, which is at a zoom level 6;

FIG. 26 is an image taken from the MapQuest website, which is at a zoom level 7;

FIG. 27 is an image taken from the MapQuest website, which is at a zoom level 9; FIG. 28 is an image of Long Island produced at a zoom level of about 334 meters/pixel in accordance with one or more aspects of the present invention;

FIG. 29 is an image of Long Island produced at a zoom level of about 191 meters/pixel in accordance with one or more further aspects of the present invention;

FIG. 30 is an image of Long Island produced at a zoom level of about 109.2 meters/pixel in accordance with one or more further aspects of the present invention;

FIG. 31 is an image of Long Island produced at a zoom level of about 62.4 meters/pixel in accordance with one or more further aspects of the present invention;

FIG. 32 is an image of Long Island produced at a zoom level of about 35.7 meters/pixel in accordance with one or more further aspects of the present invention; FIG. 33 is an image of Long Island produced at a zoom level of about 20.4 meters/pixel in accordance with one or more further aspects of the present invention;

FIG. 34 is an image of Long Island produced at a zoom level of about 11.7 meters/pixel in accordance with one or more further aspects of the present invention;

FIG. 35 is a flow diagram illustrating process steps that may be carried out in order to provide smooth and continuous navigation of an image in accordance with one or more aspects of the present invention; FIG. 36 is a flow diagram illustrating further process steps that may be carried out in order to smoothly navigate an image in accordance with various aspects of the present invention;

FIG. 37 is a log-log graph of a line width in pixels versus a zoom level in meters/pixel illustrating physical and non-physical scaling in accordance with one or more further aspects of the present invention; and

FIG. 38 is a log-log graph illustrating variations in the physical and non-physical scaling of FIG. 37.

FIGS. 16A-D illustrate respective antialiased vertical lines whose endpoints are precisely centered on pixel coordinates;

FIGS. 17A-C illustrate respective antialiased lines on a slant, with endpoints not positioned to fall at exact pixel coordinates; and

FIG. 41 is the log-log graph of line width versus zoom level of FIG. 37 including horizontal lines indicating incremental line widths, and vertical lines spaced such that the line width over the interval between two adjacent vertical lines changes by no more than two pixels.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to the drawings, wherein like numerals indicate like elements, there is shown in FIGS. 5-11 a series of images representing the road system of Long Island, NY, U.S.A. where each image is at a different zoom level (or resolution) . Before delving into the technical details of how the present invention is implemented, these images will now be discussed in connection with desirable resultant features of using the invention, namely, at least the appearance of smooth and continuous navigation, particularly zooming, while maintaining informational integrity.

It is noted that the various aspects of the present invention that will be discussed below may be applied in contexts other than the navigation of a roadmap image . Indeed, the extent of images and implementations for which the present invention may be employed are too numerous to list in their entirety. For example, the features of the present invention may be used to navigate images of the human anatomy, complex topographies, engineering diagrams such as wiring diagrams or blueprints, gene ontologies, etc. It has been found, however, that the invention has particular applicability to navigating images in which the elements thereof are of varying levels of detail or coarseness. Therefore, for the purposes of brevity and clarity, the various aspects of the present invention will be discussed in connection with a specific example, namely, images of a roadmap. Although it is impossible to demonstrate the appearance of smooth and continuous zooming in a patent document, this feature has been demonstrated through experimentation and prototype development by executing a suitable software program on a Pentium-based computer. The image IOOA of the roadmap illustrated in FIG. 28 is at a zoom level that may be characterized by units of physical length/pixel (or physical linear size/pixel) . In other words, the zoom level, z, represents the actual physical linear size that a single pixel of the image IOOA represents. In FIG. 28, the zoom level is about 334 meters/pixel. Those skilled in the art will appreciate that the zoom level may be expressed in other units without departing from the spirit and scope of the claimed invention. FIG. 29 is an image IOOB of the same roadmap as FIG. 28, although the zoom level, z, is about 191 meters/pixel. In accordance with one or more aspects of the present invention, a user of the software program embodying one or more aspects of the invention may zoom in or out between the levels illustrated in FIGS. 5 and 6. It is significant to note that such zooming has the appearance of smooth and continuous transitions from the 334 meters/pixel level (FIG. 28) to/from the 191 meters/pixel level (FIG. 29) and any levels therebetween. Likewise, the user may zoom to other levels, such as z = 109.2 meters/pixel (FIG. 30), z = 62.4 meters/pixel (FIG. 31), z = 35.7 meters/pixel (FIG. 32), z = 20.4 meters/pixel (FIG. 33), and z = 11.7 meters/pixel (FIG. 34). Again, the transitions through these zoom levels and any levels therebetween advantageously have the appearance of smooth and continuous movements .

Another significant feature of the present invention as illustrated in FIGS. 5-11 is that little or no detail abruptly appears or disappears when zooming from one level to another level. The detail shown in FIG. 31 (at the zoom level of z = 62.4 meters/pixel) may also be found in FIG. 28 (at a zoom level of z = 334 meters/pixel) . This is so even though the image object, in this case the roadmap, includes elements (i.e., roads) of varying degrees of coarseness. Indeed, the roadmap IOOD of FIG. 31 includes at least Al highways such as 102, A3 secondary roads such as 104, and A4 local roads such as 106. Yet these details, even the A4 local roads 106, may still be seen in image IOOA of FIG. 28, which is substantially zoomed out in comparison with the image IOOD of FIG. 31.

Still further, despite that the A4 local roads 106 may be seen at the zoom level of z = 334 meters/pixel (FIG. 28) the Al, A2, A3, and A4 roads may be distinguished from one another. Even differences between Al primary highways 102 and A2 primary roads 108 may be distinguished from one another vis-a-vis the relative weight given to such roads in the rendered image IOOA.

The ability to distinguish among the road hierarchies is also advantageously maintained when the user continues to zoom in, for example, to the zoom level of z = 20.4 meters/pixel as illustrated in image IOOF of FIG. 33. Although the weight of the Al primary highway 102 significantly increases as compared with the zoom level of z = 62.4 meters/pixel in FIG. 31, it does not increase to such an extent as to obliterate other detail, such as the A4 local roads 106 or even the A5 dirt roads. Nevertheless, the weights of the roads at lower hierarchical levels, such as A4 local roads 106 significantly increase in weight as compared with their counterparts at the zoom level z = 62.4 meters/pixel in FIG. 31.

Thus, even though the dynamic range of zoom levels between that illustrated in FIG. 28 and that illustrated in FIG. 34 is substantial and detail remains substantially consistent (i.e., no roads suddenly appear or disappear while smoothly zooming) , the information that the user seeks to obtain at a given zooming level is not obscured by undesirable artifacts. For example, at the zoom level of z = 334 meters/pixel (FIG. 28), the user may wish to gain a general sense of what primary highways exist and in what directions they extend. This information may readily be obtained even though the A4 local roads 106 are also depicted. At the zoom level of z = 62.4 meters/pixel (FIG. 31), the user may wish to determine whether a particular Al primary highway 102 or A2 primary road 108 services a particular city or neighborhood. Again, the user may obtain this information without interference from other much more detailed information, such as the existence and extent of A4 local roads 106 or even A5 dirt roads. Finally, at the zoom level of z = 11.7 meters/pixel, a user may be interested in finding a particular A4 local road such as 112, and may do so without interference by significantly larger roads such as the Al primary highway 102.

In order to achieve one or more of the various aspects of the present invention discussed above, it is contemplated that one or more computing devices execute one or more software programs that cause the computing devices to carry out appropriate actions. In this regard, reference is now made to FIGS. 12-13, which are flow diagrams illustrating process steps that are preferably carried out by the one or more computing devices and/or related equipment.

While it is preferred that the process flow is carried out by commercially available computing equipment (such as Pentium-based computers) , any of a number of other techniques may be employed to carry out the process steps without departing from the spirit and scope of the present invention as claimed. Indeed, the hardware employed may be implemented utilizing any other known or hereinafter developed technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, one or more programmable digital devices or systems, such as programmable read only memories (PROMs) , programmable array logic devices (PALs), any combination of the above, etc. Further, the methods of the present invention may be embodied in a software program that may be stored on any of the known or hereinafter developed media.

FIG. 35 illustrates an embodiment of the invention in which a plurality of images are prepared (each at a different zoom level or resolution) , action 200, and two or more of the images are blended together to achieve the appearance of smooth navigation, such as zooming (action 206) . Although not required to practice the invention, it is contemplated that the approach illustrated in FIG. 35 be employed in connection with a service provider - client relationship. For example, a service provider would expend the resources to prepare a plurality of pre-rendered images (action 200) and make the images available to a user's client terminal over a communications channel, such as the Internet (action 202) . Alternatively, the pre-rendered images may be an integral or related part of an application program that the user loads and executes on his or her computer.

It has been found through experimentation that, when the blending approach is used, a set of images at the following zoom levels work well when the image object is a roadmap: 30 meters/pixel, 50 meters/pixel, 75 meters/pixel, 100 meters/pixel, 200 meters/pixel, 300 meters/pixel, 500 meters/pixel, 1000 meters/pixel, and 3000 meters/pixel. It is noted, however, that any number of images may be employed at any number of resolutions without departing from the scope of the invention. Indeed, other image objects in other contexts may be best served by a larger or smaller number of images, where the specific zoom levels are different from the example above.

Irrespective of how the images are obtained by the client terminal, in response to user-initiated navigation commands (action 204), such as zooming commands, the client terminal is preferably operable to blend two or more images in order to produce an intermediate resolution image that coincides with the navigation command (action 206) . This blending may be accomplished by a number of methods, such as the well-known trilinear interpolation technique described by Lance Williams, Pyramidal Parametrics, Computer Graphics, Proc. SIGGRAPH ^Λ83, 17(3): 1-11 (1983), the entire disclosure of which is incorporated herein by reference. Other approaches to image interpolation are also useful in connection with the present invention, such as bicubic-linear interpolation, and still others may be developed in the future. It is noted that the present invention does not require or depend on any particular one of these blending methods. For example, as shown in FIG. 31, the user may wish to navigate to a zoom level of 62.4 meters/pixel. As this zoom level may be between two of the pre-rendered images (e.g., in this example between zoom level 50 meters/pixel and zoom level 75 meters/pixel), the desired zoom level of 62.4 meters/pixel may be achieved using the trilinear interpolation technique. Further, any zoom level between 50 meters/pixel and 75 meters/pixel may be obtained utilizing a blending method as described above, which if performed quickly enough provides the appearance of smooth and continuous navigation. The blending technique may be carried through to other zoom levels, such as the 35.7 meters/pixel level illustrated in FIG. 32. In such case, the blending technique may be performed as between the pre-rendered images of 30 meters/pixel and 50 meters/pixel of the example discussed thus far.

The above blending approach may be used when the computing power of the processing unit on which the invention is carried out is not high enough to (i) perform the rendering operation in the first instance, and/or (ii) perform image rendering "just-in-time" or "on the fly" (for example, in real time) to achieve a high image frame rate for smooth navigation. As will be discussed below, however, other embodiments of the invention contemplate use of known, or hereinafter developed, high power processing units that are capable of rendering at the client terminal for blending and/or high frame rate applications.

The process flow of FIG. 36 illustrates the detailed steps and/or actions that are preferably conducted to prepare one or more images in accordance with the present invention. At action 220, the information is obtained regarding the image object or objects using any of the known or hereinafter developed techniques. Usually, such image objects have been modeled using appropriate primitives, such as polygons, lines, points, etc. For example, when the image objects are roadmaps, models of the roads in any Universal Transverse Mercator (UTM) zone may readily be obtained. The model is usually in the form of a list of line segments (in any coordinate system) that comprise the roads in the zone. The list may be converted into an image in the spatial domain (a pixel image) using any of the known or hereinafter developed rendering processes so long as it incorporates certain techniques for determining the weight (e.g., apparent or real thickness) of a given primitive in the pixel (spatial) domain. In keeping with the roadmap example above, the rendering processes should incorporate certain techniques for determining the weight of the lines that model the roads of the roadmap in the spatial domain. These techniques will be discussed below.

At action 222 (FIG. 36) , the elements of the object are classified. In the case of a roadmap object, the classification may take the form of recognizing already existing categories, namely, Al, A2, A3, A4, and A5. Indeed, these road elements have varying degrees of coarseness and, as will be discussed below, may be rendered differently based on this classification. At action 224, mathematical scaling is applied to the different road elements based on the zoom level. As will be discussed in more detail below, the mathematical scaling may also vary based on the element classification.

By way of background, there are two conventional techniques for rendering image elements such as the roads of a map: actual physical scaling, and pre-set pixel width. The actual physical scaling technique dictates that the roadmap is rendered as if viewing an actual physical image of the roads at different scales. Al highways, for example, might be 16 meters wide, A2 roads might be 12 meters wide, A3 roads might be 8 meters wide, A4 roads might be 5 meters wide, and A5 roads might be 2.5 meters wide. Although this might be acceptable to the viewer when zoomed in on a small area of the map, as one zooms out, all roads, both major and minor, become too thin to make out. At some zoom level, say at the state level (e.g., about 200 meters/pixel), no roads would be seen at all. The pre-set pixel width approach dictates that every road is a certain pixel width, such as one pixel in width on the display.

Major roads, such as highways, may be emphasized by making them two pixels wide, etc. Unfortunately this approach makes the visual density of the map change as one zooms in and out. At some level of zoom, the result might be pleasing, e.g., at a small-size county level. As one zooms in, however, roads would not thicken, making the map look overly sparse. Further, as one zooms out, roads would run into each other, rapidly forming a solid nest in which individual roads would be indistinguishable. In accordance with one or more aspects of the present invention, at action 224, the images are produced in such a way that at least some image elements are scaled up and/or down either (i) physically proportional to the zoom level/ or (ii) non-physically proportional to the zoom level, depending on parameters that will be discussed in more detail below. It is noted that the scaling being "physically proportional to the zoom level" means that the number of pixels representing the road width increases or decreases with the zoom level as the size of an element would appear to change with its distance from the human eye. The perspective formula, giving the apparent length y of an object of physical size d, is: y = c ^• d/x, where c is a constant determining the angular perspective and x is the distance of the object from the viewer. In the present invention, the linear size of an object of physical linear size d' in display pixels p is given by

P = d' ^• z^a, where z is the zoom level in units of physical linear size/pixel (e.g. meters/pixel), and a is a power law. When a = -1 and d' = d (the real physical linear size of the object) , this equation is dimensionally correct and becomes equivalent to the perspective formula, with p = y and z = x/c. This expresses the equivalence between physical zooming and perspective transformation: zooming in is equivalent to moving an object closer to the viewer, and zooming out is equivalent to moving the object farther away.

To implement non-physical scaling, a may be set to a power law other than -1, and d' may be set to a physical linear size other than the actual physical linear size d. In the context of a road map, where p may represent the displayed width of a road in pixels and d' may represent an imputed width in physical units, "non-physically proportional to the zoom level" means that the road width in display pixels increases or decreases with the zoom level in a way other than being physically proportional to the zoom level, i.e. a ≠ -1. The scaling is distorted in a way that achieves certain desirable results.

It is noted that "linear size" means one-dimensional size. For example, if one considers any 2 dimensional object and doubles its "linear size" then one multiplies the area by 4 = 2². In the two dimensional case, the linear sizes of the elements of an object may involve length, width, radius, diameter, and/or any other measurement that one can read off with a ruler on the Euclidean plane. The thickness of a line, the length of a line, the diameter of a circle or disc, the length of one side of a polygon, and the distance between two points are all examples of linear sizes. In this sense the "linear size" in two dimensions is the distance between two identified points of an object on a 2D Euclidean plane. For example, the linear size can be calculated by taking the square root of (dx² + dy²) , where dx = xl - xθ, dy = yl - yθ, and the two identified points are given by the Cartesian coordinates (xθ, yθ) and (xl, yl) .

The concept of "linear size" extends naturally to more than two dimensions; for example, if one considers a volumetric object, then doubling its linear size involves increasing the volume by 8 = 2³. Similar measurements of linear size can also be defined for non-Euclidean spaces, such as the surface of a sphere.

Any power law a < 0 will cause the rendered size of an element to decrease as one zooms out, and increase as one zooms in. When a < -1, the rendered size of the element will decrease faster than it would with proportional physical scaling as one zooms out. Conversely, when -1 < a < 0, the size of the rendered element decreases more slowly than it would with proportional physical scaling as one zooms out.

In accordance with at least one aspect of the invention, p(z), for a given length of a given object, is permitted to be substantially continuous so that during navigation the user does not experience a sudden jump or discontinuity in the size of an element of the image (as opposed to the conventional approaches that permit the most extreme discontinuity - a sudden appearance or disappearance of an element during navigation) . In addition, it is preferred that p(z) monotonically decrease with zooming out such that zooming out causes the elements of the object become smaller (e.g., roads to become thinner), and such that zooming in causes the elements of the object become larger. This gives the user a sense of physicality about the object (s) of the image.

The scaling features discussed above may be more fully understood with reference to FIG. 37, which is a log-log graph of a rendered line width in pixels for an Al highway versus the zoom level in meters/pixel. (Plotting log(z) on the x-axis and log(p) on the y-axis is convenient because the plots become straight lines due to the relationship log(x^a) = a-log(x)). The basic characteristics of the line (road) width versus zoom level plot are:

(i) that the scaling of the road widths may be physically proportional to the zoom level when zoomed in (e.g., up to about 0.5 meters/pixel);

(ii) that the scaling of the road widths may be non-physically proportional to the zoom level when zoomed out (e.g., above about 0.5 meters/pixel); and (iii) that the scaling of the road widths may be physically proportional to the zoom level when zoomed further out (e.g., above about 50 meters/pixel or higher depending on parameters which will be discussed in more detail below) .

As for the zone in which the scaling of the road widths is physically proportional to the zoom level, the scaling formula of p = d' ^• z^a, is employed where a = -1. In this example, a reasonable value for the physical width of an actual Al highway is about d' = 16 meters. Thus, the rendered width of the line representing the Al highway monotonically decreases with physical scaling as one zooms out at least up to a certain zoom level zθ, say zθ = 0.5 meters/pixel.

The zoom level for zθ = 0.5 is chosen to be an inner scale below which physical scaling is applied. This avoids a non-physical appearance when the roadmap is combined with other fine-scale GIS content with real physical dimensions. In this example, zθ = 0.5 meters/pixel, or 2 pixels/meter, which when expressed as a map scale on a 15 inch display (with 1600x1200 pixel resolution) corresponds to a scale of about 1:2600. At d = 16 meters, which is a reasonable real physical width for Al roads, the rendered road will appear to be its actual size when one is zoomed in (0.5 meters/pixel or less). At a zoom level of 0.1 meters/pixel, the rendered line is about 160 pixels wide. At a zoom level of 0.5 meters/pixel, the rendered line is 32 pixels wide.

As for the zone in which the scaling of the road widths is non-physically proportional to the zoom level, the scaling formula of p = d' • z^a, is employed where -1 < a < 0 (within a range of zoom levels zθ and zl) . In this example, the non-physical scaling is performed between about z0=0.5 meters/pixel and zl=3300 meters/pixel. Again, when -1 < a < 0, the width of the rendered road decreases more slowly than it would with proportional physical scaling as one zooms out. Advantageously, this permits the Al road to remain visible (and distinguishable from other smaller roads) as one zooms out. For example, as shown in FIG. 28, the Al road 102 remains visible and distinguishable from other roads at the zoom level of z = 334 meters/pixel. Assuming that the physical width of the Al road is d' = d = 16 meters, the width of the rendered line using physical scaling would have been about 0.005 pixels at a zoom level of about 3300 meters/pixel, rendering it virtually invisible. Using non-physical scaling, however, where -1 < a < 0 (in this example, a is about -0.473), the width of the rendered line is about 0.8 pixels at a zoom level of 3300 meters/pixel, rendering it clearly visible .

It is noted that the value for zl is chosen to be the most zoomed-out scale at which a given road still has "greater than physical" importance. By way of example, if the entire U.S. is rendered on a 1600x1200 pixel display, the resolution would be approximately 3300 meters/pixel or 3.3 kilometers/pixel. If one looks at the entire world, then there may be no reason for U.S. highways to assume enhanced importance relative to the view of the country alone .

Thus, at zoom levels above zl, which in the example above is about 3300 meters/pixel, the scaling of the road widths is again physically proportional to the zoom level, but preferably with a large d' (much greater than the real width d) for continuity of p(z) . In this zone, the scaling formula of p = d' ^• z^a is employed where a = -1. In order for the rendered road width to be continuous at zl = 3300 meters/pixel, a new imputed physical width of the Al highway is chosen, for example, d' = 1.65 kilometers. zl and the new value for d' are preferably chosen in such a way that, at the outer scale zl, the rendered width of the line will be a reasonable number of pixels. In this case, at a zoom level in which the entire nation may be seen on the display (about 3300 meters/pixel) , Al roads may be about ^ pixel wide, which is thin but still clearly visible; this corresponds to an imputed physical road width of 1650 meters, or 1.65 kilometers. The above suggests a specific set of equations for the rendered line width as a function of the zoom level:

P ( Z ) = d0 ^• z , if z < zθ

P ( Z ) = dl ^• z^a, if zθ < Z < zl ,

P ( z) = d2 if z > zl

The above form of p(z) has six parameters: zθ, zl, dθ, dl, d2 and a. zθ and zl mark the scales at which the behavior of p(z) changes. In the zoomed-in zone (z ≤ zθ) , zooming is physical (i.e., the exponent of z is -1), with a physical width of dθ, which preferably corresponds to the real physical width d.

In the zoomed-out zone (z ≥ zl) , zooming is again physical, but with a physical width of dl, which in general does not correspond to d. Between zθ and zl, the rendered line width scales with a power law of a, which can be a value other than -1. Given the preference that p(z) is continuous, specifying zθ, zl, d0 and d2 is sufficient to uniquely determine dl and a, which is clearly shown in FIG . 37 .

The approach discussed above with respect to Al roads may be applied to the other road elements of the roadmap object. An example of applying these scaling techniques to the Al, A2, A3, A4, and A5 roads is illustrated in the log-log graph of FIG. 38.

In this example, zθ = 0.5 meters/pixel for all roads, although it may vary from element to element depending on the context. As

A2 roads are generally somewhat smaller that Al roads, d0 = 12 meters. Further, A2 roads are "important," e.g., on the U.S. state level, so zl = 312 meters/pixel, which is approximately the rendering resolution for a single state (about 1/10 of the country in linear scale) . At this scale, it has been found that line widths of one pixel are desirable, so d2 = 312 meters is a reasonable setting. Using the general approach outlined above for Al and A2 roads, the parameters of the remaining elements of the roadmap object may be established. A3 roads: d0 = 8 meters, zθ = 0.5 meters/pixel, zl = 50 meters/pixel, and d2 = 100 meters. A4 streets: d0 = 5 meters, zθ = 0.5 meters/pixel, zl = 20 meters/pixel, and d2 = 20 meters. And A5 dirt roads: d0 = 2.5 meters, zθ = 0.5 meters/pixel, zl = 20 meters/pixel, and d2 = 20m. It is noted that using these parameter settings, A5 dirt roads look more and more like streets at zoomed-out zoom levels, while their physical scale when zoomed in is half as wide .

The log-log plot of FIG. 38 summarizes the scaling behaviors for the road types . It is noted that at every scale the apparent width of A1>A2>A3>A4>=A5. Note also that, with the exception of dirt roads, the power laws all come out in the neighborhood of a = -0.41. The dotted lines all have a slope of -1 and represent physical scaling at different physical widths. From the top down, the corresponding physical widths of these dotted lines are: 1.65 kilometers, 312 meters, 100 meters, 20 meters, 16 meters, 12 meters, 8 meters, 5 meters, and 2.5 meters. When interpolation between a plurality of pre-rendered images is used, it is possible in many cases to ensure that the resulting interpolation is humanly indistinguishable or nearly indistinguishable from an ideal rendition of all lines or other primitive geometric elements at their correct pixel widths as determined by the physical and non-physical scaling equations. To appreciate this alternative embodiment of the current invention, some background on antialiased line drawing will be presented below. The discussion of antialiased line drawing will be presented in keeping with the roadmap example discussed at length above, in which all primitive elements are lines, and the line width is subject to the scaling equations as described previously. With reference to FIG. 39A, a one pixel wide vertical line drawn in black on white background, such that the horizontal position of the line is aligned exactly to the pixel grid, consists simply of a 1-pixel-wide column of black pixels on a white background. In accordance with various aspects of the present invention, it is desirable to consider and accommodate the case where the line width is a non-integral number of pixels. With reference to FIG. 39B, if the endpoints of a line remain fixed, but the weight of the line is increased to be 1.5 pixels wide, then on an antialiased graphics display, the columns of pixels to the left and right of the central column are drawn at 25% grey. With reference to FIG. 39C, at 2 pixels wide, these flanking columns are drawn at 50% grey. With reference to FIG. 39D, at 3 pixels wide, the flanking columns are drawn at 100% black, and the result is three solid black columns as expected.

This approach to drawing lines of non-integer width on a pixellated display results in a sense (or illusion) of visual continuity as line width changes, allowing lines of different widths to be clearly distinguished even if they differ in width only by a fraction of a pixel. In general, this approach, known as antialiased line drawing, is designed to ensure that the line integral of the intensity function (or "1-intensity" function, for black lines on a white background) over a perpendicular to the line drawn is equal to the line width. This method generalizes readily to lines whose endpoints do not lie precisely in the centers of pixels, to lines which are in other orientations than vertical, and to curves.

Note that drawing the antialiased vertical lines of FIGS.

16A-D could also be accomplished by alpha-blending two images, one (image A) in which the line is 1 pixel wide, and the other

(image B) in which the line is 3 pixels wide. Alpha blending assigns to each pixel on the display (1-alpha) * (corresponding pixel in image A) + alpha* (corresponding pixel in image B) . As alpha is varied between zero and one, the effective width of the rendered line varies smoothly between one and three pixels. This alpha-blending approach only produces good visual results in the most general case if the difference between the two rendered line widths in images A and B is one pixel or less; otherwise, lines may appear haloed at intermediate widths. This same approach can be applied to rendering points, polygons, and many other primitive graphical elements at different linear sizes. Turning again to FIGS. 16A-D, the 1.5 pixel-wide line (FIG. 39B) and the 2 pixel-wide line (FIG. 39C) can be constructed by alpha-blending between the 1 pixel wide line (FIG. 39A) and the 3 pixel wide line (FIG. 39D) . With reference to FIGS. 17A-C, a 1 pixel wide line (FIG. 40A), a 2 pixel wide line (FIG. 40B) and a 3 pixel wide line (FIG. 40C) are illustrated in an arbitrary orientation. The same principle applies to the arbitrary orientation of FIGS. 17A-C as to the case where the lines are aligned exactly to the pixel grid, although the spacing of the line widths between which to alpha-blend may need to be finer than two pixels for good results.

In the context of the present map example, a set of images of different resolutions can be selected for pre-rendition with reference to the log-log plots of FIGS. 14-15. For example, reference is now made to FIG. 41, which is substantially similar to FIG. 37 except that FIG. 41 includes a set of horizontal lines and vertical lines. The horizontal lines indicate line widths between 1 and 10 pixels, in increments of one pixel. The vertical lines are spaced such that line width over the interval between two adjacent vertical lines changes by no more than two pixels. Thus, the vertical lines represent a set of zoom values suitable for pre-rendition, wherein alpha-blending between two adjacent such pre-rendered images will produce characteristics nearly equivalent to rendering the lines representing roads at continuously variable widths. Interpolation between the six resolutions represented by the vertical lines shown in FIG. 41 is sufficient to render the Al highways accurately using the scaling curve shown at about nine meters/pixel and above. Rendition below about nine meters/pixel does not require pre-rendition, as such views are very zoomed-in and thus show very few roads, making it more computationally efficient (and more efficient with respect to data storage requirements) to render them vectorially than to interpolate between pre-rendered images. At resolutions of more than about 1000 meters/pixel (such views encompass large fractions of the Earth's surface), the final pre-rendered image alone can be used, as it is a rendition using 1 pixel wide lines. Lines that are thinner than a single pixel render the same pixels more faintly.

Hence, to produce an image in which the Al lines are 0.5 pixels wide, the 1 pixel wide line image can be multiplied by an alpha of 0.5.

In practice, a somewhat larger set of resolutions are pre- rendered, such that over each interval between resolutions, none of the scaling curves of FIG. 38 varies by more than one pixel. Reducing the allowed variation to one pixel can result in improved rendering quality. Notably, the tiling techniques contemplated and discussed in the following co-pending application may be considered in connection with the present invention: U.S. Patent Application No. 10/790,253, entitled SYSTEM AND METHOD FOR EXACT RENDERING IN A ZOOMING USER INTERFACE, filed March 1, 2004, Attorney Docket No. 489/2, the entire disclosure of which is hereby incorporated by reference. This tiling technique may be employed for resolving an image at a particular zoom level, even if that level does not coincide with a pre-rendered image. If each image in the somewhat larger set of resolutions is pre-rendered at the appropriate resolution and tiled, then the result is a complete system for zooming and panning navigation through a roadmap of arbitrary complexity, such that all lines appear to vary in width continuously in accordance with the scaling equations disclosed herein. Additional details concerning other techniques for blending images, which may be employed in connection with implementing the present invention, may be found in U.S. Provisional Patent Application No. 60/475,897, entitled SYSTEM AND METHOD FOR THE EFFICIENT, DYNAMIC AND CONTINUOUS DISPLAY OF MULTI RESOLUTIONAL VISUAL DATA, filed June 5, 2003, the entire disclosure of which is hereby incorporated by reference. Still further details concerning blending techniques that may be employed in connection with implementing the present invention may be found in U.S. Provisional Patent Application Serial No. 60/453,897, filed March 12, 2003, entitled SYSTEM AND METHOD FOR FOVEATED, SEAMLESS, PROGRESSIVE RENDERING IN A ZOOMING USER INTERFACE, the entire disclosure of which is hereby incorporated by reference.

Advantageously, employing the above-discussed aspects of the present invention, the user enjoys the appearance of smooth and continuous navigation through the various zoom levels. Further, little or no detail abruptly appears or disappears when zooming from one level to another. This represents a significant advancement over the state of the art.

It is contemplated that the various aspects of the present invention may be applied in numerous products, such as interactive software applications over the Internet, automobile-based software applications and the like. For example, the present invention may be employed by an Internet website that provides maps and driving directions to client terminals in response to user requests. Alternatively, various aspects of the invention may be employed in a GPS navigation system in an automobile. The invention may also be incorporated into medical imaging equipment, whereby detailed information .concerning, for example, a patient's circulatory system, nervous system, etc. may be rendered and navigated as discussed hereinabove . The applications of the invention are too numerous to list in their entirety, yet a skilled artisan will recognize that they are contemplated herein and fall within the scope of the invention as claimed. The present invention may also be utilized in connection with other applications in which the rendered images provide a means for advertising and otherwise advancing commerce. Additional details concerning these aspects and uses of the present invention may be found in U.S. Provisional Patent Application No. 60/553,803, entitled METHODS AND APPARATUS FOR EMPLOYING MAPPING TECHNIQUES TO ADVANCE COMMERCE, filed on even date herewith, Attorney Docket No. 489/7, the entire disclosure of which is hereby incorporated by reference.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be -devised without departing from the spirit and scope of the present invention as defined by the appended claims.

EFFICIENT DATA CACHE

BACKGROUND

"MRU caching, " where MRU stands for "most recently used, " is a known concept for implementing a client-side memory in a client-server system. It is assumed that the server has access to and can serve to a client a large number of data objects, which in the aggregate may occupy a large amount of memory. The available bandwidth between client and server is limited, however, so client requests for data objects to be sent from the server take time. If access to data objects is reasonably "coherent," meaning that objects which the client needed recently are likely to be needed again in the near future, then MRU caching may increase the efficiency of the client-server system. Employing this approach, the client generally sets aside some limited amount of memory (generally much less than would be needed to store all of the objects on the server) , and stores in this memory (a cache) as many of the most recently requested objects as will fit. When a new object is sent from the server to the client and the client's cache space has run out, the least recently used (LRU) object is erased from the cache to create data storage space in which the new object may be stored.

Generally, when the client needs a data object, the cache is first examined to see if the object is cached. If it is cached, then the cached representation is used, obviating the need for a slow or computationally expensive server request.

Usually, making use of a cached representation also "promotes" that object to the MRU end of the cache. This approach generally provides substantial performance advantages over having to request data from the server for every data object accessed. The erasure of the least recently used object from a cache when a new object is accessed by a computing system and stored in the cache may cause inefficiency in cache usage. Even the erased, least-recently-used object in the cache may be again requested by the server. When this happens, the server may undertake the relatively slow or computationally expensive task of retrieving this object from a more remote source of data storage, such as a main memory or mass storage device. Given the finite size of cache memories, object erasure may occur with some frequency, thereby causing a server or other computing system to expend significant resources accessing more remote memories to get data that was once conveniently stored in a cache memory. Accordingly, there is a need in the art for a more efficient and flexible approach to cache memory management.

SUMMARY OF THE INVENTION

According to one aspect, the invention may provide a method, comprising providing a cache in a computing system having an initial group of cache objects, each cache object having an initial compression ratio and including stored data; decreasing an amount of data storage space occupied by at least one of the cache objects other than a given one of the cache objects; and increasing an amount of data storage space occupied by the given cache object. Preferably, the decreasing comprises decreasing the amount of data storage space by a given amount. Preferably, the increasing comprises increasing the amount of data storage space occupied by the given cache object by the given amount. Preferably, the decreasing comprises increasing the initial compression ratio of the at least one cache object. Preferably, the increasing comprises decreasing the initial compression ratio of the given cache object. Preferably, the given cache object is a most recently used cache object of the cache objects. Preferably, the at least one cache object undergoing the decreasing step comprises a least recently used cache object of the cache objects.

Preferably, the decreasing comprises removing a portion of the stored data for the at least one cache object. Preferably, the increasing comprises supplementing the stored data for the given cache object. Preferably, an amount of data storage space available for each of the cache objects may equal one of a finite number of discrete values. Preferably, the decreasing comprises reducing the amount of data storage space for at least one randomly selected cache object of the cache objects, other than the given cache object. Preferably, the reducing comprises reducing the amount of data storage space for the at least one randomly selected cache object to a randomly determined extent. Preferably, the randomly selected cache object is selected using one of a random method and pseudorandom method. Preferably, the selection of the randomly selected cache object is guided by a heuristic.

Preferably, the method further comprises storing the given cache object in a losslessly compressed form after the increasing. Preferably, the method further comprises storing the given cache object in uncompressed form after the increasing. Preferably, the decreasing comprises removing at least one of the cache objects other than the given cache object .

According to another aspect, the invention may provide an apparatus, comprising a computing system having at least one processor capable of operative communication with a main memory; and a cache in the computing system having an initial group of cache objects, each cache object having an initial compression ratio and including stored data; wherein the computing system is operable to decrease an amount of data storage space occupied by at least one of the cache objects other than a given one of the cache objects; and increase an amount of data storage space occupied by the given cache object. Preferably, the decreasing comprises decreasing the amount of data storage space by a given amount. Preferably, the increasing comprises increasing the amount of data storage space occupied by the given cache object by the given amount. Preferably, the decreasing comprises increasing the initial compression ratio of the at least one cache object. Preferably, the increasing comprises decreasing the initial compression ratio of the given cache object. Preferably, the given cache object is a most recently used cache object of the cache objects. Preferably, the decreasing comprises removing a portion of the stored data for the at least one cache object. Preferably, the increasing comprises supplementing the stored data for the given cache object. Preferably, an amount of data storage space available for each of the cache objects may equal one of a finite number of discrete values. Preferably, the decreasing comprises reducing the amount of data storage space for at least one randomly selected cache object of the cache objects, other than the given cache object. Preferably, the reducing comprises reducing the amount of data storage space for the at least one randomly selected cache object to a randomly determined extent.

According to another aspect, the invention provides method, comprising: providing a cache in a computing system, the cache having an initial condition; if insufficient data storage space is present in the cache under the initial condition to store at least one new object in the cache, compressing at least one object in the cache to clear data storage space for the at least one new object; and storing the at least one new object in the cache. Preferably, the initial condition corresponds to the cache being empty. Preferably, the method further comprises continuing to store new objects in the cache without compressing the objects stored in the cache until insufficient data storage space remains in the cache to store any additional new object. Preferably, the method further comprises storing the at least one new object in the cache without the compressing, if sufficient space for storing the at least one new object is present in the cache under the initial condition.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 42 is a graph of the data storage sizes of individual cache objects in a cache as a function of the cache objects' recency of use within the cache, in accordance with one or more embodiments of the present invention;

FIG. 43 is a graph of the cumulative sum of the data storage space occupied by cache objects in a cache plotted against the number, "N," of cache objects summed, in accordance with one or more embodiments of the present invention;

FIG. 44 is a block diagram of a data cache including a plurality of cache objects in accordance with one or more embodiments of the present invention; FIG. 45 is a block diagram of the data cache of FIG. 44 in which cache objects have been resized in accordance with one or more embodiments of the present invention; FIG. 46 is a block diagram of the data cache of FIG. 45 in which an accessed cache object has been resized and restored to the cache in accordance with one or more embodiments of the present invention; FIG. 47 is a flow diagram of a method for accessing a cache object in a data cache and updating the data cache in accordance with one or more embodiments of the present invention; and

FIG. 48 is a block diagram of a computing system adaptable for use with one or more embodiments of the present invention .

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

This disclosure makes reference to the LRU and MRU ends of the cache. Objects are generally added at the MRU end, and are generally erased from the LRU end. However, the present invention is not limited to such a scheme. It is noted that the physical layout of cache objects in a cache need not correspond to the LRU-MRU layout. The layout of a cache merely preferably enables a computing system to find, insert, and/or erase objects in the manner described herein. The linear LRU-MRU arrangement is a convenient mechanism for describing the operation of a cache, but represents only of many possible implementations of a cache memory. Herein, the terms "cache" and "cache memory" are used interchangeably. It is noted that, although MRU caching and its extensions disclosed herein are discussed in the context of a client/server architecture, similar principles apply to many other scenarios, such as efficient hard disk access on a single computer (where access to the hard disk is slower than access to RAM, and RAM is thus used to cache the most recently used content on the hard disk) . In one or more other embodiments, data are gathered from the environment or generated computationally rather than being loaded from a disk or sent across a network. In each case, the client has access to a small but fast temporary cache memory, and a larger but slower data source from which information is requested repeatedly. This slower data source is generally referred to herein as the "server."

The following discussion of convergent series is provided as an introduction to the cache memory apparatus and method disclosed herein. The infinite sum of the series y(n)=n^Λ-p, with n going from 1 to infinity, and with p>l, is finite. Similarly, the sum of y=l/b^An is finite for b>l. For example, in the latter case, if b=2, the sum is exactly 2. The principles underlying such convergent series may be used to implement one or more embodiments of efficient data caching methods and apparatus as described herein.

One or more embodiments of the methods and apparatus described herein may employ concepts related to the "Zeno paradox," which is described below. While this discussion provides a conceptual underpinning applicable to one or more embodiments described herein, the present invention is not limited by the conceptual aspects discussed below.

Zeno Caching Concept. Zeno is a runner who is so quick that in one step (which, for the sake of discussion, it is assumed he makes every second) he covers half the distance from his current position to the end of any racetrack. The paradox is that he never finishes the course, even though he moves forward with every step. This paradox is easily related to the l/b^An series above with b=2, and summing from n=2 to infinity. This concept may be extended to the storage of cache objects (with the cache itself being analogized to the "racetrack") by enabling the cache objects to be compressed to a progressively greater extent with decreasing recency of use or access. Thus, in proceeding from the MRU end of a cache to the LRU end thereof, a theoretically infinite number of additional cache objects of ever decreasing size could be put in place, without ever running out of space. This principle is referred to herein as the Zeno cache concept.

Preferably, the cache objects concerned herein are compressible, which in this disclosure, corresponds to being amenable to lossy data compression techniques. Lossy data compression may be characterized by the ability to represent a data object with fewer bytes than the full representation of the data object. Higher compression ratios generally incur higher distortion of the data object and lower quality of an image rendered using the compressed data (where the object includes one or more image files) . Without limitation, lossy compression techniques may also be applicable to sound, video, and many other data types . In one or more embodiments, compressed versions of the data may be suitable as substitutes for the uncompressed data. Below a given level of distortion, the compressed representations of the data may be fully adequate, and above the given level of distortion, the compressed representations may be adequate as a temporary measure while the client waits for a higher quality version of the data. The higher quality version may merely be less compressed than the temporarily used version, or may be losslessly compressed or uncompressed. In one or more embodiments, lower quality representations may be subsets of higher quality representations, meaning that improving the representation quality at the client side may involve merely sending data to supplement lower quality representations, thereby providing the higher quality representation. Preferably, with this approach, there is no need to incur the burden of sending an entirely new set of data associated with the high quality representation. This approach preferably avoids redundancy and hence preferably substantially increases efficiency.

Consistent with the approach discussed above, the reverse process of lowering the representation quality of an object may involve merely removing a portion of the data employed for a high quality representation of an image, rather than requiring compression, or re-compression, of the data used for the high quality representation. This property preferably also enhances the efficiency of the caching apparatus and method disclosed herein.

In one or more embodiments, the compression technique may provide objects with compression levels that scale from lossy to lossless. This feature may allow a lossless representation of a data object to be built up in steps, from highly lossy to lossless, at little or no extra total cost relative to sending across a lossless version initially. An example of a data type and compression technology enabling the above features is the wavelet compression of images, as exemplified by the

JPEG2000 standard. However, the present invention is not limited the use of the JPEG2000 standard.

Given the above properties, if memory were "continuous" (i.e. not discretized into bytes) then it would be possible in theory to cache an infinite number of data objects in a finite amount of memory, merely by enforcing the constraint that the compressed sizes of the objects conform to the rules of a convergent series as discussed earlier herein. The operation of a cache able to function in accordance with the theory discussed above is described below in connection with FIGS. 1 and 2.

In the graph of FIG. 42, the variable "N" preferably corresponds to the number of each cache object, the value of the number of each cache object representing the recency of use of each such cache object, with increasing values of N corresponding to decreasing recency of use of the cache object associated with that number. The variable ^ΛΛY" preferably corresponds to the size of each cache object. For the "Y" variable, a value of "1" may correspond to the size of a cache object in its highest-quality condition, i.e. when it is not compressed at all. Most-recently-used objects may be represented with low distortion, and less recently used objects may be subject to an extent of compression consistent with the recency of the last use thereof. It may be seen in FIG. 42 that the value of the compressed cache-object size, Y, declines with decreasing recency of use of the pertinent cache object. Note that the "Y" variable may correspond to an absolute size of each object (whether compressed or not) in the cache (expressed in arbitrary units) . Alternatively, in one or more other embodiments, "Y" may correspond to a compression ratio, with, for example, the value "1" corresponding to a full-size object, and the value "0.5" corresponding to an object occupying one half of its uncompressed size.

With reference to FIG. 43, the cumulative sum of the sizes Y of the objects numbered from 1 to N, for each value of

N, may still be a finite number, as shown in FIG. 43. The units of the variable "Y" may be units of data storage space corresponding to the size (or data storage space needed by) one representative fully expanded (uncompressed) cache object. Since FIGS. 1 and 2 aid an understanding of the theory of one or more embodiments of the present invention, information describing the size of the cache objects in bits and/or bytes of data is not provided herein.

In one or more embodiments, the theoretical implementation described above is preferably modified for two reasons. First, in practice, memory storage is preferably composed of discrete storage units. Thus, for example, it is usually meaningless in practice to compress a cache object to occupy an amount of storage space that is smaller than one bit. Second, the total number of operations performed on the cache is preferably finite. In contrast, enforcing a continuous curve of compression ratios described by one of the convergent formulas above may involve reducing the size of every cache object in the cache every time additional cache storage space was needed. This would require an impractically large number of operations.

In one or more embodiments, the number of objects in the cache will in practice be finite. However, where the Zeno cache concept is employed, this number may be much larger than would be possible with conventional MRU caching. Further, cached objects may have the property that if recently used, they may be stored at a high quality level (anywhere from a low level of distortion, or compression lossyness, to lossless compression, to uncompressed data) . The quality level of cache objects may become progressively worse (i.e. be subject to progressively higher levels of distortion or compression lossyness) with each successive cache memory access in which these cache objects are not accessed.

Because computer memory is discrete and there may be a minimum compressed size of a cache object below which a cache object may have no value to a user, cached representations may be subject to a maximum compression ratio that yields this minimum compressed size. Thus, in one or more embodiments, the maximum number of cache objects that can be stored in the cache may equal the total data storage space in the cache divided by the amount of data storage space occupied by a cache object having the above-described minimum compressed size, if the objects are all of equal size. However, the cache objects need not all be of equal size. There are many ways to design a series which is bounded by one of the equations discussed above (or any other convergent sum) , and which therefore has a finite sum. An additional constraint can also be introduced, specifically that the likelihood of any given value repeating in successive values of a series increases at higher values of N such that the number of different values of Y employed may be limited to a reasonable number.

An example of such a series is: 1, 1/4, 1/4, 1/16, 1/16, 1/16, 1/16, 1/64, 1/64, 1/64, 1/64, 1/64, 1/64, 1/64, 1/64, 1/256, etc.

Clearly the sum of the series 1, two quarters, four sixteenths, eight sixty-fourths, etc. is 2, just like y=l/2^An, as discussed earlier herein, But, if we take the series out to n=16000, only about Iog₂ (16000), or about 14, values of y (object data storage space size) may be used.

In one or more embodiments, the log function described above provides one way to cause the number of available values of Y (possible sizes of the cache objects) to grow much more slowly than the value of N. However, the present invention is not limited to the use of this log function, and other mathematical operations that cause the number of values of Y to grow more slowly than the value of N may be employed in connection with the present invention.

In one or more embodiments, when N = one million, as few as 20 values of Y may be used (as determined using the logarithm-based formula recited above) . This implies that when space has to be freed in the cache, only a small number of operations may be needed to establish a suitable allocation of data storage space among the cache objects, since the majority of the cache objects will occupy an amount of data storage space that preferably does not need to change.

Other mathematical series may also satisfy the desired criteria for use within a cache memory management system and method. Additionally, it is possible to use series that are not theoretically convergent (i.e. whose sums are infinite), since in practice a finite number of terms will be summed in any case.

In one or more embodiments, random algorithms may be used to improve the basic algorithm in a number of ways. In one more embodiments, the 2*1/4, 4*1/16 etc. series, described above, may include only a small number of available cache object sizes, possibly leading to stark differences in compression ratios between different objects within a cache. Random choice may be used to "squeeze" (reduce the data storage space used by) a randomly selected subset of the cache objects in a weighted fashion until some target amount of space is made available for new cache objects. This approach may provide beneficial results because the exact position in the cache of a cache object may decrease in importance with an increasing number of objects in the cache. The amount by which each object is "squeezed" may also be at least partially randomized. Using randomization algorithms like those discussed herein may reduce obvious discontinuities or thresholds in cache-object quality, which may be perceived in images rendered using cache objects stored in the cache.

In the following, an illustrative example of managing cache objects in a data cache in accordance with one or more aspects of the present invention is presented.

FIG. 44 is a block diagram of a data cache 300 including a plurality of cache objects 302-310 in accordance with one or more embodiments of the present invention. FIG. 45 is a block diagram of the data cache 300 of FIG. 44 in which cache objects 302, 304, 308, and 310 have been resized in accordance with one or more embodiments of the present invention. FIG. 46 is a block diagram of the data cache of FIG. 45 in which an accessed cache object 306 has been resized and restored to cache 300 in accordance with one or more embodiments of the present invention.

In one or more embodiments, including that shown in FIG. 44, cache 300 may include five cache objects (abbreviated "CO" in FIGS. 3-6 for the sake of brevity) CO 1 302, CO 2 304, CO 3 306, C04 308, and CO 5 310. The number of cache objects (5) shown in FIG. 44 has been selected for convenience in illustrating various concepts described herein. However, fewer or more than five cache objects may be included within cache 300. There is in principle no lower limit to the number of cache objects which may be included within cache 300. In principle, an upper limit of the number of cache objects that may be included within cache 300 may correspond to the total size of the cache divided by the smallest acceptable possible cache object size, which is discussed elsewhere herein..

In FIGS. 3-5, for the sake of describing various concepts disclosed herein, the width of each illustrated cache object is intended to be proportional to the data storage space used by that cache object. Also, in FIGS. 3-5, proceeding from the left-most cache object to the right-most cache object corresponds to increasing recency of access of the displayed cache objects, with the least recently used of the cache objects shown at the extreme left, and the most recently used of the cache objects shown at the extreme right.

FIG. 45 shows cache 300 after CO 3 306 has been accessed by a computing system, such as computing system 700, employing cache 300. In this example, CO 3 306 is not shown within cache 300 in its original position, since this position has been overwritten in the condition of cache 300 shown in FIG. 45. Moreover, free space 402 has been created to make room for an expanded version of CO 3 306 which may occupy more data storage space within cache 300 than did the original version of CO 3 in FIG. 44. In FIG. 46, an expanded version of CO 306 has been written into a portion of cache 300 which was occupied by free space 402 in the cache 300 condition shown in FIG. 45.

FIG. 47 is a flow diagram of a method 600 for accessing a cache object in data cache 300 and updating the data cache 300 in accordance with one or more embodiments of the present invention. Reference is made to FIGS. 3-6 in the following.

At step 602, cache objects 302, 304, 306, 308, and 310 are provided to a program having access to cache 300. The group of cache objects initially present in cache 300 are shown in FIG. 44. This initial condition of cache 300 may result from a default setting in or more programs or from program steps previously executed within one or more programs. In any case, it is the changes made to cache 300 after establishment of the initial condition shown in FIG. 44 that are of interest in the following discussion. Although any one of many programs operating on any one of various computing systems may be in communication with cache 300, for the sake of convenience, software which may access cache 300 is referred to merely as "the program" in the following.

At step 604, an indication may be provided as to which cache object will be the next to be used by the program. At step 606, the indicated cache object, which in this example is CO 3 306, may be accessed by the program.

At step 608, a determination may be made as to whether the accessed cache object is expandable. Herein, a cache object is expandable when it may occupy more data storage space by being subject to a lower compression ratio. Such expansion may be accomplished by supplementing the data already present in the cache object rather than by providing a completely new set of data corresponding to the new compression ratio (or corresponding to a lack of compression) . If the accessed cache object is not expandable, it is preferably restored to cache 300 in step 610. Preferably, in step 610, the restored cache object occupies the same amount of data storage space after being accessed as it did prior to such access. Consistent with the principles of LRU-MRU caching, upon being restored to cache 300, the accessed cache object may be written to the rightmost, or MRU end, of cache 300. Alternatively, however, the accessed cache object could be written to any part of cache 300. Continuing with this branch of method 600, the method 600 preferably ends at step 612.

With reference to step 608, if the accessed cache object, such as cache object 306, is expandable, it is preferably expanded (step 614) in accordance with one or more embodiments of the present invention. As previously discussed herein, expanding a cache object as described above preferably helps provide an arrangement in which the most recently and/or the most frequency accessed cache objects are stored in cache 300 at the highest quality levels.

In one or more embodiments, where there are "N" cache objects in a cache, the number of possible sizes (as measured in data storage space) of such cache objects may be limited to the quantity equal to log₂ (N) . Establishing a limited, finite number of possible cache object sizes, as described above, preferably limits the computational expense of determining a new, expanded size for a cache object, such as CO 306, to be expanded in step 614.

In one or more embodiments, the amount of data storage space needed for the expanded (or otherwise stated, less compressed) version of CO 306 may be calculated by a computing system (not shown) having access to cache 300. Where cache 300 is not yet ready to receive the expanded version of CO 306, the expanded version of CO 306 may be written to another memory storage location (not shown) for temporary storage therein.

At step 616, data storage space 402 needed for storing an expanded version of CO 306 is preferably made available within cache 300. If there is sufficient space present within cache 300 to store an expanded version of CO 306 without altering any cache objects within cache 300, then a reduction in size of one or more of the cache objects in cache 300 may be omitted. However, where all or substantially all of the storage space in cache 300 was occupied prior to CO 306 being accessed, one or more of the cache objects other than CO 306 may undergo a reduction in size to free up space in cache 300 for storage of an expanded version of cache 306. In one or more embodiments, the number of cache object size reduction operations may be reduced where there is a limited number of possible cache object sizes. Limiting the number of cache object size reduction operations preferably operates to reduce the computational burden on a computing system accessing cache 300 and preferably provides for overall computing system efficiency. In one or more embodiments, there may be various ways to achieve a desired amount of data storage space clearing. Herein, the term "clearing" may correspond to making data storage space available in cache 300 by reducing the data storage space allocated to one or more cache objects within cache 300.

In one or more embodiments, the amount of data storage space to be cleared may correspond to the amount of additional storage needed by the expanded cache object over and above the space it occupied prior to its most recent access by a computing system. However, in other embodiments, the amount of space to be cleared may be smaller or greater than the amount space by which the most recently accessed cache object has increased in size.

In one or more embodiments, the space cleared for the most recently used, expanded cache object may be at one end of cache 300, as is illustrated in FIG. 46. However, in other embodiments, the cleared space could be placed at other locations within cache 300.

In one or more embodiments, the data storage space to be made available may be provided at the expense of one or more of the cache objects of FIG. 44 other than CO 3 306 (the most recently used cache object). Specifically, it may be possible to provide the needed additional space by reducing the size of just one remaining cache object or by reducing the size of all but the most recently used cache object. Moreover, any number of cache objects in between these two extremes may be caused to shed storage space in favor of the expanded, most recently used cache object. In the following, all of the cache objects other than the most recently accessed cache object are considered to be "eligible for size reduction."

In one or more embodiments, the extent of size reduction of the one or more cache objects eligible for size reduction may be selected according one or more considerations. In one embodiment, the cache objects eligible for size reduction may shed an equal or substantially equal amount of storage space.

In another embodiment, the eligible cache objects may shed an equal or substantially equal proportion of their pre-reduction size to clear space for the expanded, most recently used cache object.

In one or more other embodiments, the extent of size reduction of each cache object may be based on how recently the cache object was last accessed. Specifically, cache objects eligible for size reduction may shed progressively more storage space with decreasing recency of the last access thereof. Thus, under this approach, the most recently used of the cache objects eligible for size reduction may shed a relatively small amount of storage space, and the least recently used cache object may shed a relatively large amount of data storage space, with those cache objects in between these two extremes shedding intermediate amounts of storage space. While the discussion of storage space reduction herein is directed primarily to merely reducing the size of cache objects that are not the most recently accessed, in one or more embodiments, one or more cache objects may be removed from cache 300 to clear data storage space. Moreover, such cache object removal may be practiced either alone, or in combination with cache object data storage space reduction of cache objects that will remain within cache 300. In the embodiment of FIG. 46, all four cache objects 302, 304, 308, and 310 remaining in cache 300 have been reduced in size to clear data storage space for the writing of CO 306 to the position shown at the rightmost end of cache 300. However, in alternative embodiments, three or fewer of the four cache objects 302, 304, 308, and 310 eligible for size reduction could undergo size reduction. Preferably, the method ends at step 620.

In one or more embodiments, rather than managing objects in the cache employing only the recency of use of each cache object as a variable in determining cache object size, cache object management may also involve intelligent guessing about which objects might be needed next. Thus, objects less likely to be needed may be "squeezed" before objects with a higher likelihood of being needed in the future. In one or more embodiments, this guessing approach could be combined with an algorithm that may randomly select objects in the cache for squeezing and which may additionally generate a randomly varying amount of squeezing for the objects selected. FIG. 48 is a block diagram of a computing system 900 adaptable for use with one or more embodiments of the present invention. In one or more embodiments, central processing unit (CPU) 902 may be coupled to bus 904. In addition, bus 904 may be coupled to inventive cache 300, random access memory (RAM) 906, read only memory (ROM) 908, input/output (I/O) adapter 910, communications adapter 922, user interface adapter 906, and display adapter 918.

In one or more embodiments, RAM 906 and/or ROM 908 may hold user data, system data, and/or programs. I/O adapter 910 may connect storage devices, such as hard drive 912, a CD-ROM

(not shown) , or other mass storage device to computing system

900. Communications adapter 922 may couple computing system 900 to a local, wide-area, or Internet network 924. User interface adapter 916 may couple user input devices, such as keyboard 926 and/or pointing device 914, to computing system 900. Moreover, display adapter 918 may be driven by CPU 902 to control the display on display device 920. CPU 902 may be any general purpose CPU.

It is noted that the methods and apparatus described thus far and/or described later in this document may be achieved utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, programmable digital devices or systems, programmable array logic devices, or any combination of the above. One or more embodiments of the invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit.

Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims .

SYSTEM AND METHOD FOR EFFICIENTLY ENCODING DATA

BACKGROUND OF THE INVENTION

Recently, image compression standards such as JPEG2000/JPIP (JPEG 2000 Interactive Protocol) have been introduced to meet a demanding engineering goal: to enable very large images (i.e. gigapixels in size) to be delivered incrementally or selectively from a server to a client over a low-bandwidth communication channel. (See David Taubman' s implementation of Kakadu at www.kakadusoftware.com). When such images are being viewed at full resolution, only a limited region of each image can fit on a client' s graphical display at any given time. The new standards are geared toward selectively accessing such regions and sending across the communication channel only data relevant to the region. If this "region of interest" or ROI changes continuously, then a continuous dialogue between a client and server over a low- bandwidth channel can continue to keep the client' s representation of the area inside the ROI accurate. However, existing approaches generally require the transmission of substantial amounts of data to significantly shift the location of the region of interest, thus limiting the speed at which such shifts may be implemented. Moreover, existing approaches generally depend on sequential access to a linear string of text, thereby imposing a significant burden on text navigation when a client seeks to significantly change the location of the region of interest. Accordingly, an improved method for transmitting data from a server to a client is needed.

SUMMARY OF THE INVENTION According to one or more embodiments, the invention provides a method that may include converting data identifying a plurality of visual features into a plurality of pixel characteristic data values; and forming an image file with pixels having the respective pixel characteristic data values. Preferably, the plurality of visual features comprise at least one graphical symbol. Preferably, the visual feature identification data comprises ASCII codes. Preferably, the visual feature identification data comprises frequency-order positions of the visual features. Preferably, the visual feature identification data is determined based on a combination of a) a frequency of use of the visual features and b) transition probabilities between successive ones of the visual features. Preferably, the pixel characteristic data values comprise at least one data type selected from the group consisting of: pixel intensity data values, pixel color data values, and image contrast data values.

Preferably, the pixel characteristic data values comprise pixel color data values. Preferably, the pixel color data values comprise at least one color-data value type selected from the group consisting of: RBG (Red, Blue, Green) pixel color data values, HSV (Hue Saturation Value) pixel color data values, and CMYK (Cyan, Magenta, Yellow, Black) color data values . Preferably, the method further comprises losslessly compressing said image file. Preferably, the visual feature identification data comprises differences in values of the identification data for successive ones of the visual features. Preferably, the visual features comprise alphanumeric characters. Preferably, the method further comprises locating the pixels in said image file in a same order that said alphanumeric characters are ordered in, within an original document. Preferably, the method further comprises transmitting the image file to a client. Preferably, the method further comprises decoding at least a region of said image file into the alphanumeric characters for viewing by the client. Preferably, the method further comprises enabling the client to browse the decoded image file region in a first dimension.

Preferably, the method further comprises enabling the client to browse the decoded image file region in first and second dimensions. Preferably, browsing in said first dimension includes advancing along a line of the alphanumeric characters, and browsing in the second dimension preferably includes scrolling over a plurality of lines of the alphanumeric characters. Preferably, the browsing comprises browsing the decoded image file in a manner that emulates browsing within the original document. Preferably, the method further comprises correlating locations of the pixels in the image file with locations of the alphanumeric characters corresponding to said pixels in an original document containing the alphanumeric characters . Preferably, the pixels' locations within the image file at least substantially correspond to locations of the corresponding alphanumeric characters in the original document. Preferably, the pixels' locations within the image file are substantially the same as locations of the corresponding alphanumeric characters in the original document. Preferably, the method further comprises transmitting at least a region of the image file from a server to a client. Preferably, the method further comprises requesting a region of the image file from a server by a client.

Preferably, the method further comprises sending the requested region of the image file by the server. Preferably, the sending step comprises transmitting compressed data suitable for decoding by the client.

According to another aspect, one or more embodiments of the present invention provide an apparatus that may include a processor operating under the instructions of a software program, the software program causing the apparatus to perform actions, including converting data identifying a plurality of visual features into a plurality of pixel characteristic data values; and forming an image file with pixels having the respective pixel characteristic data values.

According to another aspect, one or more embodiments of the present invention may provide a storage medium containing a software program operable to cause an apparatus including a processor under the instructions of the software program to perform actions, comprising: converting data identifying a plurality of visual features into a plurality of pixel characteristic data values; and forming an image file with pixels having the respective pixel characteristic data values.

BRIEF DESCRIPTION OF THE DRAWINGS For the purposes of illustrating the various aspects of the invention, there are shown in the drawings forms that are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown. FIG. 49 is a text image of the full text of "Ulysses," using raw ASCII encoding in accordance with one or more embodiments of the present invention, with the color White having a numerical value of 0 and the color Black having a numerical value of 255;

FIG. 50 is a text image of the first five chapters of Ulysses, encoded using raw ASCII in accordance with one or more embodiments of the present invention;

FIG. 51 is a text image of the first five chapters of Ulysses encoded using frequency-order encoding in accordance with one or more embodiments of the present invention; and FIG. 52 is a block diagram of a computing system that may be adaptable for use with one or more embodiments of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One or more embodiments of the present invention may involve extending selectively decompressable image compression and transmission technologies to textual or other data that may be identified using a binary representation.

Herein, binary data that identify and/or describe visual features may be converted from an initial format, such as ASCII (American Standard Code for Information Interchange) or other format, may be converted into a format suitable for incorporation into image data, such as but not limited to pixel intensity data. The visual features referred to above may include but are not limited to graphical symbols and/or alphanumeric characters. However, herein, visual features may include any visible image features that may be described and/or identified using binary data. Moreover, while such data may be encoded into pixel intensity data, the present invention is not limited to encoding in this format.

Ill In one or more embodiments, the initial data (visual feature identification data) may be converted into several possible types of pixel characteristic data including but not limited to pixel intensity data, pixel color data, image contrast data, and/or other form of image data. The above- mentioned pixel color data may include but is not limited to Red, Blue, Green (RBG) pixel color data, Hue Saturation Value (HSV) pixel color data, Cyan, Magenta, Yellow, Black (CMYK) pixel color data, and/or other form of pixel color data. In the following discussion, a large text, specifically the book "Ulysses," by James Joyce, is considered. In one or more embodiments, this text may be formatted by putting each chapter in its own column, with columns for successive chapters arranged from left to right. However, it will be appreciated that other arrangements of the chapters may be implemented. Columns are assumed to have a maximum width in characters, such as 100. FIG. 49 shows the entire text of Ulysses encoded as an image in this fashion, with each pixel within FIG. 49 corresponding to a single text character (or character position that doesn't include text, such as empty space) in the original text document. FIG. 50 shows a text image of the first five chapters of Ulysses, using the same encoding method as in FIG. 49.

In one or more embodiments, the intensity value of each pixel may be set equal to the ASCII code of the character being encoded in the pixel. Because grayscale pixels and

ASCII characters may both be represented using 8-bit sequences, (which may both have values in the range 0-255) , the correspondence between a pixel value and a character may be readily implemented. In this disclosure, although textual and other characters may be represented with pixels using the

ASCII code as a value for pixel intensity, it will be appreciated that other codes for textual or other characters may be employed for this purpose.

Generally, the full text of Ulysses in ordinary ASCII representation (i.e. as a standard text file) occupies 1.5MB of storage space, which may be too large to transmit in its entirety over a narrowband communication channel. The pixel- characteristic-data representation of character data (which is known herein as the "character-pixel image" and also as the "text-image") of FIG. 49, encoded as a lossless JPEG2000 image, requires about 2.2MB (Megabytes) of data storage space. This storage space requirement could be significantly reduced if the chapters of the book were more equal in length, resulting in less empty space (encoded as zeros) in the character-pixel image (text-image) . However, more important than the overall file size, is the ability of an ordinary JPIP server to serve this file to a client selectively and incrementally. Specifically, of concern here is the ability of a server to serve selected portions of the file at controllable increments of resolution. In one or more embodiments, a client viewing the text at a zoom level sufficient to read the characters (which may require well over 1 pixel/character on the client-side display) can use JPIP (or other suitable protocol) to request only the relevant portion of the text. This operation is efficient, and adequate performance could be achieved for a reader of the text even with a very low bandwidth connection to the server, under conditions that would make it prohibitive to download the entire text, due to the magnitude of data involved in such a download. In one or more embodiments, similar effects may be achieved using a client/server technology specifically designed for selective access to large texts, but the character-pixel image approach described above has a number of advantages over conventional implementations, which are listed below.

1) The character-pixel image approach may use existing technology and protocols designed for very large data volume.

2) The character-pixel image approach may easily scale up to texts of many gigabytes in size, or even larger, without any degradation of performance.

3) Access to text or other visual features within a region of interest is preferably two-dimensional in one or more embodiments of the present invention. Thus, in many situations

(for example, when text is to be viewed in columns as in the

Ulysses case) , the character-pixel image approach disclosed herein preferably enables more efficient browsing than is available using existing approaches that deal with text as a one-dimensional string.

4) Because wavelets are used in JPEG2000, the character- pixel image is preferably subject to a multi-resolution representation. Although the text, or other visual feature identified using visual feature identification data, will generally not be readable at other than the final (most detailed) resolution, the broader spatial support of lower- resolution wavelets preferably provides an intelligent client- side cache for text, and/or other visual features, near the region of interest. Moving the ROI over various regions of the original text, as may be done when scrolling through text in a traditional manner, may therefore be efficiently performed using one or more embodiments of the present invention.

In one or more embodiments, the above concepts may be readily extended to deal with formatted text, Unicode, or other metadata, as all such data can be represented using ASCII text strings, possibly with embedded escape sequences. In selected applications, JPEG2000 may be used as a lossy- compression format, meaning that the decoded image bytes are not necessarily identical to the original bytes. Herein, the term "decoding" refers to converting pixel data in a text image back into the original text data, or other visual feature data. If the image bytes represent text, lossy compression will generally not be acceptable. One of the design goals of JPEG2000 was, however, to support lossless compression efficiently, as this is important for certain imaging functions (such as for medical and scientific applications) . Lossless compression ratios for photographic images are typically only around 2:1, as compared with visually acceptable lossy images, which can usually easily be compressed by 24:1. Image compression, both lossy and lossless, generally operate best on images that have good spatial continuity, meaning that the differences between the intensity values of adjacent pixels are low. The raw ASCII encoding is not optimal from this perspective, since successive text characters encoded in ASCII may have widely varying values. Thus, some alternatives are considered below.

FIG. 51 is a text image of the first five chapters of Ulysses encoded using frequency-order encoding in accordance with one or more embodiments of the present invention. Frequency Encoding

In one or more embodiments of the present invention, the encoding efficiency may be improved by reordering characters according to their frequency of occurrence in the pertinent text, in the English language as a whole, or in another language as a whole, from highest frequency to lowest frequency. Thus, in one or more embodiments, empty space would have code zero, and a pixel value of zero in the character-pixel image. The "space" character could receive code "one" (with its corresponding value in the character- pixel image also being "1") . A sequence of characters such as e, t, a, o, i, n, s, r, h, 1, etc... could be caused to convert to successive pixel values starting with "2" (corresponding to "e") and proceeding upward therefrom up to the value 255.

It is possible that, upon converting all the characters in a large text into pixel values using this approach, all 255 pixel values could end up being used. However, by the very nature of the text character (or other visual feature) to pixel value conversion contemplated herein, pixel value occurrences preferably become increasingly rare with increasing numerical values thereof.

The image of FIG. 51 illustrates this point. Pixel values tend to cluster near zero. At least as importantly, the difference in pixel values between successive character- pixel values in the embodiment of FIG. 51 is preferably lower than in the embodiment of FIG. 50.

Where all pixel values in the range 0-255 are equally likely, eight bits will generally be needed to represent each pixel value. However, in embodiments in which some pixel values occur much more frequently than others, the pixel values can preferably be represented with fewer bits, without losing any information. An example is considered to illustrate this point. In this extreme case, the pixel value equals zero 99% of the time, and has another value, somewhere between 0 and 255, the rest of the time. In this case, the encoding algorithm may encode the 0 value with a single "0" bit, and non-zero values with a leading "1" bit (to signal the presence of a non-zero value) followed by an 8-bit representation of the non-zero value. Thus, this approach conserves 7 bits per pixel for 99 out of 100 pixels, but uses one extra pixel to represent a non-zero pixel which occurs only 1% of the time. The decoding algorithm corresponding to the above-described encoding algorithm thus preferably interprets a "0" bit as representing a 0 bit value and a bit sequence starting with a ^λλl" bit value has having the value represented by the bits succeeding the leading "1" bit.

However, even in less extreme situations, the existence of pixel values that occur much more frequently than others may enable considerable savings in storage space, without incurring any loss of pixel data, and by logical extension without incurring any loss of the visual-feature data represented by the pixel data. In general, two or more categories of value occurrence frequency may be established, generally using a progressively increasing number of bits to represent values occurring with decreasing frequency. In this manner, smaller bit sequences may be used most of the time, thereby conserving data storage space and data communication bandwidth requirements. In an intermediate example, five bits could be used to represent the most frequently occurring pixel values, and nine bits for the less frequently occurring values. For the most frequently occurring visual features, a leading bit, which may be a "0", may be provided, which may be followed by four bits representing the actual value of the pixel. For the less frequency occurring pixel values, a leading bit, which may be a "1", may be provided, which may be followed by eight bits representing the actual value of the pixel.

In one or more other embodiments, frequency encoding may benefit from spatial coherence to represent a text image using a reduced number of bits. Specifically, the image may be divided into 8X8 pixel blocks, thus providing 64 pixels in each block, with each pixel representing a frequency encoded visual feature (which may be a text character) . The encoding method may then review each block and determine the number of bits needed to represent the highest-valued pixel value in the block. This determined number of bits may then be used to represent all pixel values in that block.

For many of the blocks within any given image, the highest-value pixel may be able to be represented with four or fewer bits. Thus, considerable savings in data storage requirements may be obtained when employing this block by block approach.

In one or more embodiments, when the frequency-encoded text-image of Ulysses is compressed losslessly as a JPEG2000 image, the file size is 1.6MB, barely larger than the raw ASCII text file (1.5MB), and 37% smaller than the ASCII encoded text-image. With further optimizations of the letter encoding, the compressed file size can drop well below the ASCII text file size. The further optimizations can include, but are not limited to: using letter transition probabilities (Markov-1) to develop the encoding, instead of just frequencies (Markov-0) ; and/or encoding as pixels the delta or difference between one character and the next, rather than the characters themselves .

With these added optimizations, text ready to be served in this fashion may actually take up less data storage space than the raw ASCII.

One or more embodiments of the present invention discussed herein include using JPEG2000/JPIP as a selective image decompression technology, but the present invention is not limited to using this image compression technology. Other image compression formats and protocols may be employed in conjunction with the present invention, including but not limited to, for example, LizardTech's MrSID format and protocol, which has similar properties.

FIG. 52 is a block diagram of a computing system 1400 adaptable for use with one or more embodiments of the present invention. In one or more embodiments, central processing unit (CPU) 1402 may be coupled to bus 1404. In addition, bus

1404 may be coupled to random access memory (RAM) 1406, read only memory (ROM) 1408, input/output (I/O) adapter 1410, communications adapter 1422, user interface adapter 1406, and display adapter 1418.

In one or more embodiments, RAM 1406 and/or ROM 1408 may hold user data, system data, and/or programs. I/O adapter 1410 may connect storage devices, such as hard drive 1412, a CD-ROM (not shown) , or other mass storage device to computing system 1400. Communications adapter 1422 may couple computing system 1400 to a local, wide-area, or Internet network 1424. User interface adapter 1416 may couple user input devices, such as keyboard 1426 and/or pointing device 1414, to computing system 1400. Moreover, display adapter 1418 may be driven by CPU 1402 to control the display on display device 1420. CPU 1402 may be any general purpose CPU.

It is noted that the methods and apparatus described thus far and/or described later in this document may be achieved utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, programmable digital devices or systems, programmable array logic devices, or any combination of the above. One or more embodiments of the invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit. Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims .

SYSTEM AND METHOD FOR MANAGING COMMUNICATION AND/OR STORAGE OF

IMAGE DATA

BACKGROUND OF THE INVENTION Recently developed image compression and transmission standards such as JPEG2000/JPIP have enabled the interactive display of large images (i.e. gigapixels in size) over narrow bandwidth communication channels. However, these emerging standards and technologies do not provide means for achieving a more ambitious goal: to allow flexible visual interaction with a very large number of images simultaneously, each of which may also potentially be very large. Accordingly, there is a need in the art for an improved system and method for an improved system and method for transmitting and/or storing image data.

SUMMARY OF THE INVENTION

According to one aspect, the present invention provides a method that may include establishing communication between a first computer and a second computer over a communication link, the second computer having an image collection stored therein in the form of compressed image data; selecting a plurality of images in the collection for communication to said first computer; and transmitting low-resolution image data for all of the selected images from the second computer to the first computer before transmitting full-resolution image data for any of the selected images .

Other aspects, features, advantages, etc. will become apparent to one skilled in the art when the description of the preferred embodiments of the invention herein is taken in conjunction with the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 53 is a block diagram of a system that may be connected to enable communication of image data a plurality of computers in accordance with one or more embodiments of the present invention;

FIG. 54 is a block diagram of an image having at least two regions of interest therein in accordance with one or more embodiments of the present invention;

FIG. 55 is a block diagram of a "virtual book" that employs aspects of the technology disclosed herein in accordance with one or more embodiments of the present invention;

FIG. 56 is an illustration of a three-dimensional version of the virtual book of FIG. 55 in accordance with one or more embodiments of the present invention;

FIG. 57 is block diagram of a system for managing image data communication between one or more portable devices and one or more other computers in accordance with one or more embodiments of the present invention; FIG. 58A illustrates the results of an incomplete image data download employing an existing approach;

FIG. 58B illustrates the results of an incomplete image data download in accordance with one or more embodiments of the present invention; FIG. 59 is a block diagram of a "common space" that may include a physical display (screen) and two virtual displays in accordance with one or more embodiments of the present invention;

FIG. 60 illustrates a collection of over one thousand images (a collection of digitized maps of various sizes) packed into a montage in accordance with one or more embodiments of the present invention;

FIG. 61 illustrates a snapshot of about three thousand images that have been dynamically re-arranged into a random configuration in accordance with one or more embodiments of the present invention; and

FIG. 62 is a block diagram of a computer system that may be adaptable for use with one or more embodiments of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 53 is a block diagram of a system 170 that may be connected to enable communication of image data between a plurality of computers in accordance with one or more embodiments of the present invention. System 170 preferably includes client computer 182 which is connected to display 184 and data storage device 186. System 170 preferably also includes server computer 188 which may be connected to data storage device 190. Server computer 188 may also be connected to the Internet 180.

In one or more embodiments, image data may be communicated between a plurality of computers 182, 188 so as to enable viewing of large collections of potentially large images using a relatively low-bandwidth connection therebetween. For instance, desirable viewing and navigation of images stored at server computer 188 may be accomplished by transmitting selected portions of the image data stored at server computer 188 at controllable levels of resolution. The selectivity of image data 114 may be such as to select a particular image at high resolution, or even a selected portion of a particular image at high resolution.

Herein, various embodiments are discussed that include varying the types of devices used as client computer 182 and server 188, the types of image data 114 transmitted therebetween, and various applications of the ability to transmit selected image data at specified levels of resolution.

FIG. 54 is a block diagram of an image 250 having at least two regions of interest 252, 254 therein in accordance with one or more embodiments of the present invention. Image 250 could be a subset of image data 114. Alternatively, image data 114 could represent a subset of image 250, depending upon what image data is requested by client computer 182. In one or more embodiments, image 250 may be stored in compressed form on server computer 188, or within storage device 190. Preferably, when stored in this manner, data for a plurality of resolution levels for various regions of image 250 may be stored and may be requested for downloading by client computer 182.

In one or more embodiments, the resolution level at which a particular image or region of an image is stored on client computer 182 may be readily increased or decreased. Where a prior download results in storage of region or image at a first resolution level (which may be less-than-full resolution) , this first resolution level may be increased by adding data representing the next higher level of resolution, preferably without having to discard the data representing the first resolution, thereby avoiding redundancy and increasing the efficiency of the image data communication contemplated herein. Conversely, the resolution level of a region or image stored at client 182 may be decreased by discarding the highest level of resolution stored therein, without losing data corresponding to the lower levels of resolution for the same region or image. Such resolution reduction may be practiced at client 182 to clear data storage space needed for one or more regions or images other than the one for which data is being discarded.

The pertinent image compression may be provided by, for instance, the use of JPEG2000 or another discrete wavelet transform-based image compression scheme. However, the present invention is not limited to the use of any particular compression format or image data representation. Other formats may be employed, including image formats whose sizes in bytes are not substantially smaller than the uncompressed image data. It is merely preferable that the selected image format be susceptible to multiscale representation and storage of image data.

In one or more embodiments, client computer 182 may seek to download one or more regions of image 250, where such regions may be portions of image 250. The one or more regions of interest 252, 254 may be the only ones that client computer 182 seeks to download. Alternatively, client computer (client) 182 may merely seek to download one or more selected regions at higher resolution than the resolution at which the remainder of image 250 is downloaded. In either case, client 182 may request a download by identifying both a specified region of image 250 to download and a resolution level at which this specified region will be provided by server computer (server) 188.

In the example of FIG. 54, client 182 preferably requests a download of all of image 250 at low resolution. (The exact resolution level at which the bulk of image 250 is downloaded is not pertinent to this discussion) . However, client 182 seeks to download region of interest 1 252 at a higher resolution, or even at full resolution. Accordingly, client 182 preferably specifies the coordinates and the desired resolution level of region of interest 1 252 to server 188. Thus, in addition to downloading the bulk (including that portion external to region of interest 1 252) of image 250 at low resolution, client 182 preferably downloads region of interest 1 252 at the specified higher resolution. In other situations, client 182 could seek to download only the region (s) of interest and omit a download of the remainder of image 250.

In this manner, a user of client computer 182 may view region of interest 1 252 at high resolution without having to download the entirety of image 250 at this high resolution. Thus, a relatively low-bandwidth data communication link between client 182 and server 188 could nevertheless transmit the entirety of image 250, while providing a region of particular interest (in this case, region of interest 1 252) at high resolution, thereby providing the viewer with the same viewing experience with respect to the region of interest that would have occurred had client 182 downloaded the entirety of image 250 at the high resolution, this latter option however demanding considerably more download time and data storage space at client computer 182 or data storage device 186. Shifting the Region of Interest

In one or more embodiments, a user of client computer 182 may wish to pan across image 250. Normally, panning from one region of interest 252 to another 254 would involve having both regions downloaded at client 182 at the level of resolution at which the regions will be viewed. Moreover, generally, all image territory in between region of interest 1 252 and region of interest 2 254 would be stored at client computer 182 to enable the described panning to occur. As described in the following, in one or more embodiments of the present invention, viewing of such regions of interest 252, 254 may be accomplished by downloading much less data and using less storage space at client computer 182 than in the approach described above.

In one or more embodiments, client 182 may shift from a high resolution view of region of interest 1 252 to region of interest 2 254. Preferably, image data corresponding to a low-resolution representation of region of interest 2 254 is already present in client computer 182 from the download of image 250, discussed above. In this case, all that is needed is to supplement the existing image data for region of interest 2 254 with additional image data describing the pertinent higher levels of resolution to arrive at a high- resolution rendition of region of interest 2 254 at client computer 182. If needed, image data representing the higher resolution levels of region of interest 1 252 may be discarded or overwritten to make space in data storage device 186 or other data storage space for the additional image data to be downloaded for region of interest 2 254.

In one or more embodiments, the shift in view from region of interest 1 252 to region of interest 2 254 may be accomplished gradually, to provide a viewer of display 184 with a viewing experience that may closely simulate that available on a computer that has the entirety of image 250 downloaded at high resolution. Specifically, the level of resolution at which region of interest 1 252 is displayed may be reduced gradually to the resolution level at which most of image 250 is represented. Thereafter, the view on display 184 may present a gradual pan across the low-resolution territory in between region of interest 1 252 and region of interest 2 254. Finally, upon arriving at region of interest 2 254, the view on display 184 may increase toward a high-resolution rendition of region of interest 2 254 either after completing the panning across image 250 or concurrently with the latter portion of this panning operation. Preferably, at the conclusion of the described process, region of interest 2 254 may be stored in client computer 182 at high resolution and may be displayed on display 184 at this high resolution.

FIG. 55 is a block diagram of a "virtual book" 350 that employs aspects of the technology disclosed herein in accordance with one or more embodiments of the present invention. Virtual book 350 may include display 352, backward cache 354, and forward cache 356. While the caches 354, 356 are each shown having two pages stored therein, any number of pages may be stored in either of caches 354 and 356. In one or more embodiments, virtual book 352 employs the ability to present selected image data at controllable levels of resolution for the particular case of virtual book 350. In virtual book 350, each image may be a page within display 352 of virtual book 350. Display 352 may correspond to display 184 of FIG. 53 or may be a special purpose display that accommodates the specific features of virtual book 350. Virtual book 350 may correspond to client computer 182 of FIG. 53, or may be a special purpose computer that is substantially limited to communicating, storing, and displaying pages of books .

In one or more embodiments, virtual book 350 may include only one page that is stored and/or displayed at full resolution, with other pages, both earlier and later in the sequence of pages displayed, at a variety of other resolutions.

In one or more embodiments, the page currently displayed on display 184, i.e. the active page, is displayed at full resolution, which is "page 10" in FIG. 55. In such embodiments, other pages may be displayed at progressively lower resolutions with increasing distance in pages from the active page. More specifically, the resolution at which each page is stored may equal the resolution of the active page being displayed in display 356 divided the quantity equal to 2 raised to a power equal to the number of pages between each stored page and the active page. Thus, applying this approach, page 11 (in forward cache 356) and page 9 (in backward cache 354) may each occupy one half the amount of data storage space occupied by the active page in display 352. Continuing with this approach, page 12 (in forward cache 356) and page 8 (in backward cache 354) may each occupy one quarter the amount of data storage space occupied by the active page in display 352.

While in the above discussion, the amount of data storage space allocated to each page differs by a factor of two with respect to its immediately neighboring page, it will be appreciated by those of skill in the art that a value greater than or less than two may be employed as the division factor. Moreover, arithmetic formulae other than division of the data storage space of the active page by a constant may be employed to determine the allocation of data storage space to a succession of pages stored in caches 354 and 356. In one or more embodiments, a new active page may be selected in place of page 10 which is shown displayed in FIG. 55. The new selected page may, but need not, be a page immediately adjacent page 10 (either page 9 or page 11) . That is, any page from 1 to the last page in the pertinent book (or any other type of publication with discrete pages) may be the new active page . In one or more embodiments, upon selection of the new active page, a transition between the currently active page and the new active page is preferably conducted. This transition to a new active page may include acquiring additional image data for the new active page to enable the new active page to be stored and/or displayed at full resolution. If the new active page is "page 11", and the ^ΛΛfactor-of-two" embodiment, discussed above, is employed, the amount of data storage space allocated to page 11 will preferably double. Continuing with an application of the "factor-of-two" embodiment, the data storage space allocated to page 10 will preferably be halved as part of the transition away from page 10 and toward page 11, as the active page. The data for the active version of page 10 that is not included in the post-transition page 10 may be discarded (which may include overwriting thereof) . Alternatively however, this "surplus" data for page 10 may be stored in another cache. Such caching of the page-10 surplus data may provide efficiency if a transition to page 10 occurs soon after (i.e. within a reasonable number of page transitions) the transition away therefrom.

In one or more embodiments, the transition from page 10 to page 11 (or other new active page) may include a gradual fade-out from page 10 and gradual fade-in of page 11, to provide an experience that is visually pleasing and/or reminiscent of a physical page transition to the user of virtual book 350. Optionally, a sequence of images showing the folding and turning of the old active page may be provided to make the virtual page transition look still more reminiscent of a physical turn of a page.

FIG. 56 is an illustration of a three-dimensional version of the virtual book of FIG. 55 in accordance with one or more embodiments of the present invention. The embodiment of FIG. 56 illustrates a method in which an alpha channel, for partial transparency (the rough edges) may be stored as image information in addition to the red, green and blue color components. While color components are discussed above, for the sake of convenience, only a black and white rendition of the image of FIG. 56 is provided herein.

In one or more embodiments, hardware-accelerated texture mapping may be employed to support an alpha channel . Another feature that may be practiced in connection with either two- dimensional or three-dimensional embodiments of the virtual book is dynamic deformation of images, e.g. bending the pages of this book as they turn, as illustrated in FIG. 56.

Managing Image Data in One or More Portable Devices'¹

In this section, a number of mechanisms for storing and interacting with the digital images, based on progressive and interactive visual collection transmission are described. In one or more embodiments of the present invention, variations on the methods disclosed herein allow near-instantaneous viewing, on a desktop computer, a mobile device, or other devices, of a large collection of images stored on a second mobile device; the use of remote storage to augment a mobile device's local memory for the purpose of viewing images; and browsing of large image collections from a mobile device. The various permutations enabled by one or more embodiments of the present invention may rely on a common client/server imaging and collection representation architecture.

One or more embodiments of the present invention may provide a method which may include providing a collection of digital images or other visual objects on a server; establishing communication between a client and said server; and enabling efficient multi-scale navigation by the client of collections of visual objects residing on the server. In this disclosure, the term "digital image data" may include digital photographs, digital images, visual documents or other forms of visual content. Herein, the term "image" generally corresponds to the term "digital image," and either of these terms may correspond to a "digital photograph." Herein, the term "client" generally corresponds to the term "client side" and to the term "client device". Herein, the terms "portable device", "portable camera device", and "camera device" generally refer to a digital image capturing devices and/or digital image storage devices. Herein, a "digital image capturing device" may include but is not limited to a digital camera, a camera-enabled mobile phone (which may be referred to as a camera-enabled cell phone) , a personal digital assistant, and/or a digital video recorder able to record digital still images. A "digital image capturing device" may include devices that are capable of receiving image data by directly optically receiving and recording such data (such as with a standard digital camera) and may also include devices that are able to receive image data via a wired or wireless Internet or other network connection.

One or more embodiments of the methods described herein may use a multi-resolution approach to address the problems of storing, synchronizing, browsing, and organizing collections of digital image data, which may be visual documents. Digital photos, which may be represented as arrays of color pixels at a certain resolution (e.g. 1024x768 pixels = 0.75 megapixels, 2592x1944 pixels = about 5 megapixels, etc.), are a common visual document type that end-users may create in large numbers using digital cameras, camera-enabled mobile phones, and digital video recorders, among other devices.

One or more of the methods described herein may also apply to visual data objects other than images, such as the roadmap or other vector data of Applicant reference document 489/17NP (U.S. Patent Application Serial No. 11/082,556) or the textual data of Applicant reference document 489/13 (U.S. Provisional Patent Application Serial No. 60/617,485). . (Both documents are identified in greater detail at the beginning of this document, and both documents are incorporated by reference herein) .

A problem facing users of existing systems is that the camera devices can quickly create large numbers of potentially large visual documents. However, these devices typically don't have sufficient memory or visual browsing facilities to allow satisfactory archiving, viewing, or organization of these documents.

Digital photographs or other digital image data stored in a camera or other portable device are generally downloaded periodically to a desktop or notebook computer, cleared from the camera's memory to allow more pictures to be taken, and organized and/or viewed on the desktop or notebook computer. Thereafter, digital photographs may be shared with friends by posting a selection of digital photographs to one or more Internet sites .

Traditional Approach to Managing Image Data on Portable Devices

The following steps may be followed when using a conventional approach to managing image data on portable devices. First, a mobile device, which may be a digital camera or other digital image data capturing device, takes pictures. Then, potentially after some culling of the pictures, the pictures are downloaded to the camera user's PC

(personal computer) and deleted from the camera device. The camera device's local storage may be limited and, in this conventional approach, only holds images transiently, until they are safely stored on the PC. The PC may permanently retain in its memory (e.g. hard disk drive or other non-volatile storage) any subset of the digital photos. The user may in turn upload some further culled subset of those images to a web server which may be owned by a web photo publishing service, typically at reduced resolution. The images uploaded may be made publicly viewable by any third party using a web browser on a PC or other device, or by some subset of those users with restricted access . Limitations of the existing approach may include lengthy download times from the camera device to the PC. There is also usually poor management of persistent storage on the camera device. Camera devices typically have small color displays on which viewers could theoretically view persistently stored images of the same type people commonly carry in their wallets

(such as of family and pets) and photos associated with callers or other contacts on a PDA (Personal Digital

Assistant) . However, the limitations on persistent storage in existing camera devices makes the above task difficult to achieve .

Moreover, existing camera devices impose other limitations. In existing camera devices, navigation of images stored on the camera device is generally awkward and difficult. In existing camera devices, there is a lack of a unified visual interface to image collections which would give users a consistent experience either on the camera device or on a PC. Existing camera devices tend to impose very restrictive limits on the number of photos that can be stored thereon before downloading becomes necessary. Thus, when employing existing approaches, a lengthy series of steps is generally involved in making images available to a third party. Managing Image Data According to One or More Embodiments of the Invention

FIG. 57 is block diagram of a system 550 for managing image data communication between one or more portable devices 562, 572 and one or more other computers in accordance with one or more embodiments of the present invention. System 550 may include a client side 560 and a server side 570. However, in alternative embodiments, the client and server statuses of the groupings of devices shown in FIG. 57 may be reversed. In one or more embodiments, system 550 may include portable device 1 562, portable device 2 572, personal computer 182 (which may be substantially the same as client computer 182 of FIG. 53), server 188 (which may be substantially the same as server computer 188 of FIG. 53) and/or additional computers 574. Preferably, each of devices 562, 572 and computers 182, 188, and 574, have memory and one or more displays included therewith. Alternatively or additionally, the devices and computers of FIG. 57 could be in communication with memories and/or displays. FIG. 57 illustrates various possible data paths useable in accordance with one or more embodiments of the present invention. One or more embodiments may use less than all of the data paths shown in FIG. 57. The available data paths shown in FIG. 57 may have one or more of the following features in common: 1) The data paths may involve server side

570 (the originator of the image data) and a client side 560

(the recipient of the image data); 2) bi-directional data paths (which are illustrated with lines having arrows at both ends) indicate that the devices pointed to by these arrows are capable of serving in either a client or a server capacity; 3) the connections may employ a hard-wired network (e.g. Universal Serial Bus (USB) , Firewire or Ethernet) or a wireless network (e.g., for nearby devices, Bluetooth, and for more remote connections, WiFi or a wireless wide-area networking protocol); and/or 4) the illustrated connections may or may not be ad-hoc. In one or more embodiments, both the client side 560 and the server side 570 may include one or more digital computing and/or storage devices including but not limited to: camera devices, personal computers, and personal digital assistants.

In one or more embodiments, a client device (client) may have one or more displays. The client may browse a collection of documents residing on the server using one or more of the efficient multi-resolution browsing methods described in

Applicant reference document 489/15P (U.S. Provisional

Application Serial No. 60/619,118, entitled "Method for Efficiently Interacting with Dynamic, Remote Photo Albums with

Large Numbers of Potentially Large Images" which is incorporated herein by reference) . These methods allow large collections of large images or other visual documents to be navigated efficiently over low-bandwidth connections. Zooming, panning, and dynamic rearrangement of such image collections are described in the referenced document.

In one or more embodiments, one of the properties of this navigation method is that the display contents may gradually come into focus as information is sent from the server to the client. The rate at which this information comes into focus may be governed by the ratio of connection bandwidth to display pixels. When the user zooms, pans, or rearranges the documents on the client side 560 such that new content becomes visible, this content again appears blurred, then comes into focus. Virtual Display

In one or more embodiments , a client's "display" need not necessarily be physical, or visible to an end-user. In one or more embodiments, this display can be a "virtual display", i.e. an abstract model of a display with a specified resolution. Such a "virtual display" might be represented as an array of pixel values in the client' s memory, irrespective of whether those pixel values are rendered to a screen. A virtual display may include wavelet data that at least partially describes one or more images. The wavelet data is preferably able to represent an image at a range of possible resolutions. In one or more embodiments, the wavelet data may correspond to that employed using JPEG2000. In one or more embodiments, a virtual display may include enough wavelet data to completely describe one or more images.

For example, if it were desirable for a device to acquire thumbnails of all of the images in a collection at a specified resolution, then this device could create a "virtual display" of the appropriate size, establish a connection with a server, and request a view of the entire collection. The full set of thumbnails could then be transmitted to and rendered on this "virtual display". If transmission were interrupted before all of the relevant data were sent from the server to the client, then the client's virtual display would not yet have all of the thumbnail images in perfectly focused condition. However, all of the requested thumbnail images would preferably be stored within the client's virtual display with sufficient resolution to enable rendering visible versions of these images on a screen. The images rendered in the manner described would generally be of lower visual quality than if the transmission of the image had concluded without interruption. Thus, some image degradation may be present in the images rendered using data from an incomplete, interrupted transmission.

Nevertheless, the described degradation is preferable to the prior art approach to sending a set of thumbnails across a network, in which the complete image of each thumbnail is transmitted in turn. Under this prior art approach, a premature interruption of connectivity would result in some thumbnails being available in their entirety (i.e. at full resolution) and would result in other thumbnails being completely unavailable. FIG. 58 illustrates this difference. FIG. 58A illustrates the results of an incomplete image data download employing an existing approach; and FIG. 58B illustrates the results of an incomplete image data download in accordance with one or more embodiments of the present invention.

FIG. 55A shows a prior art scenario in which all of the data for three thumbnails (shown with square shapes) have been received, and in which the remaining nine thumbnails (shown with X's) have not been received at all. FIG. 55B illustrates a situation that may arise employing one or more embodiments of the present invention, in which all twelve thumbnails

(shown as cross-hatched square shapes) have been received at some level of resolution, which is preferably acceptable for viewing, but which is likely below the resolution that would be obtained after conclusion of a complete and uninterrupted data transmission.

In one or more embodiments, a client may have a client- side cache that caches recently viewed visual content. A standard MRU (most-recently-used) cache may be employed for the caching needs for one or more embodiments of the present invention. However, a cache disclosed in U.S. Patent Application Serial No. 11/141,958 (client reference document 489/10NP) , entitled "Efficient Data Cache", which is incorporated herein by reference, may be beneficially employed to enable more sophisticated client-side caching. In either case, a given amount of client-side memory may be devoted to the cache. Thus, navigation back to a recently viewed image may permit using image data stored in the cache, rather than requiring that this image data be re-sent from the server.

A client may have multiple displays. A given display may be physical or virtual. A given display may be driven directly by user input, or it may be driven programmatically by software within a client computer such as computer 182. The total size in pixels of all of the displays may be fixed or bounded by some limit, and this limit may define a minimum amount of client-side memory needed for visual content. This client-side memory is preferably separate from storage space allocated to the cache memory.

An embodiment involving both a physical display and a virtual display is described in the following. Preferably, a physical display within a client device is visible to the user and allows zooming and panning navigation through, and rearrangement of, a collection of digitally stored images. The user may also select one or more images from the collection and send them to a "holding pen" which may serve as a place for storing user-selected images. The holding pen may be visualized in some way on the physical display. Adding an image to the holding pen preferably causes the image to be placed on the virtual display, which may be invisible to the user. As images are added to the holding pen, the virtual display representing the holding pen gradually fills up. This virtual display may increase in size (as measured in number of pixels) up to some limit, after which its size may remain fixed at this limit. The virtual display may be too small to display all of the images in the holding pen at full resolution. In this case, the data storage space needed for the images resident in the virtual display is preferably reduced as needed to fit the images into the virtual display. Hence, an off-screen view (the virtual display) preferably gets supplemented with images as the user puts viewable images into the holding pen. This supplementing of the off-screen view may occur invisibly to the user.

A method for browsing is disclosed in U.S. Patent Application Serial No. 10/790,253 (Applicant reference document 489/2NP) , entitled "System and Method for Exact

Rendering in a Zooming User Interface", which is incorporated by reference herein. The method disclosed in that document for determining the order in which , information is sent from the server to the client based on the client' s view may be modified for a multiple display scenario. The 489/2NP document discloses that visual information may be broken up into tiles, with each tile covering a region in space at a given resolution. Low-resolution tiles may then occupy large physical areas, while high-resolution tiles may occupy smaller physical areas, such that the amount of information in each tile may be substantially the same.

The 489/2NP document discloses methods for ordering tiles using criteria discussed in the following. One criterion may be tile resolution and tile position on the display. Sorting of tiles could be lexicographic, such that lower-resolution tiles always precede higher-resolution tiles, with spatial position only playing a role in resolving order within a resolution. (Lexicographic sorting is referenced here in the generalized tuple sense — for example, the lexicographic sort of the set of triplets {(1,2,3), (0,3,1), (4,0,0), (0,0,1), (0,3,2)} would be (0,0,1), (0,3,1), (0,3,2), (1,2,3), (4,0,0) .) Alternatively, non-lexicographic sorting criteria may be employed. For instance, a linear combination of a plurality of properties could be used to sort tiles. Such properties may include but are not limited to: resolution (which could be expressed in logarithmic units) and distance of the tile from the center of the display. Herein, the term "sort key" corresponds to the term "sorting criterion."

In this embodiment, lower-resolution tiles may be sent in preference to higher-resolution tiles, and tiles near the center of the display may be sent in preference to tiles near the periphery, but these properties can trade off against each other .

Preferably, minor changes may be implemented to adapt the above scheme for a multiple display scenario. In one embodiment, display number can be added as an extra lexicographic sort key. Thus, a first display might refine completely (in accordance with the other sort keys) before any tiles are sent relevant to a second display.

In another embodiment, display number can be an additional variable for inclusion in a linear combination, allowing display number to trade off in some fashion against resolution and proximity to the center of the display. In yet another embodiment, the displays can coexist in an imaginary

"common space", and the resolution and proximity-to-center sort keys can be used as before. The "common space" is a notional space establishing an imaginary spatial relationship between multiple displays, as if they were regions of a single, larger display. Defining this imaginary spatial relationship determines all parameters needed for prioritizing tiles in the multiple displays.

FIG. 59 is a block diagram of a "common space" 740 that may include a physical display (screen) 742 and two virtual 2006/011405

displays 744, 746 in accordance with one or more embodiments of the present invention. The physical display 742 is preferably in the center of "common space" 740 at normal size. Virtual displays Vl 744 and V2 746 are preferably off to the side, and V2 is preferably scaled down, so that its pixels are preferably half the linear size of the physical display' s pixels. This means that, assuming purely lexicographic tile sort order, the contents of each resolution level in Vl 746 will preferably be sent from the server to the client after the corresponding resolution for the physical display (since Vl is farther from the center of the space than any point on the physical display) . Resolutions in V2 746 may be sent after all the tiles at a resolution twice as fine have been sent for both the physical display 742 and Vl 744. It is noted that it isn't necessary for the "common space" 740 to correspond to any real larger display or memory address space. The "common space" 740 is merely a conceptual convenience for establishing the relationships among tile priorities across different displays. Clearly many tradeoffs are possible. These tradeoffs can have the consequence, as in the lexicographic example above, of giving refinement of the physical display 742 the highest priority, while using any excess time and bandwidth not required for bringing the physical display into focus to continue refining the virtual display (s) 744, 746. The tradeoffs may alternatively begin refining the virtual display (s) after the physical display has largely, but not completely, come into focus. After the physical display 742 has largely come into focus, the physical and virtual displays 744, 746 can share bandwidth resources to refine in concert.

If the images in a collection are JPEG2000 images, then any subset of the data for a given image can itself comprise a JPEG2000 image file. During navigation of an image, the client may progressively download image data from the server, thereby supplementing the quality of the client's subset of the image and giving the client the ability to create a JPEG2000 file that is an increasingly accurate approximation of the full image.

If a client has navigated everywhere in an image, or has viewed the entire image at full resolution for long enough that all of the image data has been transmitted, then the client can recreate the entire original JPEG2000 file for that image. If a client has zoomed in closely on only a part of a large image, then the client could still create a JPEG2000 file, but it would lack detail everywhere except where the client zoomed in. This property of JPEG2000 can be extended to other multi-resolution document types as well. If the client never zoomed in beyond a given resolution, then no information would be available regarding the image content beyond that given resolution. In this case, the version of the JPEG2000 image which may be created and/or stored by the client may have a lower overall resolution than the original version of that image.

One application of the virtual display scenario described above is to ameliorate the problem of long download times for images from a camera. In one or more embodiments, the camera or camera-enabled mobile device may operate as the server, and a PC may operate as the client.

In one or more embodiments, rather than initiating a time-consuming batch download of all images to the PC, when the camera and PC are connected, the PC can rapidly browse through the complete set of images available on the camera. During navigation, a group of images can be selected and put in the holding pen. Note that if all images on the camera are to be downloaded to the PC in their entirety, then the total time needed to accomplish the transfer remains the same as in the prior art. However, as with the closely related problem of thumbnail transmission, this method can provide a number of advantages over the conventional serial download of images which are listed and discussed below. The present invention is not limited to the features listed below.

Image download and user navigation of the full image set on the camera or other mobile device may be concurrent and cooperative in their use of bandwidth (in effect, navigation merely influences the order in which tiles are sent from server to client) .

If the PC's display is larger than the mobile device's display, then better choices can be made about which images to download, which to leave on the mobile device, and which to discard, without incurring the delay of downloading the entire set before deciding.

The experiences of browsing on the PC and on the mobile device (assuming that it also has a display) , respectively, are preferably simple and experientially similar, thereby increasing usability.

If lower-resolution versions of the images in the holding pen are desired, it' s preferably straightforward to suitably limit the detail of downloaded data by reducing the size of the item on the virtual display. It is noted that reducing the image size in this manner may both speed up downloading by a large factor — i.e. by a factor of 4 per resolution level discarded — and requires less space on the PC) .

By limiting the size of the virtual display and reducing the number of images therein as desired, the amount of memory allocated to photos on the PC can be bounded. Also, different constraints can be placed on different photos, and hence space can be allocated based on recency or one or more other criteria.

In one or more embodiments, premature loss of connectivity results in a degradation in the quality of some or all of the images to be downloaded, instead of completely removing some images from the download operation. (Note that the bulk of the data volume for an image is very high- resolution detail, some of which is camera noise, and all of which is less critical for ordinary viewing than the coarser image structure. Hence, it is preferable to conduct the transmission of the high-resolution image data for all images after the lower-resolution image data for all of the images has been fully transmitted.) Hybrid prioritization of the image data is also possible, for example, favoring the complete download of a subset of the photos before proceeding to refine a second set beyond thumbnail detail.

In one or more embodiments, one or more methods disclosed herein are resilient to intermittent connectivity, since any JPEG2000 object can continue to be augmented at any time with additional information while still allowing browsing and interaction with whatever visual data has already been received.

With regard to the above references to a) reducing the size of the item on the physical display and b) bounding the amount of memory allocated to photos on the PC, it is noted that typical home users may not want to discard any of their images (after an initial culling of such images) . If such users continue to add sufficient storage to their PC, then of course it should not be necessary to discard any content. The addition of storage can in itself increase the virtual display maximum size. Features of (a) and (b) above can therefore be omitted if a sufficiently large virtual display size can be created (i.e., if there is enough available client-side storage) .

Because it may be unclear to the user on the client-side when the "holding pen" images are finished downloading, some form of visual indication for completion is desirable. As an example, checkmarks or green dots can appear next to images as they finish downloading. When all images in the "holding pen" include green dots, the connection can be broken without loss.

Operations such as requesting that the camera discard some of its images using the client computer (which may be a PC) may benefit from some additional communication from the client to the server beyond that contemplated in Applicant reference document 489/15P. In one or more other embodiments, the client side could also instruct the server side (which may be a mobile device such as a digital camera or mobile phone) to launch its own client side, and create its own view to receive content from the PC.

This is similar to "push" methods developed in the context of the World Wide Web. The PC can render the camera/mobile phone's "view" of content on the PC, thus (for example) displaying the green completion dots described above for images uploaded from the PC to the camera. Each of the reciprocal arrows of FIG. 57 can be implemented using either a "push" or "pull" arrangement. Specifically, the viewport setting, the arrangement, and other navigation settings may be controlled from either the client side 560 ("pull") or from the server side 570 ("push") . A user interacting with one device can be connected reciprocally to another device, thereby enabling both "pulling" and "pushing" to occur simultaneously. We will now enumerate the potential client-server connections shown in FIG. 57, and describe briefly how they can be used and why they are useful.

A mobile device 562 which may be a camera or camera- enabled mobile phone may serve content to a user' s PC

(personal computer) 182. This connection might typically take place over a USB cable or a Bluetooth ad-hoc wireless network.

The benefits are described above.

The PC 182 may serve content back to the mobile device 562. This can be useful for the following applications, among others .

"Wallet photos" can be sent from the PC to the camera or mobile phone, even if those photos weren't taken by the mobile device. The PC may be a home appliance without a display, and the mobile device may then be used as a primary visual interface to the archived visual material. The mobile device in this context may be a digital camera, a camera-enabled cell phone, a PDA, or a mobile tablet PC with a display. A first mobile device can be connected directly to, or form an ad-hoc network with, another mobile device (the "guest") . The two mobile devices can then view and share each others' photos.

The PC could upload images (via push) to a remote server. The server may be a photo sharing service, and may therefore implement the kind of space constraints envisioned in the above processes of reducing the size of the item on the physical display and bounding the amount of memory allocated to photos on the PC. The remote server could then serve its collection to one or more additional PCs. Typically this would be a broadband connection. However, other connection types could be employed. The remote server could also serve collections to mobile device users. Typically this would be a mobile wireless wide- area network.

Mobile devices could upload their images via "push" (that is, under control of the mobile devices) to a remote server. In one or more embodiments , the upload may be automatic, allowing the mobile device to transparently extend its apparent storage space by transferring content freely to a server and deleting it locally when transfers are complete. In connection with the last two items above, it is noted that local caching on the mobile device 562 may allow the mobile device 562 to support browsing through very large thumbnail collections using only local storage, even if the local storage is limited. Zooming in on details of recently viewed images may also be possible, if the relevant information is still in the mobile device's local cache.

Zooming in on images whose details are only available on a remote server could result in a blurry and un-detailed image. If the mobile device is on a network that includes the remove server 188 however, the blurry image can become progressively more refined as more and more detailed image data is downloaded to the mobile device 562. If the mobile device is not connected to a network that can supply additional image data, the image may not be presented with any greater detail than is available in the initial thumbnail image .

Montage of Low-Resolution Images

One or more embodiments of the present invention may define precomputed steps and interactive rendering algorithms which can be used in a variety of configurations to implement downloading of selected images and or image regions at controllable levels of resolution for various applications. Many of these applications (such as focusing on regions of interest, virtual book etc..) may involve user interaction with a "universe" of images. In one or more embodiments, the starting point for precomputation may therefore be a list of the filenames, URLs, or other strings referencing the individual images. When a user is zoomed out far enough to view all of these images at once, it is impractical for either the client or the server to traverse all of the image files, as there may be a very large number of them. For example, in the regime where individual images occupy 2x2=4 pixels onscreen, tens of thousands or hundreds of thousands of images may be in view. Even if these images support efficient low-resolution access, merely opening and closing 100,000 files involves a large overhead and could be impractical to accomplish on an interactive timescale. It may therefore be desirable to use a cached representation of low-resolution versions of these images, referred to herein as a "montage". In one or more embodiments, a montage may be a mosaic or collage of all of the images, rendered at low resolution and packed efficiently into a rectangular area, as shown in FIG. 60. Auxiliary metadata, which can be embedded in the montage image file or stored separately, may identify rectangular regions on the montage image with a particular image file.

In one embodiment, the montage image itself can be navigated using a zooming and panning interface. When the user zooms in far enough to exhaust the resolution available in the montage version of one or more images within the montage, the metadata for that image may refer the client to one or more individual image files, and the client may use imagery from these image files to render the images at higher resolution .

In one or more embodiments, the overall size of the montage in pixels may be chosen such that its resolution is only exhausted when zooming in to a stage where only a small number of images, which may be referred to herein as a "set" of images, are visible simultaneously. Therefore, access to more than this small number of images at high resolution is preferably not needed at any given time. During subsequent zooming and panning, image streams may be opened and closed as needed to limit the number of high resolution images that are open at any given time.

The above approach to navigating many images of high resolution incurs a limitation: the montage layout is preferably designed for packing efficiency, but the user may want a different arrangement of the images onscreen. Moreover, the user may want to be able to dynamically rearrange the layout of images on the screen.

In one or more embodiments, to enable such rearrangement, we can make use of a graphics rendering technique known as "texture mapping", which may be implemented in software but is, in general, hardware-accelerated on modern personal computers. Texture mapping allows a portion of a "texture", or source image, to be drawn on the display, optionally rescaling the image, rotating it, and/or performing a three- dimensional perspective transform. Other hardware-accelerated transformations are often supported, including color correction or alteration, full or partial transparency, lighting, occlusion, and coordinate remapping. A low- resolution version of the montage can be used as a "texture", so that when the user is zoomed out, the individual images within the montage can be dynamically remapped in any way, as in FIG. 61. More than one texture map may be used, in which case each texture map may be a montage containing a subset of the images. Transitions between arrangements may or may not be animated. It is noted that rearrangement can take place while the user is zoomed in, but because the rearrangement might result in a new zoomed-in view of an image which was previously off-screen, the new image may initially be very blurry.

In another embodiment, the texture mapping technique may be used only during dynamic rearrangement of images. When the image arrangement is static, software compositing can be used to assemble all or part of a higher-definition rearranged montage on-screen. This software compositing method is especially valuable in combination with the multiresolution rendering techniques described in US Patent application Serial No. 10/790,253, (Applicant reference document 489/2NP) , identified in detail earlier in this disclosure. This method may in effect creates a new "display montage" by rearranging the imagery of the original montage. Texture mapping may also be used to display high resolution images, but in this case, rather than using textures containing montages of multiple images, textures are used that contain tiles of individual images . This technique is also described in US Patent application Serial No. 10/790,253 (Applicant reference document 489/2NP) .

In one or more embodiments, montage rearrangement may be used to support reorganization of the images without recourse to texture mapping.

In one or more other embodiments, texture mapping, software rendering, or any combination of the two may be used to render imagery in three dimensions instead of on a one- dimensional plane. Dynamic rearrangement in three dimensions is also possible. Three-dimensional applications may include virtual galleries or other walk-through environments as well as virtual books. Virtual books are described herein and still further in Provisional Patent Application Serial No. 60/619,053 which is incorporated herein by reference.

FIG. 62 is a block diagram of a computing system 1000 adaptable for use with one or more embodiments of the present invention. In one or more embodiments, central processing unit (CPU) 1002 may be coupled to bus 1004. In addition, bus 1004 may be coupled to random access memory (RAM) 1006, read only memory (ROM) 1008, input/output (I/O) adapter 1010, communications adapter 1022, user interface adapter 1006, and display adapter 1018.

In one or more embodiments, RAM 1006 and/or ROM 1008 may hold user data, system data, and/or programs. I/O adapter 1010 may connect storage devices, such as hard drive 1012, a CD-ROM (not shown) , or other mass storage device to computing system 1000. Communications adapter 1022 may couple computing system 1000 to a local, wide-area, or Internet network 1024. User interface adapter 1016 may couple user input devices, such as keyboard 1026 and/or pointing device 1014, to computing system 1000. Moreover, display adapter 1018 may be driven by CPU 1002 to control the display on display device 1020. CPU 1002 may be any general purpose CPU. It is noted that the methods and apparatus described thus far and/or described later in this document may be achieved utilizing any of the known technologies, such as standard digital circuitry, analog circuitry, any of the known processors that are operable to execute software and/or firmware programs, programmable digital devices or systems, programmable array logic devices, or any combination of the above. One or more embodiments of the invention may also be embodied in a software program for storage in a suitable storage medium and execution by a processing unit.

METHODS AND APPARATUS FOR EMPLOYING IMAGE NAVIGATION TECHNIQUES TO ADVANCE COMMERCE

BACKGROUND OF THE INVENTION The present invention is directed to methods and apparatus for the application of image navigation techniques in advancing commerce, for example, by way of providing new environments for advertising and purchasing products and/or services . Digital mapping and geospatial applications are a booming industry. They have been attracting rapidly increasing investment from businesses in many different markets—from candidates like Federal Express, clothing stores and fast food chains. In the past several years, mapping has also become one of the very few software applications on the web that generate significant interest (so-called "killer apps") , alongside search engines, web-based email, and matchmaking.

Although mapping should in principle be highly visual, at the moment its utility for end users lies almost entirely in generating driving directions. The map images which invariably accompany the driving directions are usually poorly rendered, convey little information, and cannot be navigated conveniently, making them little more than window dressing. Clicking on a pan or zoom control causes a long delay, during which the web browser becomes unresponsive, followed by the appearance of a new map image bearing little visual relationship to the previous image. Although in principle computers should be able to navigate digital maps more effectively than we navigate paper atlases, in practice visual navigation of maps by computer is still inferior. DESCRIPTION OF THE INVENTION

The present invention is intended to be employed in combination with a novel technology permitting continuous and rapid visual navigation of a map (or any other image) , even over a low bandwidth connection. This technology relates to new techniques for rendering maps continuously in a panning and zooming environment. It is an application of fractal geometry to line and point rendering, allowing networks of roads (ID curves) and dots marking locations (OD points) to be drawn at all scales, producing the illusion of continuous physical zooming, while still keeping the "visual density" of the map bounded. Related techniques apply to text labels and iconic content. This new approach to rendition avoids such effects as the sudden appearance or disappearance of small roads during a zoom, an adverse effect typical of digital map drawing. The details of this navigation technology may be found in U.S. Patent Application No.: 10/803,010, filed on even date herewith, entitled METHODS AND APPARATUS FOR NAVIGATING AN IMAGE, Attorney Docket No.: 489/9, the entire disclosure of which is hereby incorporated by reference. This navigation technology may be referred to herein as "Voss."

The use of the Voss technology enables a number of novel and commercially valuable business models for Internet mapping. These models take as their point of departure the proven success of businesses like Yahoo! Maps and MapQuest, both of which generate revenue from geographical advertising. Our approach, however, goes well beyond advertising, capitalizing on the ability of new technology to add substantial value for both businesses and end users. The essential idea is to allow businesses and people to rent "real estate" on the map, normally at their physical address, in which they can embed zoomable content. This content can appear in an iconic form—i.e. golden arches for McDonalds—when viewed on a large-scale map, but then smoothly and continuously resolve into any kind of web-like content when viewed closely. By integrating our mapping application with dynamic user content and business/residential address data, we can thus enable a kind of "geographical world wide web" .

In addition to large horizontal markets for both consumers and retailers, synergies with existing geographical information service (GIS) providers and related industries would obtain, such as car navigation systems, cell phones and PDAs, real estate rentals or sales, classified advertising, and many more. Possible business relationships in these areas include technology licensing, strategic partnership, and direct vertical sales.

The capabilities of the new navigation techniques of the present invention are described in detail in the aforementioned U.S. patent application. For this application, the most relevant aspects of the base technology are: - smooth zooming and panning through a 2D world with perceptual continuity and advanced bandwidth management; an infinite-precision coordinate system, allowing visual content to be nested without limit; the ability to nest content stored on many different servers, so that spatial containment is equivalent to a hyperlink.

With respect to the latter two elements, additional details may be found in U.S. Provisional patent application

No.: 60/474,313, filed, May 30, 2003, entitled SYSTEM AND METHOD FOR INFINITE PRECISION COORDINATES IN A ZOOMING USER INTERFACE, the entire disclosure of which is hereby incorporated by reference.

A map consists of many layers of information; ultimately, the Voss map application will allow the user to turn most of these layers on and off, making the map highly customizable. Layers include :

1. roads;

2. waterways;

3. administrative boundaries; 4. aerial photography-based orthoimagery (aerial photography which has been digitally "unwarped" such that it tiles a map perfectly) ;

5. topography; β. public infrastructure locations, e.g. schools, churches, public telephones, restrooms;

7. labels for each of the above;

8. cloud cover, precipitation and other weather conditions;

9. traffic conditions; 10. advertising; and

11. personal and commercial user content; etc.

The most salient layers from the typical user' s point of view are 1-4 and 7. The advertising/user content layers 10- 11, which are of particular interest in this patent application are also of significant interest. Many of the map layers—including 1-7—are already available, at high quality and negligible cost, from the U.S. Federal Government. Value- added layers like 8-9 (and others) can be made available at any time during development or even after deployment. Several companies offer enhanced road data with annotations indicating one-way streets, entrance and exit ramps on highways, and other features important for generating driving directions but not for visual geography. The most relevant commercial geographic information service (GIS) offering for our application is geocoding, which enables the conversion of a street address into precise latitude/longitude coordinates. It has been determined that obtaining geocoding services will not be prohibitive. In addition to map data, national Yellow Pages/White Pages data may also be valuable in implementing the present invention. This information may also be licensed. National Yellow Pages/White Pages data may be used in combination with geocoding to allow geographical user searches for businesses, or filtering (e.g. "highlight all restaurants in Manhattan") . Perhaps most importantly, directory listings combined with geocoding will greatly simplify associating business and personal users with geographic locations, allowing "real estate" to be rented or assigned via an online transaction and avoiding the need for a large sales force.

National telephone and address databases designed for telemarketing can be obtained inexpensively on CDs, but these are not necessarily of high quality—their coverage is usually only partial, and they are often out of date. A number of companies offer robust directory servers with APIs designed for software-oriented businesses like ours. Among the best is W3Data (www . w3data . com) , which offers what it calls "neartime" national telephone listings using an XML-based API for $500/month minimum, beginning at $0.10/hit and going down to $0.05/hit for volumes of 250, 000/month, or $0.03/hit for volumes over 1, 000, 000/month. The entire U.S. and Canada are covered. Reverse queries are also possible, i.e. looking up a name given a telephone number. The "neartime" data are updated at least every 90 days. Combined with 90-day caching of entries already obtained on our end, this is a very economical way to obtain high-quality national listings. "Realtime" data, updated nightly, are also available, but are more expensive ($0.20/hit) . The realtime data are identical to those used by 411 operators.

Service providers similar to W3Data who can produce national business listings by category also exist, with comparable business and pricing models.

Classic advertising-based business models, as well as "media player" models involving a proprietary data format and a downloadable plug-in (such as Flash, Adobe Acrobat, and Real Player) normally face a chicken-and-egg problem. An advertising venue only becomes worth advertising in when people are already looking; a plug-in (even if free) only becomes worth downloading when there is already useful content to view; and content only becomes attractive to invest in making if there is already an installed user base ready to view it.

Although the Voss mapping application requires both downloadable client software and generates revenue through advertising, it will not suffer the disadvantages of classic advertising-based business models. Even before any substantial commercial space has been "rented", the present invention will provide a useful and visually compelling way of viewing maps and searching for addresses—that is, similar functionality to that of existing mapping applications, but with a greatly improved visual interface. Furthermore, the approach of the present invention provides limited but valuable service to non-commercial users free of charge to attract a user base. The limited service consists of hosting a small amount (5-15 MB) of server space per user, at the user's geographical location—typically a house. The client software may include simple authoring capabilities, allowing users to drag and drop images and text into their "physical address", which can then be viewed by any other authorized user with the client software. (Password protection may be available.) Because the zooming user interface approach is of obvious benefit for navigating digital photo collections— especially over limited bandwidth—the photo album sharing potential alone may attract substantial numbers of users . Additional server space may be available for a modest yearly fee. This very horizontal market is likely to be a major source of revenue.

The usual chicken-and-egg problem is thus avoided by providing valuable service from' the outset, an approach that has worked well for search engines and other useful (and now profitable) web services. This contrasts sharply with online dating services, for example, which are not useful until the user base is built up. In accordance with various aspects of the invention, the sources of revenue may include:

1. Commercial "rental" of space on the map corresponding to a physical address;

2. Fees for "plus services" (defined below) geared toward commercial users;

3. Fees for "plus services" geared toward noncommercial users;

4. Professional zoomable content authoring software; 5. Licensing or partnerships with PDA, cell phone, car navigation system, etc. vendors and service providers;

6. Information.

Basic commercial rental of space on a map can be priced using a combination of the following variables:

1. Number of sites on the map;

2. Map area ("footprint") per site, in square meters;

3. Desirability of the real estate, based on aggregate viewing statistics;

4. Server space needed to host content, in MB.

Plus services for commercial users are geared toward franchises, businesses wishing to conduct e-commerce or make other more sophisticated use of the web, and businesses wishing to increase their advertising visibility:

1. Greater visible height—several levels may be offered, allowing visible but unobtrusive icons or "flags" to indicate the locations of business sites from farther away (more zoomed out) than they would otherwise be visible.

2. Focusing priority—Voss brings areas of the image into focus during navigation as data become available. By default, all visual content is treated equally, and focusing goes from the center of the screen outward. Focusing priority allows commercial content to come into focus faster than it would otherwise, increasing its prominence in the user' s "peripheral vision". This feature will be tuned to deliver commercial value without compromising the user's navigation experience.

3. Including a conventional web hyperlink in the zoomable content—these may be clearly marked (e.g., with the conventional underlined blue text) and, on the user's click, open a web browser. We can either charge for including such a hyperlink, or, like Google, charge per click.

4. Making the geographic area rented refer to an outside commercial server, which will itself host zoomable content of any type and size—this is a fancier version of #3, and allows any kind of e-business to be conducted via the map. We can once again either charge a flat fee or charge per user connecting to the outside server (though this no longer requires a click, just a zoom-in) .

5. Billboards—as in real life, many high- visibility areas of the map will have substantial empty space. Companies can buy this space and insert content, including hyperlinks and "hyperjumps", which if clicked will make the user jump through space to a commercial site elsewhere on the map. In contrast to ordinary commercial space, billboard space need not be rented at a fixed location; its location can be generated on the fly during user navigation.

This last service raises issues of the "ecology" or visual aesthetics and usability of the map. It is desirable that the map is attractive and remains a genuine service for users, which implies limits on advertising and tasteful "zoning regulations". If the map becomes too cluttered with billboards or other content which does not mirror real world geography, then users will be turned off and the value of the map as an advertising and e-commerce venue will go down.

As we usage statistics are gathered, the value of many of these commercial "plus services" may be quantitatively demonstrable. Quantitative evidence of a competitive advantage should increase sales of these extras.

Plus services geared toward non-commercial users will consist of some of the same products, but scaled, priced and marketed differently. Limited authoring of zoomable Voss content will be possible from within the free client. This will include inserting text, dragging and dropping digital photos, and setting passwords. Professional authoring software may be a modified version of the client designed to allow more flexible zoomable content creation, as well as facilities for making hyperlinks and hyperjumps, and inserting custom applets.

Use of the present invention may generate a great deal of aggregate and individual information on spatial attention density, navigation routes and other patterns. These data are of commercial value.

Claims

CLAIMS :

1 . A method, comprising : accessing an Internet site by a proxy server; converting image data from said Internet site into a multiresolution representation; and transmitting said image data in said multiresolution representation to a client device.

2. The method of claim 1 further comprising: navigating a web page by said client device.

3. The method of claim 2 wherein said navigating comprises :

(a) navigating said Internet site image data by said proxy server; and

(b) navigating said multiresolution image data by said client device, wherein said navigating of step (a) and said navigating of step (b) occur substantially simultaneously.

4. The method of claim 2 wherein said navigating comprises : storing multiresolution image data for at least substantially the entirety of said web page at said proxy server; and navigating said stored multiresolution image data by said client device.

5. The method of claim 2 wherein said converting step comprises : continuously converting said Internet site image data as needed by said navigating by said client device.

6. The method of claim 2 wherein said converting step comprises : pre-converting selected image data from said Internet site by said proxy server prior to said selected image data being needed for said client device navigation/ and storing said pre-converted image data at said proxy server.

7. The method of claim 6 further comprising: transmitting said stored image data to said client device as needed by said client device navigation.

8. The method of claim 6 further comprising: ensuring that said stored pre-converted image data is not stale prior to transmission thereof to said client device.

9. The method of claim 8 wherein said ensuring comprises : comparing one of a timestamp and a checksum of said pre- converted image data with a respective one of a timestamp and a checksum of corresponding image data originating from said Internet site.

10. The method of claim 6 wherein said pre-converting step comprises: converting image data for at least one web page by said proxy server; and storing said converted image data for said at least one web page at said proxy server.

11. The method of claim 2 further comprising: enabling bi-directional interactive communication between said client device and said Internet site.

12. The method of claim 11 wherein said interactive communication comprises: transmitting navigation commands from said client device to said proxy server.

13. The method of claim 2 further comprising: emulating dynamic Internet graphics for said client device by said proxy server.

14. The method of claim 13 wherein said emulating step comprises emulating at least one of: dynamic HTML (Hypertext Markup Language) ; and applets .

15. The method of claim 2 wherein said navigatinc comprises : selecting a resolution level at which to view at least portion of said web page, at said client device.

16. The method of claim 2 wherein said navigating comprises : selecting a region of said web page for display on said client device.

17. The method of claim 16 wherein said navigating further comprises: selecting a resolution at which to view said region of said web page.

18. The method of claim 2 wherein said navigating comprises : selecting a plurality of portions of said web page for viewing at said client device.

19. The method of claim 18 wherein said navigating further comprises: selecting a plurality of respective resolutions at which to view said plurality of web page portions.

20. The method of claim 2 wherein said navigating said multiresolution image data comprises: panning said multiresolution image data.

21. The method of claim 2 wherein said navigating said multiresolution image data comprises : zooming in on said multiresolution image data.

22. An apparatus, comprising: a client device; and a proxy server in communication with, and serving as an intermediary between, said client device and an Internet site, wherein said proxy server is operable to convert image data from said Internet site into a multiresolution visual data format .

23. The apparatus of claim 22 wherein said multiresolution visual data format is JPEG2000.

24. The apparatus of claim 2 wherein said client device is a cell phone, and wherein said cell phone comprises: at least one mechanism for navigating image data in said multiresolution visual data format.

25. The apparatus of claim 24 wherein said mechanism is a touchpad and said touchpad enables panning image data in said multiresolution visual data format.

26. The apparatus of claim 22 wherein said mechanism enables zooming in on image data in said multiresolution visual data format.

27. The apparatus of claim 26 wherein said mechanism for zooming comprises a roller that is rotatable by a user of said client device.

28. The apparatus of claim 22 wherein said client device is one of the group consisting of: a cell phone, a PDA (Personal Digital Assistant) , a notebook computer, a desktop computer, a tablet computer, and a television set.