US9197864B1 - Zoom and image capture based on features of interest - Google Patents

Zoom and image capture based on features of interest Download PDF

Info

Publication number
US9197864B1
US9197864B1 US13/617,608 US201213617608A US9197864B1 US 9197864 B1 US9197864 B1 US 9197864B1 US 201213617608 A US201213617608 A US 201213617608A US 9197864 B1 US9197864 B1 US 9197864B1
Authority
US
United States
Prior art keywords
interest
image
feature
level
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US13/617,608
Inventor
Thad Eugene Starner
Joshua Weaver
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to US13/617,608 priority Critical patent/US9197864B1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WEAVER, Joshua, STARNER, THAD EUGENE
Priority to US14/885,763 priority patent/US9466112B1/en
Application granted granted Critical
Publication of US9197864B1 publication Critical patent/US9197864B1/en
Priority to US15/262,847 priority patent/US9852506B1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WEAVER, Joshua, STARNER, THAD EUGENE
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/50Constructional details
    • H04N23/54Mounting of pick-up tubes, electronic image sensors, deviation or focusing coils
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/69Control of means for changing angle of the field of view, e.g. optical zoom objectives or electronic zooming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2628Alteration of picture size, shape, position or orientation, e.g. zooming, rotation, rolling, perspective, translation
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/014Head-up displays characterised by optical features comprising information/image processing systems
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B2027/0178Eyeglass type
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • Wearable systems can integrate various elements, such as miniaturized computers, cameras, input devices, sensors, detectors, image displays, wireless communication devices as well as image and audio processors, into a device that can be worn by a user.
  • Such devices provide a mobile and lightweight solution to communicating, computing and interacting with one's environment.
  • the captured media may include video, audio, and still frame images.
  • the media may be captured continuously. In other cases, the media may be captured based on inputs from the wearer.
  • Disclosed herein are systems and methods that may be implemented to provide an efficient and intuitive search for, and navigation of, stored information associated with a user's real-world experience.
  • a system with at least one processor and a non-transitory computer readable medium is provided.
  • Program instructions may be stored on the non-transitory computer readable medium and may be executable by the at least one processor to perform functions.
  • the functions include receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria.
  • the functions further include causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest, and providing the image of the first feature of interest on a display.
  • the functions also include determining a level of interest in the first feature of interest, and capturing a second image based on the level of interest.
  • a method in a second example, includes receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria. The method further includes causing a camera to zoom to and capture an image of a portion of the field of view that includes the first feature of interest. The method also includes causing the captured image of the first feature of interest to be stored in an image-attribute database including data for a set of images. The data for a given image of the set of images specifies one or more attributes from a set of attributes.
  • a non-transitory computer readable memory with instructions stored thereon is provided.
  • the instructions may be executable by a computing device to cause the computing device to perform functions.
  • the functions include receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria.
  • the functions further include causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest.
  • the functions also include providing the first image of the first feature of interest on a display, determining a level of interest in the first feature of interest, and capturing a second image based on the level of interest.
  • FIG. 1A is a block diagram of an exemplary method for intelligently zooming and capturing images.
  • FIG. 1B is a block diagram of an exemplary method for storing a captured image in an image-attribute database.
  • FIG. 1C is a block diagram of an alternative exemplary method for intelligently zooming and capturing images.
  • FIG. 2 illustrates an environment in which an image within a first field of view may be captured.
  • FIG. 3A illustrates a second field of view for capturing an image of a first feature of interest determined according to a first interest criteria.
  • FIG. 3B illustrates a third field of view for capturing an image of a second feature of interest determined according to the first interest criteria.
  • FIG. 3C illustrates a fourth field of view for capturing an image of a third feature of interest determined according to the first interest criteria.
  • FIG. 4A illustrates a fifth field of view for capturing an image of a fourth feature of interest determined according to a second interest criteria.
  • FIG. 4B illustrates a sixth field of view for capturing an image of a fifth feature of interest determined according to the second interest criteria.
  • FIG. 5 illustrates an example presentation of captured images according to at least one attribute.
  • FIG. 6A illustrates an example system for receiving, transmitting, and displaying data.
  • FIG. 6B illustrates an alternate view of the system illustrated in FIG. 5A .
  • FIG. 7A illustrates another example system for receiving, transmitting, and displaying data.
  • FIG. 7B illustrates yet another example system for receiving, transmitting, and displaying data.
  • FIG. 8 illustrates a simplified block diagram of an example computer network infrastructure.
  • FIG. 9 illustrates a simplified block diagram depicting example components of an example computing system.
  • a person who is wearing a head-mountable camera may be surrounded by various sorts of activity and objects, and may wish to take a closer look at a series of related objects and capture images for each of the objects.
  • example embodiments may be implemented on the HMD to intelligently determine and display potentially interesting features of the wearer's surroundings (e.g., objects or people), and to progressively display images of related features and/or sub-features. This may allow the HMD wearer to explore multi-part features and/or related features in a logical and progressive manner, without having to explicitly indicate what features they are interested in and/or indicate how to navigate through related features, or navigate between sub-features of an interesting feature.
  • the HMD may include a point-of-view camera that is configured to capture a wide-angle image or video from the point of view of the HMD wearer.
  • the HMD may then process the captured image or video to determine objects within the point of view of the HMD wearer that may be of interest to the HMD wearer.
  • the HMD may determine objects of interest to the HMD wearer based on interests of the wearer, and may determine the progression of capturing images based on how close the object is or how interested the wearer might be in the object.
  • the wearer may be walking by a movie theater on the way somewhere else.
  • the HMD may determine that the wearer is interested in movies and may therefore zoom to and capture an image of the sign of the theater.
  • the wearer may indicate whether the object in the captured image is of interest. If the wearer is not interested in the theater sign, the HMD may intelligently look for a different object that the wearer may be interested in. If the wearer is interested in the theater sign, the HMD may intelligently zoom to a title of a movie being played at the theater shown on a sign and capture an image of the title. If the wearer is not interested in the movie, the HMD may intelligently zoom to the title of a different movie being played at the theater. If the wearer is interested in the movie, the HMD may intelligently zoom to a listing of show times for the movie of interest, and capture an image of the show time.
  • This use of the HMD may also be applied to other scenarios for capturing other elements of the user's life experiences, whether the experiences are expected, unexpected, memorable, or in passing. Further discussions relating to devices and methods for capturing images representing experiences from the perspective of a user may be found below in more detail. While the discussions herein generally refer to the capturing of an image based on a determined feature of interest, other content may be captured accordingly as well.
  • the other content may include audio content, video content without an audio component, or video content with an audio component.
  • FIG. 1A is a block diagram of an exemplary method 100 for intelligently zooming and capturing images. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results.
  • Method 100 may include one or more operations, functions, or actions as illustrated by one or more of blocks 102 - 112 . Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
  • each block may represent a module, a segment, or a portion of program code, which includes one or more instructions executable by a processor for implementing specific logical functions or steps in the process.
  • the program code may be stored on any type of computer readable medium, for example, such as a storage device including a disk or hard drive.
  • the computer readable medium may include non-transitory computer readable medium, for example, such as computer-readable media that stores data for short periods of time like register memory, processor cache and Random Access Memory (RAM).
  • the computer readable medium may also include non-transitory media, such as secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example.
  • the computer readable media may also be any other volatile or non-volatile storage systems.
  • the computer readable medium may be considered a computer readable storage medium, for example, or a tangible storage device.
  • the method 200 involves receiving image data corresponding to a field of view of an environment.
  • a user may be wearing an HMD with a camera having a zoom function, while present in the environment.
  • image data may be received from the camera and may be in the form of a video or a series of still images.
  • the field of view of the environment may be the imaging field of view of the camera at an initial level of zoom of the camera.
  • FIG. 2 illustrates an environment 200 in which an image within a first field of view 250 may be captured.
  • the environment 200 may be an area outside a movie theater.
  • signs with text indicating the name of the theater 202 there may be signs with text indicating the name of the theater 202 , titles for the movies that are playing 208 , and times for the movies that are playing 210 .
  • the environment may also include a sign for the box office 206 , and other movie goers such as a first person 232 and a second person 234 .
  • the first field of view 250 may be the imaging field of view of the HMD camera at the initial level of zoom, as mentioned above.
  • the method 100 involves determining a first feature of interest within the field of view 250 of the environment 200 based on a first interest criteria.
  • the environment 200 may include a number of features the user may be interested in. For example, the movie theater name 202 , the titles for the movies that are playing 208 , and the times for the moves that are playing 210 may all be features the user may be interested in. Further, the user may be interested in the first person 232 and the second person 234 . In one case, the first person 232 and the second person 234 may be recognizable friends of the user.
  • Interest criteria may include text, human faces, colors, shapes, or any qualification that may be discerned through image processing by a computing device. Interest criteria may further include certain types of objects or object categories such as signs, animals, vehicles, buildings, or landscape, etc.
  • at least one interest criterion may be based on to determine features of interest.
  • the at least one interest criterion may be predetermined. For instance, the user of the HMD may indicate that the user is interested in information presented in text form, and accordingly, the features of interest may be determined based on the interest criterion of text.
  • the interest criterion of text may further be combined with additional interest criteria for determining features of interest, such that if text recognition is available, features of interest may be determined based on text representing the additional interest criteria.
  • additional interest criteria for determining features of interest, such that if text recognition is available, features of interest may be determined based on text representing the additional interest criteria.
  • the user of the HMD may indicate that the user is interested in certain brands of cars.
  • features of interest may be determined based on text identifying a vehicle or advertisement related to the certain brands of cars the user is interest in.
  • features of interest may be determined based on the interest criteria of human faces and colors. For instance, the user may indicate an interest in people having unusual hair color (i.e. green, blue, pink, or purple hair). In this example, the features of interest may be determined based on recognizing human faces and determining a hair color of person whose face has been recognized.
  • interest criteria may include a degree of motion, such that features of interest may be determined based on the degree of motion detected within the field of view 250 of the environment 200 . For example, the degree of motion may indicate a movement speed and/or a measure of presence in the field of view 250 of an object that can be based upon to determine whether the object may be a feature of interest.
  • the HMD may further be configured to “learn” which interest criteria may be particularly applicable to the user based on the user's previous behaviors.
  • interest criteria based upon to determine features of interest may be determined based on real-time input from the user. For instance, the user may indicate that images of features of interest determined based on the interest criteria of text are to be captured by selecting a “text” icon provided on the HMD or by saying “text.” Other example key words representing interest criteria may also be implemented.
  • the first field of view 250 of the environment 200 includes text indicating the name of the theater 202 , titles for the movies that are playing 208 , times for the movies that are playing 210 , as well as the sign for the box office 206 .
  • the first field of view 250 also includes the face of the first person 232 .
  • features of interest based on the first interest criteria may include each feature in the first field of view 250 indicated by text as well as the first person 232 .
  • the name of the theater 202 may be determined as the first feature of interest based on the first interest criteria of text, and in this case, because the name of the theater 202 has the largest font size of the different text elements.
  • the HMD may include an eye-tracking device configured to determine where an attention or gaze of the user is directed.
  • eye-tracking data from the eye-tracking device may be based on to determine features of interest, in addition to interest criteria.
  • the eye-tracking data may further provide information regarding dwell time and pupil dilation changes that may also be based on to determine features of interest.
  • the HMD may also include various physiological sensors for detecting physiological responses such as galvanic skin responses, pupillary responses, electrocardiographic responses, electroencephalographic responses, body temperature, blood pressure, and hemoglobin oxygenation responses. As such, the detected physiological responses may further be based on to determine features of interest.
  • the method 100 involves causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest.
  • the name of the theater 202 may be determined as the first feature of interest. Accordingly, a first image of the name of the theater 202 may be captured.
  • zooming to the portion of the field of view that includes the first feature of interest may result in the camera zooming to a first level of zoom, which may be an increased level of zoom from the initial level of zoom.
  • the first image may be a zoomed-in image of the name of the theater 202 .
  • the first level of zoom may be determined based on a number of factors. For instance, the first level of zoom may be determined such that the entire name of the theater 202 may be included in the first image.
  • determining the first level of zoom may involve determining characteristics of the first feature of interest.
  • the characteristics of the first feature of interest may indicate at least a size of the first feature of interest.
  • an extent of zoom for the first level of zoom may be determined such that at least the entire first feature of interest (such as the entire marquee of the movie theater, rather than just the text representing the name of the theater) may be included in the image captured by the camera. This may further be applicable to determining subsequent levels of zoom for subsequent features of interests.
  • the camera may have optical, digital, or both types of zoom capabilities.
  • the camera may be adjusted to center on the first feature of interest to provide optimal optical zoom when capturing the image.
  • the camera may not be angularly adjustable and may therefore be configured to provide digital zoom when capturing the image.
  • the camera may be configured to provide both optical zoom and digital zoom. For instance, if the camera is unable to fully center on the first feature of interest, the camera may be configured to provide optical zoom to the extent that the entire first feature of interest is within the camera field of view, and subsequently provide digital zoom to capture the first image of a portion of the field of view that includes the first feature of interest.
  • FIG. 3A illustrates a second field of view 352 for capturing the first image of the first feature of interest, which in this case may be the name of theater 202 .
  • the second field of view 352 of the first image may be a zoomed-in view from the first field of view 250 .
  • the second field of view 352 of the environment 200 may be narrower than the first field of view 250 of the environment 200 .
  • a feature of interest may further be determined based on a degree of motion.
  • capturing the first image of the portion of the field of view that includes the first feature of interest may involve capturing video of the portion of the field of view.
  • a duration of the captured video may be a predetermined duration, such as 5 seconds.
  • the duration of the captured video may be further based on the degree of motion of the feature of interest.
  • the feature of interest may be determined based on when a degree of motion of the feature of interest exceeds a threshold degree of motion. As such, the duration of the captured video may be as long as the degree of motion of the feature of interest is above the threshold degree of motion.
  • the method 100 involves providing the image of the first feature of interest on a display.
  • the display may be a component of the HMD worn by the user, such that the user may view what the HMD may have determined to be a feature of interest upon capturing the first image of the first feature of interest.
  • the user may be prompted to indicate whether the user is interested in the determined feature of interest when the first image of the first feature of interest is displayed. In this case, if the user indicates interest in the determined feature of interest, the user may further be prompted to indicate whether the captured first image is to be stored. In another case, the user may simply be prompted to indicate whether the captured first image is to be stored.
  • the method 100 involves determining a level of interest in the first feature of interest.
  • determining the level of interest may involve acquiring interest input data indicating a level of interest in the first feature of interest.
  • the user may provide the interest input data by providing a gradient value indicating the level of interest in the first feature of interest.
  • the user may provide a numeric value between 0 and 10, with 10 indicating extremely high interest, 0 indicating absolutely no interest, and 1-9 representing the varying levels in between.
  • the user may provide the interest input data by providing a binary value indicating the level of interest.
  • the user may provide either a first value affirming interest in the first feature of interest, or a second value denying interest in the first feature of interest.
  • the interest input data provided by the user may not explicitly indicate the user's level of interest. For instance, the user may be prompted to indicate whether the first image is to be stored, as mentioned above. By indicating the first image is to be stored, the user may be implicitly affirming interest in the first feature of interest. On the other hand, the user may be implicitly denying interest in the first feature of interest by indicating the first image is not to be stored.
  • a predetermined duration of time may be implemented such that the user is assumed to affirm or deny interest in the first feature of interest if the user does not provide interest input data within the predetermined duration of time after the image of the first feature of interest has been provided on the display. For instance, the user may be given 3 seconds to provide interest input data after either the image of the first feature of interest has been provided on the display or the user has been prompted to provide interest input data. In one case, if no interest input is provided by the user after 3 seconds, the user may be considered to have denied interest in the first feature of interest. In another case, the user may be considered to have affirmed interest in the first feature of interest if no interest input is provided after the predetermined 3 seconds.
  • interest input data may be provided by the user without explicit feedback from the user.
  • the HMD may include an eye-tracking device or other physiological sensors.
  • the interest input data may be eye-tracking data or physiological response data received upon providing the image of the first feature of interest on a display. The received eye-tracking data or physiological response data may accordingly be indicative of the user's level of interest in the first feature of interest.
  • the HMD may be configured to learn which interest criteria may be particularly applicable to the user based on the user's previous behaviors.
  • the interest criteria based on to determine features of interest and the interest input data from the user in response to images of features of interest being display may be stored and processed for learning the applicable interest criteria of the user.
  • the method 100 involves storing the first image and capturing a second image based on the level of interest.
  • the user may indicate whether or not the user may be in fact interested in the first feature of interest captured in the first image.
  • the user may indicate a level of interest above a predetermined interest threshold. For instance, in the case the user provides interest input data by providing a gradient value between 0 and 10, the predetermined interest threshold may be set at 5, such that any value at 5 or above may be interpreted as affirming interest in the feature of interest. On the flip side, any value below 4 may be interpreted as denying interest in the feature of interest. Additional predetermined interest thresholds may also configured to further define the level of interest of the user with more precision.
  • the first image may be stored based on the indicated level of interest.
  • the first image may be stored on a data storage medium in communication with the HMD. This may include a data storage medium physically attached to the HMD or a data storage medium within a network of computing devices in communication with the HMD.
  • the first image may be stored in an image attribute database.
  • the image-attribute database may be configured to store a set of images associated with the user wearing the HMD, and may include data for each of the images in the set of images specifying one or more attributes from a set of attributes indicating context associated with each of the images.
  • FIG. 1B is a block diagram of an exemplary method 130 for storing a captured image in an image-attribute database.
  • the method 130 includes steps for storing the captured images in an image-attribute database. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results.
  • Method 130 may include one or more operations, functions, or actions as illustrated by one or more of blocks 132 - 136 . Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
  • the method 130 involves determining one or more attributes indicating a context of the captured image.
  • the captured image may be the first image of the first feature of interest, which in this case is the name of the theater 202 .
  • attributes associated with the captured image may include a location of the movie theater, and a date and time when the first image captured.
  • the HMD may include a global positioning system (GPS) configured to determine a geographic location associated with the HMD.
  • GPS global positioning system
  • the geographic location of the HMD may be acquired from the GPS when the first image is captured.
  • the HMD may further include a system clock, and the date and time may be acquired from a system clock of the HMD when the first image is captured.
  • the time may be acquired from a server in communication with the HMD. In this instance, a local time may be acquired from the server based the time zone the wearer is in, according to the geographic location of the HMD acquired from the GPS.
  • the method 130 involves associating the one or more attributes with the captured image.
  • the acquired location, date, and time may be associated with the first image.
  • additional context may be determined and associated with the first image. For instance, if the location, date, and time associated with the first image indicates the user may be on the way home from work when the first image is captured, context indicating that the user is “commuting” may be determined and associated with the first image as an attribute.
  • the method 130 involves causing the captured image and the one or more attributes to be stored in the image-attribute database. After relevant attributes have been associated with the captured image, the captured image may then be stored in the image-attribute database along with the determined one or more attributes. Continuing with the above example, the first image may be stored in the image-attribute database along with the location, data, and time acquired when capturing the first image.
  • a second image may be captured based on the indicated level of interest.
  • the first interest criteria used to determine the first feature of interest may be validated.
  • the first interest criteria if text may be used to determine a second feature of interest within the second field of view 352 of the environment 200 , and the image capture instructions may indicate that a second image should be captured of the second feature of interest.
  • the titles for the movies that are playing 208 may be determined as the second feature of interest within the second field of view 352 in a similar manner to how the name of the theater 202 may have been determined as the first feature of interest within the first field of view 250 .
  • the text of the titles for the movies that are playing 208 may be smaller than the text of the name of the theater 202 .
  • the second image of the second feature of interest may be captured at a second level of zoom, which may be higher than the first level of zoom.
  • FIG. 3B illustrates a third field of view 354 for capturing an image of the second feature of interest determined according to the first interest criteria of text.
  • the second image may be an image of the first third field of view 354 , which includes the second feature of interest, which in this case may be the titles for the movies that are playing 208 .
  • the third field of view 354 of the environment 200 may be narrower than the second field of view 352 of the environment 200 .
  • the second image may also be provided on a display for the user to view, and the user may then indicate a level of interest in the second feature of interest. Based on the level of interest indicated by the user, a third feature of interest within the third field of view 354 of the environment 200 may be determined, and a third image of the third feature of interest may be captured according to the same first interest criteria of text.
  • FIG. 3C illustrates a fourth field of view 356 for capturing an image of a third feature of interest determined according to the first interest criteria.
  • times for the movies that are playing 210 may be determined as the third feature of interest.
  • the size of text for the times for the moves that are playing 210 may be smaller than the size of the text for the movies that are playing 208 .
  • the third image may be captured at a level of zoom higher than the second level of zoom.
  • the fourth field of view 356 of the environment 200 may be narrower than the third field of view 354 of the environment 200 .
  • the level of interest indicated by the user may also be based on to determine image capture instructions relating a quality of the image captured. For instance, if the user indicates a level of interest just above the predetermined interest threshold, the image may be captured at a relatively low resolution. On the other hand, if the user indicates a high level of interest, a high resolution image may be captured. In this case, the camera may be configured to capture a high-definition video if the indicated level of interest is sufficiently high. In other words, additional thresholds for different levels of capture qualities of images and videos may be implemented.
  • a range of image capture resolutions may be configured to correspond to different levels of interest within the range of gradient values.
  • the user may also be indicating the image resolution at which the image of the feature of interest is to be captured.
  • the discussions in connection to FIGS. 3A-3C have related to cases in which the user indicates interest in the determined features of interest.
  • the user may in fact indicate a level of interest below the predetermined interest threshold.
  • a fourth feature of interest may be determined within the first field of view 250 of the environment 200 .
  • the camera may pan out from the second field of view 352 back to the first field of view 250 and determine the fourth feature of interest according to a second interest criteria.
  • the second interest criteria may be the same as the first interest criteria of text.
  • the fourth feature of interest may be determined as the sign for the box office 206 .
  • the second interest criteria may be that of human faces.
  • the fourth feature of interest may be determined as the first person 232 .
  • the camera may zoom-in on the first person 232 such that a fourth image may be captured of the first person 232 .
  • FIG. 4A illustrates a fifth field of view 452 for capturing the fourth image of the fourth feature of interest determined according to the second interest criteria of human faces.
  • the fifth field of view 452 has a narrower view than the first field of view 250 of the environment 200 because the fourth image may have been captured at a fourth level of zoom higher than the initial level of zoom of the camera.
  • the fourth image of the first person 232 may then be provided for the user to view.
  • the user may indicate a level of interest above the predetermined interest threshold and accordingly, a fifth feature of interest may be determined based on the second interest criteria of human faces within the fifth field of view 452 .
  • the second person 234 may be determined as the fifth feature of interest.
  • FIG. 4B illustrates a sixth field of view 454 for capturing a fifth image of the fifth feature of interest determined according to the second interest criteria.
  • the level of zoom at which the fifth image is captured may be similar to the fourth level of zoom.
  • the sixth field of view 454 may be similar in scope to the fifth field of view 452 .
  • different interest criteria may be based on to determine features of interest when no additional feature of interest are determined within the field of view. For instance, referring back to FIG. 3C , no additional features of interest based on the interest criteria of text may be found. In such a case, the field of view may be widened by panning out, and additional features of interest, based on either the original or different interest criteria may be determined within the widened field of view.
  • the method 100 involves determining a feature of interest within a field of view, capturing an image of the feature of interest, providing the image of the feature of interest to the user, and determining storage and further image capture instructions according to feedback from the user.
  • the method 100 may be configured to store images and capture additional images without feedback from the user.
  • FIG. 1C is a block diagram of an alternative exemplary method 160 for intelligently zooming and capturing images. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results.
  • Method 160 may include one or more operations, functions, or actions as illustrated by one or more of blocks 162 - 168 . Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
  • block 162 involves receiving image data corresponding a field of view of an environment
  • block 164 involves determining a feature of interest within the field of view based on a first interest criteria
  • block 166 involves causing a camera to zoom to and capture an image of a portion of the field of view that includes the first feature of interest
  • block 168 involves causing the captured image to be stored in an image-attribute database including data for a set of images.
  • block 168 of method 160 may cause the captured image to be stored according to method 130 of FIG. 1B
  • Blocks 162 - 166 of method 160 may be implemented similarly to block 102 - 106 of method 100 .
  • method 160 causes the captured image to be stored in the image-attribute database automatically, and not based on a level of interest provided by the user. Nevertheless, method 160 may still involve one or more of providing the image of the first feature of interest on a display, determine a level of interest in the first feature of interest, and storing the first image and capturing another image based on the level of interest, as described in connection to blocks 108 - 112 of FIG. 1A .
  • the image of the first feature of interest may be provided on the display for a predetermined duration of time before the zooming to a second feature of interest without determining a level of interest in the first feature of interest.
  • method 160 may be implemented as method 100 configured to automatically store each captured image, regardless of the level of interest indicated by the user.
  • the resolution at which the to-be-automatically-stored images are captured and stored may still be determined based on the level of interest indicated by the user.
  • the determination of whether to zoom to and capture an image of a second feature of interest based on the first interest criteria or a second interest criteria may also be based on the level of interest indicated by the user.
  • a user may wish to view images captured previously and stored in an image-attribute database.
  • the user may be provided one image at a time on a display.
  • the user may be provided a subset of images from the image-attribute database on the display.
  • the each image included in the subset of images may share one or more attributes.
  • FIG. 5 illustrates an example presentation 500 of captured images according to one or more attributes.
  • the presentation 500 may include a subset of images including images 502 , 504 , 506 , 508 , and 510 .
  • the images 502 - 510 maybe tiled in the form of a “mosaic.”
  • each image in the subset of images may share one or more attributes.
  • the presentation 500 may also include subset tags 512 indicating the one or more attributes associated with each image in the subset of images.
  • the user may view all previously captured images associated with an event by indicating one or more attributes by which to view the captured images. For example, the user may wish to view images associated with a time the user attended a movie showing with friends at a particular movie theater. In this case, the user may indicate a movie theater name, and the names of the friends, and accordingly, a subset of images associated with the movie theater, and the named friends may be provided.
  • FIG. 6A illustrates an example system 600 for receiving, transmitting, and displaying data.
  • the system 600 is shown in the form of a wearable computing device, which may be implemented as the HMD discussed above, to intelligently zoom to and capture an image of a feature of interest.
  • FIG. 6A illustrates a head-mounted device 602 as an example of a wearable computing device, other types of wearable computing devices could additionally or alternatively be used.
  • the head-mounted device 602 has frame elements including lens-frames 604 , 606 and a center frame support 608 , lens elements 610 , 612 , and extending side-arms 614 , 616 .
  • the center frame support 608 and the extending side-arms 614 , 616 are configured to secure the head-mounted device 602 to a user's face via a user's nose and ears, respectively.
  • Each of the frame elements 604 , 606 , and 608 and the extending side-arms 614 , 616 may be formed of a solid structure of plastic and/or metal, or may be formed of a hollow structure of similar material so as to allow wiring and component interconnects to be internally routed through the head-mounted device 602 . Other materials may be possible as well.
  • each of the lens elements 610 , 612 may be formed of any material that can suitably display a projected image or graphic.
  • Each of the lens elements 610 , 612 may also be sufficiently transparent to allow a user to see through the lens element. Combining these two features of the lens elements may facilitate an augmented reality or heads-up display where the projected image or graphic is superimposed over a real-world view as perceived by the user through the lens elements 610 , 612 .
  • the extending side-arms 614 , 616 may each be projections that extend away from the lens-frames 604 , 606 , respectively, and may be positioned behind a user's ears to secure the head-mounted device 602 to the user.
  • the extending side-arms 614 , 616 may further secure the head-mounted device 602 to the user by extending around a rear portion of the user's head.
  • the system 600 may connect to or be affixed within a head-mounted helmet structure. Other possibilities exist as well.
  • the system 600 may also include an on-board computing system 618 , a video camera 620 , a sensor 622 , and a finger-operable touch pad 624 .
  • the on-board computing system 618 is shown to be positioned on the extending side-arm 614 of the head-mounted device 602 ; however, the on-board computing system 618 may be provided on other parts of the head-mounted device 602 or may be positioned remote from the head-mounted device 602 (e.g., the on-board computing system 618 could be connected by wires or wirelessly connected to the head-mounted device 602 ).
  • the on-board computing system 618 may include a processor and memory, for example.
  • the on-board computing system 618 may be configured to receive and analyze data from the video camera 620 , the sensor 622 , and the finger-operable touch pad 624 (and possibly from other sensory devices, user-interfaces, or both) and generate images for output by the lens elements 610 and 612 .
  • the on-board computing system 618 may additionally include a speaker or a microphone for user input (not shown).
  • An example computing system is further described below in connection with FIG. 7 .
  • the video camera 620 is shown positioned on the extending side-arm 614 of the head-mounted device 602 ; however, the video camera 620 may be provided on other parts of the head-mounted device 602 .
  • the video camera 620 may be configured to capture images at various resolutions or at different frame rates. Video cameras with a small form-factor, such as those used in cell phones or webcams, for example, may be incorporated into an example embodiment of the system 600 .
  • FIG. 6A illustrates one video camera 620
  • more video cameras may be used, and each may be configured to capture the same view, or to capture different views.
  • the video camera 620 may be forward facing to capture at least a portion of the real-world view perceived by the user. This forward facing image captured by the video camera 620 may then be used to generate an augmented reality where computer generated images appear to interact with the real-world view perceived by the user.
  • the sensor 622 is shown on the extending side-arm 616 of the head-mounted device 602 ; however, the sensor 622 may be positioned on other parts of the head-mounted device 602 .
  • the sensor 622 may include one or more of a gyroscope or an accelerometer, for example. Other sensing devices may be included within, or in addition to, the sensor 622 or other sensing functions may be performed by the sensor 622 .
  • the finger-operable touch pad 624 is shown on the extending side-arm 614 of the head-mounted device 602 . However, the finger-operable touch pad 624 may be positioned on other parts of the head-mounted device 602 . Also, more than one finger-operable touch pad may be present on the head-mounted device 602 .
  • the finger-operable touch pad 624 may be used by a user to input commands.
  • the finger-operable touch pad 624 may sense at least one of a position and a movement of a finger via capacitive sensing, resistance sensing, or a surface acoustic wave process, among other possibilities.
  • the finger-operable touch pad 624 may be capable of sensing finger movement in a direction parallel or planar to the pad surface, in a direction normal to the pad surface, or both, and may also be capable of sensing a level of pressure applied to the pad surface.
  • the finger-operable touch pad 624 may be formed of one or more translucent or transparent insulating layers and one or more translucent or transparent conducting layers. Edges of the finger-operable touch pad 624 may be formed to have a raised, indented, or roughened surface, so as to provide tactile feedback to a user when the user's finger reaches the edge, or other area, of the finger-operable touch pad 624 . If more than one finger-operable touch pad is present, each finger-operable touch pad may be operated independently, and may provide a different function.
  • FIG. 6B illustrates an alternate view of the system 600 illustrated in FIG. 6A .
  • the lens elements 610 , 612 may act as display elements.
  • the head-mounted device 602 may include a first projector 628 coupled to an inside surface of the extending side-arm 616 and configured to project a display 630 onto an inside surface of the lens element 612 .
  • a second projector 632 may be coupled to an inside surface of the extending side-arm 614 and configured to project a display 634 onto an inside surface of the lens element 610 .
  • the lens elements 610 , 612 may act as a combiner in a light projection system and may include a coating that reflects the light projected onto them from the projectors 628 , 632 .
  • a reflective coating may be omitted (e.g., when the projectors 628 , 632 are scanning laser devices).
  • the lens elements 610 , 612 themselves may include: a transparent or semi-transparent matrix display, such as an electroluminescent display or a liquid crystal display, one or more waveguides for delivering an image to the user's eyes, or other optical elements capable of delivering an in focus near-to-eye image to the user.
  • a corresponding display driver may be disposed within the frame elements 604 , 606 for driving such a matrix display.
  • a laser or light emitting diode (LED) source and scanning system could be used to draw a raster display directly onto the retina of one or more of the user's eyes. Other possibilities exist as well.
  • FIG. 7A illustrates an example system 700 for receiving, transmitting, and displaying data.
  • the system 700 is shown in the form of a wearable computing device 702 , which may be implemented as the HMD discussed above, to intelligently zoom to and capture an image of a feature of interest.
  • the wearable computing device 702 may include frame elements and side-arms such as those described with respect to FIGS. 6A and 6B .
  • the wearable computing device 702 may additionally include an on-board computing system 704 and a video camera 706 , such as those described with respect to FIGS. 6A and 6B .
  • the video camera 706 is shown mounted on a frame of the wearable computing device 702 ; however, the video camera 706 may be mounted at other positions as well.
  • the wearable computing device 702 may include a single display 708 which may be coupled to the device.
  • the display 708 may be formed on one of the lens elements of the wearable computing device 702 , such as a lens element described with respect to FIGS. 6A and 6B , and may be configured to overlay computer-generated graphics in the user's view of the physical world.
  • the display 708 is shown to be provided in a center of a lens of the wearable computing device 702 , however, the display 708 may be provided in other positions.
  • the display 708 is controllable via the computing system 704 that is coupled to the display 708 via an optical waveguide 710 .
  • FIG. 7B illustrates an example system 720 for receiving, transmitting, and displaying data.
  • the system 720 is shown in the form of a wearable computing device 722 .
  • the wearable computing device 722 may include side-arms 723 , a center frame support 724 , and a bridge portion with nosepiece 725 .
  • the center frame support 724 connects the side-arms 723 .
  • the wearable computing device 722 does not include lens-frames containing lens elements.
  • the wearable computing device 722 may additionally include an on-board computing system 726 and a video camera 728 , such as those described with respect to FIGS. 6A and 6B .
  • the wearable computing device 722 may include a single lens element 730 that may be coupled to one of the side-arms 723 or the center frame support 724 .
  • the lens element 730 may include a display such as the display described with reference to FIGS. 6A and 6B , and may be configured to overlay computer-generated graphics upon the user's view of the physical world.
  • the single lens element 730 may be coupled to a side of the extending side-arm 723 .
  • the single lens element 730 may be positioned in front of or proximate to a user's eye when the wearable computing device 722 is worn by a user.
  • the single lens element 730 may be positioned below the center frame support 724 , as shown in FIG. 7B .
  • FIG. 8 shows a simplified block diagram of an example computer network infrastructure.
  • a device 810 communicates using a communication link 820 (e.g., a wired or wireless connection) to a remote device 830 .
  • the device 810 may be any type of device that can receive input data.
  • the device 810 may be a heads-up display system, such as the head-mounted device 602 , 700 , or 720 described with reference to FIGS. 6A-7B .
  • the device 810 may include a display system 812 including a processor 814 and a display 816 .
  • the display 816 may be, for example, an optical see-through display, an optical see-around display, or a video see-through display.
  • the processor 814 may receive data from the remote device 830 , and configure the data for display on the display 816 .
  • the processor 814 may be any type of processor, such as a micro-processor or a digital signal processor, for example.
  • the device 810 may further include on-board data storage, such as memory 818 coupled to the processor 814 .
  • the memory 818 may store software that can be accessed and executed by the processor 814 , for example.
  • the remote device 830 may be any type of computing device or transmitter including a laptop computer, a mobile telephone, or tablet computing device, etc., that is configured to transmit data to the device 810 .
  • the remote device 830 and the device 810 may contain hardware to enable the communication link 820 , such as processors, transmitters, receivers, antennas, etc.
  • the communication link 820 is illustrated as a wireless connection; however, wired connections may also be used.
  • the communication link 820 may be a wired serial bus such as a universal serial bus or a parallel bus, among other connections.
  • the communication link 820 may also be a wireless connection using, e.g., Bluetooth® radio technology, communication protocols described in IEEE 802.11 (including any IEEE 802.11 revisions), Cellular technology (such as GSM, CDMA, UMTS, EV-DO, WiMAX, or LTE), or Zigbee® technology, among other possibilities. Either of such a wired and/or wireless connection may be a proprietary connection as well.
  • the remote device 830 may be accessible via the Internet and may include a computing cluster associated with a particular web service (e.g., social-networking, photo sharing, address book, etc.).
  • an example wearable computing device may include, or may otherwise be communicatively coupled to, a computing system, such as computing system 618 or computing system 704 .
  • FIG. 9 shows a simplified block diagram depicting example components of an example computing system 900 .
  • One or both of the device 810 and the remote device 830 may take the form of computing system 900 .
  • Computing system 900 may include at least one processor 902 and system memory 904 .
  • computing system 900 may include a system bus 906 that communicatively connects processor 902 and system memory 904 , as well as other components of computing system 900 .
  • processor 902 can be any type of processor including, but not limited to, a microprocessor ( ⁇ P), a microcontroller ( ⁇ C), a digital signal processor (DSP), or any combination thereof.
  • system memory 904 can be of any type of memory now known or later developed including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof.
  • An example computing system 900 may include various other components as well.
  • computing system 900 includes an A/V processing unit 908 for controlling graphical display 910 and speaker 912 (via A/V port 914 ), one or more communication interfaces 916 for connecting to other computing devices 918 , and a power supply 920 .
  • Graphical display 910 may be arranged to provide a visual depiction of various input regions provided by user-interface module 922 .
  • user-interface module 922 may be configured to provide a user-interface
  • graphical display 910 may be configured to provide a visual depiction of the user-interface.
  • User-interface module 922 may be further configured to receive data from and transmit data to (or be otherwise compatible with) one or more user-interface devices 928 .
  • computing system 900 may also include one or more data storage devices 924 , which can be removable storage devices, non-removable storage devices, or a combination thereof.
  • removable storage devices and non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDD), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and/or any other storage device now known or later developed.
  • Computer storage media can include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • computer storage media may take the form of RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium now known or later developed that can be used to store the desired information and which can be accessed by computing system 900 .
  • computing system 900 may include program instructions 926 that are stored in system memory 904 (and/or possibly in another data-storage medium) and executable by processor 902 to facilitate the various functions described herein including, but not limited to, those functions described with respect to FIG. 1 .
  • program instructions 926 are stored in system memory 904 (and/or possibly in another data-storage medium) and executable by processor 902 to facilitate the various functions described herein including, but not limited to, those functions described with respect to FIG. 1 .
  • system memory 904 and/or possibly in another data-storage medium
  • processor 902 may include program instructions 926 that are stored in system memory 904 (and/or possibly in another data-storage medium) and executable by processor 902 to facilitate the various functions described herein including, but not limited to, those functions described with respect to FIG. 1 .
  • system memory 904 and/or possibly in another data-storage medium

Abstract

Methods and systems for intelligently zooming to and capturing a first image of a feature of interest are provided. The feature of interest may be determined based on a first interest criteria. The captured image may be provided to a user, who may indicate a level of interest in the feature of interest. The level of interest may be based upon to store the captured image and capture another image. The level of interest may be a gradient value, or a binary value. The level of interest may be based upon to determine whether to store the captured image, and if so, a resolution at which the captured image is to be stored. The level of interest may also be based upon to determine whether to zoom to and capture a second image of a second feature of interest based on the first interest criteria or a second interest criteria.

Description

CROSS-REFERENCE TO RELATED APPLICATION
The present application claims priority to U.S. Provisional Application Ser. No. 61/584,100, filed on Jan. 6, 2012, the entire contents of which are incorporated by reference.
BACKGROUND
Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Wearable systems can integrate various elements, such as miniaturized computers, cameras, input devices, sensors, detectors, image displays, wireless communication devices as well as image and audio processors, into a device that can be worn by a user. Such devices provide a mobile and lightweight solution to communicating, computing and interacting with one's environment. With the advance of technologies associated with wearable systems as well as miniaturized electronic components and optical elements, it has become possible to consider wearable compact cameras for capturing the wearer's experiences of the real world.
By orienting the wearable camera towards the same general direction as the wearer's point of view, media representing a real world experience of the user can be captured. The captured media may include video, audio, and still frame images. In some cases, the media may be captured continuously. In other cases, the media may be captured based on inputs from the wearer.
SUMMARY
Disclosed herein are systems and methods that may be implemented to provide an efficient and intuitive search for, and navigation of, stored information associated with a user's real-world experience.
In one example, a system with at least one processor and a non-transitory computer readable medium is provided. Program instructions may be stored on the non-transitory computer readable medium and may be executable by the at least one processor to perform functions. The functions include receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria. The functions further include causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest, and providing the image of the first feature of interest on a display. The functions also include determining a level of interest in the first feature of interest, and capturing a second image based on the level of interest.
In a second example, a method is provided that includes receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria. The method further includes causing a camera to zoom to and capture an image of a portion of the field of view that includes the first feature of interest. The method also includes causing the captured image of the first feature of interest to be stored in an image-attribute database including data for a set of images. The data for a given image of the set of images specifies one or more attributes from a set of attributes.
In a third example, a non-transitory computer readable memory with instructions stored thereon is provided. The instructions may be executable by a computing device to cause the computing device to perform functions. The functions include receiving image data corresponding to a field of view of an environment, and determining a first feature of interest within the first field of view based on a first interest criteria. The functions further include causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest. The functions also include providing the first image of the first feature of interest on a display, determining a level of interest in the first feature of interest, and capturing a second image based on the level of interest.
These as well as other aspects, advantages, and alternatives, will become apparent to those of ordinary skill in the art by reading the following detailed description, with reference where appropriate to the accompanying drawings.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1A is a block diagram of an exemplary method for intelligently zooming and capturing images.
FIG. 1B is a block diagram of an exemplary method for storing a captured image in an image-attribute database.
FIG. 1C is a block diagram of an alternative exemplary method for intelligently zooming and capturing images.
FIG. 2 illustrates an environment in which an image within a first field of view may be captured.
FIG. 3A illustrates a second field of view for capturing an image of a first feature of interest determined according to a first interest criteria.
FIG. 3B illustrates a third field of view for capturing an image of a second feature of interest determined according to the first interest criteria.
FIG. 3C illustrates a fourth field of view for capturing an image of a third feature of interest determined according to the first interest criteria.
FIG. 4A illustrates a fifth field of view for capturing an image of a fourth feature of interest determined according to a second interest criteria.
FIG. 4B illustrates a sixth field of view for capturing an image of a fifth feature of interest determined according to the second interest criteria.
FIG. 5 illustrates an example presentation of captured images according to at least one attribute.
FIG. 6A illustrates an example system for receiving, transmitting, and displaying data.
FIG. 6B illustrates an alternate view of the system illustrated in FIG. 5A.
FIG. 7A illustrates another example system for receiving, transmitting, and displaying data.
FIG. 7B illustrates yet another example system for receiving, transmitting, and displaying data.
FIG. 8 illustrates a simplified block diagram of an example computer network infrastructure.
FIG. 9 illustrates a simplified block diagram depicting example components of an example computing system.
DETAILED DESCRIPTION
In the following detailed description, reference is made to the accompanying figures, which form a part thereof. In the figures, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, figures, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented herein. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are contemplated herein.
1. Overview
In an example scenario, a person who is wearing a head-mountable camera (HMD) may be surrounded by various sorts of activity and objects, and may wish to take a closer look at a series of related objects and capture images for each of the objects. Accordingly, example embodiments may be implemented on the HMD to intelligently determine and display potentially interesting features of the wearer's surroundings (e.g., objects or people), and to progressively display images of related features and/or sub-features. This may allow the HMD wearer to explore multi-part features and/or related features in a logical and progressive manner, without having to explicitly indicate what features they are interested in and/or indicate how to navigate through related features, or navigate between sub-features of an interesting feature.
For example, the HMD may include a point-of-view camera that is configured to capture a wide-angle image or video from the point of view of the HMD wearer. The HMD may then process the captured image or video to determine objects within the point of view of the HMD wearer that may be of interest to the HMD wearer. The HMD may determine objects of interest to the HMD wearer based on interests of the wearer, and may determine the progression of capturing images based on how close the object is or how interested the wearer might be in the object. In a more specific example, the wearer may be walking by a movie theater on the way somewhere else. The HMD may determine that the wearer is interested in movies and may therefore zoom to and capture an image of the sign of the theater. The wearer may indicate whether the object in the captured image is of interest. If the wearer is not interested in the theater sign, the HMD may intelligently look for a different object that the wearer may be interested in. If the wearer is interested in the theater sign, the HMD may intelligently zoom to a title of a movie being played at the theater shown on a sign and capture an image of the title. If the wearer is not interested in the movie, the HMD may intelligently zoom to the title of a different movie being played at the theater. If the wearer is interested in the movie, the HMD may intelligently zoom to a listing of show times for the movie of interest, and capture an image of the show time.
This use of the HMD may also be applied to other scenarios for capturing other elements of the user's life experiences, whether the experiences are expected, unexpected, memorable, or in passing. Further discussions relating to devices and methods for capturing images representing experiences from the perspective of a user may be found below in more detail. While the discussions herein generally refer to the capturing of an image based on a determined feature of interest, other content may be captured accordingly as well. The other content may include audio content, video content without an audio component, or video content with an audio component.
2. First Example Method for Intelligent Zoom and Image Capture
FIG. 1A is a block diagram of an exemplary method 100 for intelligently zooming and capturing images. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results. Method 100 may include one or more operations, functions, or actions as illustrated by one or more of blocks 102-112. Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
In addition, for the method 100 and other processes and methods disclosed herein, the flowchart shows functionality and operation of one possible implementation of present embodiments. In this regard, each block may represent a module, a segment, or a portion of program code, which includes one or more instructions executable by a processor for implementing specific logical functions or steps in the process. The program code may be stored on any type of computer readable medium, for example, such as a storage device including a disk or hard drive. The computer readable medium may include non-transitory computer readable medium, for example, such as computer-readable media that stores data for short periods of time like register memory, processor cache and Random Access Memory (RAM). The computer readable medium may also include non-transitory media, such as secondary or persistent long term storage, like read only memory (ROM), optical or magnetic disks, compact-disc read only memory (CD-ROM), for example. The computer readable media may also be any other volatile or non-volatile storage systems. The computer readable medium may be considered a computer readable storage medium, for example, or a tangible storage device.
At block 102, the method 200 involves receiving image data corresponding to a field of view of an environment. As mentioned previously, a user may be wearing an HMD with a camera having a zoom function, while present in the environment. In this case, image data may be received from the camera and may be in the form of a video or a series of still images. As such, the field of view of the environment may be the imaging field of view of the camera at an initial level of zoom of the camera.
FIG. 2 illustrates an environment 200 in which an image within a first field of view 250 may be captured. As shown, the environment 200 may be an area outside a movie theater. Within the environment, there may be signs with text indicating the name of the theater 202, titles for the movies that are playing 208, and times for the movies that are playing 210. The environment may also include a sign for the box office 206, and other movie goers such as a first person 232 and a second person 234. In this case, the first field of view 250 may be the imaging field of view of the HMD camera at the initial level of zoom, as mentioned above.
At block 104, the method 100 involves determining a first feature of interest within the field of view 250 of the environment 200 based on a first interest criteria. The environment 200 may include a number of features the user may be interested in. For example, the movie theater name 202, the titles for the movies that are playing 208, and the times for the moves that are playing 210 may all be features the user may be interested in. Further, the user may be interested in the first person 232 and the second person 234. In one case, the first person 232 and the second person 234 may be recognizable friends of the user.
Features of interest to the user may be determined based on a plurality of interest criteria. Interest criteria may include text, human faces, colors, shapes, or any qualification that may be discerned through image processing by a computing device. Interest criteria may further include certain types of objects or object categories such as signs, animals, vehicles, buildings, or landscape, etc. Within the plurality of interest criteria, at least one interest criterion may be based on to determine features of interest. The at least one interest criterion may be predetermined. For instance, the user of the HMD may indicate that the user is interested in information presented in text form, and accordingly, the features of interest may be determined based on the interest criterion of text. In another instance, the interest criterion of text may further be combined with additional interest criteria for determining features of interest, such that if text recognition is available, features of interest may be determined based on text representing the additional interest criteria. For example, the user of the HMD may indicate that the user is interested in certain brands of cars. In this example, features of interest may be determined based on text identifying a vehicle or advertisement related to the certain brands of cars the user is interest in.
In another example, features of interest may be determined based on the interest criteria of human faces and colors. For instance, the user may indicate an interest in people having unusual hair color (i.e. green, blue, pink, or purple hair). In this example, the features of interest may be determined based on recognizing human faces and determining a hair color of person whose face has been recognized. In addition, interest criteria may include a degree of motion, such that features of interest may be determined based on the degree of motion detected within the field of view 250 of the environment 200. For example, the degree of motion may indicate a movement speed and/or a measure of presence in the field of view 250 of an object that can be based upon to determine whether the object may be a feature of interest. In some cases, the HMD may further be configured to “learn” which interest criteria may be particularly applicable to the user based on the user's previous behaviors.
In yet another example, interest criteria based upon to determine features of interest may be determined based on real-time input from the user. For instance, the user may indicate that images of features of interest determined based on the interest criteria of text are to be captured by selecting a “text” icon provided on the HMD or by saying “text.” Other example key words representing interest criteria may also be implemented.
As shown in FIG. 2, the first field of view 250 of the environment 200 includes text indicating the name of the theater 202, titles for the movies that are playing 208, times for the movies that are playing 210, as well as the sign for the box office 206. The first field of view 250 also includes the face of the first person 232. Within the first field of view 250, features of interest based on the first interest criteria may include each feature in the first field of view 250 indicated by text as well as the first person 232. In this case, the name of the theater 202 may be determined as the first feature of interest based on the first interest criteria of text, and in this case, because the name of the theater 202 has the largest font size of the different text elements.
In some cases, the HMD may include an eye-tracking device configured to determine where an attention or gaze of the user is directed. In such cases, eye-tracking data from the eye-tracking device may be based on to determine features of interest, in addition to interest criteria. In addition to gaze direction, the eye-tracking data may further provide information regarding dwell time and pupil dilation changes that may also be based on to determine features of interest.
In addition to the eye-tracking device, the HMD may also include various physiological sensors for detecting physiological responses such as galvanic skin responses, pupillary responses, electrocardiographic responses, electroencephalographic responses, body temperature, blood pressure, and hemoglobin oxygenation responses. As such, the detected physiological responses may further be based on to determine features of interest.
At block 106, the method 100 involves causing a camera to zoom to and capture a first image of a portion of the field of view that includes the first feature of interest. As discussed above, the name of the theater 202 may be determined as the first feature of interest. Accordingly, a first image of the name of the theater 202 may be captured. In one example, zooming to the portion of the field of view that includes the first feature of interest may result in the camera zooming to a first level of zoom, which may be an increased level of zoom from the initial level of zoom. In other words, the first image may be a zoomed-in image of the name of the theater 202. The first level of zoom may be determined based on a number of factors. For instance, the first level of zoom may be determined such that the entire name of the theater 202 may be included in the first image.
In one example, determining the first level of zoom may involve determining characteristics of the first feature of interest. In one case, the characteristics of the first feature of interest may indicate at least a size of the first feature of interest. In this case, an extent of zoom for the first level of zoom may be determined such that at least the entire first feature of interest (such as the entire marquee of the movie theater, rather than just the text representing the name of the theater) may be included in the image captured by the camera. This may further be applicable to determining subsequent levels of zoom for subsequent features of interests.
In a further example, the camera may have optical, digital, or both types of zoom capabilities. In one case, the camera may be adjusted to center on the first feature of interest to provide optimal optical zoom when capturing the image. In another case, the camera may not be angularly adjustable and may therefore be configured to provide digital zoom when capturing the image. In yet another case, the camera may be configured to provide both optical zoom and digital zoom. For instance, if the camera is unable to fully center on the first feature of interest, the camera may be configured to provide optical zoom to the extent that the entire first feature of interest is within the camera field of view, and subsequently provide digital zoom to capture the first image of a portion of the field of view that includes the first feature of interest.
FIG. 3A illustrates a second field of view 352 for capturing the first image of the first feature of interest, which in this case may be the name of theater 202. As shown, the second field of view 352 of the first image may be a zoomed-in view from the first field of view 250. As such, the second field of view 352 of the environment 200 may be narrower than the first field of view 250 of the environment 200.
As discussed previously, a feature of interest may further be determined based on a degree of motion. In one example, capturing the first image of the portion of the field of view that includes the first feature of interest may involve capturing video of the portion of the field of view. In one case, a duration of the captured video may be a predetermined duration, such as 5 seconds. In another case, the duration of the captured video may be further based on the degree of motion of the feature of interest. For instance, the feature of interest may be determined based on when a degree of motion of the feature of interest exceeds a threshold degree of motion. As such, the duration of the captured video may be as long as the degree of motion of the feature of interest is above the threshold degree of motion.
At block 108, the method 100 involves providing the image of the first feature of interest on a display. In one example, the display may be a component of the HMD worn by the user, such that the user may view what the HMD may have determined to be a feature of interest upon capturing the first image of the first feature of interest. In one case, the user may be prompted to indicate whether the user is interested in the determined feature of interest when the first image of the first feature of interest is displayed. In this case, if the user indicates interest in the determined feature of interest, the user may further be prompted to indicate whether the captured first image is to be stored. In another case, the user may simply be prompted to indicate whether the captured first image is to be stored.
At block 110, the method 100 involves determining a level of interest in the first feature of interest. In one case, determining the level of interest may involve acquiring interest input data indicating a level of interest in the first feature of interest. For example, the user may provide the interest input data by providing a gradient value indicating the level of interest in the first feature of interest. For instance, the user may provide a numeric value between 0 and 10, with 10 indicating extremely high interest, 0 indicating absolutely no interest, and 1-9 representing the varying levels in between.
In another example, the user may provide the interest input data by providing a binary value indicating the level of interest. In other words, the user may provide either a first value affirming interest in the first feature of interest, or a second value denying interest in the first feature of interest. In this example, the interest input data provided by the user may not explicitly indicate the user's level of interest. For instance, the user may be prompted to indicate whether the first image is to be stored, as mentioned above. By indicating the first image is to be stored, the user may be implicitly affirming interest in the first feature of interest. On the other hand, the user may be implicitly denying interest in the first feature of interest by indicating the first image is not to be stored.
In a further example, a predetermined duration of time may be implemented such that the user is assumed to affirm or deny interest in the first feature of interest if the user does not provide interest input data within the predetermined duration of time after the image of the first feature of interest has been provided on the display. For instance, the user may be given 3 seconds to provide interest input data after either the image of the first feature of interest has been provided on the display or the user has been prompted to provide interest input data. In one case, if no interest input is provided by the user after 3 seconds, the user may be considered to have denied interest in the first feature of interest. In another case, the user may be considered to have affirmed interest in the first feature of interest if no interest input is provided after the predetermined 3 seconds.
In yet another example, interest input data may be provided by the user without explicit feedback from the user. As discussed above, the HMD may include an eye-tracking device or other physiological sensors. In one instance, the interest input data may be eye-tracking data or physiological response data received upon providing the image of the first feature of interest on a display. The received eye-tracking data or physiological response data may accordingly be indicative of the user's level of interest in the first feature of interest.
As previously mentioned, the HMD may be configured to learn which interest criteria may be particularly applicable to the user based on the user's previous behaviors. In one case, the interest criteria based on to determine features of interest and the interest input data from the user in response to images of features of interest being display may be stored and processed for learning the applicable interest criteria of the user.
At block 112, the method 100 involves storing the first image and capturing a second image based on the level of interest. As mentioned above, the user may indicate whether or not the user may be in fact interested in the first feature of interest captured in the first image. In one example, the user may indicate a level of interest above a predetermined interest threshold. For instance, in the case the user provides interest input data by providing a gradient value between 0 and 10, the predetermined interest threshold may be set at 5, such that any value at 5 or above may be interpreted as affirming interest in the feature of interest. On the flip side, any value below 4 may be interpreted as denying interest in the feature of interest. Additional predetermined interest thresholds may also configured to further define the level of interest of the user with more precision.
Continuing with the example in which the user may indicate a level of interest above the predetermined interest threshold, the first image may be stored based on the indicated level of interest. In one case, the first image may be stored on a data storage medium in communication with the HMD. This may include a data storage medium physically attached to the HMD or a data storage medium within a network of computing devices in communication with the HMD.
Within the data storage medium, the first image may be stored in an image attribute database. In one example, the image-attribute database may be configured to store a set of images associated with the user wearing the HMD, and may include data for each of the images in the set of images specifying one or more attributes from a set of attributes indicating context associated with each of the images.
FIG. 1B is a block diagram of an exemplary method 130 for storing a captured image in an image-attribute database. In particular, the method 130 includes steps for storing the captured images in an image-attribute database. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results. Method 130 may include one or more operations, functions, or actions as illustrated by one or more of blocks 132-136. Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
At block 132, the method 130 involves determining one or more attributes indicating a context of the captured image. Continuing with the example above, the captured image may be the first image of the first feature of interest, which in this case is the name of the theater 202. In this case, attributes associated with the captured image may include a location of the movie theater, and a date and time when the first image captured.
In one example, the HMD may include a global positioning system (GPS) configured to determine a geographic location associated with the HMD. In this case, the geographic location of the HMD may be acquired from the GPS when the first image is captured. The HMD may further include a system clock, and the date and time may be acquired from a system clock of the HMD when the first image is captured. In another example, the time may be acquired from a server in communication with the HMD. In this instance, a local time may be acquired from the server based the time zone the wearer is in, according to the geographic location of the HMD acquired from the GPS.
At block 134, the method 130 involves associating the one or more attributes with the captured image. Continuing with the example above, the acquired location, date, and time may be associated with the first image. In addition to associating the acquired data with the first image, additional context may be determined and associated with the first image. For instance, if the location, date, and time associated with the first image indicates the user may be on the way home from work when the first image is captured, context indicating that the user is “commuting” may be determined and associated with the first image as an attribute.
At block 136, the method 130 involves causing the captured image and the one or more attributes to be stored in the image-attribute database. After relevant attributes have been associated with the captured image, the captured image may then be stored in the image-attribute database along with the determined one or more attributes. Continuing with the above example, the first image may be stored in the image-attribute database along with the location, data, and time acquired when capturing the first image.
Referring back to block 112 of FIG. 1, and continuing with the example in which the user may indicate a level of interest above the predetermined interest threshold, a second image may be captured based on the indicated level of interest. In this case, because the user indicated a level of interest above the predetermined interest threshold, the first interest criteria used to determine the first feature of interest may be validated. As such, the first interest criteria if text may be used to determine a second feature of interest within the second field of view 352 of the environment 200, and the image capture instructions may indicate that a second image should be captured of the second feature of interest.
As shown in FIG. 3A, the titles for the movies that are playing 208 may be determined as the second feature of interest within the second field of view 352 in a similar manner to how the name of the theater 202 may have been determined as the first feature of interest within the first field of view 250. In this case, the text of the titles for the movies that are playing 208 may be smaller than the text of the name of the theater 202. Accordingly, the second image of the second feature of interest may be captured at a second level of zoom, which may be higher than the first level of zoom.
FIG. 3B illustrates a third field of view 354 for capturing an image of the second feature of interest determined according to the first interest criteria of text. As shown, the second image may be an image of the first third field of view 354, which includes the second feature of interest, which in this case may be the titles for the movies that are playing 208. As such, because the second level of zoom may be higher than the first level of zoom, the third field of view 354 of the environment 200 may be narrower than the second field of view 352 of the environment 200.
Similar to the case of the first image, the second image may also be provided on a display for the user to view, and the user may then indicate a level of interest in the second feature of interest. Based on the level of interest indicated by the user, a third feature of interest within the third field of view 354 of the environment 200 may be determined, and a third image of the third feature of interest may be captured according to the same first interest criteria of text.
FIG. 3C illustrates a fourth field of view 356 for capturing an image of a third feature of interest determined according to the first interest criteria. As shown, times for the movies that are playing 210 may be determined as the third feature of interest. The size of text for the times for the moves that are playing 210 may be smaller than the size of the text for the movies that are playing 208. As such the third image may be captured at a level of zoom higher than the second level of zoom. Accordingly, the fourth field of view 356 of the environment 200 may be narrower than the third field of view 354 of the environment 200.
In further embodiment, the level of interest indicated by the user may also be based on to determine image capture instructions relating a quality of the image captured. For instance, if the user indicates a level of interest just above the predetermined interest threshold, the image may be captured at a relatively low resolution. On the other hand, if the user indicates a high level of interest, a high resolution image may be captured. In this case, the camera may be configured to capture a high-definition video if the indicated level of interest is sufficiently high. In other words, additional thresholds for different levels of capture qualities of images and videos may be implemented.
Accordingly, in the case a gradient value is provided to indicate a level of interest, a range of image capture resolutions may be configured to correspond to different levels of interest within the range of gradient values. In other words, by indicating a level of interest within the range of gradient values, the user may also be indicating the image resolution at which the image of the feature of interest is to be captured.
Thus far, the discussions in connection to FIGS. 3A-3C have related to cases in which the user indicates interest in the determined features of interest. Referring back to block 112 of FIG. 1A, the user may in fact indicate a level of interest below the predetermined interest threshold. In such a case, a fourth feature of interest may be determined within the first field of view 250 of the environment 200. In other words, the camera may pan out from the second field of view 352 back to the first field of view 250 and determine the fourth feature of interest according to a second interest criteria. In one case, the second interest criteria may be the same as the first interest criteria of text. In such a case, the fourth feature of interest may be determined as the sign for the box office 206.
In another case the second interest criteria may be that of human faces. As such, the fourth feature of interest may be determined as the first person 232. Accordingly, the camera may zoom-in on the first person 232 such that a fourth image may be captured of the first person 232. FIG. 4A illustrates a fifth field of view 452 for capturing the fourth image of the fourth feature of interest determined according to the second interest criteria of human faces. As shown, the fifth field of view 452 has a narrower view than the first field of view 250 of the environment 200 because the fourth image may have been captured at a fourth level of zoom higher than the initial level of zoom of the camera.
The fourth image of the first person 232 may then be provided for the user to view. In this case, the user may indicate a level of interest above the predetermined interest threshold and accordingly, a fifth feature of interest may be determined based on the second interest criteria of human faces within the fifth field of view 452. As such, the second person 234 may be determined as the fifth feature of interest.
FIG. 4B illustrates a sixth field of view 454 for capturing a fifth image of the fifth feature of interest determined according to the second interest criteria. In this case, because the second person 234 and the first person 232 may be similar in size, the level of zoom at which the fifth image is captured may be similar to the fourth level of zoom. As such, the sixth field of view 454 may be similar in scope to the fifth field of view 452.
Note that while the above example describes determining features of interest based on different interest criteria, when the user indicates a level of interest below the predetermined interest threshold, different interest criteria may be based on to determine features of interest when no additional feature of interest are determined within the field of view. For instance, referring back to FIG. 3C, no additional features of interest based on the interest criteria of text may be found. In such a case, the field of view may be widened by panning out, and additional features of interest, based on either the original or different interest criteria may be determined within the widened field of view.
3. Second Example Method for Intelligent Zoom and Image Capture
As discussed above, the method 100 involves determining a feature of interest within a field of view, capturing an image of the feature of interest, providing the image of the feature of interest to the user, and determining storage and further image capture instructions according to feedback from the user. In one example, the method 100 may be configured to store images and capture additional images without feedback from the user.
FIG. 1C is a block diagram of an alternative exemplary method 160 for intelligently zooming and capturing images. While examples described herein may refer specifically to the use of an HMD, those skilled in the art will appreciate that any wearable computing device with a camera with a zoom function may be configured to execute the methods described herein to achieve the desired results. Method 160 may include one or more operations, functions, or actions as illustrated by one or more of blocks 162-168. Although the blocks are illustrated in a sequential order, these blocks may also be performed in parallel, and/or in a different order than those described herein. Also, the various blocks may be combined into fewer blocks, divided into additional blocks, and/or removed, depending upon the desired implementation.
As shown in FIG. 1C, block 162 involves receiving image data corresponding a field of view of an environment, block 164 involves determining a feature of interest within the field of view based on a first interest criteria, block 166 involves causing a camera to zoom to and capture an image of a portion of the field of view that includes the first feature of interest, and block 168 involves causing the captured image to be stored in an image-attribute database including data for a set of images. In one case, block 168 of method 160 may cause the captured image to be stored according to method 130 of FIG. 1B
Blocks 162-166 of method 160 may be implemented similarly to block 102-106 of method 100. However, note that method 160 causes the captured image to be stored in the image-attribute database automatically, and not based on a level of interest provided by the user. Nevertheless, method 160 may still involve one or more of providing the image of the first feature of interest on a display, determine a level of interest in the first feature of interest, and storing the first image and capturing another image based on the level of interest, as described in connection to blocks 108-112 of FIG. 1A. In one case, the image of the first feature of interest may be provided on the display for a predetermined duration of time before the zooming to a second feature of interest without determining a level of interest in the first feature of interest.
As such, method 160 may be implemented as method 100 configured to automatically store each captured image, regardless of the level of interest indicated by the user. In this case, the resolution at which the to-be-automatically-stored images are captured and stored may still be determined based on the level of interest indicated by the user. Similarly, while the captured image is automatically stored, the determination of whether to zoom to and capture an image of a second feature of interest based on the first interest criteria or a second interest criteria may also be based on the level of interest indicated by the user.
4. Example Display of Captured Images
At some point, a user may wish to view images captured previously and stored in an image-attribute database. In one case, the user may be provided one image at a time on a display. In another case, the user may be provided a subset of images from the image-attribute database on the display. In this case, the each image included in the subset of images may share one or more attributes.
FIG. 5 illustrates an example presentation 500 of captured images according to one or more attributes. As shown, the presentation 500 may include a subset of images including images 502, 504, 506, 508, and 510. In one case, the images 502-510 maybe tiled in the form of a “mosaic.” As mentioned above, each image in the subset of images may share one or more attributes. Accordingly, the presentation 500 may also include subset tags 512 indicating the one or more attributes associated with each image in the subset of images.
As such, the user may view all previously captured images associated with an event by indicating one or more attributes by which to view the captured images. For example, the user may wish to view images associated with a time the user attended a movie showing with friends at a particular movie theater. In this case, the user may indicate a movie theater name, and the names of the friends, and accordingly, a subset of images associated with the movie theater, and the named friends may be provided.
5. Example System and Device Architecture
FIG. 6A illustrates an example system 600 for receiving, transmitting, and displaying data. The system 600 is shown in the form of a wearable computing device, which may be implemented as the HMD discussed above, to intelligently zoom to and capture an image of a feature of interest. While FIG. 6A illustrates a head-mounted device 602 as an example of a wearable computing device, other types of wearable computing devices could additionally or alternatively be used. As illustrated in FIG. 6A, the head-mounted device 602 has frame elements including lens- frames 604, 606 and a center frame support 608, lens elements 610, 612, and extending side- arms 614, 616. The center frame support 608 and the extending side- arms 614, 616 are configured to secure the head-mounted device 602 to a user's face via a user's nose and ears, respectively.
Each of the frame elements 604, 606, and 608 and the extending side- arms 614, 616 may be formed of a solid structure of plastic and/or metal, or may be formed of a hollow structure of similar material so as to allow wiring and component interconnects to be internally routed through the head-mounted device 602. Other materials may be possible as well.
One or more of each of the lens elements 610, 612 may be formed of any material that can suitably display a projected image or graphic. Each of the lens elements 610, 612 may also be sufficiently transparent to allow a user to see through the lens element. Combining these two features of the lens elements may facilitate an augmented reality or heads-up display where the projected image or graphic is superimposed over a real-world view as perceived by the user through the lens elements 610, 612.
The extending side- arms 614, 616 may each be projections that extend away from the lens- frames 604, 606, respectively, and may be positioned behind a user's ears to secure the head-mounted device 602 to the user. The extending side- arms 614, 616 may further secure the head-mounted device 602 to the user by extending around a rear portion of the user's head. Additionally or alternatively, for example, the system 600 may connect to or be affixed within a head-mounted helmet structure. Other possibilities exist as well.
The system 600 may also include an on-board computing system 618, a video camera 620, a sensor 622, and a finger-operable touch pad 624. The on-board computing system 618 is shown to be positioned on the extending side-arm 614 of the head-mounted device 602; however, the on-board computing system 618 may be provided on other parts of the head-mounted device 602 or may be positioned remote from the head-mounted device 602 (e.g., the on-board computing system 618 could be connected by wires or wirelessly connected to the head-mounted device 602). The on-board computing system 618 may include a processor and memory, for example. The on-board computing system 618 may be configured to receive and analyze data from the video camera 620, the sensor 622, and the finger-operable touch pad 624 (and possibly from other sensory devices, user-interfaces, or both) and generate images for output by the lens elements 610 and 612. The on-board computing system 618 may additionally include a speaker or a microphone for user input (not shown). An example computing system is further described below in connection with FIG. 7.
The video camera 620 is shown positioned on the extending side-arm 614 of the head-mounted device 602; however, the video camera 620 may be provided on other parts of the head-mounted device 602. The video camera 620 may be configured to capture images at various resolutions or at different frame rates. Video cameras with a small form-factor, such as those used in cell phones or webcams, for example, may be incorporated into an example embodiment of the system 600.
Further, although FIG. 6A illustrates one video camera 620, more video cameras may be used, and each may be configured to capture the same view, or to capture different views. For example, the video camera 620 may be forward facing to capture at least a portion of the real-world view perceived by the user. This forward facing image captured by the video camera 620 may then be used to generate an augmented reality where computer generated images appear to interact with the real-world view perceived by the user.
The sensor 622 is shown on the extending side-arm 616 of the head-mounted device 602; however, the sensor 622 may be positioned on other parts of the head-mounted device 602. The sensor 622 may include one or more of a gyroscope or an accelerometer, for example. Other sensing devices may be included within, or in addition to, the sensor 622 or other sensing functions may be performed by the sensor 622.
The finger-operable touch pad 624 is shown on the extending side-arm 614 of the head-mounted device 602. However, the finger-operable touch pad 624 may be positioned on other parts of the head-mounted device 602. Also, more than one finger-operable touch pad may be present on the head-mounted device 602. The finger-operable touch pad 624 may be used by a user to input commands. The finger-operable touch pad 624 may sense at least one of a position and a movement of a finger via capacitive sensing, resistance sensing, or a surface acoustic wave process, among other possibilities. The finger-operable touch pad 624 may be capable of sensing finger movement in a direction parallel or planar to the pad surface, in a direction normal to the pad surface, or both, and may also be capable of sensing a level of pressure applied to the pad surface. The finger-operable touch pad 624 may be formed of one or more translucent or transparent insulating layers and one or more translucent or transparent conducting layers. Edges of the finger-operable touch pad 624 may be formed to have a raised, indented, or roughened surface, so as to provide tactile feedback to a user when the user's finger reaches the edge, or other area, of the finger-operable touch pad 624. If more than one finger-operable touch pad is present, each finger-operable touch pad may be operated independently, and may provide a different function.
FIG. 6B illustrates an alternate view of the system 600 illustrated in FIG. 6A. As shown in FIG. 6B, the lens elements 610, 612 may act as display elements. The head-mounted device 602 may include a first projector 628 coupled to an inside surface of the extending side-arm 616 and configured to project a display 630 onto an inside surface of the lens element 612. Additionally or alternatively, a second projector 632 may be coupled to an inside surface of the extending side-arm 614 and configured to project a display 634 onto an inside surface of the lens element 610.
The lens elements 610, 612 may act as a combiner in a light projection system and may include a coating that reflects the light projected onto them from the projectors 628, 632. In some embodiments, a reflective coating may be omitted (e.g., when the projectors 628, 632 are scanning laser devices).
In alternative embodiments, other types of display elements may also be used. For example, the lens elements 610, 612 themselves may include: a transparent or semi-transparent matrix display, such as an electroluminescent display or a liquid crystal display, one or more waveguides for delivering an image to the user's eyes, or other optical elements capable of delivering an in focus near-to-eye image to the user. A corresponding display driver may be disposed within the frame elements 604, 606 for driving such a matrix display. Alternatively or additionally, a laser or light emitting diode (LED) source and scanning system could be used to draw a raster display directly onto the retina of one or more of the user's eyes. Other possibilities exist as well.
FIG. 7A illustrates an example system 700 for receiving, transmitting, and displaying data. The system 700 is shown in the form of a wearable computing device 702, which may be implemented as the HMD discussed above, to intelligently zoom to and capture an image of a feature of interest. The wearable computing device 702 may include frame elements and side-arms such as those described with respect to FIGS. 6A and 6B. The wearable computing device 702 may additionally include an on-board computing system 704 and a video camera 706, such as those described with respect to FIGS. 6A and 6B. The video camera 706 is shown mounted on a frame of the wearable computing device 702; however, the video camera 706 may be mounted at other positions as well.
As shown in FIG. 7A, the wearable computing device 702 may include a single display 708 which may be coupled to the device. The display 708 may be formed on one of the lens elements of the wearable computing device 702, such as a lens element described with respect to FIGS. 6A and 6B, and may be configured to overlay computer-generated graphics in the user's view of the physical world. The display 708 is shown to be provided in a center of a lens of the wearable computing device 702, however, the display 708 may be provided in other positions. The display 708 is controllable via the computing system 704 that is coupled to the display 708 via an optical waveguide 710.
FIG. 7B illustrates an example system 720 for receiving, transmitting, and displaying data. The system 720 is shown in the form of a wearable computing device 722. The wearable computing device 722 may include side-arms 723, a center frame support 724, and a bridge portion with nosepiece 725. In the example shown in FIG. 7B, the center frame support 724 connects the side-arms 723. The wearable computing device 722 does not include lens-frames containing lens elements. The wearable computing device 722 may additionally include an on-board computing system 726 and a video camera 728, such as those described with respect to FIGS. 6A and 6B.
The wearable computing device 722 may include a single lens element 730 that may be coupled to one of the side-arms 723 or the center frame support 724. The lens element 730 may include a display such as the display described with reference to FIGS. 6A and 6B, and may be configured to overlay computer-generated graphics upon the user's view of the physical world. In one example, the single lens element 730 may be coupled to a side of the extending side-arm 723. The single lens element 730 may be positioned in front of or proximate to a user's eye when the wearable computing device 722 is worn by a user. For example, the single lens element 730 may be positioned below the center frame support 724, as shown in FIG. 7B.
FIG. 8 shows a simplified block diagram of an example computer network infrastructure. In system 800, a device 810 communicates using a communication link 820 (e.g., a wired or wireless connection) to a remote device 830. The device 810 may be any type of device that can receive input data. For example, the device 810 may be a heads-up display system, such as the head-mounted device 602, 700, or 720 described with reference to FIGS. 6A-7B.
Thus, the device 810 may include a display system 812 including a processor 814 and a display 816. The display 816 may be, for example, an optical see-through display, an optical see-around display, or a video see-through display. The processor 814 may receive data from the remote device 830, and configure the data for display on the display 816. The processor 814 may be any type of processor, such as a micro-processor or a digital signal processor, for example.
The device 810 may further include on-board data storage, such as memory 818 coupled to the processor 814. The memory 818 may store software that can be accessed and executed by the processor 814, for example.
The remote device 830 may be any type of computing device or transmitter including a laptop computer, a mobile telephone, or tablet computing device, etc., that is configured to transmit data to the device 810. The remote device 830 and the device 810 may contain hardware to enable the communication link 820, such as processors, transmitters, receivers, antennas, etc.
In FIG. 8, the communication link 820 is illustrated as a wireless connection; however, wired connections may also be used. For example, the communication link 820 may be a wired serial bus such as a universal serial bus or a parallel bus, among other connections. The communication link 820 may also be a wireless connection using, e.g., Bluetooth® radio technology, communication protocols described in IEEE 802.11 (including any IEEE 802.11 revisions), Cellular technology (such as GSM, CDMA, UMTS, EV-DO, WiMAX, or LTE), or Zigbee® technology, among other possibilities. Either of such a wired and/or wireless connection may be a proprietary connection as well. The remote device 830 may be accessible via the Internet and may include a computing cluster associated with a particular web service (e.g., social-networking, photo sharing, address book, etc.).
As described above in connection with FIGS. 6A-7B, an example wearable computing device may include, or may otherwise be communicatively coupled to, a computing system, such as computing system 618 or computing system 704. FIG. 9 shows a simplified block diagram depicting example components of an example computing system 900. One or both of the device 810 and the remote device 830 may take the form of computing system 900.
Computing system 900 may include at least one processor 902 and system memory 904. In an example embodiment, computing system 900 may include a system bus 906 that communicatively connects processor 902 and system memory 904, as well as other components of computing system 900. Depending on the desired configuration, processor 902 can be any type of processor including, but not limited to, a microprocessor (μP), a microcontroller (μC), a digital signal processor (DSP), or any combination thereof. Furthermore, system memory 904 can be of any type of memory now known or later developed including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.) or any combination thereof.
An example computing system 900 may include various other components as well. For example, computing system 900 includes an A/V processing unit 908 for controlling graphical display 910 and speaker 912 (via A/V port 914), one or more communication interfaces 916 for connecting to other computing devices 918, and a power supply 920. Graphical display 910 may be arranged to provide a visual depiction of various input regions provided by user-interface module 922. For example, user-interface module 922 may be configured to provide a user-interface, and graphical display 910 may be configured to provide a visual depiction of the user-interface. User-interface module 922 may be further configured to receive data from and transmit data to (or be otherwise compatible with) one or more user-interface devices 928.
Furthermore, computing system 900 may also include one or more data storage devices 924, which can be removable storage devices, non-removable storage devices, or a combination thereof. Examples of removable storage devices and non-removable storage devices include magnetic disk devices such as flexible disk drives and hard-disk drives (HDD), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSD), and/or any other storage device now known or later developed. Computer storage media can include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. For example, computer storage media may take the form of RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium now known or later developed that can be used to store the desired information and which can be accessed by computing system 900.
According to an example embodiment, computing system 900 may include program instructions 926 that are stored in system memory 904 (and/or possibly in another data-storage medium) and executable by processor 902 to facilitate the various functions described herein including, but not limited to, those functions described with respect to FIG. 1. Although various components of computing system 900 are shown as distributed components, it should be understood that any of such components may be physically integrated and/or distributed according to the desired configuration of the computing system.
6. Conclusion
It should be understood that arrangements described herein are for purposes of example only. As such, those skilled in the art will appreciate that other arrangements and other elements (e.g. machines, interfaces, functions, orders, and groupings of functions, etc.) can be used instead, and some elements may be omitted altogether according to the desired results. Further, many of the elements that are described are functional entities that may be implemented as discrete or distributed components or in conjunction with other components, in any suitable combination and location.
While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purposes of illustration and are not intended to be limiting, with the true scope being indicated by the following claims, along with the full scope of equivalents to which such claims are entitled. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting.
Since many modifications, variations, and changes in detail can be made to the described example, it is intended that all matters in the preceding description and shown in the accompanying figures be interpreted as illustrative and not in a limiting sense. Further, it is intended to be understood that the following clauses further describe aspects of the present description.

Claims (19)

The invention claimed is:
1. A system comprising:
at least one processor;
a non-transitory computer readable medium; and
program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
receiving image data corresponding to a field of view of an environment;
determining a first feature of interest within the first field of view based on a first interest criteria;
causing a camera to zoom to and capture a first image of a portion of the field of view that comprises the first feature of interest;
providing the first image of the first feature of interest on a display;
determining a level of interest in the first feature of interest, wherein determining a level of interest in the first feature of interest comprises acquiring a gradient value within a range of interest level values indicating the level of interest, and storing the first image, wherein storing the image of the determined first feature of interest comprises:
based on the gradient value, determining an image resolution at which to store the first image;
causing the first image to be stored in an image-attribute database at the determined image resolution; and
capturing a second image based on the level of interest.
2. The system of claim 1, wherein the camera is attached to a head-mountable device (HMD), and wherein the image data is extracted from a point-of-view video captured by the camera.
3. The system of claim 1, wherein determining a level of interest in the first feature of interest comprises acquiring a gradient value within a range of interest level values indicating the level of interest.
4. The system of claim 1, wherein capturing a second image based on the level of interest further comprises:
determining a second feature of interest within a second field of view of the environment based on the first interest criteria; and
causing the camera to zoom to and capture a second image of the second feature of interest.
5. The system of claim 1, wherein the level of interest is below a predetermined interest threshold, and wherein capturing a second image based on the level of interest further comprises:
determining a second feature of interest within the field of view of the environment based on a second interest criteria; and
causing the camera to zoom to and capture a second image of the second feature of interest.
6. The system of claim 1, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
causing the first image of the first feature of interest to be stored in an image-attribute database, the image-attribute database comprising data for a set of images, wherein the data for a given image of the set of images specifies one or more attributes from a set of attributes.
7. The system of claim 6, wherein causing the first image of the determined feature of interest to be stored in a data item-attribute database, further comprises:
determining one or more attributes indicating a context of the first image;
associating the one or more attributes with the first image; and
causing the first image and the one or more attributes to be stored in the image-attribute database.
8. The system of claim 7, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
providing a subset of images from the image-attribute database on a display, wherein the subset of images includes the first image, and wherein each image in the subset of images shares at least one of the one or more attributes associated with the first image.
9. The system of claim 1, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
receiving eye-tracking data indicating a gaze direction, wherein the eye-tracking data is received from an eye-tracking device; and
determining the first feature of interest within the first field of view based also on the obtained gaze direction.
10. The system of claim 1, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
receiving eye-tracking data indicating pupil dilation, wherein the eye-tracking data is received from an eye-tracking device; and
determining the first feature of interest within the first field of view based also on a degree of pupil dilation.
11. The system of claim 1, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising:
receiving physiological data indicating a physiological state, wherein the physiological data is received from one or more physiological sensors; and
determining the first feature of interest within the first field of view based also on the physiological state.
12. The system of claim 1, wherein causing a camera to zoom to and capture a first image of a portion of the field of view that comprises the first feature of interest further comprises:
determining characteristics of the first feature of interest; and
determining an extent of the zoom based on the characteristics of the first feature of interest.
13. The system of claim 1, wherein causing a camera to zoom to and capture a first image of a portion of the field of view that comprises the first feature of interest further comprises:
determining characteristics of the first feature of interest; and
determining a type of the zoom based on the characteristics of the first feature of interest.
14. A method comprising:
receiving image data corresponding to a field of view of an environment;
determining a first feature of interest within the first field of view based on a first interest criteria;
causing a camera to zoom to and capture a first image of a portion of the field of view that comprises the first feature of interest;
providing the first image of the first feature of interest on a display;
determining a level of interest in the first feature of interest, wherein determining a level of interest in the first feature of interest comprises acquiring a gradient value within a range of interest level values indicating the level of interest
causing the first image of the first feature of interest to be stored in an image-attribute database comprising data for a set of images, wherein the data for a given image of the set of images specifies one or more attributes from a set of attributes, wherein storing the first image of the determined first feature of interest comprises:
based on the gradient value, determining an image resolution at which to store the first image; and
causing the first image to be stored in an image-attribute database at the determined image resolution; and
capturing a second image based on the level of interest.
15. The method of claim 14 wherein the level of interest is below a predetermined interest threshold, and wherein capturing a second image based on the level of interest comprises:
determining a second feature of interest within the field of view of the environment based on a second interest criteria; and
causing the camera to zoom to and capture the second image of the second feature of interest.
16. A non-transitory computer-readable medium having stored thereon instructions executable by a computing device to cause the computing device to perform functions comprising:
receiving image data corresponding to a field of view of an environment;
determining a first feature of interest within the first field of view based on a first interest criteria;
causing a camera to zoom to and capture a first image of a portion of the field of view that comprises the first feature of interest;
providing the image of the first feature of interest on a display;
determining a level of interest in the first feature of interest, wherein determining a level of interest in the first feature of interest comprises acquiring a gradient value within a range of interest level values indicating the level of interest, and storing the first image, wherein storing the image of the determined first feature of interest comprises:
based on the gradient value, determining an image resolution at which to store the first image;
causing the first image to be stored in an image-attribute database at the determined image resolution; and
capturing a second image based on the level of interest.
17. The non-transitory computer readable medium of claim 16, wherein the instructions further comprise:
determining one or more attributes indicating a context of the first image;
associating the one or more attributes with the first image;
causing the first image and the one or more attributes to be stored in the image-attribute database; and
providing a subset of images from the image-attribute database on a display, wherein the subset of images includes the first image, and wherein each image in the subset of images shares at least one of the one or more attributes associated with the first image.
18. The system of claim 8, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising providing the subset of images from the image-attribute database on a display in a tiled mosaic format.
19. The system of claim 1, further comprising program instructions stored on the non-transitory computer readable medium and executable by the at least one processor to perform functions comprising causing the camera to zoom based on both an optical zoon and a digital zoom of the camera, wherein the digital zoom is performed subsequent to an optical zoom.
US13/617,608 2012-01-06 2012-09-14 Zoom and image capture based on features of interest Active 2033-05-25 US9197864B1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US13/617,608 US9197864B1 (en) 2012-01-06 2012-09-14 Zoom and image capture based on features of interest
US14/885,763 US9466112B1 (en) 2012-01-06 2015-10-16 Zoom and image capture based on features of interest
US15/262,847 US9852506B1 (en) 2012-01-06 2016-09-12 Zoom and image capture based on features of interest

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261584100P 2012-01-06 2012-01-06
US13/617,608 US9197864B1 (en) 2012-01-06 2012-09-14 Zoom and image capture based on features of interest

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/885,763 Continuation US9466112B1 (en) 2012-01-06 2015-10-16 Zoom and image capture based on features of interest

Publications (1)

Publication Number Publication Date
US9197864B1 true US9197864B1 (en) 2015-11-24

Family

ID=54543030

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/617,608 Active 2033-05-25 US9197864B1 (en) 2012-01-06 2012-09-14 Zoom and image capture based on features of interest
US14/885,763 Active US9466112B1 (en) 2012-01-06 2015-10-16 Zoom and image capture based on features of interest
US15/262,847 Active US9852506B1 (en) 2012-01-06 2016-09-12 Zoom and image capture based on features of interest

Family Applications After (2)

Application Number Title Priority Date Filing Date
US14/885,763 Active US9466112B1 (en) 2012-01-06 2015-10-16 Zoom and image capture based on features of interest
US15/262,847 Active US9852506B1 (en) 2012-01-06 2016-09-12 Zoom and image capture based on features of interest

Country Status (1)

Country Link
US (3) US9197864B1 (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150235427A1 (en) * 2013-06-19 2015-08-20 Panasonic Intellectual Property Management Co., Ltd. Image display device and image display method
US20150350517A1 (en) * 2014-05-27 2015-12-03 François Duret Device for visualizing an interior of a patient's mouth
WO2017218436A1 (en) * 2016-06-13 2017-12-21 Sony Interactive Entertainment Inc. Hmd transitions for focusing on specific content in virtual-reality environments
WO2018007779A1 (en) * 2016-07-08 2018-01-11 Sony Interactive Entertainment Inc. Augmented reality system and method
US20180074577A1 (en) * 2012-12-06 2018-03-15 International Business Machines Corporation Dynamic augmented reality media creation
US20190005327A1 (en) * 2017-06-30 2019-01-03 International Business Machines Corporation Object storage and retrieval based upon context
JP2019040333A (en) * 2017-08-24 2019-03-14 大日本印刷株式会社 Information processing device, information processing method, computer program and time series data for display control
WO2019177757A1 (en) * 2018-03-14 2019-09-19 Apple Inc. Image enhancement devices with gaze tracking
US10437918B1 (en) * 2015-10-07 2019-10-08 Google Llc Progressive image rendering using pan and zoom
CN111602140A (en) * 2018-05-11 2020-08-28 三星电子株式会社 Method of analyzing an object in an image recorded by a camera of a head mounted device
CN112188097A (en) * 2020-09-29 2021-01-05 Oppo广东移动通信有限公司 Photographing method, photographing apparatus, terminal device, and computer-readable storage medium
US11222632B2 (en) 2017-12-29 2022-01-11 DMAI, Inc. System and method for intelligent initiation of a man-machine dialogue based on multi-modal sensory inputs
US11331807B2 (en) 2018-02-15 2022-05-17 DMAI, Inc. System and method for dynamic program configuration
WO2022122117A1 (en) * 2020-12-07 2022-06-16 Viewpointsystem Gmbh Method for implementing a zooming function in an eye tracking system
US11468894B2 (en) * 2017-12-29 2022-10-11 DMAI, Inc. System and method for personalizing dialogue based on user's appearances
US11504856B2 (en) 2017-12-29 2022-11-22 DMAI, Inc. System and method for selective animatronic peripheral response for human machine dialogue
US11582392B2 (en) 2021-03-25 2023-02-14 International Business Machines Corporation Augmented-reality-based video record and pause zone creation

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021061479A1 (en) 2019-09-23 2021-04-01 Apple Inc. Rendering computer-generated reality text
GB2621083A (en) 2021-04-20 2024-01-31 Shvartzman Yosef Computer-based system for interacting with a baby and methods of use thereof

Citations (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5347371A (en) 1990-11-29 1994-09-13 Hitachi, Ltd. Video camera with extraction unit for extracting specific portion of video signal
US5572343A (en) 1992-05-26 1996-11-05 Olympus Optical Co., Ltd. Visual display having see-through function and stacked liquid crystal shutters of opposite viewing angle directions
US20020140822A1 (en) 2001-03-28 2002-10-03 Kahn Richard Oliver Camera with visible and infra-red imaging
US20030179288A1 (en) 2002-01-23 2003-09-25 Tenebraex Corporation Method of creating a virtual window
US20050063566A1 (en) 2001-10-17 2005-03-24 Beek Gary A . Van Face imaging system for recordal and automated identity confirmation
US6972787B1 (en) * 2002-06-28 2005-12-06 Digeo, Inc. System and method for tracking an object with multiple cameras
US7091928B2 (en) 2001-03-02 2006-08-15 Rajasingham Arjuna Indraeswara Intelligent eye
US20060187305A1 (en) 2002-07-01 2006-08-24 Trivedi Mohan M Digital processing of video images
US20070146484A1 (en) 2005-11-16 2007-06-28 Joshua Horton Automated video system for context-appropriate object tracking
US7248294B2 (en) 2001-07-10 2007-07-24 Hewlett-Packard Development Company, L.P. Intelligent feature selection and pan zoom control
US20070270215A1 (en) 2006-05-08 2007-11-22 Shigeru Miyamoto Method and apparatus for enhanced virtual camera control within 3d video games or other computer graphics presentations providing intelligent automatic 3d-assist for third person viewpoints
US20080036875A1 (en) 2006-08-09 2008-02-14 Jones Peter W Methods of creating a virtual window
US20080129844A1 (en) 2006-10-27 2008-06-05 Cusack Francis J Apparatus for image capture with automatic and manual field of interest processing with a multi-resolution camera
US7391907B1 (en) 2004-10-01 2008-06-24 Objectvideo, Inc. Spurious object detection in a video surveillance system
US20080192116A1 (en) 2005-03-29 2008-08-14 Sportvu Ltd. Real-Time Objects Tracking and Motion Capture in Sports Events
US20080292140A1 (en) 2007-05-22 2008-11-27 Stephen Jeffrey Morris Tracking people and objects using multiple live and recorded surveillance camera video feeds
US20090087029A1 (en) 2007-08-22 2009-04-02 American Gnc Corporation 4D GIS based virtual reality for moving target prediction
US20090097710A1 (en) 2006-05-22 2009-04-16 Rafael Advanced Defense Systems Ltd. Methods and system for communication and displaying points-of-interest
US20090219387A1 (en) 2008-02-28 2009-09-03 Videolq, Inc. Intelligent high resolution video system
US20090259102A1 (en) 2006-07-10 2009-10-15 Philippe Koninckx Endoscopic vision system
US20090324010A1 (en) 2008-06-26 2009-12-31 Billy Hou Neural network-controlled automatic tracking and recognizing system and method
US7733369B2 (en) 2004-09-28 2010-06-08 Objectvideo, Inc. View handling in video surveillance systems
US7742077B2 (en) 2004-02-19 2010-06-22 Robert Bosch Gmbh Image stabilization system and method for a video camera
US20100194859A1 (en) 2007-11-12 2010-08-05 Stephan Heigl Configuration module for a video surveillance system, surveillance system comprising the configuration module, method for configuring a video surveillance system, and computer program
US20110018903A1 (en) 2004-08-03 2011-01-27 Silverbrook Research Pty Ltd Augmented reality device for presenting virtual imagery registered to a viewed surface
US7884849B2 (en) 2005-09-26 2011-02-08 Objectvideo, Inc. Video surveillance system with omni-directional camera
US20110043644A1 (en) * 2008-04-02 2011-02-24 Esight Corp. Apparatus and Method for a Dynamic "Region of Interest" in a Display System
US20110057863A1 (en) 2009-09-10 2011-03-10 Ryohei Sugihara Spectacles-type image display device
US7940299B2 (en) 2001-08-09 2011-05-10 Technest Holdings, Inc. Method and apparatus for an omni-directional video surveillance system
WO2011062591A1 (en) 2009-11-21 2011-05-26 Douglas Peter Magyari Head mounted display device
US20110149072A1 (en) 2009-12-22 2011-06-23 Mccormack Kenneth Surveillance system and method for operating same
US20110213664A1 (en) 2010-02-28 2011-09-01 Osterhout Group, Inc. Local advertising content on an interactive head-mounted eyepiece
US8026945B2 (en) 2005-07-22 2011-09-27 Cernium Corporation Directed attention digital video recordation
US20110310260A1 (en) 2010-06-18 2011-12-22 Minx, Inc. Augmented Reality
US20120019522A1 (en) 2010-07-25 2012-01-26 Raytheon Company ENHANCED SITUATIONAL AWARENESS AND TARGETING (eSAT) SYSTEM

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2248501A (en) * 1999-12-17 2001-06-25 Promo Vu Interactive promotional information communicating system
JP2012003189A (en) * 2010-06-21 2012-01-05 Sony Corp Image display device, image display method and program
US20140347363A1 (en) * 2013-05-22 2014-11-27 Nikos Kaburlasos Localized Graphics Processing Based on User Interest

Patent Citations (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5347371A (en) 1990-11-29 1994-09-13 Hitachi, Ltd. Video camera with extraction unit for extracting specific portion of video signal
US5572343A (en) 1992-05-26 1996-11-05 Olympus Optical Co., Ltd. Visual display having see-through function and stacked liquid crystal shutters of opposite viewing angle directions
US7091928B2 (en) 2001-03-02 2006-08-15 Rajasingham Arjuna Indraeswara Intelligent eye
US20020140822A1 (en) 2001-03-28 2002-10-03 Kahn Richard Oliver Camera with visible and infra-red imaging
US7248294B2 (en) 2001-07-10 2007-07-24 Hewlett-Packard Development Company, L.P. Intelligent feature selection and pan zoom control
US7940299B2 (en) 2001-08-09 2011-05-10 Technest Holdings, Inc. Method and apparatus for an omni-directional video surveillance system
US20050063566A1 (en) 2001-10-17 2005-03-24 Beek Gary A . Van Face imaging system for recordal and automated identity confirmation
US20030179288A1 (en) 2002-01-23 2003-09-25 Tenebraex Corporation Method of creating a virtual window
US6972787B1 (en) * 2002-06-28 2005-12-06 Digeo, Inc. System and method for tracking an object with multiple cameras
US20060187305A1 (en) 2002-07-01 2006-08-24 Trivedi Mohan M Digital processing of video images
US7742077B2 (en) 2004-02-19 2010-06-22 Robert Bosch Gmbh Image stabilization system and method for a video camera
US20110018903A1 (en) 2004-08-03 2011-01-27 Silverbrook Research Pty Ltd Augmented reality device for presenting virtual imagery registered to a viewed surface
US7733369B2 (en) 2004-09-28 2010-06-08 Objectvideo, Inc. View handling in video surveillance systems
US7391907B1 (en) 2004-10-01 2008-06-24 Objectvideo, Inc. Spurious object detection in a video surveillance system
US20080192116A1 (en) 2005-03-29 2008-08-14 Sportvu Ltd. Real-Time Objects Tracking and Motion Capture in Sports Events
US8026945B2 (en) 2005-07-22 2011-09-27 Cernium Corporation Directed attention digital video recordation
US7884849B2 (en) 2005-09-26 2011-02-08 Objectvideo, Inc. Video surveillance system with omni-directional camera
US20070146484A1 (en) 2005-11-16 2007-06-28 Joshua Horton Automated video system for context-appropriate object tracking
US20070270215A1 (en) 2006-05-08 2007-11-22 Shigeru Miyamoto Method and apparatus for enhanced virtual camera control within 3d video games or other computer graphics presentations providing intelligent automatic 3d-assist for third person viewpoints
US20090097710A1 (en) 2006-05-22 2009-04-16 Rafael Advanced Defense Systems Ltd. Methods and system for communication and displaying points-of-interest
US20090259102A1 (en) 2006-07-10 2009-10-15 Philippe Koninckx Endoscopic vision system
US20080036875A1 (en) 2006-08-09 2008-02-14 Jones Peter W Methods of creating a virtual window
US20080129844A1 (en) 2006-10-27 2008-06-05 Cusack Francis J Apparatus for image capture with automatic and manual field of interest processing with a multi-resolution camera
US20080292140A1 (en) 2007-05-22 2008-11-27 Stephen Jeffrey Morris Tracking people and objects using multiple live and recorded surveillance camera video feeds
US20090087029A1 (en) 2007-08-22 2009-04-02 American Gnc Corporation 4D GIS based virtual reality for moving target prediction
US20100194859A1 (en) 2007-11-12 2010-08-05 Stephan Heigl Configuration module for a video surveillance system, surveillance system comprising the configuration module, method for configuring a video surveillance system, and computer program
US20090219387A1 (en) 2008-02-28 2009-09-03 Videolq, Inc. Intelligent high resolution video system
US20110043644A1 (en) * 2008-04-02 2011-02-24 Esight Corp. Apparatus and Method for a Dynamic "Region of Interest" in a Display System
US20090324010A1 (en) 2008-06-26 2009-12-31 Billy Hou Neural network-controlled automatic tracking and recognizing system and method
US20110057863A1 (en) 2009-09-10 2011-03-10 Ryohei Sugihara Spectacles-type image display device
WO2011062591A1 (en) 2009-11-21 2011-05-26 Douglas Peter Magyari Head mounted display device
US20110149072A1 (en) 2009-12-22 2011-06-23 Mccormack Kenneth Surveillance system and method for operating same
US20110213664A1 (en) 2010-02-28 2011-09-01 Osterhout Group, Inc. Local advertising content on an interactive head-mounted eyepiece
US20110310260A1 (en) 2010-06-18 2011-12-22 Minx, Inc. Augmented Reality
US20120019522A1 (en) 2010-07-25 2012-01-26 Raytheon Company ENHANCED SITUATIONAL AWARENESS AND TARGETING (eSAT) SYSTEM

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Miura et al., An Active Vision System for Real-Time Traffic Sign Recognition, Proc. 2000 IEEE.
Tack et al., Soldier Information Requirements Technology Demonstration 9SIREQ-Td) Off-Bore Camera Display Characterization Study, Human Systems Incorporated, DRDC-Toronto CR-2005-025, May 2005.
Takacs et al., Feature Tracking for Mobile Augmented Reality Using Video Coder Motion Vectors, 2007 IEEE.

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180095527A1 (en) * 2012-12-06 2018-04-05 International Business Machines Corporation Dynamic augmented reality media creation
US10831263B2 (en) * 2012-12-06 2020-11-10 International Business Machines Corporation Dynamic augmented reality media creation
US20200012338A1 (en) * 2012-12-06 2020-01-09 International Business Machines Corporation Dynamic augmented reality media creation
US20180074577A1 (en) * 2012-12-06 2018-03-15 International Business Machines Corporation Dynamic augmented reality media creation
US10452129B2 (en) * 2012-12-06 2019-10-22 International Business Machines Corporation Dynamic augmented reality media creation
US20200012337A1 (en) * 2012-12-06 2020-01-09 International Business Machines Corporation Dynamic augmented reality media creation
US10452130B2 (en) * 2012-12-06 2019-10-22 International Business Machines Corporation Dynamic augmented reality media creation
US10831262B2 (en) * 2012-12-06 2020-11-10 International Business Machines Corporation Dynamic augmented reality media creation
US9916690B2 (en) * 2013-06-19 2018-03-13 Panasonic Intellectual Property Management Co., Ltd. Correction of displayed images for users with vision abnormalities
US20150235427A1 (en) * 2013-06-19 2015-08-20 Panasonic Intellectual Property Management Co., Ltd. Image display device and image display method
US20150350517A1 (en) * 2014-05-27 2015-12-03 François Duret Device for visualizing an interior of a patient's mouth
US11759091B2 (en) * 2014-05-27 2023-09-19 Condor Sas Device for visualizing an interior of a patient's mouth
US10437918B1 (en) * 2015-10-07 2019-10-08 Google Llc Progressive image rendering using pan and zoom
US10395428B2 (en) 2016-06-13 2019-08-27 Sony Interactive Entertainment Inc. HMD transitions for focusing on specific content in virtual-reality environments
CN109478095A (en) * 2016-06-13 2019-03-15 索尼互动娱乐股份有限公司 HMD conversion for focusing the specific content in reality environment
JP2019531550A (en) * 2016-06-13 2019-10-31 株式会社ソニー・インタラクティブエンタテインメント HMD transition to focus on specific content in virtual reality environment
US11568604B2 (en) * 2016-06-13 2023-01-31 Sony Interactive Entertainment Inc. HMD transitions for focusing on specific content in virtual-reality environments
WO2017218436A1 (en) * 2016-06-13 2017-12-21 Sony Interactive Entertainment Inc. Hmd transitions for focusing on specific content in virtual-reality environments
WO2018007779A1 (en) * 2016-07-08 2018-01-11 Sony Interactive Entertainment Inc. Augmented reality system and method
US20190005327A1 (en) * 2017-06-30 2019-01-03 International Business Machines Corporation Object storage and retrieval based upon context
US10713485B2 (en) * 2017-06-30 2020-07-14 International Business Machines Corporation Object storage and retrieval based upon context
JP2019040333A (en) * 2017-08-24 2019-03-14 大日本印刷株式会社 Information processing device, information processing method, computer program and time series data for display control
US11222632B2 (en) 2017-12-29 2022-01-11 DMAI, Inc. System and method for intelligent initiation of a man-machine dialogue based on multi-modal sensory inputs
US11468894B2 (en) * 2017-12-29 2022-10-11 DMAI, Inc. System and method for personalizing dialogue based on user's appearances
US11504856B2 (en) 2017-12-29 2022-11-22 DMAI, Inc. System and method for selective animatronic peripheral response for human machine dialogue
US11331807B2 (en) 2018-02-15 2022-05-17 DMAI, Inc. System and method for dynamic program configuration
US10747312B2 (en) 2018-03-14 2020-08-18 Apple Inc. Image enhancement devices with gaze tracking
WO2019177757A1 (en) * 2018-03-14 2019-09-19 Apple Inc. Image enhancement devices with gaze tracking
US11810486B2 (en) 2018-03-14 2023-11-07 Apple Inc. Image enhancement devices with gaze tracking
CN111602140A (en) * 2018-05-11 2020-08-28 三星电子株式会社 Method of analyzing an object in an image recorded by a camera of a head mounted device
CN111602140B (en) * 2018-05-11 2024-03-22 三星电子株式会社 Method of analyzing objects in images recorded by a camera of a head-mounted device
CN112188097A (en) * 2020-09-29 2021-01-05 Oppo广东移动通信有限公司 Photographing method, photographing apparatus, terminal device, and computer-readable storage medium
WO2022122117A1 (en) * 2020-12-07 2022-06-16 Viewpointsystem Gmbh Method for implementing a zooming function in an eye tracking system
US11582392B2 (en) 2021-03-25 2023-02-14 International Business Machines Corporation Augmented-reality-based video record and pause zone creation

Also Published As

Publication number Publication date
US9466112B1 (en) 2016-10-11
US9852506B1 (en) 2017-12-26

Similar Documents

Publication Publication Date Title
US9852506B1 (en) Zoom and image capture based on features of interest
US9420352B2 (en) Audio system
US10009542B2 (en) Systems and methods for environment content sharing
US10055642B2 (en) Staredown to produce changes in information density and type
US8941561B1 (en) Image capture
US10114466B2 (en) Methods and systems for hands-free browsing in a wearable computing device
US20190331914A1 (en) Experience Sharing with Region-Of-Interest Selection
US9684374B2 (en) Eye reflection image analysis
US8922481B1 (en) Content annotation
US9262780B2 (en) Method and apparatus for enabling real-time product and vendor identification
US9807291B1 (en) Augmented video processing
US8994613B1 (en) User-experience customization
US9058054B2 (en) Image capture apparatus
US9076033B1 (en) Hand-triggered head-mounted photography
US9274599B1 (en) Input detection
US20130021374A1 (en) Manipulating And Displaying An Image On A Wearable Computing System
US20150009309A1 (en) Optical Frame for Glasses and the Like with Built-In Camera and Special Actuator Feature
US20190227694A1 (en) Device for providing augmented reality service, and method of operating the same
US9794475B1 (en) Augmented video capture
US20150170418A1 (en) Method to Provide Entry Into a Virtual Map Space Using a Mobile Device's Camera
US9336779B1 (en) Dynamic image-based voice entry of unlock sequence
US20150193977A1 (en) Self-Describing Three-Dimensional (3D) Object Recognition and Control Descriptors for Augmented Reality Interfaces
US20170163866A1 (en) Input System
US9582081B1 (en) User interface
US8893247B1 (en) Dynamic transmission of user information to trusted contacts

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:STARNER, THAD EUGENE;WEAVER, JOSHUA;SIGNING DATES FROM 20120904 TO 20120914;REEL/FRAME:028962/0798

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:STARNER, THAD EUGENE;WEAVER, JOSHUA;SIGNING DATES FROM 20120904 TO 20120914;REEL/FRAME:043693/0103

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044334/0466

Effective date: 20170929

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8