WO2006030995A1

WO2006030995A1 - Index-based authoring and editing system for video contents

Info

Publication number: WO2006030995A1
Application number: PCT/KR2004/002390
Authority: WO
Inventors: Seong-Il Jin
Original assignee: Realtimetech. Inc.
Priority date: 2004-09-17
Filing date: 2004-09-17
Publication date: 2006-03-23
Also published as: WO2006030995A9

Abstract

Disclosed is a system for recording and editing an index-based video content, in which a lecturer looks at a screen of a computer, inputs a note through an input device, and records and edits the lecture into a video using a camera and a microphone. The system includes: a lecture data output unit for reading out lecture data, outputting the lecture data to the computer, and when an output of the lecture data is changed, transmitting index information on the changed output to an indexing unit in real time; the indexing unit for storing index record information including the index information and a changed time when receiving the index information from the lecture data output unit; a recording unit for receiving image and voice recorded through the camera and the microphone and a scene on the screen of the computer, forming the received result in a synchronized format according a recorded time and storing the formed result into a lecture video; an editing unit for reading out the index record information stored in the indexing unit and the lecture video stored in the recording unit and editing the corresponding lecture video on the basis of each of index items according to the index information; and a player unit for reading out the index record information stored in the indexing unit and the lecture video stored in the recording unit and playing the corresponding lecture video according to the index information. Thereby, the system makes it possible to generate various image and voice inputs into one video when the lecture is recorded and edited, and to easily edit and manage the lecture on the basis of the index information synchronized with the video.

Description

INDEX-BASED AUTHORING AND EDITING SYSTEM FOR VIDEO CONTENTS

Technical Field

The present invention relates to a system for recording and editing an index- based video content, in which a user such as a lecturer looks at a computer screen, inputs a note using an input device, and records and edits the lecture into a video using peripheral devices such as a camera and a microphone.

Background Art Conventionally, when knowledge and data are written, either many lecture authoring techniques using hyper text markup language (HTML) document data or many system techniques capable of recording a video in real time have been developed at home and abroad. However, for the HTML-based lecture authoring techniques based on synchronization between an HTML and a speech or between an HTML and an image of a lecturer, several lecture files per lecture are generated, so that it is difficult for a user to edit and manage the lecture. Further, white boarding such as a compression image is added on an HTML page, and all lecture scenes are not recorded, so that it is impossible to provide lecture service having a high learning effect.

Further, when the lecture is written with the video, camera and audio scenes are separated and stored into several video files. Thus, the user has difficulty in editing and managing the lecture file with ease. This is because each image and each speech are divided into many frame units and many voice units respectively, and thereby works of determining the sequence of the frame and voice units and of synchronizing the video and the audio are very complicated.

Disclosure of the Invention Technical Problem

Therefore, the present invention has been made in view of the above- mentioned problems, and it is an objective of the present invention to provide a system for recording and editing an index-based video content, in which when a lecture is recorded, a camera input such as a lecturer's figure, a lecturer's speech and a lecture content are automatically indexed and generated into a video in real time.

It is another objective to provide editing means capable of physically or logically editing (adding, deleting, modifying) videos on the basis of indexed content information.

Technical Solution

In order to accomplish the objectives, there is provided a system for recording and editing an index-based video content, in which a lecturer looks at a screen of a computer, inputs a note using an input device, and records and edits the lecture into a video using a camera and a microphone. The system comprises: a lecture data output unit for reading out lecture data, outputting the lecture data to the computer, and when an output of the lecture data is changed, transmitting index information on the changed

_* output to an indexing unit in real time; the indexing unit for storing index record information including the index information and a changed time when receiving the index information from the lecture data output unit; a recording unit for receiving image and voice recorded through the camera and the microphone and a scene on the screen of the computer, forming the received result in a synchronized format according a recorded time and storing the formed result into a lecture video; an editing unit for reading out the index record information stored in the indexing unit and the lecture video stored in the recording unit and editing the corresponding lecture video on the basis of each of index items according to the index information; and a player unit for reading out the index record information stored in the indexing unit and the lecture video stored in the recording unit and playing the corresponding lecture video according to the index information.

Brief Description of the Drawings

The foregoing and other objects, features and advantages of the present invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings in which:

Fig. 1 shows a configuration of a lecture recording room according to an embodiment of the present invention;

Fig. 2 shows a configuration of a system for recording and editing an index- based video content according to an embodiment of the present invention;

Fig. 3 shows a configuration of lecture data according to an embodiment of the present invention; Fig. 4 shows a configuration of a lecture video according to an embodiment of the present invention;

Fig. 5 shows a configuration of index record information according to an embodiment of the present invention; and

Fig. 6 shown a configuration of a lecture video on the basis of each of index information according to an embodiment of the present invention;

Best Mode for Carrying Out the Invention

Hereinafter, a system for recording and editing an index-based video content in accordance with the present invention will be described in detail with reference to accompanying drawings .

Fig. 1 shows a configuration of a lecture recording room according to an embodiment of the present invention.

As shown in Fig. 1, the recoding room is equipped with a computer 12, an input device 15, a camera 13, a microphone 16 etc. in order to record a lecture scene of a lecturer 11. The lecturer 11 executes a system for recording and editing a lecture video, sets a session mode and environments of the camera 13, the microphone 16 etc., pushes a recording button, and starts a lecture together with the start of the recording. The session mode includes a presentation mode and a general screen capture mode. In the presentation mode, lecture data prepared in advance are stored as a presentation file, and then presentation is started by executing the presentation file, hi the screen capture mode that includes all modes other than the presentation mode, a screen either of displaying programs executed on the computer 12 or taking a white board type may be used. For the presentation file, a presentation program such as PowerPoint which has been already popularized may be used. When the presentation mode is selected, the lecturer 11 gives a lecture based on presentation data displayed on the screen of the computer 12, and inputs necessary items through the input device 15. The input data is displayed on the screen of the computer 12. Here, the input device 15 may include a mouse, an electronic pen or so forth which may be white-boarded. The white-boarded items are displayed on the screen of the computer 12 together with contents of the presentation data. Further, the lecturer 11 may convert the presentation mode into the screen mode or a whiteboard mode in order to make an auxiliary description of the presentation data. In the whiteboard mode, a colored screen similar to a general blackboard may be used, and necessary contents may be white-boarded using an electronic blackboard function. Further, the screen capture mode may execute a desired program using a general computer screen mode. When the whiteboard mode or the screen capture mode is ended, it is converted into the original mode, i.e. the presentation mode, and thus the lecturer 11 proceeds with the lecture. On selecting the screen capture mode, the lecturer 11 may execute a specific application program or give the lecture using the whiteboard mode. In addition, the lecturer 11 may perform the white boarding through a white boarding function of the electronic blackboard.

The computer 12 includes an output device for outputting or displaying the lecture data on its screen. As the output device, any one of a monitor, a touch screen, a tablet monitor etc. will do.

A scene in which the lecturer 11 gives the lecture is recorded by the camera 13, and a speech of the lecturer 11 is recorded through the microphone 16. Of course, except the speech of the lecturer 11, a sound generated when the lecturer 11 gives the lecture using a sound generator such as a speaker is recorded through the microphone

16. Further, the scene displayed on the computer 12 is captured into the video. The image such as the lecture scene of the lecturer by the camera 13, the voice by the microphone 16, and the image displayed on the computer 12 are synchronized by a system 20 for recording and editing a video content, and then stored as one video file.

As long as the recording room is equipped with only the foregoing devices, any one of a classroom, a conference room, a study room etc. except a dedicated recording space will do. Thus, if a person reads a paper at this place, contents of the paper may be recorded.

Fig. 2 shows a configuration of a system for recording and editing an index- based video content according to an embodiment of the present invention.

As shown in Fig. 2, a system 20 for recording and editing an index-based video content comprises an indexing unit 210, an index manual input unit 211, a lecture data output unit 212, a recording unit 220, an editing unit 230, and a player unit 240. The lecture data output unit 212 reads out lecture data and outputs them to the computer 12. When an output of the lecture data is changed, for example, in sequence, index information on the changed output is transmitted to the indexing unit 210 in real time. A configuration of the lecture data will be described in detail with reference to Fig. 3.

As shown in Fig. 3, the lecture data 300 is typically divided by a system such as a table of contents. In the example of Fig. 3, there is a main title denoted by Introduction of word processor, which is classified into: Chapter 1 Understanding of word processor, Chapter 2 .... The title Chapter 1 Understanding of word processor is again classified into: 1-1 What is a word processor?, 1-2 Start and End of word processor, 1-3 Screen configuration of word processor, ..., and 1 -N Exercise. Here, there are two cases: one without detailed classification as in 1-1 What is a word processor?, and the other with detailed classification as in 1-2 Start and End of word processor which is again classified into: STEP 1, STEP 2, ..., and STEP n. At this time, the titles, 1-1 What is a word processor?, 1-1 What is a word processor?, STEP 2, etc. are named index items 311. When the respective index items 311 have a hierarchical structure as a whole as shown in Fig. 3, then the whole of them are named index information 310. A content of the lecture data corresponding to each of the index items 311 is named a lecture page 321. The whole of the lecture pages 321, i.e. contents of the lecture data, are named lecture data information 320.

Outputs that are output to the computer 12 by the lecture data output unit 212 are mainly the lecture data information 320 of the lecture data 300. The lecture data information 320 is output by sequentially displaying the lecture pages 321 on the screen of the computer 12. A time point of being scrolled from a page (or a scene) to the next page (or the next scene) is generally determined by the input of the lecturer 11. However, if the time point is preset, the pages may be automatically scrolled when a preset time has lapsed.

Here, whenever a change of the displayed lecture page 321, i.e., an event occurs, the lecture data output unit 212 transmits the index item 311 corresponding to the lecture page 321 to the indexing unit 210 in real time, thereby informing that the event has occurred.

The indexing unit 210 receives the index item 311 for the lecture page 321 output from the lecture data output unit 212 in real time, thereby recognizing which a lecture page of the lecture data 300 is proceeding currently. Then, the indexing unit 210 stores, as index record information 500, the index item 311 received in real time and a current system time. A configuration of the index record information 500 will be described in detail with reference to Fig. 5.

As shown in Fig. 5, the index record information 500 is composed of two fields: one 510 for index items and the other 520 for index record times. Index records 530 refer to individuals constituting the index item field and the index record time field of the index record information 500. Specifically, when the change of the lecture page 321, i.e., the event occurs at the computer 12, the indexing unit 210 sets the index item 311 corresponding to the lecture page 321 to a value of the index item field 510 of the index record information 500, and sets the current time to a value of the index record time field of the index record information 500, and then adds a record of the event to the index record information 500 as one of the index records 530.

When the lecturer 11 inputs the index information in the course of the lecture, the index manual input unit 211 transmits an event indicating that the index information is manually input to the indexing unit 210. In terms of this event, the indexing unit 210 sets a current time to a value of the index record time field 520, and then adds a record of the event to the index record information 500 as one of the records 530. This index record 530 is inserted into the next index record of the index record 530 corresponding to the lecture page 321 output currently.

The recording unit 220 receives a first image stream recorded through the camera 13, a second image stream recorded through the screen of the computer 12, a voice stream recorded through the microphone 16. The received streams are synchronized according to a recorded time and stored as one lecture video 400. The lecture video 400 will be described in more detail with reference to Fig. 4.

As shown in Fig. 4, the first image stream 410 is composed of frame units 411 arranged in time series. The second image stream 420 is also composed of frame units 421. The voice stream 430 is composed of voice units arranged in time series. For the three streams, it may be seen from Fig. 4 that the units of the respective streams are synchronized to a time of a time zone recorded one another. In Fig. 4, the units of the respective streams are illustrated to be equal in the number per second. However, although they are not equal in the number per second, they may be synchronized on the basis of a recorded time. A file of the lecture video 400 stored finally may have a video standard format such as AVI (Audio Video Interleave), MPEG (Moving Picture Experts Group)- 1, MPEG-2, MPEG-3 etc., or a WMV (Windows Media Video) format developed by Microsoft or a self-developed video format. The self-developed video format follows a WMF (Windows Media Format) for streaming and downloading services. The WMF has a characteristic of general purpose coding standards for freely processing multimedia data including audio and video on Internet and computer. The WMF is composed of a windows media audio and video codec, an integrated digital management system called DRM (management of digital rights), and a file container. Among them, the file container may store audio, multiple bit transfer rate video, metadata, and index and script commands (URL etc.) in one file. Hence, the file container may store three streams in one video file by storing two images (a figure of the lecturer and a lecture content of the computer) and the voice stream, wherein the two images are generated in the process of recording the video.

The editing unit 230 provides a function capable of editing the lecture video file on the basis of each index item. The editing unit 230 makes it possible to read out the index record information 500 stored in the indexing unit 210 and the lecture video 400 stored in the recording unit 200 and to display the lecture video 400 according to the index information. Fig. 6 shows a configuration of the lecture video 400 according to the index information.

As shown in Fig. 6, parts of the three streams are allocated to one 641 of the index items. Each part corresponds to an interval between an index record time of the index item 641 and that of the next index item. Specifically, in the index record information 500 of Fig. 5, the index record time Tl of the index item of 1-1 What is a word processor? is 1 (Tl = 1), and the next index record time T2 of the next index item of Step 1 is 6 (T2 = 6). Therefore, the time belonging to the index item of /-/ What is a word processor? is from 1 to 5. Referring to Fig. 4, within this time, the first image stream has five frame units from Fl to F5, the second image stream has five frame units from Gl to G5, and the voice stream has five voice units from Pl to P5. In other words, the parts of the three streams corresponding to the index item 641 may be subjected to addition/delete/modifϊcation together. This editing is called "lecture video editing based on index information (or index item)."

The video editing provides a function of adding/deleting/modifying data of an actual file on the basis of each index item. For addition of the video, when an index region selected by a user such as a lecturer is input between index regions of an original video which the user intends to edit, the index information of the video is automatically added, so that physical data of the actual file are added. For delete of the video, when an index region which the user intends to delete is selected from the original video, the index information of the video is automatically deleted, so that physical data of the actual file are deleted. For modification of the video where the delete and addition of the video are sequentially carried out, when an index region selected by the user is input between index regions of the original video which the user intends to modify, the index information of the video is automatically deleted and then added, so that physical data of the actual file are deleted and added. The index items may be changed in title, combined, reconstructed into a multi-stage hierarchical structure using a chapter and a section. When editing the video is completed, the lecture video and the index information are automatically synchronized again. The player unit 240 reads out the lecture video and the index record information that are generated at the recording unit 220 and the indexing unit 210 respectively, and outputs the lecture video to the computer 12 on the basis of each of the index information. A process of playing the lecture video on the basis of each of the index information in the player unit 240 is equal to the process of editing the lecture video on the basis of each of the index information at the editing unit 230 and is carried out according to each index item.

The screen of the computer is configured of an image and a voice from the first image stream and the voice stream, an image from the second image stream, and index information (index items). When the corresponding index item is clicked, data formation of the clicked index item is played. Further, an image screen caused by the first image stream and an image screen caused by the second image stream may be played by interconversion, and the image screen caused by the second image stream may be magnified and played into the whole screen.

Industrial Applicability

As set forth above, the present invention makes it possible to generate various image and voice inputs into one video when the lecture is recorded and edited, thereby easily editing and managing the lecture.

Further, the present invention may edit and manage the lecture on the basis of the index information synchronized with the video, thereby making it possible for the lecturer to generate a good quality of lecture contents.

In addition, the lecture video and the index information which are generated may be variously applied to a video download service, a streaming service, a statistic service, a retrieval service, etc.

While this invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not limited to the disclosed embodiment and the drawings, but, on the contrary, it is intended to cover various modifications and variations within the spirit and scope of the appended claims.

Claims

1. A system for recording and editing an index-based video content, in which a lecturer looks at a screen of a computer (12), inputs a note through an input device (15), and records and edits the lecture into a video using a camera (13) and a microphone (16), the system comprising: a lecture data output unit (212) for reading out lecture data, outputting the lecture data to the computer (12), and when an output of the lecture data is changed, transmitting index information on the changed output to an indexing unit (210) in real time; the indexing unit (210) for, when receiving the index information from the lecture data output unit (212), storing index record information (500) including the index information and a changed time; a recording unit (220) for receiving image and voice recorded through the camera (13) and the microphone (16) and a scene on the screen of the computer (12), forming the received result in a synchronized format according a recorded time and storing the formed result into a lecture video (400); an editing unit (230) for reading out the index record information (500) stored in the indexing unit (210) and the lecture video (400) stored in the recording unit (220) and editing the corresponding lecture video (400) on the basis of each of index items according to the index information; and a player unit (240) for reading out the index record information (500) stored in the indexing unit (210) and the lecture video (400) stored in the recording unit (220) and playing the corresponding lecture video (400) according to the index information.

2. The system according to claim 1, further comprising an index manual input unit 211 for receiving the index information input manually by the lecturer in the course of giving the lecture and transmitting the input index information to the indexing unit (210) in real time.

3. The system according to claim 1, wherein the index information (300) is composed of each of the index items (310), the index items (310) forming a hierarchical structure, each of the index items (310) corresponding to a part of the lecture data, and wherein the index record information (500) has an index item field (510) and an index record time field (520) for recording a changed time and has index records (530) stored in a changed time order.

4. The system according to claim 1 or 3, wherein: the editing unit (230) edits the lecture video 400 on the basis of each index item (310) of the index record information (500); and the player unit (240) plays the lecture video 400 on the basis of each index item (310) of the index record information (500).

5. The system according to claim 1, wherein: the lecture video 400 is composed of a first image stream recorded through the camera (13) and a second image stream recorded through the screen of the computer (12) and a voice stream recorded through the microphone (16); and units of the respective streams are synchronized with each other according to a recorded time, and when being edited or played on the basis of each time interval of the index information, the three streams belonging to the time interval are edited and played together.