WO2017114573A1 - Commande de traitement de contenu multimédia - Google Patents

Commande de traitement de contenu multimédia Download PDF

Info

Publication number
WO2017114573A1
WO2017114573A1 PCT/EP2015/081414 EP2015081414W WO2017114573A1 WO 2017114573 A1 WO2017114573 A1 WO 2017114573A1 EP 2015081414 W EP2015081414 W EP 2015081414W WO 2017114573 A1 WO2017114573 A1 WO 2017114573A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
audio
time interval
frame
useful
Prior art date
Application number
PCT/EP2015/081414
Other languages
English (en)
Inventor
Fabrizio MOGGIO
Nicola REALE
Andrea Varesio
Marco Vecchietti
Original Assignee
Telecom Italia S.P.A.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telecom Italia S.P.A. filed Critical Telecom Italia S.P.A.
Priority to PCT/EP2015/081414 priority Critical patent/WO2017114573A1/fr
Publication of WO2017114573A1 publication Critical patent/WO2017114573A1/fr

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/22Means responsive to presence or absence of recorded information signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/02Arrangements for relaying broadcast information
    • H04H20/04Arrangements for relaying broadcast information from field pickup units [FPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/147Scene change detection

Definitions

  • the present invention relates to the field of systems and methods for the acquisition of multimedia content.
  • the present invention relates to a method for controlling the processing of multimedia content, for instance audio and video information, acquired by a capturing device such as a camcorder.
  • a system used for the acquisition of multimedia content may comprise a capturing device, such as a camcorder, and a portable processing device (such as a personal computer, a tablet or the like) connected to the capturing device.
  • the processing device allows processing the multimedia content, in particular audio and video information, captured by the capturing device and transmitting the multimedia content to a remote station, for, e.g., broadcasting the multimedia content over a television network or over the Internet.
  • the processing of the multimedia content is typically performed by a dedicated software application installed on the portable processing device.
  • the expression “switch on the capturing device” relates to the operation of start recording (or, capturing), by the capturing device, the multimedia content from the surrounding environment.
  • the expression “switch off the capturing device” relates to the operation of stop recording the multimedia content from the surrounding environment.
  • US2008031 3686 discloses a handheld camcorder or camcorder accessory and designated webpage combination is provided, wherein the camcorder records onto a designated medium and the camcorder or camcorder accessory has an integrated wireless transmitter and receiver unit, an integrated web browser, and wherein the camcorder or camcorder accessory is pre-programmed, by default, to allow for webcasting to the designated web page; and wherein the designated webpage is programmed to receive and display the video transmission from the camcorder.
  • the handheld camcorder or camcorder accessory having wireless internet access allows for a pre-programmed, one button webcasting, whereby a user that is familiar only with the basic operation of the camcorder can effectively webcast. Summary of the invention
  • the human operator when known systems for the multimedia content acquisition are used, the human operator shall manually use the control tool in order to start and stop processing the captured multimedia content.
  • the operation of starting processing the captured multimedia content involves leaving the capturing device by the human operator, extracting the portable processing device from the bag or backpack where it is stored, activating the processing operation by using the control tool and putting the processing device back in the backpack. Only after all these operations, the human operator may take back the capturing device and start capturing the multimedia content.
  • the operations above imply that the operator shall divert his/her attention from the camcorder and focus on the processing device and the software application to be activated for the processing of the multimedia content. This may be uneasy and may delay the transmission of the multimedia content.
  • the process for wireless webcasting of US20080313686 provides for integrating a wireless transmitter, a wireless receiver and a web browser in the handheld camcorder to allow for webcasting the multimedia content to a web page.
  • the camcorder of US20080313686 comprises a dedicated accessory comprising, for instance, a toggle switch, for toggling the webcasting capabilities on and off of the camcorder accessory. Therefore, the method of US20080313686 may be applied by using a dedicated handheld camcorder or at least it requires modifying a standard handheld camcorder in order to integrate the necessary components. Further, the camcorder operator shall (manually) activate the toggle switch for activating the webcasting of the images captured by the camcorder.
  • the Applicant has tackled the problem of providing a method for controlling the acquisition of multimedia content which allows overcoming the drawbacks underlined above.
  • the Applicant has tackled the problem of providing a method for controlling processing of multimedia content acquired by a handheld capturing device, such as a camcorder, which does not require altering the capturing device or its manner of use, and which does not require specific actions to be performed by the human operator (other than those typically performed by the operator for using the camcorder) in order to start and stop processing and transmitting the captured multimedia content.
  • the expression "operating state of the capturing device” will indicate either a state according to which the capturing device is put in condition to capture multimedia content from the surrounding environment, or a state according to which the capturing device is put in condition not to capture multimedia content from the surrounding environment.
  • the operating state of the capturing device may change by switching on the capturing device or by switching off the capturing device.
  • the present invention provides a method for controlling processing of multimedia content captured by a capturing device, the multimedia content being provided over a signal comprising a video signal or an audio signal, the method comprising the following steps:
  • the video signal is a digital video signal comprising a sequence of video frames
  • detecting comprises checking whether a current video frame of the sequence of video frames is a black frame.
  • checking comprises checking whether the value of the luma and the values of the chrominance components of a selected set of pixels within the video frame are comprised within, respectively, a luma interval of black color and a chrominance interval of black color.
  • the luma interval of black color is comprised between 0 and 16, and wherein the chrominance interval of black color is comprised between 127 and 129.
  • the pixels of the selected set of pixels are arranged along an horizontal axis and a vertical axis of the video frame subdividing the video frame in four quadrants, and wherein the selected set of pixels also comprises at least four further pixels, each located at the center of a respective quadrant.
  • the method further comprises, in case the current video frame of the sequence of video frames is not a black frame: incrementing a useful video frame time interval;
  • the method further comprises, in case the current video frame is a black frame
  • the value of the first threshold time interval and of the second threshold time interval is equal to 250 ms.
  • the audio signal is a digital audio signal comprising a sequence of audio frames, each audio frame comprising a sequence of audio samples, and wherein detecting comprises checking whether a current audio frame of the sequence of audio frames is a null audio frame indicating silence.
  • checking comprises checking whether the value of each audio sample of the current audio frame is comprised within a null audio interval of values indicating silence.
  • the values of the null audio interval range between -16 and 16.
  • the method further comprises, in case the current audio frame of the sequence of audio frames is not a null audio frame:
  • the method further comprises, in case the current audio frame is a null audio frame:
  • the value of the third threshold time interval and of the fourth threshold time interval is equal to 250 ms.
  • controlling comprises, in case the transition from absence of a useful signal to presence of the useful signal is determined, switching on the processing the multimedia content.
  • controlling comprises, in case the transition from presence of a useful signal to absence of the useful signal is determined, switching off the processing the multimedia content.
  • the present invention provides an analysis module for a processing device configured to process a multimedia content captured by a capturing device, the multimedia content being provided to the analysis module over a signal comprising a video signal or an audio signal, the analysis module being configured to :
  • the useful signal is a portion of the signal comprising the multimedia content
  • FIG. 1 schematically shows an exemplary system for the acquisition of multimedia content according to an embodiment of the present invention
  • FIG. 2 is a flow chart of the method for controlling processing of multimedia content according to an embodiment of the present invention
  • FIG. 3 shows an exemplary video frame in which the pixels selected for analysis are evidenced, according to an embodiment of the present invention.
  • FIG. 4a and 4b show alternative ways of selecting a set of pixels for the analysis according to embodiments of the present invention.
  • Figure 1 schematically shows an exemplary system 1 for the acquisition of multimedia content suitable for implementing the method for controlling processing of multimedia content acquired by a capturing device according to an embodiment of the present invention.
  • the system 1 preferably comprises a capturing device 1 1 , such as a camcorder, suitable to capture multimedia (e.g. audio and video) content, and a processing device 12 connected to the capturing device 1 1 .
  • a capturing device 1 1 such as a camcorder
  • multimedia e.g. audio and video
  • a processing device 12 connected to the capturing device 1 1 .
  • a (handheld) camcorder as capturing device.
  • the camcorder 1 1 may be used by a human operator for capturing in real-time the audio and video content of a live broadcasting event such as a live sport event, which may be transmitted over an IP (Internet Protocol) network as an audiovisual streaming content.
  • IP Internet Protocol
  • the processing device 12 may be a portable personal computer, a tablet, a Mini PC (e.g. Intel® NUC), a Single Board Computer (e.g. Raspberry Pi) or the like.
  • the processing device 1 2 may be stored into a bag or a backpack by the human operator.
  • the processing device 12 is preferably connected to the camcorder 1 1 by means of a wireless connection or a wired connection.
  • the camcorder 1 1 may have digital output ports and/or analog output ports. As known, the camcorder 1 1 may output a digital audiovisual signal which is formatted according to a digital video format, such as DV (Digital Video).
  • the digital output ports of the camcorder 1 1 may comprise one or more of: a USB (Universal Serial Bus) output port , a FireWire output port, .a HDMI (High Definition Multimedia Interface) output port, a SDI (Serial Digital Interface) output port.
  • a USB Universal Serial Bus
  • FireWire output port e.g., a USB (High Definition Multimedia Interface) output port
  • SDI Serial Digital Interface
  • Corresponding USB, FireWire, HDMI, or SDI input ports are present in the processing device 12 for receiving the digital audiovisual signal though a USB, FireWire, HDMI or SDI cable.
  • the processing device 12 is typically equipped with a FireWire card for the acquisition of the digital audiovisual signal by the processing device through a FireWire input port.
  • the processing device 1 2 may typically receive this signal though an analog input port connected to a video capture card which provides for converting the received signal into a digital audiovisual signal.
  • camcorder 1 1 once switched on, outputs a digital audiovisual signal.
  • the system 1 further comprises a communication network 1 3 and a remote station 14.
  • the communication network 13 is preferably a wireless communication network connecting the processing device 12 and the remote station 14.
  • the communication network 1 3 may be an IP communication network.
  • the remote station 14 may be a server.
  • the camcorder 1 1 is configured to capture multimedia content in the form of an audiovisual data flow, output data in the form of a digital audiovisual signal and transmit it to the processing device 12.
  • the processing device 1 2 is configured to process the digital audiovisual signal, for instance by applying a coding operation, as it will be described herein after, and transmit it to the remote station 14 through the communication network 1 3.
  • the remote station 14 may be configured to receive the audiovisual signal, to further process it (for instance, by applying a decoding operation), and to broadcast it over, e.g., a television network or over the Internet.
  • the operation of the remote station 14 is not relevant to the present invention and hence it will not be further described herein after.
  • the processing device 1 2 comprises an analysis module 121 and a processing module 122. Both the analysis module 1 21 and the processing module 122 receive the digital audiovisual signal comprising the multimedia content captured by the camcorder 1 1 .
  • the analysis module 1 21 is preferably configured to analyse the digital audiovisual signal and, on the basis of a result of the analysis, to control the activation of the processing module 1 22 or the deactivation thereof, as it will be described in detail herein after.
  • the processing module 1 22 is preferably configured to, once activated by the analysis module 121 , process the digital audiovisual signal in order to, for instance, apply an encoding operation to the digital video signal and the digital audio signal comprised within the digital audiovisual signal.
  • the processing module 122 may apply a H.264/MPEG-4 AVC (Advanced Video Coding) to the digital video signal, and a AAC (Advanced Audio Coding) encoding to the digital audio signal.
  • a H.265 HEVC - High Efficiency Video Coding
  • the encoded signals may then be transmitted by the processing device 1 2 over the communication network 1 3 towards the remote station 14.
  • the operations of the processing module 1 22 will not be described in greater detail herein after as they are not relevant to the present description.
  • Both the analysis module 121 and the processing module 1 22 may be implemented as software modules running over the hardware of the processing device 1 2.
  • the processing device 12 is already active when the operator switches on the camcorder 1 1 .
  • the camcorder 1 1 captures the multimedia content from the surrounding environment, generates a corresponding digital audiovisual signal and transmits it to the processing device 1 2.
  • the processing device 12 receives the digital audiovisual signal through a digital input port.
  • the signal is forwarded to the analysis module 121 and, in parallel, to the processing module 1 22.
  • the digital audiovisual signal received from the input port is preferably split into a digital video signal and a digital audio signal.
  • the digital audiovisual signal at the input of the analysis module 1 22, which carries the captured multimedia content generally comprises portions of a "useful audiovisual signal" actually comprising the multimedia content, and portions indicating absence of such a useful signal, as it will be described in greater detail herein after.
  • the digital video signal is acquired by the analysis module 1 21 as a stream of video frames by sampling the digital video signal at a predetermined frequency of acquisition of the video signal or video capture frequency, which may be equal to 50 Hz in Europe and 60 Hz in America and Asia.
  • Each video frame is composed of a number of lines of picture elements (pixel).
  • the pixels of a video frame are usually associated with a color model.
  • each pixel may be associated with a brightness information (also called “luma”) and a color information (or "chrominance”), the latter being usually composed of two color-difference components, U (blue-luma) and V (red-luma).
  • the brightness information and the components of the color information have a 8-bit representation.
  • the luma information may have a value within the range from 0 to 255, the value 0 identifying the absence of light.
  • each of the chrominance components may have a value within the range from -1 28 to 127.
  • a value of 128 is added to the chrominance components, which may be positive or negative, so that these components assume only positive values.
  • the black color may be hence associated with a luma value equal to 0 and values of the chrominance components equal to 127.
  • the black color may be associated with a luma value ranging within a given interval (indicated in the following lines as "luma interval of black color”) around value 0: the minimum value of the interval may be equal to 0 and the maximum value of the interval may be equal to 1 6.
  • the black color may be associated with values of the chrominance components ranging within a further given interval (indicated in the following lines as "chrominance interval of black color”) around value 127: the minimum value of the interval may be equal to 1 27 and the maximum value of the interval may be equal to 129.
  • the audio signal captured by the camcorder 1 1 may comprise, as known, one or two channels (left channel, L, and right channel, R). It is usually an analog audio signal which may be then digitalized.
  • the digital audio signal is acquired by the analysis module 121 as one stream of audio samples per each channel L, R.
  • the samples of a given channel are obtained by sampling the corresponding audio signal at a predetermined frequency that may be equal to 48 kHz and quantizing the value of each sample.
  • each sample may have a value within a discrete numeric range of relative numbers with a sign. Silence is represented by the central value of the numeric range used for quantizing the audio signal. For instance, silence may be represented by a value nearly equal to 0.
  • an audio frame of a channel preferably comprises a number of audio samples of the channel spread over a given period which corresponds to the temporal length of the video frame that is defined by the video capture frequency.
  • An audio frame comprising audio samples with values indicating silence will be indicated in the following description as a "null audio frame".
  • an audio sample indicates silence when its value is comprised within a null audio interval of values indicating silence, which may range between -16 and 16.
  • a single audio channel for instance the left one, is analysed by the analysis module 121 .
  • the digital video signal and the digital audio signal are independently analysed at the analysis module 121 , for detecting a change in the operating state of the capturing device and trigger accordingly the processing of the captured multimedia content.
  • the change in the operating state of the capturing device is determined by detecting transitions from the absence of a useful video/audio signal and the presence of the useful video/audio signal in each stream, and viceversa.
  • the expression "useful video/audio signal” will indicate the portion of the video/audio stream containing the captured multimedia content.
  • the absence of the useful video signal means that black frames are detected for a given interval of time within the stream of video frames.
  • the absence of the useful audio signal means that null audio frames are detected for a given interval of time within the stream of audio frames. Detection of a transition involving either the digital video signal or the digital audio signal triggers the activation or deactivation of the signal processing at the processing module 122, as it will be described in detail herein after.
  • the stream of video frames of the digital video signal and the stream of audio frames of the digital audio signal are preferably analysed in parallel by the analysis module 121 .
  • the analysis of the stream of video frames of the digital video signal will be described first in the following lines.
  • a check is performed by the analysis module 121 for determining whether the video frame is a black frame (step 203).
  • the generic video frame is indicated in the flow chart of Figure 2 as the i-th video frame, wherein i is an integer indexing number indicating the position of the video frame within the stream of video frames at the analysis module 121 .
  • the analysis module 121 preferably checks the value of the luma and the values of the chrominance components of a selected set of pixels within the video frame (step 203).
  • the set of pixels to be analysed may be selected as follows. Reference is made to Figure 3, where the generic i-th video frame is represented as subdivided in pixels. According to an embodiment of the present invention, the video frame is split into four sections of equal area by an horizontal axis X and a vertical axis Y. The four sections will be referred to as quadrants.
  • the set of pixels which are analysed in the analysis module 1 21 at step 203 comprises the pixels of the i-th video frame which are positioned on the horizontal axis X and the vertical axis Y that divide the video frame into quadrants. Moreover, the set of pixels to be analysed further comprises a number of pixels for each quadrant, which are positioned at the center of each quadrant. This number may be four. In Figure 3, the pixels belonging to the selected set of pixels to be analysed by the analysis module 1 21 are represented as grey pixels.
  • pixels may be chosen for performing the check of step 203.
  • Two other exemplary sets of pixels that may be analysed at step 203 are the grey pixels in Figures 4a and 4b.
  • the check is performed as follows: for each pixel of the selected set, the value of the luma and the values of the chrominance components are checked.
  • the analysis module 1 21 preferably checks whether the luma value of each selected pixel is comprised within the luma interval of black color described above, for instance between 0 and 1 6.
  • the analysis module 121 preferably checks whether the value of each chrominance component of each selected pixel is comprised within the chrominance interval of black color described above, for instance between 127 and 129.
  • the analysis module 121 preferably determines that the i-th video frame is not a black frame. If, for each pixel of the selected set, the luma value is comprised within the luma interval of black color and the values of the chrominance components are comprised within the chrominance interval of black color, the analysis module 121 preferably determines that the i-th video frame is a black frame.
  • the analysis module 121 increments a useful video frame time interval Tuv (step 204) by a given amount tdu, according to the following equation:
  • Tuv(i) Tuv(i - 1) + tdu [2] wherein Tuv(i) is the useful video frame time interval for the i-th video frame, Tuv(i-1 ) is the useful video frame time interval in correspondence of the preceding video frame and tdu is the amount of the increment.
  • the increment tdu may be equal to 40 ms, which corresponds to the duration of a single video frame in case the frequency of acquisition of the video signal is 50 Hz (25 frames per second).
  • the expression "useful video frame” will indicate a video frame in which the pixels of the selected set have luma and chrominance values outside the luma and chrominance intervals of black color.
  • the analysis module 121 preferably checks whether the useful video frame time interval Tuv(i) computed at step 204 for the current i-th video frame is greater than a first threshold time interval Thv on.
  • the first threshold time interval Thv on may be equal to 250 ms.
  • the analysis module 1 21 preferably determines that, in correspondence of the i-th video frame, a transition is occurred from a condition of absence of a useful video signal to a condition of presence of the useful video signal. In this case, at step 208, the analysis module 1 21 preferably generates and sends to the processing module 122 a command in order to switch on the processing module 122.
  • the analysis module 1 21 determines that the useful video frame time interval Tuv(i) computed at step 206 is lower than the first threshold time interval Thv on (step 207), the analysis module 121 preferably acquires the next video frame (steps 208 and 202) and repeats step 205.
  • the analysis module 1 21 determines that a transition is occurred from a condition of absence of the useful video signal to a condition of presence of the useful video signal only when the analysis module 121 acquires a given number of consecutive useful video frames, this number being determined by the frequency of acquisition of the video signal by the camcorder 1 1 and the first threshold time interval Thv on. This situation arises when the operator switches on the camcorder 1 1 and starts capturing a video signal, so that the analysis module 1 21 starts acquiring a sequence of useful video frames.
  • the analysis module 121 may acquire a number of black frames (due, for instance, to the presence of an acquisition card in the processing device) before acquiring the useful video frames of the captured multimedia content and hence a transition may be detected when a 250 ms sequence of useful video frames is analysed at the analysis module 121 after the last black frame.
  • the analysis module 121 determines that the current video frame is a black frame, it preferably increments a black frame time interval Tb (step 209) by a given amount tdb, according to the following equation:
  • Tb(i) Tb(i - l) + tdb [2] wherein Tb(i) is the black frame time interval for the i-th video frame, Tb(i-1 ) is the black frame time interval in correspondence of the preceding video frame and tdb is the amount of the increment.
  • the increment tdb may be equal to 40 ms, which corresponds to the duration of a single video frame in case the frequency of acquisition of the video signal is 50 Hz (25 frames per second).
  • the analysis module 121 preferably checks whether the black frame time interval Tb(i) computed at step 209 for the current i-th video frame is greater than a second threshold time interval Thv off.
  • the second threshold time interval Thv off may have the same value of the first threshold time interval Thv on, and it may be equal to e.g. 250 ms.
  • the analysis module 121 preferably determines that, in correspondence of the i-th video frame, a transition is occurred from a condition of presence of the useful video signal to a condition of absence of the useful video signal.
  • the analysis module 121 preferably generates and sends to the processing module 1 22 a command in order to switch off the processing module 122.
  • the analysis module 1 21 determines that the black video frame time interval Tb(i) computed at step 209 is lower than the second threshold time interval Thv off, the analysis module 121 preferably acquires the next video frame (steps 208 and 202) and repeats step 203.
  • the analysis module 1 21 determines that a transition is occurred from a condition of presence of the useful video signal to a condition of absence of the useful video signal only when it has acquired a given number of consecutive black video frames, the number being determined by the frequency of acquisition of the video signal by the camcorder 1 1 and the second threshold time interval Thv off. This situation arises when the operator switches off the camcorder 1 1 and stops capturing a video signal.
  • the analysis of the audio digital signal by the analysis module 121 proceeds in parallel with respect to the analysis of the digital video signal.
  • a check is performed by the analysis module 121 for determining whether the audio frame is a null audio frame (step 213).
  • the generic audio frame is indicated in the flow chart of Figure 2 as the i-th audio frame, wherein i is the index already used for the video frames and indicating the position of the audio frame within the stream of audio frames at the input of the analysis module 121 .
  • the analysis module 121 preferably checks the values of the audio samples within the audio frame (step 213). In particular, at step 21 3, the analysis module 121 preferably checks whether the value of each audio sample is comprised within the null audio interval of values indicating silence, for instance between -1 6 and 16. If, for each audio sample of the current audio frame, the value of the audio sample is not comprised within the null audio interval of values indicating silence, the analysis module 1 21 preferably determines that the i-th audio frame is not a null audio frame. If, for each audio sample of the current audio frame, the value of the audio sample is comprised within the null audio interval of values indicating silence, the analysis module 121 preferably determines that the i-th audio frame is a null audio frame.
  • the analysis module 121 increments a useful audio frame time interval Tua (step 214) by the given amount tdu, according to the following equation:
  • Tua(i) Tua(i - 1) + tdu [3] wherein Tua(i) is the useful video frame time interval for the i-th audio frame, Tua(i-1 ) is the useful audio frame time interval in correspondence of the preceding audio frame and tdu is the amount of the increment, which is preferably the same amount used to increment the useful video frame time interval according to equation [2].
  • the expression "useful audio frame” will indicate an audio whose audio samples have values outside the null audio interval.
  • the analysis module 121 preferably checks whether the useful audio frame time interval Tua(i) computed at step 214 for the current i-th audio frame is greater than a third threshold time interval Tha_on.
  • the third threshold time interval Tha_on may be equal to the first threshold time interval Thv on (e.g. equal to 250 ms).
  • the analysis module 1 21 preferably determines that, in correspondence of the i-th video frame, a transition is occurred from a condition of absence of the audio signal to a condition of presence of the audio signal.
  • the analysis module 1 21 preferably generates and sends to the processing module 122 a command in order to switch on the processing module 122.
  • the analysis module 121 determines that the useful audio frame time interval Tua(i) computed at step 206 is lower than the third threshold time interval Tha_on (step 215), the analysis module 121 preferably acquires the next audio frame (steps 216 and 212) and repeats step 213.
  • the analysis module 1 21 determines that a transition is occurred from a condition of absence of a useful audio signal to a condition of presence of the useful audio signal only when it has acquired a given number of consecutive useful audio frames, the number being determined by the frequency of acquisition of the audio signal by the camcorder 1 1 , which, as described above, is assumed to be equal to the frequency of acquisition of the video signal by the camcorder 1 1 and the third threshold time interval Tha_on. This situation arises when the operator switches on the camcorder 1 1 and starts capturing an audio signal.
  • the analysis module 121 determines that the current audio frame is a null audio frame, it preferably increments a null audio frame time interval Tn (step 217) by a given amount tdn, according to the following equation:
  • Tn(i) Tn(i - 1) + tdn [4] wherein Tn(i) is the null audio frame time interval for the i-th audio frame, Tn(i-1 ) is the null audio frame time interval in correspondence of the preceding audio frame and tdn is the amount of the increment.
  • the increment tdn may be equal to 40 ms, which corresponds to the duration of a single video frame in case the frequency of acquisition of the video signal is 50 Hz (25 frames per second).
  • the analysis module 121 preferably checks whether the null audio frame time interval Tn(i) computed at step 217 for the current i-th audio frame is greater than a fourth threshold time interval Tha_off.
  • the fourth threshold time interval Tha_off may have the same value of the third threshold time interval Tha_on, for instance 250 ms.
  • the analysis module 1 21 preferably determines that, in correspondence of the i-th audio frame, a transition is occurred from a condition of presence of the useful audio signal to a condition of absence of the useful audio signal.
  • the analysis module 1 21 preferably generates and sends to the processing module 122 a command in order to switch off the processing module 122.
  • the analysis module 1 21 determines that the null audio frame time interval Tn(i) computed at step 21 7 is lower than the fourth threshold time interval Tha_off, the analysis module 121 preferably acquires the next audio frame (steps 21 6 and 212) and repeats step 213.
  • the analysis module 1 21 determines that a transition is occurred from a condition of presence of the useful audio signal to a condition of absence of the useful audio signal only when it has acquired a given number of consecutive null audio frames, the number being determined by the frequency of acquisition of the audio signal by the camcorder 1 1 and the fourth threshold time interval Tha_off. This situation arises when the operator switches off the camcorder 1 1 and stops capturing an audio signal.
  • processing module 1 22 When the processing module 1 22 is activated (step 206), it preferably process the digital video signal and the digital audio signal and transmits them over the communication network 13 towards the remote station 14.
  • processing by the processing module 1 22 may comprise encoding the digital video signal according to the H.264/MPEG-4 AVC encoding technique and encoding the digital audio signal according to the AAC encoding technique.
  • the H.265 (HEVC) encoding technique may be used.
  • the camcorder 1 1 may continue capturing a useful video signal and/or a useful audio signal until the operator switches off the camcorder 1 1 . Therefore, the analysis module 1 21 cyclically acquires a video/audio frame (step 220) and repeats steps 202-21 1 and 212-219. At step 203 or 213 the check is repeatedly negative and the useful video frame time interval (and/or the useful audio frame time interval) gets incremented on a frame by frame basis. The check at step 205 is hence repeatedly positive. In this situation, the analysis module 121 at step 206 continues generating the command above, which is however ignored at the processing device 122.
  • the analysis module 121 cyclically acquires a video/audio frame (step 220) and repeats steps 202-21 1 and 212-219.
  • the check at step 203 or 213 continues to be positive, as well as the check at step 210 or 218, respectively, the analysis module 121 at step 21 1 or 219 continues generating the command to switch off the processing module 122, which is however ignored at the processing device 1 22. Otherwise, if, for instance, the digital video signal restart comprising useful video frames, the processing module 1 22 is switched on again.
  • the check at step 205 is a check involving an "OR" logical operation.
  • the processing device 1 22 is switched on and off by detecting a transition within any one of the digital video signal and the digital audio signal.
  • other logical operations may be considered to perform the check of step 205, for instance, an AND operation.
  • the check at step 205 is positive only when a transition is detected within both the digital video signal and the digital audio signal.
  • the method of the present invention applies also to situations in which only one of a digital video signal and a digital audio signal is available. In this case, steps 201 - 21 1 and 220 are performed for analyzing a digital video signal or, alternatively, steps 201 , 212-220 are performed for analyzing a digital audio signal.
  • the analysis module 1 21 analyses the digital video signal and the digital audio signal and according to an outcome of the analysis, it provides for switching on the processing module 1 22, which is in charge of transmitting the digital video signal and the digital audio signal over the network.
  • the analysis module 121 which continues analysing the digital video signal and the digital audio signal, provides for switching off the processing module 122.
  • the analysis module 121 activates the processing module 122, and every time the human operator switches off the camcorder 1 1 , the analysis module 121 deactivates the processing module 122.
  • the analysis module 121 is capable of detecting the presence of a useful signal by determining whether a transition occurred from absence of signal to presence of signal and viceversa, for the digital video signal and/or the digital audio signal. This allows giving the human operator the possibility to voluntarily trigger the start and stop of the operations of processing and transmitting the multimedia content captured by the camcorder by merely using the camcorder in its normal mode of operation, in particular by merely changing the operating state of the camcorder.
  • the human operator may connect the processing device to the camcorder, switch on the processing device and put it on a backpack, and then he/she can decide to start and stop the processing and the transmission of the audiovisual signal captured by the camcorder simply switching on and off the camcorder itself.
  • the operator may switch on the processing device, put it on a backpack and connect it to the camcorder before switching on the camcorder.
  • the application of the method of the present invention is independent from “how" and "when" the camcorder is connected to the processing device. Indeed, switching on and off the camcorder when it is connected to the processing device or connect and disconnect the camcorder when it is switched on are, according to the present invention, equivalent actions that determine the presence or absence of an incoming audiovisual signal into the processing device.
  • the method of the present invention may be implemented without altering the capturing device or its manner of use.
  • the method of the present invention may be applied to standard capturing devices as it does not require a specific control tool to be activated for starting and stopping the processing of the multimedia content.
  • the method of the present invention does not require specific actions to be performed by the human operator neither on the capturing device nor on the processing device to start and stop processing the multimedia content. Thanks to the present invention, the human operator may completely focus on his/her main activity, i.e. using the camcorder to capture the multimedia content, without the need to manage the processing and transmission of the captured content. This allows avoiding any delay in transmitting the captured multimedia content.
  • the method of the present invention allows controlling the processing of multimedia content captured by a capturing device in all those cases in which the capturing device and the processing device (or, at least, the processing module responsible for the processing of the multimedia content) are located in remote positions or in those cases in which the processing module is not easily accessible.
  • the control may be advantageously triggered only by switching on and off the capturing device.
  • the change of the operating state of the camcorder 1 1 is triggered by the switching on and off of the camcorder 1 1 itself.
  • the method of the present invention may be similarly applied to other situations in which the change of the operating state of the capturing device 1 1 is determined by other actions/events.
  • the start and stop of the processing of the multimedia content may be triggered by the absence/presence of the lens cover on the camcorder, which actually determines a change in the operating state of the camcorder.
  • the method of the present invention provides for detecting a transition from absence of a useful video signal to presence of a useful video signal when the human operator takes away the lens cover from the camcorder, and, on the contrary, it provides for detecting a transition from presence of a useful video signal to absence of the useful video signal when the human operator put the lens cover on the camcorder.
  • the method of the present invention does not require altering the capturing device or its manner of use, and does not require specific actions to be performed by the human operator (other than those typically performed by the operator for using the camcorder) in order to start and stop processing (and transmitting) the captured multimedia content.

Abstract

L'invention concerne un procédé permettant de contrôler le traitement d'un contenu multimédia capturé par un dispositif de capture, le contenu multimédia étant fourni sur un signal comprenant un signal vidéo ou un signal audio. Le procédé consiste à : détecter dans le signal une transition entre l'absence d'un signal utile et la présence du signal utile, le signal utile étant une partie du signal comprenant le contenu multimédia; et déterminer un changement d'état de fonctionnement du dispositif de capture et contrôler le traitement du contenu multimédia d'après la transition détectée.
PCT/EP2015/081414 2015-12-30 2015-12-30 Commande de traitement de contenu multimédia WO2017114573A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2015/081414 WO2017114573A1 (fr) 2015-12-30 2015-12-30 Commande de traitement de contenu multimédia

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2015/081414 WO2017114573A1 (fr) 2015-12-30 2015-12-30 Commande de traitement de contenu multimédia

Publications (1)

Publication Number Publication Date
WO2017114573A1 true WO2017114573A1 (fr) 2017-07-06

Family

ID=55083402

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/081414 WO2017114573A1 (fr) 2015-12-30 2015-12-30 Commande de traitement de contenu multimédia

Country Status (1)

Country Link
WO (1) WO2017114573A1 (fr)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2243969A (en) * 1990-05-11 1991-11-13 British Broadcasting Corp Electronic clapperboard for television sound-vision synchronisation
WO2000007367A2 (fr) * 1998-07-28 2000-02-10 Koninklijke Philips Electronics N.V. Appareil et procede de localisation d'une pub se trouvant dans un flux de donnees video
US20020030738A1 (en) * 2000-06-01 2002-03-14 Moreinis Joseph Daniel Web based monitoring system
US20040095377A1 (en) * 2002-11-18 2004-05-20 Iris Technologies, Inc. Video information analyzer
US20060230414A1 (en) * 2005-04-07 2006-10-12 Tong Zhang System and method for automatic detection of the end of a video stream
US20080313686A1 (en) 2007-06-13 2008-12-18 Matvey Thomas R Handheld camcorder accessory with pre-programmed wireless internet access for simplified webcasting and handheld camcorder with built-in pre-programmed wireless internet access for simplified webcasting and method of commercially supplying and supporting same
US20120095579A1 (en) * 2006-02-28 2012-04-19 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Data management of an audio data stream
CN102006499B (zh) * 2010-12-10 2012-10-24 北京中科大洋科技发展股份有限公司 一种检测数字电视节目文件视音频质量的方法
US20150229980A1 (en) * 2014-02-11 2015-08-13 Disney Enterprises, Inc. Method and system for detecting commercial breaks

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2243969A (en) * 1990-05-11 1991-11-13 British Broadcasting Corp Electronic clapperboard for television sound-vision synchronisation
WO2000007367A2 (fr) * 1998-07-28 2000-02-10 Koninklijke Philips Electronics N.V. Appareil et procede de localisation d'une pub se trouvant dans un flux de donnees video
US20020030738A1 (en) * 2000-06-01 2002-03-14 Moreinis Joseph Daniel Web based monitoring system
US20040095377A1 (en) * 2002-11-18 2004-05-20 Iris Technologies, Inc. Video information analyzer
US20060230414A1 (en) * 2005-04-07 2006-10-12 Tong Zhang System and method for automatic detection of the end of a video stream
US20120095579A1 (en) * 2006-02-28 2012-04-19 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Data management of an audio data stream
US20080313686A1 (en) 2007-06-13 2008-12-18 Matvey Thomas R Handheld camcorder accessory with pre-programmed wireless internet access for simplified webcasting and handheld camcorder with built-in pre-programmed wireless internet access for simplified webcasting and method of commercially supplying and supporting same
CN102006499B (zh) * 2010-12-10 2012-10-24 北京中科大洋科技发展股份有限公司 一种检测数字电视节目文件视音频质量的方法
US20150229980A1 (en) * 2014-02-11 2015-08-13 Disney Enterprises, Inc. Method and system for detecting commercial breaks

Similar Documents

Publication Publication Date Title
US9571702B2 (en) System and method of displaying a video stream
US9602769B2 (en) Timestamp-based audio and video processing method and system thereof
KR101526081B1 (ko) 감시 카메라로 촬영된 디지털 비디오 스트림을 제어가능하게 시청하는 시스템 및 방법
US10349008B2 (en) Tool of mobile terminal and intelligent audio-video integration server
US20080231708A1 (en) Network apparatus, sound data transmission method, sound data transmission program, monitoring apparatus and security camera system
EP2148511A3 (fr) Commutateur et méthode de commutation de trames vidéo, caméra numérique, et système de supervision
KR102169466B1 (ko) 채널 변화 기반 트리거 피처를 갖는 컴퓨팅 시스템
US20130236120A1 (en) Method and system for analyzing multi-channel images
US20040155961A1 (en) Apparatus and method for controlling display of video camera signals received over a powerline network
US20150109436A1 (en) Smart Dual-View High-Definition Video Surveillance System
JP2006128997A5 (fr)
WO2017114573A1 (fr) Commande de traitement de contenu multimédia
KR20140124497A (ko) 이벤트에 따라 적응적으로 가변 화질 영상을 전송하는 감시 카메라 시스템
CN110958434A (zh) 一种多图像拼接方法、系统以及计算机可读存储介质
EP3203729A1 (fr) Procédé, dispositif et système de conférence vidéo pour détecter des signaux vidéo dans la même norme
EP3629577A1 (fr) Procédé de transmission de données, caméra et dispositif électronique
CA2855659C (fr) Appareil et methode de reduction de latence ptz
CN113612938A (zh) 一种多类型自适应分辨率的图像转换方法及装置
CN213693936U (zh) 一种基于互联网的远程监控平台系统
KR20090109261A (ko) 디지털 비디오 분배 시스템과 디지털 비디오 분배 장치 및백업 라인을 이용한 디지털 비디오 분배 시스템
Kielan et al. Tool For Measuring End To End Latency
KR102126794B1 (ko) 영상 데이터 전송 장치 및 방법
KR20070005247A (ko) Gre 터널링을 이용한 원격 비디오 인식 시스템.
JP2004228720A (ja) テレビ通信によるデータ提供方法
CN205071210U (zh) 一种公务船多路视频显示控制系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15823176

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15823176

Country of ref document: EP

Kind code of ref document: A1