US6618739B1 - Digital filter implementation suitable for execution, together with application code, on a same processor - Google Patents

Digital filter implementation suitable for execution, together with application code, on a same processor Download PDF

Info

Publication number
US6618739B1
US6618739B1 US09/790,281 US79028101A US6618739B1 US 6618739 B1 US6618739 B1 US 6618739B1 US 79028101 A US79028101 A US 79028101A US 6618739 B1 US6618739 B1 US 6618739B1
Authority
US
United States
Prior art keywords
general purpose
processor
registers
vector data
accumulate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US09/790,281
Inventor
Mark Gonikberg
Haixiang Liang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Avago Technologies International Sales Pte Ltd
Original Assignee
AltoCom Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AltoCom Inc filed Critical AltoCom Inc
Priority to US09/790,281 priority Critical patent/US6618739B1/en
Priority to US10/651,922 priority patent/US7398288B2/en
Application granted granted Critical
Publication of US6618739B1 publication Critical patent/US6618739B1/en
Assigned to BROADCOM CORPORATION reassignment BROADCOM CORPORATION MERGER: CERTIFICATE OF OWNERSHIP EVIDENCING MERGER OF ALTOCOM, INC. INTO BROADCOM CORPORATION Assignors: ALTOCOM, INC.
Assigned to BANK OF AMERICA, N.A., AS COLLATERAL AGENT reassignment BANK OF AMERICA, N.A., AS COLLATERAL AGENT PATENT SECURITY AGREEMENT Assignors: BROADCOM CORPORATION
Anticipated expiration legal-status Critical
Assigned to AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD. reassignment AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BROADCOM CORPORATION
Assigned to BROADCOM CORPORATION reassignment BROADCOM CORPORATION TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS Assignors: BANK OF AMERICA, N.A., AS COLLATERAL AGENT
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H17/02Frequency selective networks
    • H03H17/06Non-recursive filters

Definitions

  • This invention relates to software implementations of discrete time filters, and in particular to software implementations of a Finite Impulse Response (FIR) filter on a general purpose processor.
  • FIR Finite Impulse Response
  • DSP Digital Signal Processor
  • Such a DSP instruction is executed to perform a multiply-accumulate operation and to shift the delay line in a single cycle (assuming the delay line is entirely in zero-wait state memory or on-chip).
  • h[N] is the N-tap filter coefficient vector and x[K] is an input signal vector.
  • FIR Finite Impulse Response
  • an efficient implementation of an FIR in accordance with the present invention allows a single general purpose processor (e.g., any of a variety of processors including MIPS R3000, R4000, and R5000 processors, processors conforming to the Sparc, PowerPC, Alpha, PA-RISC, or x86 processor architectures, etc.) to execute instructions encoded in a machine readable media to provide not only application-level functionality, but also the underlying signal processing functionality and digital filter structures for a communications device implementation.
  • multiprocessor embodiments i.e., embodiments including multiple general-purpose processors
  • an FIR filter implementation on a general purpose processor provides digital filter structures for a software implementation of a V.34 modem without use of a DSP.
  • a general purpose processor provides an instruction set architecture for loading data to and storing data from general purpose registers, for performing logical and scalar arithmetic operations on such data, and providing instruction sequence control.
  • Application programs, as well as operating systems and device drivers, are typically executed on such a general purpose processor.
  • a digital signal processor is optimized for vector operations on vector data, typically residing in large memory arrays or special purpose register blocks, and is not well suited to the demands of application programs or operating system implementations. Instead, a digital signal processor typically provides a vector multiply-accumulate operation which exploits highly-optimized vector addressing facilities.
  • a general purpose processor provides neither a vector multiply-accumulate operation nor vector addressing facilities necessary for computing a y n th element and shifting through vector data in a single cycle.
  • an N-tap filter implemented in a straightforward manner for execution on a general purpose processor computes each output vector element using 2N reads from memory to processor registers, N multiply-accumulates, and one write to memory.
  • K elements such an N-tap filter implementation makes K(2N+1) memory accesses and KN multiply-accumulates. For each multiply-accumulate, more than two memory accesses are required.
  • Finite Impulse Response (FIR) filter can be implemented in software on a general purpose processor in a manner which reduces the number of memory accesses.
  • an efficient implementation for a general purpose processor having a substantial number of registers includes inner and outer loop code which together make K ⁇ [ ( L 1 + L 2 L 1 ⁇ L 2 ) ⁇ N + L 2 L 1 + 1 ]
  • L 1 is the number of output vector elements computed during each pass through the outer loop and where L 2 is the number of taps per output vector element computed during each pass through the inner loop.
  • L 1 is the number of output vector elements computed during each pass through the outer loop
  • L 2 is the number of taps per output vector element computed during each pass through the inner loop.
  • FIG. 1 is a flow chart of an implementation of a Finite Impulse Response (FIR) filter, in accordance with an exemplary embodiment of the present invention, for execution on a processor.
  • FIR Finite Impulse Response
  • FIG. 2 is a data flow diagram for a multiply accumulate step of an implementation of a Finite Impulse Response (FIR) filter for execution on system including a processor with general purpose registers and a memory, in accordance with an exemplary embodiment of the present invention.
  • FIR Finite Impulse Response
  • FIG. 3 is a functional block diagram depicting functional modules and data flows for a software implementation of a modem incorporating instantiations of a Finite Impulse Response (FIR) filter implemented in accordance with an exemplary embodiment of the present invention.
  • FIR Finite Impulse Response
  • FIG. 4 is a block diagram of an exemplary Personal Digital Assistant (PDA) system embodiment including a general purpose processor, registers, and memory for executing a software implementation of a modem including an implementation of a Finite Impulse Response (FIR) filter in accordance with an exemplary embodiment of the present invention.
  • PDA Personal Digital Assistant
  • An N-tap filter implemented as software for execution on a general purpose processor computes each output vector element using 2N reads from memory to processor registers, N multiply-accumulates, and one write to memory. To calculate K elements, such an N-tap filter implementation includes K(2N+1) memory accesses and KN multiply-accumulates. For each multiply-accumulate, more than two memory access are required.
  • the improved software implementation includes an inner loop 120 and an outer loop 110 which together include K ⁇ [ ( L 1 + L 2 L 1 ⁇ L 2 ) ⁇ N + L 2 L 1 + 1 ]
  • L 1 is the number of output vector elements computed during each pass through outer loop 120 and where L 2 is the number of taps per output vector element computed during each pass through inner loop 110 .
  • the improved software implementation efficiently exploits L 1 +2L 2 general purpose registers and significantly reduces the number of memory accesses performed.
  • inner and outer loop code make K ⁇ ( N 4 + 2 )
  • FIG. 1 depicts an exemplary embodiment of a nested loop implementation, including control flows (bold lines) and data flows (fine lines), of an N-tap filter design for a Finite Impulse Response filter (FIR).
  • Outer loop 110 includes K/L 1 iterations to compute K output values of an output signal vector, OUT[K].
  • input registers 140 are loaded with L 2 (of K) respective input values of an input signal vector, D[K], from memory (step 111 ).
  • Output registers 150 store L 1 (of K) respective output values of the output signal vector, OUT[K], and are cleared in step 112 .
  • Inner loop 120 includes N/L 2 iterations to accumulate partial products into output registers 150 storing a subset of output values OUT[(iL 1 ) . . . (iL 1 +L 1 ⁇ 1)] of the output signal vector OUT[K] where i is the loop index variable for outer loop 110 .
  • Loop index variable j is checked during each pass through inner loop 120 (illustratively, in step 128 ).
  • the subset of output values computed by inner loop 120 and accumulated into output registers 150 are stored to memory (step 113 ) and a subsequent iteration (if any) of outer loop 110 is initiated.
  • Coefficient registers 130 provide storage for L 2 (of N) filter coefficients of a filter coefficient vector C[N]. During each iteration of inner loop 120 (in particular, during step 121 ), coefficient registers 130 are loaded with a subset C[(jL 2 ) . . . (jL 2 +L 2 ⁇ 1)] of the values from the filter coefficient vector, C[N], from memory. Inner loop 120 includes N/L 2 iterations to accumulate partial products of filter coefficient values and input signal vector values into a subset of output points OUT[(iL 1 ) . . . (iL 1 +L 1 ⁇ 1])] of the output signal vector OUT[K].
  • Inner loop 120 also includes accumulation steps (e.g., accumulation steps 122 , 124 , and 126 ) and input data load steps (e.g., input data load steps 123 , 125 , and 127 ). After each accumulate step, processing of a particular element of the input signal vector, D[K], is complete and the register used for storage of that particular element is available for storage of an as-yet unloaded element of the input signal vector.
  • accumulation steps e.g., accumulation steps 122 , 124 , and 126
  • input data load steps e.g., input data load steps 123 , 125 , and 127 .
  • Each input data load step (e.g., input data load step 123 , 125 , or 127 ) loads a next successive element of the input signal vector into a corresponding input register location (illustratively, input register D 0 141 , D 1 142 , or D L 2 ⁇ 1 143 ) freed up during the prior accumulation step.
  • L 2 partial products are accumulated into L 1 output registers 150 (i.e., into the L 1 output registers OUT 0 151 , OUT 1 152 , . . . OUT L 1 ⁇ 1 153 ).
  • FIG. 1 depicts an exemplary N-tap filter implementation 100 where the number of output vector elements computed and input vector elements consumed during each pass through outer loop 110 is L 1 and the number of partial products of input vector elements and filter coefficients accumulated during each pass through inner loop 120 is L 2 .
  • the numbers L 1 and L 2 are independent, although L 1 should be a multiple of L 2 and the quantity (L 1 +2L 2 ) should be less or equal to than the total number of registers allocable to the N-tap filter implementation 100 on a particular processor.
  • L 1 and L 2 are chosen so that the total number of general purpose registers allocated to storage of a partial input signal vector, a partial filter coefficient vector, and a partial output signal vector approaches the number of available general purpose registers on a general purpose processor.
  • L 1 and L 2 are preferably chosen so that the total number of general purpose registers allocated to storage of the partial input signal, partial filter coefficient, and partial output signal vectors approaches the number of available general purpose registers in a register set.
  • RISC Reduced Instruction Set Computer
  • FIG. 2 depicts the data flows associated with an accumulation step and an input data load step from an iteration of inner loop 120 .
  • Inner loop 120 a code and outer loop 110 code each execute on processor 200 , which illustratively includes a general purpose processor with at least 24 general purpose registers 210 .
  • a first group (C 0 131 a , C 1 132 a , . . . C 7 133 a ) of general purpose registers 210 are allocated to storage of a working set of eight (8) filter coefficient values from filter coefficient vector C[N].
  • a second group (D 0 141 a , D 1 142 a , . . . D 7 143 a ) of general purpose registers 210 are allocated to storage of a working set of eight (8) input values from input signal vector D[K].
  • a third group (OUT 0 151 a , OUT 1 152 a , . . . OUT 7 153 a ) of general purpose registers 210 are allocated to accumulative storage of partial convolutions for eight (8) output values of output vector C[N].
  • a third group (OUT 0 151 a , OUT 1 152 a , . . . OUT 7 153 a ) of general purpose registers 210 is cleared in step 112 and stored to memory 220 in step 113 (both of outer loop 110 ).
  • Accumulation step instance 126 a convolves the then-present contents of the first group (C 0 131 a , C 1 132 a , . . . C 7 133 a ) of general purpose registers 210 with the then-present contents of the second group (D 0 141 a , D 1 142 a , . . . D 7 143 a ) of general purpose registers 210 .
  • (jL 2 +L 2 ⁇ 1)] is a convolved with a partial input signal vector D[(iL 1 +jL 2 ⁇ 1), (iL 1 +jL 2 ), . . . (iL 1 +jL 2 +L 2 ⁇ 1)], as follows:
  • Input registers 140 respectively contain elements of the partial input signal vector D[(iL 1 +jL 2 ⁇ 1), (iL 1 +jL 2 ), . . . (iL 1 +jL 2 +L 2 ⁇ 1)] where i is the loop index for outer loop 110 and where elements are stored as shown in Table 1.
  • Input data load step instance 127 a loads the input register D 7 143 a with the next successive element, i.e., D[iL 1 +jL 2 +7], of input signal vector D[K].
  • second group (D 0 141 a , D 1 142 a , . . . D 7 143 a ) of general purpose registers 210 is ready for the next pass through inner loop 120 a.
  • software implementation 300 of a V.34 modem includes transmit and receive data paths.
  • the transmit data path includes encoder 320 , modulator 330 , and pre-emphasis and shaping filter 341 .
  • the receive data path includes receive data module 350 , decoder 360 , demodulator 370 , and receive front end module 380 .
  • a transmit process 396 invokes an external data handler with data for transmission over line 395 .
  • pre-emphasis and shaping filter 341 is implemented using a FIR filter 100 as described above in accordance with FIGS. 1 and 2.
  • echo interpolator 381 , preliminary echo canceller 384 , main echo canceller 371 , and equalizer 373 are also implemented using a FIR filter 100 as described above in accordance with FIGS. 1 and 2.
  • pointers to an input signal vector, D[K], to a coefficient vector, C[N], and an output signal vector, OUT[K], are passed to a function, procedure, or method implementing FIR filter 100 .
  • Each of the submodules which are implemented using FIR filter 100 i.e., shaping filter 341 along the transmit data path and echo interpolator 381 , preliminary echo canceller 384 , main echo canceller 371 , and equalizer 373 along the receive data path, are invoked with input data passed from a predecessor in the respective data path and with coefficient data specific to the particular filter implementation. Both the input data and the filter-specific coefficient data are passed via memory 220 .
  • Suitable filter coefficient vectors are specific to each of the particular filters and will be appreciated by persons of ordinary skill in the art. Certain filter implementations are adaptive and FIR filter 100 is instantiated or invoked with coefficient vectors which are updated to implement each of the respective adaptive filters. Each of the instantiations or invocations of FIR filter 100 code which implement a particular filter along the transmit or receive data path may independently define L 1 and L 2 values for efficient implementation thereof.
  • transmit process 396 supplies a bit stream to a V.34 implementation of encoder 320 .
  • Encoder 320 converts the input bit stream into a baseband sequence of complex symbols which is used as input to modulator 330 .
  • Encoder 320 performs shell mapping, differential encoding, constellation mapping, precoding and 4D trellis encoding, and nonlinear encoding, all as described in respective sections of ITU-T Recommendation V.34, A Modem Operating at Data Signalling Rates of up to 28 800 bits/s for Use on the General Switched Telephone Network and on Leased Point-to-Point 2-Wire Telephone-Type Circuits, dated September, 1994 (previously CCITT Recommendation V.34), which is hereby incorporated herein, in its entirety, by reference.
  • encoder 320 in accordance with the requirements of ITU-T Recommendation V.34 (hereafter the V.34 recommendation).
  • V.34 recommendation a variety of alternative configurations of encoder 320 suitable to modem implementations in accordance with other communications standards such as V.32, V.32bis, etc.
  • Modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples.
  • modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples.
  • modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples.
  • modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples.
  • modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples.
  • the spectrum of the modulator output is sufficiently white, it can be used as an input to receiver echo cancellers, such as preliminary echo canceller 384 , described below.
  • Shaping and pre-emphasis filter 341 provides square-root-of-raised-cosine shaping as well as pre-emphasis filtering specified by section 5.4 of the V.34 recommendation, which is incorporated herein by reference.
  • Raised cosine complex shaping and pre-emphasis filtering are implemented using FIR filters 100 in accordance with FIGS. 1 and 2.
  • Persons of ordinary skill in the art will appreciate suitable coefficient vector definitions for providing raised cosine complex shaping and pre-emphasis filtering.
  • eleven pre-emphasis characteristics combined with four choices of carrier frequency (relative to symbol rate) result in a total of 44 separate filters definitions.
  • the output of shaping and pre-emphasis filter 341 is an output of the transmitter portion of software implementation 300 of a V.34 modem and is provided to D/A converter 391 , typically via an I/O channel and codec. D/A converter 391 couples to transmission line 395 .
  • receive front end module 380 receives the output of the A/D converter 392 as an input.
  • A/D converter 392 couples to transmission line 395 .
  • Preliminary echo canceller 384 is implemented as a real data/real coefficients adaptive filter using an FIR filter 100 in accordance with FIGS. 1 and 2. Persons of ordinary skill in the art will appreciate suitable coefficient vector definitions.
  • Preliminary echo canceller 384 receives as an input a white signal from the output of the modulator 330 .
  • Preliminary echo canceller 384 uses a stochastic gradient updating algorithm for adaptation during half duplex of V.34 training and is not updated during data mode.
  • This preliminary stage of echo cancellation is to reduce the echo level relative to the receive signal level so that subsequent stages such as clock recovery, signal detection, and automatic gain control (each not shown) will not be affected by the echo.
  • Final echo signal cancellation is performed by main echo canceller 371 at the output of equalizer 373 .
  • a passband phase-splitting adaptive T/3 equalizer 373 is used for channel equalization.
  • the input to equalizer 373 is the output of receive signal interpolator 389 and has a sampling rate of 3T ⁇ S.
  • the output of equalizer 373 is downsampled by 3 to symbol rate.
  • Equalizer 373 is implemented using FIR filters 100 in accordance with FIGS. 1 and 2. Persons of ordinary skill in the art will appreciate suitable initial coefficient vector definitions for providing passband phase-splitting adaptive T/3 equalizer 373 .
  • Passband adaptive T/3 echo canceller 371 is used to subtract residual echo left over from preliminary echo canceller 384 .
  • the echo canceller input is the output signal from modulator 330 synchronized with the remote modem clock. Echo is subtracted at the output of equalizer 373 .
  • Demodulator 370 also converts the passband signal at the output of the equalizer to baseband. Demodulator 370 may optionally contain phase locked loop to compensate for frequency offset and phase jitter on transmission line 395 .
  • Decoder 360 converts the demodulated complex symbols into a bit stream which is supplied to receiver process 397 . Transmit process 396 receiver process 397 may be the same process. Decoder 360 performs nonlinear decoding, linear prediction, trellis decoding, constellation decoding, shell demapping, and data de-framing, all as described in respective sections of the V.34 recommendation, which is incorporated herein by reference. Persons of ordinary skill in the art will recognize a variety of alternative implementations of decoder 360 , in accordance with the requirements the V.34 recommendation. In addition, persons of ordinary skill in the art will recognize a variety of alternative configurations of decoder 360 suitable to modem implementations in accordance with other communications standards such as V.32, V.32bis, etc. Returning to the V.34 embodiment of FIG. 3, decoder 360 :
  • deframes data to provide a single bit stream which is then passed (after descrambling) to receiver process 397 .
  • FIG. 4 depicts a Personal Digital Assistant (PDA) 400 incorporating a SoftModem library 410 of software modules (illustratively, V.34 SoftModem modules 300 ) for execution on a general purpose processor 420 .
  • V.34 SoftModem modules 300 are implemented using an FIR filter implementation 100 , as described above.
  • Input signal vectors D[K] and filter coefficient vectors C[N] suitable for providing the various FIR filter implementations of interpolators, phase splitting filters, linear predictors, etc. (which have been described above with reference to FIG. 3) are loaded from memory 430 and output signal vectors OUT[K] are stored to memory 430 .
  • executable instructions implementing SoftModem library 410 (including FIR filter implementation 100 ) and suitable for execution on general purpose processor 420 are also stored in, and loaded from, memory 430 .
  • general purpose processor 420 includes an R3000 RISC microprocessor, although a wide variety of alternative processor implementations are also suitable.
  • General purpose processor 420 includes general purpose registers 210 which are operated on by the executable instructions of FIR filter implementation 100 and includes a DMA channel 421 for interfacing to telecommunication circuits (illustratively, phone line 490 ) via codec 470 and Digital-to-Analog/Analog-to-Digital (DAA) converter 460 .
  • DMA Digital-to-Analog/Analog-to-Digital
  • memory 430 may include both read/write memory 431 and read only memory 432 and persons of ordinary skill in the art will recognize code portions and data suitable for storage in each.
  • Removable media 480 provides a mechanism for supplying the executable instructions implementing SoftModem library 410 (including FIR filter implementation 100 ) as well as filter coefficient definitions.

Abstract

A filter is implemented in software on a general purpose processor in a manner which reduces the number of memory accesses as compared to conventional methods. In some realizations, both application code and filter code are executed on a same general purpose processor. The filter code incrementally loads respective portions of input and coefficient vector data from addressable storage into respective registers of the processor and performs successive operations thereupon to accumulate output vector data into other respective registers of the processor. The filter code typically exhibits an execution ratio of less than two input and coefficient data loads per operation to accumulate. In some realizations, the filter code is callable from the application code and provides the application code with a signal processing facility without use of a digital signal processor (DSP).

Description

This application is a continuation of application Ser. No. 09/460,262, filed Dec. 13, 1999 now U.S. Pat. No. 6,209,013, which was itself a continuation of application Ser. No. 08/748,854, filed Nov. 14, 1996, now U.S. Pat. No. 6,018,755. The entirety of each is incorporated herein by reference.
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to software implementations of discrete time filters, and in particular to software implementations of a Finite Impulse Response (FIR) filter on a general purpose processor.
1. Description of the Relevant Art
Traditional implementations of discrete time filters for signal processing applications have used a custom Digital Signal Processor (DSP) instruction to implement an N-tap filter. Such a DSP instruction is executed to perform a multiply-accumulate operation and to shift the delay line in a single cycle (assuming the delay line is entirely in zero-wait state memory or on-chip). For example, on a T1320C50 DSP, a finite impulse response (FIR) filter is implemented by successive evaluations of an MACD instruction, each evaluation computing an element, yn, of the filtered signal vector, i.e., of the output vector, y[K], such that: y n = N - 1 i = 0 h i x n - i ( 1 )
Figure US06618739-20030909-M00001
where h[N] is the N-tap filter coefficient vector and x[K] is an input signal vector.
Unfortunately, for many portable device applications such as Personal Digital Assistants (PDAs), portable computers, and cellular phones, power consumption, battery life, and overall mass are important design figures of merit. In addition, very small part counts are desirable for extremely-small, low-cost consumer devices. Signal processing capabilities are desirable in many such portable device applications, for example to provide a modem or other communications interface, for speech recognition, etc. However, traditional DSP implementations of such signal processing capabilities create increased power demands, increase part counts, and because of the power consumption of a discrete DSP, typically require larger heavier batteries.
SUMMARY OF THE INVENTION
An efficient implementation of a Finite Impulse Response (FIR) filter on a general purpose processor allows a discrete Digital Signal Processor (DSP), together with the cost, size, weight, and power implications thereof, to be eliminated in device configurations (such as communications device configurations) requiring signal processing functionality and digital filter structures. In particular, an efficient implementation of an FIR in accordance with the present invention allows a single general purpose processor (e.g., any of a variety of processors including MIPS R3000, R4000, and R5000 processors, processors conforming to the Sparc, PowerPC, Alpha, PA-RISC, or x86 processor architectures, etc.) to execute instructions encoded in a machine readable media to provide not only application-level functionality, but also the underlying signal processing functionality and digital filter structures for a communications device implementation. Of course, multiprocessor embodiments (i.e., embodiments including multiple general-purpose processors) which similarly eliminate a DSP are also possible. In one embodiment in accordance with the present invention, an FIR filter implementation on a general purpose processor provides digital filter structures for a software implementation of a V.34 modem without use of a DSP.
In general, a general purpose processor provides an instruction set architecture for loading data to and storing data from general purpose registers, for performing logical and scalar arithmetic operations on such data, and providing instruction sequence control. Application programs, as well as operating systems and device drivers, are typically executed on such a general purpose processor. In contrast, a digital signal processor is optimized for vector operations on vector data, typically residing in large memory arrays or special purpose register blocks, and is not well suited to the demands of application programs or operating system implementations. Instead, a digital signal processor typically provides a vector multiply-accumulate operation which exploits highly-optimized vector addressing facilities. In contrast, a general purpose processor provides neither a vector multiply-accumulate operation nor vector addressing facilities necessary for computing a ynth element and shifting through vector data in a single cycle. Instead, an N-tap filter implemented in a straightforward manner for execution on a general purpose processor computes each output vector element using 2N reads from memory to processor registers, N multiply-accumulates, and one write to memory. To calculate K elements, such an N-tap filter implementation makes K(2N+1) memory accesses and KN multiply-accumulates. For each multiply-accumulate, more than two memory accesses are required.
It has been discovered that a Finite Impulse Response (FIR) filter can be implemented in software on a general purpose processor in a manner which reduces the number of memory accesses. In particular, an efficient implementation for a general purpose processor having a substantial number of registers includes inner and outer loop code which together make K [ ( L 1 + L 2 L 1 L 2 ) N + L 2 L 1 + 1 ]
Figure US06618739-20030909-M00002
memory accesses and KN multiply-accumulates, where L1 is the number of output vector elements computed during each pass through the outer loop and where L2 is the number of taps per output vector element computed during each pass through the inner loop. The efficient implementation exploits L1+2L2 general purpose registers. For an exemplary embodiment wherein L1=L2=8, i.e., using 24 general purpose registers, inner and outer loop code make K ( N 4 + 2 )
Figure US06618739-20030909-M00003
memory accesses, which for filter implementations with large numbers of taps, approaches a 4× reduction in the number of memory accesses.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to persons of ordinary skill in the art by referencing the accompanying drawings.
FIG. 1 is a flow chart of an implementation of a Finite Impulse Response (FIR) filter, in accordance with an exemplary embodiment of the present invention, for execution on a processor.
FIG. 2 is a data flow diagram for a multiply accumulate step of an implementation of a Finite Impulse Response (FIR) filter for execution on system including a processor with general purpose registers and a memory, in accordance with an exemplary embodiment of the present invention.
FIG. 3 is a functional block diagram depicting functional modules and data flows for a software implementation of a modem incorporating instantiations of a Finite Impulse Response (FIR) filter implemented in accordance with an exemplary embodiment of the present invention.
FIG. 4 is a block diagram of an exemplary Personal Digital Assistant (PDA) system embodiment including a general purpose processor, registers, and memory for executing a software implementation of a modem including an implementation of a Finite Impulse Response (FIR) filter in accordance with an exemplary embodiment of the present invention.
The use of the same reference symbols in different drawings indicates similar or identical items.
DESCRIPTION OF THE PREFERRED EMBODIMENT(S)
An N-tap filter implemented as software for execution on a general purpose processor computes each output vector element using 2N reads from memory to processor registers, N multiply-accumulates, and one write to memory. To calculate K elements, such an N-tap filter implementation includes K(2N+1) memory accesses and KN multiply-accumulates. For each multiply-accumulate, more than two memory access are required.
In contrast, an improved software implementation of the N-tap filter reduces the number of memory accesses. Referring to FIG. 1, the improved software implementation includes an inner loop 120 and an outer loop 110 which together include K [ ( L 1 + L 2 L 1 L 2 ) N + L 2 L 1 + 1 ]
Figure US06618739-20030909-M00004
memory accesses and KN multiply-accumulates, where L1 is the number of output vector elements computed during each pass through outer loop 120 and where L2 is the number of taps per output vector element computed during each pass through inner loop 110. The improved software implementation efficiently exploits L1+2L2 general purpose registers and significantly reduces the number of memory accesses performed. In particular, for an exemplary embodiment wherein L1=L2=8, i.e., using 24 general purpose registers, inner and outer loop code make K ( N 4 + 2 )
Figure US06618739-20030909-M00005
memory accesses, which for filter implementations with large numbers of taps, approaches a 4× reduction in the number of memory accesses.
FIG. 1 depicts an exemplary embodiment of a nested loop implementation, including control flows (bold lines) and data flows (fine lines), of an N-tap filter design for a Finite Impulse Response filter (FIR). Outer loop 110 includes K/L1 iterations to compute K output values of an output signal vector, OUT[K]. During each iteration of outer loop 110, input registers 140 are loaded with L2 (of K) respective input values of an input signal vector, D[K], from memory (step 111). Output registers 150 store L1 (of K) respective output values of the output signal vector, OUT[K], and are cleared in step 112. Inner loop 120 includes N/L2 iterations to accumulate partial products into output registers 150 storing a subset of output values OUT[(iL1) . . . (iL1+L1−1)] of the output signal vector OUT[K] where i is the loop index variable for outer loop 110. The structure of inner loop 120 is described below. Loop index variable j is checked during each pass through inner loop 120 (illustratively, in step 128). On inner loop exit (i.e., on j=N/L2 in the exemplary embodiment of FIG. 1), the subset of output values computed by inner loop 120 and accumulated into output registers 150 are stored to memory (step 113) and a subsequent iteration (if any) of outer loop 110 is initiated. In the exemplary embodiment of FIG. 1, outer loop exit is on i=K/L1.
Coefficient registers 130 provide storage for L2 (of N) filter coefficients of a filter coefficient vector C[N]. During each iteration of inner loop 120 (in particular, during step 121), coefficient registers 130 are loaded with a subset C[(jL2) . . . (jL2+L2−1)] of the values from the filter coefficient vector, C[N], from memory. Inner loop 120 includes N/L2 iterations to accumulate partial products of filter coefficient values and input signal vector values into a subset of output points OUT[(iL1) . . . (iL1+L1−1])] of the output signal vector OUT[K]. L2 element subsets of the filter coefficient vector and of the input signal vector are processed during each iteration through inner loop 120. Inner loop 120 also includes accumulation steps (e.g., accumulation steps 122, 124, and 126) and input data load steps (e.g., input data load steps 123, 125, and 127). After each accumulate step, processing of a particular element of the input signal vector, D[K], is complete and the register used for storage of that particular element is available for storage of an as-yet unloaded element of the input signal vector. Each input data load step (e.g., input data load step 123, 125, or 127) loads a next successive element of the input signal vector into a corresponding input register location (illustratively, input register D 0 141, D1 142, or DL 2 −1 143) freed up during the prior accumulation step. During each iteration of inner loop 120, L2 partial products are accumulated into L1 output registers 150 (i.e., into the L1 output registers OUT0 151, OUT 1 152, . . . OUTL 1 −1 153).
FIG. 1 depicts an exemplary N-tap filter implementation 100 where the number of output vector elements computed and input vector elements consumed during each pass through outer loop 110 is L1 and the number of partial products of input vector elements and filter coefficients accumulated during each pass through inner loop 120 is L2. The numbers L1 and L2 are independent, although L1 should be a multiple of L2 and the quantity (L1+2L2) should be less or equal to than the total number of registers allocable to the N-tap filter implementation 100 on a particular processor.
For an embodiment wherein L1=L2=8, the steps of the N-tap filter implementation of FIG. 1 correspond to the following pseudocode:
/* compute L1=8 output points per iteration */
OUTER_LOOP {
clear 8 output registers OUT0, OUT1, . . . , OUT7;
load 8 input from memory to registers D0, D1, . . . , D7;
/* compute L2=8 partial outputs */
INNER_LOOP {
load 8 coefficients from memory to registers C0, C1, . . . , C7;
OUT0 += C0*D0 + C1*D1 + . . . + C7*D7;
load new input from memory to D0;
OUT1 += C0*D1 + C1*D2 + . . . + C7*D0;
load new input from memory to D1;
OUT2 += C0*D2 + C1*D3 + . . . + C7*D1;
load new input from memory to D2;
. . .
OUT7 += C0*D7 + C1*D0 + . . . + C7*D6;
load new input from memory to D7;
}
store 8 outputs from registers OUT0, OUT1, . . . , OUT7;
}
A variety of source-code, assembly language, and machine language implementations consistent with the above pseudocode will be appreciated by persons of ordinary skill in the art. Alternative embodiments corresponding to different combinations of L1 and L2 values will also be appreciated by persons of ordinary skill in the art. Preferably, L1 and L2 are chosen so that the total number of general purpose registers allocated to storage of a partial input signal vector, a partial filter coefficient vector, and a partial output signal vector approaches the number of available general purpose registers on a general purpose processor. In an embodiment for execution on a Reduced Instruction Set Computer (RISC) processor providing overlapping register sets, L1 and L2 are preferably chosen so that the total number of general purpose registers allocated to storage of the partial input signal, partial filter coefficient, and partial output signal vectors approaches the number of available general purpose registers in a register set.
FIG. 2 depicts the data flows associated with an accumulation step and an input data load step from an iteration of inner loop 120. In particular, FIG. 2 depicts the data flows associated with the final two steps in each iteration of inner loop 120 (i.e., accumulation step 126 and input data load step 127, as shown in FIG. 1) for an exemplary embodiment in which L1=8 and L2=8. Inner loop instance 120 a, accumulation step instance 126 a, and input data load step instance 127 a correspond to this exemplary embodiment in which L1=8 and L2=8. The exemplary embodiment of FIG. 2 exploits twenty-four (24) general purpose registers 210 and is illustrative of the data flows for accumulation step instance 126 a and input data load step instance 127 a. The data flows associated with each of seven other preceding accumulation and input data load steps are analogous and will be appreciated by persons of ordinary skill in the art. In addition, persons of ordinary skill in the art will appreciate modification for alternate selections of L1 and L2 values.
Inner loop 120 a code and outer loop 110 code (not shown) each execute on processor 200, which illustratively includes a general purpose processor with at least 24 general purpose registers 210. A first group (C 0 131 a, C 1 132 a, . . . C7 133 a) of general purpose registers 210 are allocated to storage of a working set of eight (8) filter coefficient values from filter coefficient vector C[N]. A second group (D 0 141 a, D 1 142 a, . . . D 7 143 a) of general purpose registers 210 are allocated to storage of a working set of eight (8) input values from input signal vector D[K]. A third group (OUT 0 151 a, OUT 1 152 a, . . . OUT 7 153 a) of general purpose registers 210 are allocated to accumulative storage of partial convolutions for eight (8) output values of output vector C[N]. Initialization of the first group (C 0 131 a, C 1 132 a, . . . C7 133 a) and the second group (D 0 141 a, D 1 142 a, . . . D 7 143 a) of general purpose registers 210 with values from memory, such as memory 200, is performed in steps 111 (of outer loop 110) and 121 (of inner loop 120), as indicated in FIG. 1. A third group (OUT 0 151 a, OUT 1 152 a, . . . OUT 7 153 a) of general purpose registers 210 is cleared in step 112 and stored to memory 220 in step 113 (both of outer loop 110).
Accumulation step instance 126 a convolves the then-present contents of the first group (C 0 131 a, C 1 132 a, . . . C7 133 a) of general purpose registers 210 with the then-present contents of the second group (D 0 141 a, D 1 142 a, . . . D 7 143 a) of general purpose registers 210. For the particular accumulate step performed by accumulation step instance 126 a, a partial filter coefficient vector C[(jL2) . . . (jL2+L2−1)] is a convolved with a partial input signal vector D[(iL1+jL2−1), (iL1+jL2), . . . (iL1+jL2+L2−1)], as follows:
OUT7 +=C 0 D 7 +C 1 D 0 +C 2 D 1 +C 3 D 2 +C 4 D 3 +C 5 D 4 +C 6 D 5 +C 7 D 6  (2)
where j is the loop index for inner loop 120 a and where C 0 131 a, C 1 132 a, . . . , and C 7 133 a respectively contain elements of the partial filter coefficient vector C[(jL2) . . . (jL2+L2−1)]. Input registers 140 (i.e., D 7 143 a, D 0 141 a, D 1 142 a, . . . , and D6) respectively contain elements of the partial input signal vector D[(iL1+jL2−1), (iL1+jL2), . . . (iL1+jL2+L2−1)] where i is the loop index for outer loop 110 and where elements are stored as shown in Table 1.
TABLE 1
Input Signal
Register Vector Element
Input Register D 0 141a D[iL1 + jL2]   
Input Register D 1 142a D[iL1 + jL2 + 1]
Input Register D2 D[iL1 + jL2 + 2]
Input Register D3 D[iL1 + jL2 + 3]
Input Register D4 D[iL1 + jL2 + 4]
Input Register D5 D[iL1 + jL2 + 5]
Input Register D6 D[iL1 + jL2 + 6]
Input Register D 7 143a D[iL1 + jL2 − 1]
Input data load step instance 127 a loads the input register D 7 143 a with the next successive element, i.e., D[iL1+jL2+7], of input signal vector D[K]. In this way, second group (D 0 141 a, D 1 142 a, . . . D 7 143 a) of general purpose registers 210 is ready for the next pass through inner loop 120 a.
Referring to FIG. 3, software implementation 300 of a V.34 modem includes transmit and receive data paths. The transmit data path includes encoder 320, modulator 330, and pre-emphasis and shaping filter 341. The receive data path includes receive data module 350, decoder 360, demodulator 370, and receive front end module 380. A transmit process 396 invokes an external data handler with data for transmission over line 395. Along the transmit data path, pre-emphasis and shaping filter 341 is implemented using a FIR filter 100 as described above in accordance with FIGS. 1 and 2. Along the receive data path, echo interpolator 381, preliminary echo canceller 384, main echo canceller 371, and equalizer 373 are also implemented using a FIR filter 100 as described above in accordance with FIGS. 1 and 2.
In an exemplary embodiment of software implementation 300 of a V.34 modem, pointers to an input signal vector, D[K], to a coefficient vector, C[N], and an output signal vector, OUT[K], are passed to a function, procedure, or method implementing FIR filter 100. Each of the submodules which are implemented using FIR filter 100, i.e., shaping filter 341 along the transmit data path and echo interpolator 381, preliminary echo canceller 384, main echo canceller 371, and equalizer 373 along the receive data path, are invoked with input data passed from a predecessor in the respective data path and with coefficient data specific to the particular filter implementation. Both the input data and the filter-specific coefficient data are passed via memory 220. Suitable filter coefficient vectors are specific to each of the particular filters and will be appreciated by persons of ordinary skill in the art. Certain filter implementations are adaptive and FIR filter 100 is instantiated or invoked with coefficient vectors which are updated to implement each of the respective adaptive filters. Each of the instantiations or invocations of FIR filter 100 code which implement a particular filter along the transmit or receive data path may independently define L1 and L2 values for efficient implementation thereof.
Referring now to the transmit data path of software implementation 300, transmit process 396 supplies a bit stream to a V.34 implementation of encoder 320. Encoder 320 converts the input bit stream into a baseband sequence of complex symbols which is used as input to modulator 330. Encoder 320 performs shell mapping, differential encoding, constellation mapping, precoding and 4D trellis encoding, and nonlinear encoding, all as described in respective sections of ITU-T Recommendation V.34, A Modem Operating at Data Signalling Rates of up to 28 800 bits/s for Use on the General Switched Telephone Network and on Leased Point-to-Point 2-Wire Telephone-Type Circuits, dated September, 1994 (previously CCITT Recommendation V.34), which is hereby incorporated herein, in its entirety, by reference. Persons of ordinary skill in the art will recognize variety of alternative implementations of encoder 320, in accordance with the requirements of ITU-T Recommendation V.34 (hereafter the V.34 recommendation). In addition, persons of ordinary skill in the art will recognize a variety of alternative configurations of encoder 320 suitable to modem implementations in accordance with other communications standards such as V.32, V.32bis, etc. Returning to the V.34 embodiment of FIG. 3, encoder 320:
1. converts the input bit stream into a sequence of mapping frames as described in section 9.3 of the V.34 recommendation, which is incorporated herein by reference;
2. performs shell mapping as described in section 9.4 of the V.34 recommendation, which is incorporated herein by reference;
3. performs differential encoding as described in section 9.5 of the V.34 recommendation, which is incorporated herein by reference;
4. performs constellation mapping as described in section 9.1 of the V.34 recommendation, which is incorporated herein by reference;
5. performs precoding and 4D trellis encoding as described in section 9.6 of the V.34 recommendation, which is incorporated herein by reference; and
6. performs nonlinear encoding as described in section 9.7 of the V.34 recommendation, which is incorporated herein by reference.
A variety of suitable implementations in accordance with the requirements of respective sections of the V.34 recommendation will be appreciated by persons of ordinary skill in the art.
Modulator 330 converts the baseband sequence of complex symbols from the output of the encoder into a passband sequence of real samples. In particular, modulator 330:
1. multiplies the complex baseband sequence by the carrier frequency; and
2. converts the complex signal to real.
If the spectrum of the modulator output is sufficiently white, it can be used as an input to receiver echo cancellers, such as preliminary echo canceller 384, described below.
Shaping and pre-emphasis filter 341 provides square-root-of-raised-cosine shaping as well as pre-emphasis filtering specified by section 5.4 of the V.34 recommendation, which is incorporated herein by reference. Raised cosine complex shaping and pre-emphasis filtering are implemented using FIR filters 100 in accordance with FIGS. 1 and 2. Persons of ordinary skill in the art will appreciate suitable coefficient vector definitions for providing raised cosine complex shaping and pre-emphasis filtering. In the embodiment of FIG. 3, eleven pre-emphasis characteristics combined with four choices of carrier frequency (relative to symbol rate) result in a total of 44 separate filters definitions. Only one filter is used on any one connection, although other embodiments utilizing more than one filter definitions per connection are also suitable. The output of shaping and pre-emphasis filter 341 is an output of the transmitter portion of software implementation 300 of a V.34 modem and is provided to D/A converter 391, typically via an I/O channel and codec. D/A converter 391 couples to transmission line 395.
Referring now to the receive data path of software implementation 300, receive front end module 380 receives the output of the A/D converter 392 as an input. A/D converter 392 couples to transmission line 395. Preliminary echo canceller 384 is implemented as a real data/real coefficients adaptive filter using an FIR filter 100 in accordance with FIGS. 1 and 2. Persons of ordinary skill in the art will appreciate suitable coefficient vector definitions. Preliminary echo canceller 384 receives as an input a white signal from the output of the modulator 330. Preliminary echo canceller 384 uses a stochastic gradient updating algorithm for adaptation during half duplex of V.34 training and is not updated during data mode. The purpose of this preliminary stage of echo cancellation is to reduce the echo level relative to the receive signal level so that subsequent stages such as clock recovery, signal detection, and automatic gain control (each not shown) will not be affected by the echo. Final echo signal cancellation is performed by main echo canceller 371 at the output of equalizer 373.
The modem receiver implemented along the receive data path should be synchronized with the remote modem signal. An adaptive FIR filter (i.e., an FIR filter implementation 100 in accordance with FIGS. 1 and 2 with an adaptively updated set of filter coefficients) is used to perform the interpolation. Adaptive FIR filters implemented in this manner are used to interpolate the receive signal (at receive signal interpolator 389) as well as to interpolate the modulator output (at echo interpolator 381) used as input for main echo canceller 371. The filter coefficients are adjusted based on timing phase and frequency recovered from the remote modem signal. The adaptation algorithm is a two-stage combination sin(x)/x of and linear interpolations.
Referring now to demodulator 370, a passband phase-splitting adaptive T/3 equalizer 373 is used for channel equalization. The input to equalizer 373 is the output of receive signal interpolator 389 and has a sampling rate of 3T×S. The output of equalizer 373 is downsampled by 3 to symbol rate. Equalizer 373 is implemented using FIR filters 100 in accordance with FIGS. 1 and 2. Persons of ordinary skill in the art will appreciate suitable initial coefficient vector definitions for providing passband phase-splitting adaptive T/3 equalizer 373.
Passband adaptive T/3 echo canceller 371 is used to subtract residual echo left over from preliminary echo canceller 384. The echo canceller input is the output signal from modulator 330 synchronized with the remote modem clock. Echo is subtracted at the output of equalizer 373. Demodulator 370 also converts the passband signal at the output of the equalizer to baseband. Demodulator 370 may optionally contain phase locked loop to compensate for frequency offset and phase jitter on transmission line 395.
Decoder 360 converts the demodulated complex symbols into a bit stream which is supplied to receiver process 397. Transmit process 396 receiver process 397 may be the same process. Decoder 360 performs nonlinear decoding, linear prediction, trellis decoding, constellation decoding, shell demapping, and data de-framing, all as described in respective sections of the V.34 recommendation, which is incorporated herein by reference. Persons of ordinary skill in the art will recognize a variety of alternative implementations of decoder 360, in accordance with the requirements the V.34 recommendation. In addition, persons of ordinary skill in the art will recognize a variety of alternative configurations of decoder 360 suitable to modem implementations in accordance with other communications standards such as V.32, V.32bis, etc. Returning to the V.34 embodiment of FIG. 3, decoder 360:
1. compensates for the effect of nonlinear encoding by applying inverse nonlinear projection function to the symbols at the output of the demodulator;
2. performs linear prediction implemented as a 4-tap complex FIR filter which uses the same coefficients as the remote modem precoder. The purpose of the linear predictor (not shown), which is implemented as an FIR filter 100 in accordance with FIGS. 1 and 2, is to whiten the channel noise, thereby reducing the probability of errors;
3. performs the trellis search algorithm to determine, based on the received symbols, the best decoding decision for the current symbol;
4. performs constellation decoding;
5. performs an operation complementary to that performed by shell mapper described above with reference to encoder 320; and
6. deframes data to provide a single bit stream which is then passed (after descrambling) to receiver process 397.
Other Embodiments
FIG. 4 depicts a Personal Digital Assistant (PDA) 400 incorporating a SoftModem library 410 of software modules (illustratively, V.34 SoftModem modules 300) for execution on a general purpose processor 420. In accordance with an embodiment of the present invention, certain of V.34 SoftModem modules 300 are implemented using an FIR filter implementation 100, as described above. Input signal vectors D[K] and filter coefficient vectors C[N] suitable for providing the various FIR filter implementations of interpolators, phase splitting filters, linear predictors, etc. (which have been described above with reference to FIG. 3) are loaded from memory 430 and output signal vectors OUT[K] are stored to memory 430. In addition, executable instructions implementing SoftModem library 410 (including FIR filter implementation 100) and suitable for execution on general purpose processor 420 are also stored in, and loaded from, memory 430. In a presently preferred embodiment, general purpose processor 420 includes an R3000 RISC microprocessor, although a wide variety of alternative processor implementations are also suitable. General purpose processor 420 includes general purpose registers 210 which are operated on by the executable instructions of FIR filter implementation 100 and includes a DMA channel 421 for interfacing to telecommunication circuits (illustratively, phone line 490) via codec 470 and Digital-to-Analog/Analog-to-Digital (DAA) converter 460. Of course, memory 430 may include both read/write memory 431 and read only memory 432 and persons of ordinary skill in the art will recognize code portions and data suitable for storage in each. Removable media 480 provides a mechanism for supplying the executable instructions implementing SoftModem library 410 (including FIR filter implementation 100) as well as filter coefficient definitions.
While the invention has been described with reference to various embodiments, it will be understood that these embodiments are illustrative and that the scope of the invention is not limited to them. Many variations, modifications, additions, and improvements of the embodiments described are possible. For example, complex inputs and/or complex coefficient can be accommodated to generate complex outputs. FIR filter implementations in accordance with the present invention are suitable for implementation of many other signal processing functions and can be incorporated in a wide variety of devices including modems, answering machines, cellular phones, voice/data compression systems, speech recognition systems, etc. Additionally, structures and functionality presented as hardware in the exemplary embodiment may be implemented as software, firmware, or microcode in alternative embodiments. These and other variations, modifications, additions, and improvements may fall within the scope of the invention as defined in the claims which follow.

Claims (36)

What is claimed is:
1. An apparatus comprising:
a general purpose processor having general purpose registers;
addressable memory coupled to the general purpose processor for storing input, coefficient and output vector data; and
software instructions executable on the general purpose processor and including a discrete-time filter implementation to incrementally load respective portions of the input and coefficient vector data into first and second sets of the general purpose registers and operate thereupon to accumulate the output vector data into a third set of the general purpose registers without use of a digital signal processor.
2. The apparatus of claim 1,
wherein memory access overhead for any single one of the incremental loads is amortized over multiple of the accumulations of the output vector data.
3. The apparatus of claim 1,
wherein the discrete-time filter implementation exhibits an execution ratio of less than two of the incremental loads per operation to accumulate.
4. The apparatus of claim 1,
wherein the discrete-time filter includes a Finite Impulse Response (FIR) filter.
5. The apparatus of claim 1,
wherein the operation upon respective portions of the input and coefficient vector data in first and second sets of the general purpose registers includes execution of successive multiply-accumulate operations.
6. The apparatus of claim 1,
wherein the operation upon respective portions of the input and coefficient vector data in first and second sets of the general purpose registers includes execution of successive multiply and accumulate operations.
7. The apparatus of claim 1,
wherein the signal processing functions at least partially implement a modem.
8. The apparatus of claim 1,
wherein the general purpose processor is a RISC processor.
9. The apparatus of claim 1,
wherein the general purpose processor provides a scalar multiply-accumulate instruction.
10. The apparatus of claim 1,
wherein only a partial portion of the input vector data is represented in the general purpose registers at any given time; and
wherein additional portions of at least the input vector data are loaded from the addressable memory into respective ones of the general purpose registers under control of the discrete-time filter implementation.
11. The apparatus of claim 1,
wherein the general purpose registers of the first, second and third sets are all allocated from an architecturally-defined set of registers available to a computational thread that executes the software instructions.
12. The apparatus of claim 11,
wherein the first, second and third sets each number 8 and the architecturally-defined set of available registers number at least 24.
13. The apparatus of claim 1,
wherein the software instructions of the discrete-time filter implementation are executable on the general purpose processor without use of an instruction that performs a multiply-accumulate operation and delay line shift in a single-cycle.
14. The apparatus of claim 1,
wherein the software instructions of the discrete-time filter implementation are executable on the general purpose processor without use of a vector multiply-accumulate operation.
15. The apparatus of claim 1,
wherein the software instructions of the discrete-time filter implementation are executable on the general purpose processor without use of a vector addressing facility.
16. The apparatus of claim 1,
wherein the general purpose processor does not provide an instruction that performs a multiply-accumulate operation and delay line shift in a single-cycle.
17. The apparatus of claim 1,
wherein the general purpose processor does not provide a vector multiply-accumulate operation.
18. The apparatus of claim 1,
wherein the general purpose processor does not provide a vector addressing facility.
19. The apparatus of claim 1,
wherein application software instructions are also executable on the general purpose processor.
20. The apparatus of claim 1, embodied as one or more of:
a personal digital assistant;
a portable computer; and
a phone.
21. The apparatus of claim 1, wherein the software instructions implement:
signal processing functions based on the discrete-time filter implementation; and
application functions,
wherein the software instructions that implement the signal processing functions and those that implement the application functions are both executable on the general purpose processor.
22. The apparatus of claim 21,
wherein the signal processing functions at least partially implement a modem.
23. The apparatus of claim 21, further comprising:
a communications interface coupled between a communications medium and the general purpose processor, wherein the input vector data corresponds to a signal received via the communications interface.
24. The apparatus of claim 23,
wherein the communications medium includes a phone line and the communications interface includes an analog-to-digital conversion.
25. The apparatus of claim 1, further comprising:
signal processing structures at least partially implemented by the discrete-time filter implementation, the signal processing structures including one or more of an interpolator, an echo canceller, and an equalizer.
26. The apparatus of claim 25,
wherein the signal processing structures at least partially implement one or more of a telephony feature, modem feature, answering machine feature, voice or data compression feature, and speech recognition system feature of the apparatus.
27. The apparatus of claim 1, further comprising:
receive path and transmit path signal processing structures at least partially implemented by the discrete-time filter implementation.
28. A method of providing a signal processing facility in a computing device without use of a digital signal processor (DSP), the method comprising:
executing both application code and FIR filter code on a same processor;
the FIR filter code incrementally loading respective portions of input and coefficient vector data into respective registers of the processor and performing successive operations thereupon to accumulate output vector data into other respective registers of the processor;
the FIR filter code exhibiting an execution ratio of less than two input and coefficient data loads per operation to accumulate.
29. The method of claim 28,
wherein the operations to accumulate include successive scalar multiply-accumulate operations.
30. The method of claim 28,
wherein L1 of the registers are allocated to the respective portions of the output vector data, L2 of the registers are allocated to the respective portions of the input vector data, and L2 of the registers are allocated to the respective portions of the coefficient vector data;
wherein the input and coefficient vector data loads number no more than approximately K [ ( L 1 + L 2 L 1 L 2 ) N + L 2 L 1 + 1 ]
Figure US06618739-20030909-M00006
 per KN scalar multiply-accumulate operations, where K is the number of elements in the output vector and N is the number of taps of the FIR filter.
31. An apparatus comprising:
a processor;
application code including instructions executable by the processor;
FIR filter code including instructions executable by the processor to load respective portions of input and coefficient vector data from addressable storage into respective registers of the processor and to operate thereupon to accumulate output vector data into other respective registers of the processor,
wherein the FIR filter code is callable from the application code and provides the application code with a signal processing facility without use of a digital signal processor (DSP).
32. The apparatus of claim 31,
wherein memory access overhead for any single one of the loads is amortized over multiple of the accumulations of the output vector data.
33. The apparatus of claim 31,
wherein the FIR filter code exhibits an execution ratio of less than two of the loads per operation to accumulate.
34. The apparatus of claim 31,
wherein the operation upon the input and coefficient vector data includes execution of successive multiply-accumulate operations.
35. The apparatus of claim 31,
wherein the signal processing facility includes a modem.
36. The apparatus of claim 31, configured as a personal digital assistant.
US09/790,281 1996-11-14 2001-02-22 Digital filter implementation suitable for execution, together with application code, on a same processor Expired - Fee Related US6618739B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09/790,281 US6618739B1 (en) 1996-11-14 2001-02-22 Digital filter implementation suitable for execution, together with application code, on a same processor
US10/651,922 US7398288B2 (en) 1996-11-14 2003-08-29 Efficient implementation of a filter

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/748,854 US6018755A (en) 1996-11-14 1996-11-14 Efficient implementation of an FIR filter on a general purpose processor
US09/460,262 US6209013B1 (en) 1996-11-14 1999-12-13 Efficient implementation of a filter
US09/790,281 US6618739B1 (en) 1996-11-14 2001-02-22 Digital filter implementation suitable for execution, together with application code, on a same processor

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/460,262 Continuation US6209013B1 (en) 1996-11-14 1999-12-13 Efficient implementation of a filter

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US10/651,922 Continuation US7398288B2 (en) 1996-11-14 2003-08-29 Efficient implementation of a filter

Publications (1)

Publication Number Publication Date
US6618739B1 true US6618739B1 (en) 2003-09-09

Family

ID=25011213

Family Applications (4)

Application Number Title Priority Date Filing Date
US08/748,854 Expired - Fee Related US6018755A (en) 1996-11-14 1996-11-14 Efficient implementation of an FIR filter on a general purpose processor
US09/460,262 Expired - Lifetime US6209013B1 (en) 1996-11-14 1999-12-13 Efficient implementation of a filter
US09/790,281 Expired - Fee Related US6618739B1 (en) 1996-11-14 2001-02-22 Digital filter implementation suitable for execution, together with application code, on a same processor
US10/651,922 Expired - Fee Related US7398288B2 (en) 1996-11-14 2003-08-29 Efficient implementation of a filter

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US08/748,854 Expired - Fee Related US6018755A (en) 1996-11-14 1996-11-14 Efficient implementation of an FIR filter on a general purpose processor
US09/460,262 Expired - Lifetime US6209013B1 (en) 1996-11-14 1999-12-13 Efficient implementation of a filter

Family Applications After (1)

Application Number Title Priority Date Filing Date
US10/651,922 Expired - Fee Related US7398288B2 (en) 1996-11-14 2003-08-29 Efficient implementation of a filter

Country Status (1)

Country Link
US (4) US6018755A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7398288B2 (en) 1996-11-14 2008-07-08 Broadcom Corporation Efficient implementation of a filter
US9015219B2 (en) 2011-10-28 2015-04-21 Stmicroelectronics International N.V. Apparatus for signal processing

Families Citing this family (68)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW417082B (en) * 1997-10-31 2001-01-01 Yamaha Corp Digital filtering processing method, device and Audio/Video positioning device
FR2776093A1 (en) * 1998-03-10 1999-09-17 Philips Electronics Nv PROGRAMMABLE PROCESSOR CIRCUIT PROVIDED WITH A RECONFIGURABLE MEMORY FOR PRODUCING A DIGITAL FILTER
US6240128B1 (en) * 1998-06-11 2001-05-29 Agere Systems Guardian Corp. Enhanced echo canceler
US6510444B2 (en) * 1999-06-16 2003-01-21 Motorola, Inc. Data processor architecture and instruction format for increased efficiency
US6826279B1 (en) * 2000-05-25 2004-11-30 3Com Corporation Base band echo cancellation using laguerre echo estimation
US6804350B1 (en) * 2000-12-21 2004-10-12 Cisco Technology, Inc. Method and apparatus for improving echo cancellation in non-voip systems
US6877021B2 (en) * 2001-03-28 2005-04-05 Classic Solutions Pty Limited Calculation method and apparatus
WO2003009471A1 (en) * 2001-07-16 2003-01-30 Advanced Communications Technologies (Australia) Pty Ltd Forward link filter
JP4282286B2 (en) * 2002-08-22 2009-06-17 富士通株式会社 Digital filter device
US7492848B2 (en) * 2005-04-13 2009-02-17 Texas Instruments Incorporated Method and apparatus for efficient multi-stage FIR filters
US20070150697A1 (en) * 2005-05-10 2007-06-28 Telairity Semiconductor, Inc. Vector processor with multi-pipe vector block matching
US8620980B1 (en) 2005-09-27 2013-12-31 Altera Corporation Programmable device with specialized multiplier blocks
US8041759B1 (en) 2006-02-09 2011-10-18 Altera Corporation Specialized processing block for programmable logic device
US8266198B2 (en) * 2006-02-09 2012-09-11 Altera Corporation Specialized processing block for programmable logic device
US8301681B1 (en) 2006-02-09 2012-10-30 Altera Corporation Specialized processing block for programmable logic device
US8266199B2 (en) * 2006-02-09 2012-09-11 Altera Corporation Specialized processing block for programmable logic device
US7836117B1 (en) * 2006-04-07 2010-11-16 Altera Corporation Specialized processing block for programmable logic device
KR20090008317A (en) 2006-05-02 2009-01-21 슈퍼불브스, 인크. Plastic led bulb
CN101484964A (en) 2006-05-02 2009-07-15 舒伯布尔斯公司 Method of light dispersion and preferential scattering of certain wavelengths of light for light-emitting diodes and bulbs constructed therefrom
EP2021683A4 (en) 2006-05-02 2010-10-27 Superbulbs Inc Heat removal design for led bulbs
US7822799B1 (en) 2006-06-26 2010-10-26 Altera Corporation Adder-rounder circuitry for specialized processing block in programmable logic device
US8386550B1 (en) 2006-09-20 2013-02-26 Altera Corporation Method for configuring a finite impulse response filter in a programmable logic device
CN101163240A (en) * 2006-10-13 2008-04-16 国际商业机器公司 Filter arrangement and method thereof
US7930336B2 (en) 2006-12-05 2011-04-19 Altera Corporation Large multiplier for programmable logic device
US7814137B1 (en) 2007-01-09 2010-10-12 Altera Corporation Combined interpolation and decimation filter for programmable logic device
US8650231B1 (en) 2007-01-22 2014-02-11 Altera Corporation Configuring floating point operations in a programmable device
US7865541B1 (en) 2007-01-22 2011-01-04 Altera Corporation Configuring floating point operations in a programmable logic device
US8645450B1 (en) 2007-03-02 2014-02-04 Altera Corporation Multiplier-accumulator circuitry and methods
US7949699B1 (en) 2007-08-30 2011-05-24 Altera Corporation Implementation of decimation filter in integrated circuit device using ram-based data storage
WO2009045438A1 (en) 2007-10-03 2009-04-09 Superbulbs, Inc. Glass led light bulbs
CN101896766B (en) * 2007-10-24 2014-04-23 开关电灯公司 Diffuser for LED light sources
US8959137B1 (en) 2008-02-20 2015-02-17 Altera Corporation Implementing large multipliers in a programmable integrated circuit device
US20100057472A1 (en) * 2008-08-26 2010-03-04 Hanks Zeng Method and system for frequency compensation in an audio codec
US8307023B1 (en) 2008-10-10 2012-11-06 Altera Corporation DSP block for implementing large multiplier on a programmable integrated circuit device
US8468192B1 (en) 2009-03-03 2013-06-18 Altera Corporation Implementing multipliers in a programmable integrated circuit device
US8706790B1 (en) 2009-03-03 2014-04-22 Altera Corporation Implementing mixed-precision floating-point operations in a programmable integrated circuit device
US8645449B1 (en) 2009-03-03 2014-02-04 Altera Corporation Combined floating point adder and subtractor
US8650236B1 (en) 2009-08-04 2014-02-11 Altera Corporation High-rate interpolation or decimation filter in integrated circuit device
US8396914B1 (en) 2009-09-11 2013-03-12 Altera Corporation Matrix decomposition in an integrated circuit device
US8412756B1 (en) 2009-09-11 2013-04-02 Altera Corporation Multi-operand floating point operations in a programmable integrated circuit device
US8539016B1 (en) 2010-02-09 2013-09-17 Altera Corporation QR decomposition in an integrated circuit device
US7948267B1 (en) 2010-02-09 2011-05-24 Altera Corporation Efficient rounding circuits and methods in configurable integrated circuit devices
US8601044B2 (en) * 2010-03-02 2013-12-03 Altera Corporation Discrete Fourier Transform in an integrated circuit device
US8484265B1 (en) 2010-03-04 2013-07-09 Altera Corporation Angular range reduction in an integrated circuit device
US8510354B1 (en) 2010-03-12 2013-08-13 Altera Corporation Calculation of trigonometric functions in an integrated circuit device
US8539014B2 (en) * 2010-03-25 2013-09-17 Altera Corporation Solving linear matrices in an integrated circuit device
US8862650B2 (en) 2010-06-25 2014-10-14 Altera Corporation Calculation of trigonometric functions in an integrated circuit device
US8589463B2 (en) 2010-06-25 2013-11-19 Altera Corporation Calculation of trigonometric functions in an integrated circuit device
US8577951B1 (en) 2010-08-19 2013-11-05 Altera Corporation Matrix operations in an integrated circuit device
US8645451B2 (en) 2011-03-10 2014-02-04 Altera Corporation Double-clocked specialized processing block in an integrated circuit device
US9600278B1 (en) 2011-05-09 2017-03-21 Altera Corporation Programmable device using fixed and configurable logic to implement recursive trees
US8812576B1 (en) 2011-09-12 2014-08-19 Altera Corporation QR decomposition in an integrated circuit device
US8949298B1 (en) 2011-09-16 2015-02-03 Altera Corporation Computing floating-point polynomials in an integrated circuit device
US9053045B1 (en) 2011-09-16 2015-06-09 Altera Corporation Computing floating-point polynomials in an integrated circuit device
US8591069B2 (en) 2011-09-21 2013-11-26 Switch Bulb Company, Inc. LED light bulb with controlled color distribution using quantum dots
US8762443B1 (en) 2011-11-15 2014-06-24 Altera Corporation Matrix operations in an integrated circuit device
US8543634B1 (en) 2012-03-30 2013-09-24 Altera Corporation Specialized processing block for programmable integrated circuit device
US9098332B1 (en) 2012-06-01 2015-08-04 Altera Corporation Specialized processing block with fixed- and floating-point structures
US8996600B1 (en) 2012-08-03 2015-03-31 Altera Corporation Specialized processing block for implementing floating-point multiplier with subnormal operation support
US9207909B1 (en) 2012-11-26 2015-12-08 Altera Corporation Polynomial calculations optimized for programmable integrated circuit device structures
US9189200B1 (en) 2013-03-14 2015-11-17 Altera Corporation Multiple-precision processing block in a programmable integrated circuit device
US9348795B1 (en) 2013-07-03 2016-05-24 Altera Corporation Programmable device using fixed and configurable logic to implement floating-point rounding
US9792118B2 (en) * 2013-11-15 2017-10-17 Qualcomm Incorporated Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision filter vector processing operations with reduced sample re-fetching and power consumption, and related vector processor systems and methods
US9684488B2 (en) 2015-03-26 2017-06-20 Altera Corporation Combined adder and pre-adder for high-radix multiplier circuit
US10942706B2 (en) 2017-05-05 2021-03-09 Intel Corporation Implementation of floating-point trigonometric functions in an integrated circuit device
US11381257B2 (en) * 2018-05-23 2022-07-05 Maxlinear, Inc. Capacity achieving multicarrier modulation and coding systems and methods
CN114258647B (en) * 2019-08-26 2024-04-09 三菱电机株式会社 Receiver with a receiver body
CN111751037B (en) * 2020-05-15 2021-08-31 中国人民解放军军事科学院国防工程研究院 Electric measurement explosion test data compression method

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4809209A (en) 1985-08-26 1989-02-28 Rockwell International Corporation Mybrid charge-transfer-device filter structure
US5047972A (en) 1987-04-24 1991-09-10 Hitachi, Ltd. Digital signal processor with memory having data shift function
US5050118A (en) 1987-06-12 1991-09-17 Fanuc Ltd. PLC device having combined hardware and software input filtering
US5210705A (en) * 1990-02-28 1993-05-11 Texas Instruments Incorporated Digital filtering with single-instruction, multiple-data processor
US5307300A (en) * 1991-01-30 1994-04-26 Oki Electric Industry Co., Ltd. High speed processing unit
US5381355A (en) * 1993-12-17 1995-01-10 Elsag International N.V. Method for filtering digital signals in a pressure transmitter
US5548541A (en) 1994-08-08 1996-08-20 Interstate Electronics Corporation Finite impulse response filter for modulator in digital data transmission system
US5566101A (en) 1995-08-15 1996-10-15 Sigmatel, Inc. Method and apparatus for a finite impulse response filter processor
US5615234A (en) * 1994-05-19 1997-03-25 Spacelabs Medical, Inc. Digital high-pass filter having baseline restoration means
US5636151A (en) 1994-06-15 1997-06-03 Nec Corporation Adaptive filter capable of removing a residual echo at a rapid speed
US5636153A (en) 1993-10-20 1997-06-03 Yamaha Corporation Digital signal processing circuit
US5646983A (en) 1993-03-25 1997-07-08 U.S. Robotics Access Corp. Host computer digital signal processing system for communicating over voice-grade telephone channels
US5678059A (en) 1994-02-18 1997-10-14 Lucent Technologies Inc. Technique for time-sharing a microprocessor between a computer and a modem
US5799064A (en) 1995-08-31 1998-08-25 Motorola, Inc. Apparatus and method for interfacing between a communications channel and a processor for data transmission and reception
US5802153A (en) 1996-02-28 1998-09-01 Motorola, Inc. Apparatus and method for interfacing between a communications channel and a processor for data transmission and reception
US5801695A (en) 1994-12-09 1998-09-01 Townshend; Brent High speed communications system for analog subscriber connections
US5931950A (en) 1997-06-17 1999-08-03 Pc-Tel, Inc. Wake-up-on-ring power conservation for host signal processing communication system
US5940459A (en) 1996-07-09 1999-08-17 Pc-Tel, Inc. Host signal processor modem and telephone
US5960035A (en) 1995-09-29 1999-09-28 Motorola Inc. Method and apparatus for load balancing for a processor operated data communications device
US5982814A (en) 1996-08-01 1999-11-09 Pc-Tel, Inc. Dynamic control of processor utilization by a host signal processing modem
US5995540A (en) 1997-01-08 1999-11-30 Altocom, Inc. System and method for reducing processing requirements of modem during idle receive time
US6112266A (en) 1998-01-22 2000-08-29 Pc-Tel, Inc. Host signal processing modem using a software circular buffer in system memory and direct transfers of samples to maintain a communication signal

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6018755A (en) 1996-11-14 2000-01-25 Altocom, Inc. Efficient implementation of an FIR filter on a general purpose processor
US5862063A (en) * 1996-12-20 1999-01-19 Compaq Computer Corporation Enhanced wavetable processing technique on a vector processor having operand routing and slot selectable operations
US6411976B1 (en) * 1998-08-10 2002-06-25 Agere Systems Guardian Corp. Efficient filter implementation

Patent Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4809209A (en) 1985-08-26 1989-02-28 Rockwell International Corporation Mybrid charge-transfer-device filter structure
US5047972A (en) 1987-04-24 1991-09-10 Hitachi, Ltd. Digital signal processor with memory having data shift function
US5050118A (en) 1987-06-12 1991-09-17 Fanuc Ltd. PLC device having combined hardware and software input filtering
US5210705A (en) * 1990-02-28 1993-05-11 Texas Instruments Incorporated Digital filtering with single-instruction, multiple-data processor
US5307300A (en) * 1991-01-30 1994-04-26 Oki Electric Industry Co., Ltd. High speed processing unit
US5646983A (en) 1993-03-25 1997-07-08 U.S. Robotics Access Corp. Host computer digital signal processing system for communicating over voice-grade telephone channels
US6097794A (en) 1993-03-25 2000-08-01 U.S. Robotics Access Corp. Host computer digital signal processing system for communicating over voice-grade telephone channels
US5724413A (en) 1993-03-25 1998-03-03 U.S. Robotics, Inc. Host computer digital signal processing system for communicating over voice-grade telephone channels
US5872836A (en) 1993-03-25 1999-02-16 3Com Corporation Host computer digital signal processing system for communicating over voice-grade telephone channels
US5636153A (en) 1993-10-20 1997-06-03 Yamaha Corporation Digital signal processing circuit
US5381355A (en) * 1993-12-17 1995-01-10 Elsag International N.V. Method for filtering digital signals in a pressure transmitter
US5678059A (en) 1994-02-18 1997-10-14 Lucent Technologies Inc. Technique for time-sharing a microprocessor between a computer and a modem
US5615234A (en) * 1994-05-19 1997-03-25 Spacelabs Medical, Inc. Digital high-pass filter having baseline restoration means
US5636151A (en) 1994-06-15 1997-06-03 Nec Corporation Adaptive filter capable of removing a residual echo at a rapid speed
US5548541A (en) 1994-08-08 1996-08-20 Interstate Electronics Corporation Finite impulse response filter for modulator in digital data transmission system
US5801695A (en) 1994-12-09 1998-09-01 Townshend; Brent High speed communications system for analog subscriber connections
US5809075A (en) 1994-12-09 1998-09-15 Townshend; Brent High speed communications system for analog subscriber connections
US5835538A (en) 1994-12-09 1998-11-10 Townshend; Brent High speed communications system for analog subscriber connections
US5566101A (en) 1995-08-15 1996-10-15 Sigmatel, Inc. Method and apparatus for a finite impulse response filter processor
US5799064A (en) 1995-08-31 1998-08-25 Motorola, Inc. Apparatus and method for interfacing between a communications channel and a processor for data transmission and reception
US5960035A (en) 1995-09-29 1999-09-28 Motorola Inc. Method and apparatus for load balancing for a processor operated data communications device
US5802153A (en) 1996-02-28 1998-09-01 Motorola, Inc. Apparatus and method for interfacing between a communications channel and a processor for data transmission and reception
US5940459A (en) 1996-07-09 1999-08-17 Pc-Tel, Inc. Host signal processor modem and telephone
US6252920B1 (en) 1996-07-09 2001-06-26 Pc-Tel, Inc. Host signal processor modem and telephone
US5982814A (en) 1996-08-01 1999-11-09 Pc-Tel, Inc. Dynamic control of processor utilization by a host signal processing modem
US5995540A (en) 1997-01-08 1999-11-30 Altocom, Inc. System and method for reducing processing requirements of modem during idle receive time
US5931950A (en) 1997-06-17 1999-08-03 Pc-Tel, Inc. Wake-up-on-ring power conservation for host signal processing communication system
US6112266A (en) 1998-01-22 2000-08-29 Pc-Tel, Inc. Host signal processing modem using a software circular buffer in system memory and direct transfers of samples to maintain a communication signal

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7398288B2 (en) 1996-11-14 2008-07-08 Broadcom Corporation Efficient implementation of a filter
US9015219B2 (en) 2011-10-28 2015-04-21 Stmicroelectronics International N.V. Apparatus for signal processing

Also Published As

Publication number Publication date
US7398288B2 (en) 2008-07-08
US6018755A (en) 2000-01-25
US20040039764A1 (en) 2004-02-26
US6209013B1 (en) 2001-03-27

Similar Documents

Publication Publication Date Title
US6618739B1 (en) Digital filter implementation suitable for execution, together with application code, on a same processor
US6112218A (en) Digital filter with efficient quantization circuitry
US6260053B1 (en) Efficient and scalable FIR filter architecture for decimation
CA1211523A (en) Echo canceller for a baseband data signal
US5864545A (en) System and method for improving convergence during modem training and reducing computational load during steady-state modem operations
US4016410A (en) Signal processor with digital filter and integrating network
JP4445041B2 (en) Echo cancellation with adaptive dual filter
GB2064902A (en) Telephone line circuit
JPH0418808A (en) Automatic equalizer and semiconductor integrated circuit
US6304133B1 (en) Moving average filter
US6195386B1 (en) Parallel decision feedback equalizer
EP0054024B1 (en) Subscriber line audio processing circuit apparatus
US6411976B1 (en) Efficient filter implementation
JP2002164819A (en) Echo canceler
US4319360A (en) Predictor stage for a digit rate reduction system
FI72238C (en) INTERPOLATIVAL ANALOG-DIGITAL NETWORK.
Gay-Bellile et al. Architecture of a programmable FIR filter co-processor
Wesolowski et al. A simplified two-stage equalizer with a reduced number of multiplications for data transmission over voiceband telephone links
CA2282567C (en) Adaptive dual filter echo cancellation
Koh et al. Algorithms and architecture of a VLSI signal processor for ANSI standard ISDN transceiver
Okello et al. A new architecture for implementing pipelined ADF
Bonet et al. Architecture of the 2B1Q Symbol Receiver in the MC145472 ISDN U Transceiver
Chen Efficient multiplierless filter designs for fixed-coefficient and adaptive filtering
Mizuno et al. A high-throughput pipelined architecture for blind adaptive equalization with minimum latency
Tiwari et al. Linear Predictive Coding in a New Binary System

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: BROADCOM CORPORATION, CALIFORNIA

Free format text: MERGER;ASSIGNOR:ALTOCOM, INC.;REEL/FRAME:016097/0079

Effective date: 20040526

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 8

SULP Surcharge for late payment

Year of fee payment: 7

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20150909

AS Assignment

Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA

Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001

Effective date: 20160201

Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH

Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:037806/0001

Effective date: 20160201

AS Assignment

Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001

Effective date: 20170120

Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BROADCOM CORPORATION;REEL/FRAME:041706/0001

Effective date: 20170120

AS Assignment

Owner name: BROADCOM CORPORATION, CALIFORNIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041712/0001

Effective date: 20170119