This web site teaches the creation and operation of the MELP and MELPe vocoders, summarizes their most updated information and provides useful resources and solutions related to MELP (MIL_STD-3005), and the later enhanced version known as MELPe (STANAG-4591) vocoders.
Introduction to MELP and MELPe Vocoders
Mixed-excitation linear prediction (MELP) is a United States Department of Defense (US DoD) speech coding standard used mainly in military applications and satellite communications, secure voice, and secure radio devices. Its standardization and later development was led and supported by NSA, and NATO. MELPe is the later "enhanced MELP" vocoder.
History of MELP and MELPe Vocoders
Early MELP Vocoder
The initial MELP was invented by Alan McCree around 1995 [1], and standardized in 1997 as MIL-STD-3005.[2] It surpassed other candidate vocoders in the US DoD competition, including: [3]
(a) Frequency Selective Harmonic Coder (FSHC),
(b) Advanced Multi-Band Excitation (AMBE),
(c) Enhanced Multiband Excitation (EMBE),
(d) Sinusoid Transform Coder (STC),
(e) Subband LPC Coder (SBC), and
(f) Waveform Interpolative (WI) Coder.
MELPe achived better quality than the first five candidates, and thank to its lower complexity than the WI coder, the MELP vocoder won the DoD competition and was selected for MIL-STD-3005. [4]
US MIL-STD-3005, from MELP to MELPe Vocoder
Between 1998 and 2001, a new MELP-based vocoder was created at half the rate (i.e. 1200 bit/s) and substantial enhancements were added to the MIL-STD-3005 by SignalCom (later acquired by Microsoft), AT&T Corporation, and Compandent which included:
(a) additional new vocoder at half the rate (i.e. 1200 bit/s),
(b) substantially improved encoding (analysis),
(c) substantially improved decoding (synthesis),
(d) Noise-Preprocessing for removing background noise,
(e) transcoding between the 2400 bit/s and 1200 bit/s bitstreams, and
(f) new postfilter.
This fairly significant development was aimed to create a new coder at half the rate and have it interoperable with the old MELP standard. This enhanced-MELP (also known as MELPe) was adopted as the new MIL-STD-3005 in 2001 in form of annexes and supplements made to the original MIL-STD-3005, enabling the same quality as the old 2400 bit/s MELP's at half the rate. One of the greatest advantages of the new 2400 bit/s MELPe is that it shares the same bit format as MELP, and hence can interoperate with legacy MELP systems, but would deliver better quality at both ends. MELPe provides much better quality than all older military standards, especially in noisy environments such as battlefield and vehicles and aircraft.
NATO STANAG-4591 MELPe vocoder
In 2002, following extensive competition and testing, the 2400 and 1200 bit/s US DoD MELPe was adopted also as NATO standard, known as STANAG-4591.[5] The NATO testing performance measurements included voice intelligibility, voice quality, speaker recognition, language dependency, speaker dependency, 10 acoustic noise environments, transmission channel under 1% BER, tandem using 16 kbps CVSD vocoder, whispered speech, and real-time implementation. The testing data included Over 36,000 files, or 500 hours of speech under various conditions and languages. As part of NATO testing for new NATO standard, MELPe was tested against other candidates such as France's HSX (Harmonic Stochastic eXcitation) and Turkey's SB-LPC (Split-Band Linear Predictive Coding), as well as the old secure voice standards such as FS1015 LPC-10e (2.4 kbit/s), FS1016 CELP (4.8 kbit/s) and CVSD (16 kbit/s). Subsequently, the MELPe won also the NATO competition, surpassing the quality of all other candidates as well as the quality of all old secure voice standards (CVSD, CELP and LPC-10e). The NATO competition concluded that MELPe substantially improved performance (in terms of speech quality, intelligibility, and noise immunity), while reducing throughput requirements. The NATO testing also included interoperability tests, used over 200 hours of speech data, and was conducted by 3 test laboratories worldwide. Compandent Inc, as a part of MELPe-based projects performed for NSA and NATO, provided NSA and NATO with special test-bed platform known as MELCODER device that provided the golden reference for real-time implementation of MELPe. The low-cost FLEXI-232 Data Terminal Equipment (DTE) made by Compandent, which are based on the MELCODER golden reference, are very popular and widely used for evaluating and testing MELPe in real-time, various channels & networks, and field conditions.
The STANAG-4591 MELPe vocoder was tested along with other codecs in several acoustic noise environments, under 1% Bit Error Rate (BER), and CVSD-codec tandem. The Mean Opinion Score (MOS) subjective test results are summarized in the table below.
Condition \ Coder | 2400 |
1200 | CVSD | CELP | LPC10e |
Quiet | 3.88 | 3.47 | 2.93 | 3.86 | 2.78 |
1% BER | 3.86 | 3.32 | 2.91 | 3.80 | 2.60 |
Office Noise | 3.75 | 3.29 | 2.86 | 3.53 | 2.68 |
MCE Noise | 3.12 | 2.68 | 2.50 | 3.06 | 2.07 |
F15 Fighter | 3.86 | 3.52 | 3.03 | 3.62 | 2.63 |
Bradely | 3.85 | 3.50 | 3.00 | 3.60 | 2.65 |
Black Hawk | 3.60 | 3.20 | 2.97 | 3.40 | 1.80 |
Automobile | 3.61 | 3.15 | 2.97 | 3.42 | 1.81 |
HMWV | 2.41 | 2.07 | 2.03 | 1.94 | 1.10 |
12dB Babble | 2.64 | 2.30 | 2.58 | 2.71 | 1.55 |
6dB Babble | 1.74 | 1.54 | 2.11 | 1.98 | 1.10 |
Tandem CVSD | 2.68 | 2.22 | 2.57 | 2.65 | 1.74 |
Table 1. Mean Opinion Score (MOS) for MELPe vocoders and prior military standard vocoders in different conditions (from NATO STANAG-4591 Phase 2 testing)
The NATO STANAG-4591 MELPe competition's combined performance index is illustrated in the figure below.
TIn 2005, a new 600 bit/s rate MELPe variation by Thales Group (France) was added (without extensive competition and testing as performed for the 2400/1200 bit/s MELPe) to the NATO standard STANAG-4591. [6] The following features were added for the 600 bit/s STANAG-4591:
(a) additional new vocoder at quarter the rate (i.e. 600 bit/s),
(b) transcoding between the 2400 bit/s and 600 bit/s bitstreams, and
(c) adjusted postfilter.
300 bit/s MELP Vocoder
In 2010, MIT Lincoln Labs., Compandent, BBN, and General Dynamics also developed for DARPA a 300 bit/s MELP device .[7] Its quality is better than the 600 bit/s MELPe, but its algorithmic delay is longer.
MELPe Vocoder Implementations
The MELPe has been implemented in many applications including secure radio devices, satellite communications, VoIP, and cellphone applications. In such applications, additional expertise is required for combating channel errors, packet loss, and synchronization loss. Such expertise requires the understanding of the MELPe's bits sensitivity to errors. The 2400 bit/s and 1200 bit/s MELPe include synchronization bit, which is useful in serial communications.
MELPe Intellectual property rights
Note that MELPe (and/or its derivatives) is subject to IPR licensing from the following companies, Texas Instruments (2400 bit/s MELP algorithm / source code), Microsoft (1200 bit/s transcoder), Thales Group (600 bit/s rate), AT&T (Noise Pre-Processor NPP), and Compandent.
About the MELPe Vocoder Algorithm
MELPe - Enhanced Mixed Excitation Linear Predictive (MELP) vocoder, known as military standard MIL-STD-3005 and NATO STANAG 4591, is a triple-rate low rate coder that operates at 2400, 1200 and 600 bps. It improves on previous military standards including the earlier MIL-STD-3005 (MELP), FS-1016 (CELP), FS1015 (LPC10e), and CVSD.
About MELP Vocoder (MIL-STD-3005)
General
The Mixed Excitation Linear Prediction coder is based on the traditional Linear Prediction Coder (LPC) parametric model, but also includes five additional features. They are mixed excitation, aperiodic pulses, adaptive spectral enhancement, pulse dispersion, and Fourier magnitude modeling. A MELP frame interval is 22.5 ms ± 0.01 percent in duration and contains 180 voice samples (8,000 samples/second). These features are illustrated in the MELP decoder block diagram shown in the figure below.
The mixed excitation is implemented using a multi-band mixing model. This model can simulate frequency dependent voicing strength using an adaptive filtering structure implemented with a fixed filter bank. The primary effect of this mixed excitation is to reduce the buzz usually associated with LPC vocoders, especially in broadband acoustic noise.
When the input speech is voiced, the MELP coder can synthesize using either periodic or aperiodic pulses. Aperiodic pulses are used most often during transition regions between voiced and unvoiced segments of the speech signal. This feature enables the decoder to reproduce erratic glottal pulses without introducing tonal sounds.
The adaptive spectral enhancement filter is based on the poles of the linear prediction synthesis filter. Its use enhances the formant structure of the synthetic speech and improves the match between the synthetic and natural bandpass waveforms. It also gives the synthetic speech a more natural quality.
Analog Specification
The recommended analog requirements for the MELP coder are for a nominal bandwidth ranging from 100 Hz to 3800 Hz. Although the MELP coder will operate with a more band limited signal, performance degradation will result. To ensure proper operation of the MELP coder, the A-D conversion process should produce peak values of (or near) -32768 and 32767 (16-bit signed samples). Additionally, the coder should have unity gain, which means that the output speech level should match that of the input speech.
Parameter quantization and encoding
The MELP parameters which are quantized and transmitted are the final pitch; the bandpass voicing strengths; the two gain values; the linear prediction coefficients; the Fourier magnitudes; and the aperiodic flag. The use of the specified quantization procedures is required for interoperability among various implementations.
About MELPe Vcoder (STANAG-4591)
General
The Enhanced Mixed Excitation Linear Prediction coder is MELPe block diagram is illustrated shown below. Substantial enhancements were added to the MIL-STD-3005 by SignalCom (later acquired by Microsoft), AT&T Corporation, and Compandent which included:
(a) additional new vocoders at half and quarter the rate (i.e. 1200 and 600 bit/s),
(b) substantially improved encoding (analysis),
(c) substantially improved decoding (synthesis),
(d) Noise-Preprocessing for removing background noise,
(e) transcoding between the 2400 bit/s and 1200 bit/s bitstreams, and 2400 bit/s and 600 bit/s bitstreams, and
(f) new postfilter.
The STANAG-4591 MELPe vocoder encoding operation is performed as follows.
- At 2400 bit/s the MELPe encodes voice into 22.5 msec frames (which is 180 sample per frame for speech sampled at 8000 samples/sec), using 54 bits per frame (including 1 synchronization bit).
- At 1200 bit/s, the MELPe encodes 3 analyzed speech frames into 67.5 msec super-frames (540 samples) , using 81 bits per such a super-frame (including 1 synchronization bit).
- At 600 bit/s, the MELPe encodes 4 analyzed speech frames into 90 msec super-frames (720 samples), using 54 bits per such a super- frame.
That is illustrated in the figure below (from the STANAG-4591 standard) and summarized in the table below.
Rate | Frame Size (samples) |
Frame Size (msec) | Bits / Frame |
2400 bit/s | 180 | 22.5 | 54 |
1200 bit/s | 540 | 67.5 | 81 |
600 bit/s | 720 | 90.0 | 54 |
Table 2. MELPe vocoder's frame sizes and the number of bits per frame for the three rates
Secure Communications using MELPe Vocoder
Secure Voice using Vocoders
Vocoders such as Linear Predictive Coding known as LPC10e (also known as STANAG-4198 and FS-1015), Continuously variable slope delta modulation (CVSD), Code-Excited Linear Predictive (CELP) coder (FS-1016), and MELP have been used as part of Secure Communication systems. In such systems the vocoder frame bits are usually encrypted and then protected by FEC, subsequently transmitted over some signalling protocol which may involve packetizing, synchronization, modulation etc.. For example, past secure voice standard LPC10e vocoder, FS-1015 and NATO standard STANAG-4198, used encryption and FEC protocols for tactical secure voice as described in STANAG-4197. MELPe STANAG-4591, is also used for Secure Communications over signalling protocol known as Secure Communication Interoperability Protocol (SCIP), in a Variable Data Rate (VDR) vocoder, encryption, and FEC system known as Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS), described below.
SCIP-210
The Secure Communication Interoperability Protocol (SCIP) is an application signalling protocol used to transmit secure voice using STANAG-4591 MELPe codec, and G729D codec over digital cellular systems such as GSM and CDMA, digital mobile satellite systems.
In 2006, the Secure SCIP is a multinational communications standard for application layer protocol developed by the National Security Agency (NSA) to enable interoperable secure communications among allies and partners around the globe. [8] The SCIP-210 Signaling Plan is the specification that defines the application layer signaling used to negotiate a secure end-to-end session between two communication devices, independent of network transport. The SCIP supported channels include digital cellular systems such as GSM and CDMA, digital mobile satellite systems, and a variety of other narrowband digital systems. Its Secure and Clear MELPe Voice applications use MELPe codec, G729D, Voice Activity Detection based Discontinuous Voice (DTX Voice), and Comfort Noise, both are implemented similarly to the GSM standard. Compandent's MELPe software under Android was used and tested by NATO also as part of the development of SCIP.
Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS)
Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) is variable data rate system that is based on a MELPe STANAG-4591, and offers also scalability to enhanced coding at higher rates such as at 8000, 12000 and 16000 bit/s.
In 2009, the TSVCIS was created by National Security Agency (NSA) and Naval Research Laboratory (NRL). [9] It includes the following two main voice categories:
- TSVCIS Narrowband (NB) Waveform which is based on STANAG-4591, the 600 bit/s, 1200 bit/s and 2400 bit/s NATO Interoperable Narrow Band Voice Coder, i.e. MELPe codec,
- TSVCIS Wideband (WB) Waveform which operates at 8000, 12000, and 16000 bit/s and is based on both the STANAG-4591 codec and additional encoded voice wideband parameters that enables scalability of the 2400 bit/s MELPe to higher multi-rate speech coding.
Both categories may have different modes including Forward Error Correction (FEC) using different blocks of BCH (Bose-Chaudhuri-Hocquenghem). The FEC protection gives a tremendous advantage in highly degraded channels so that speech intelligibility will be maintained even at very high rates of channel bit errors.
TSVCIS includes also a special WB Voice 16 Gateway mode at 16000 bit/s channel, using band FEC, which is used when NB to WB crossbanding has occurred.
References
- "A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding," Alan V. McCree, Thomas P. Barnweell, 1995 in IEEE Trans. Speech and Audio Processing (Original MELP).
- "Analog-to-Digital Conversion of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction (MELP)," US DoD (MIL_STD-3005, Original MELP).
- M.R. Bielefeld, L.M. Supplee, "Developing a test program for the DoD 2400 bps vocoder selection process", Acoustics Speech and Signal Processing 1996. ICASSP-96. Conference Proceedings. 1996 IEEE International Conference on, vol. 2, pp. 1141-1144 vol. 2, 1996.
- L.M. Supplee, R.P. Cohn, J.S. Collura, A.V. McCree, "MELP: the new Federal Standard at 2400 bps", Acoustics Speech and Signal Processing 1997. ICASSP-97. 1997 IEEE International Conference on, vol. 2, pp. 1591-1594 vol.2, 1997.
- "The 1200 and 2400 bit/s NATO Interoperable Narrow Band Voice Coder," STANAG-4591, NATO.
- "MELPe Variation for 600 bit/s NATO Narrow Band Voice Coder, STANAG-4591," NATO.
- Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France.
- "SCIP Signaling Plan," Revision 3.6 January 8, 2013.
- “Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) Version 2.1,” July 2, 2012.
find out more in MELPe FAQ
Find out more about MELPe vocoder in Frequently Asked Questions (FAQ)
find out more about MELPe software.
To find out more about Compandent's STANAG-4591 MELPe software...
find out more about MELPe hardware.
To find out more about Compandent's STANAG-4591 MELPe hardware...