SSL-Workbench

SSL-Workbench is a suite of stand-alone software tools for some specific applications. The workbench is a stand-alone tool in the sense that Speech Science Lab (SSL) software is not required to run it. These tools can be used either as teaching tools or for running small scale research projects. User can opt for any one or more of these depending on the area of interest. SSL workbench can be installed on any PC with windows XP/7.1. SSL-Workbench software is compatible with PCM mono 16 bit 8/16 kHz files.

The SSL-workbench modules available are

  • Speaker Recognition/Verification
  • Articulatory Synthesis
  • Speech Coding
  • Speech Recognition

SSL-Workbench for Semi-automatic Speaker Recognition/Verification

Typical Application: To build a semi-automatic vocabulary dependent speaker recognition/verification. Testing its performance for a choice of a variety of acoustic parameters and distance metrics.

Features:

  • Facility to design the speaker recognition test – No. of speakers, phone being compared, No. of repetitions per speaker, No. of sessions recorded
  • Facility to record a signal
  • Facility for segmentation – to mark the beginning and ending locations of the chosen phone – No need to segment the signal into a separate file
  • Select any one of the acoustic features such as mel scale log spectra, MFCCs, linear cepstral coefficients, parcor coefficients, autocorrelation coefficients, F0 based on cepstrum or autocorrelation etc
  • Facility to suppress any one or more coefficients such those with large standard deviation and thus to test recognition on a subset
  • Facility to print out of the feature vectors
  • Facility to compare telephone quality speech with full-band speech
  • Compute the distance matrix – Euclidian distance or Weighted Euclidian distance or k-nearest neighbor rule
  • Facility to print out of distance matrix
  • Facility to compute speaker recognition or verification score
  • Facility for pair-wise graphic comparison of the feature vectors in terms of spectrograms, log spectra, auto-correlation function, MFCCs etc

 

SSL-Workbench for the Development of a Text-to-Speech Synthesis System

Typical Application: To interactively develop a semi-automatic moderate vocabulary text-to-speech synthesis system using an articulatory model. Program is especially useful for perception experiments to manipulate the articulatory dynamics, rate of transitions, duration of segments etc. (*nasals, laterals under development.)

Features

  • Vocal Tract Filter
    • A database with a set of default articulatory positions along with phone symbols exists.
    • Facility to change the default positions of the articulators for any given phone using an interactive articulatory model and compute formant data.
    • Tune the positions to obtain the desired reference acoustic (formant) data.
    • Save the positions in a database along with a phone symbol.
    • A text file exists for the articulatory dynamics rules with default values and with a specific syntax. The text file may be edited to generate new rules.
  • Source
    • A text file exists for generating intensity contours and duration of segments. The text file may be edited to generate new rules.
    • Select a default intonation contour (declination, hat).
    • Pitch level and slopes may be edited.
    • User can specify an existing acoustic signal file with analyzed parameters to copy the supra-segmentals.
    • Copy stop bursts into the synthesized signal to make it natural sounding.
  • Typical usage:
    • User enters a sequence of phone symbols and synthesizes the utterance.
    • The synthesized signal and its spectrogram are displayed.
    • Synthesized speech signal can be saved.
    • An animation of dynamic articulatory movement can be displayed.

 

SSL-Workbench for Speech Coding (LPC based)

(Applicable for input speech signal files in ‘wav’ format with linear PCM 16-bit mono, sampled at 8000 Hz)

Typical Application:
To interactively synthesize LP coded speech at various bit rates and to evaluate the quality of the LP vocoder (for scalar and vector quantization) and segment LP vocoder.

Features:

      • Facility to record a signal, manually segment the desired part.
      • Facility to perform acoustic analysis to extract the parameters.
      • Facility for graphic display of the parameters – correcting the source parameters in case of errors.
      • Facility to apply quantization rules on parcor coefficients – scalar, vector, segmental.
      • Re-synthesize at various bit rates and prepare synthesized speech signal stimuli.
      • Facility to run a perception test

 

SSL-Workbench for Semi-automatic Speech Recognition (Distance based)

(Applicable for input speech signal files in ‘wav’ format with linear PCM 16-bit mono, sampled at 16000 or 8000 Hz)

Typical Application:
To test the performance of a semi-automatic segment based speech recognition system for a choice of a variety of acoustic parameters and distance metrics.

Features:

    • Facility to design the speech recognition test – No. of speakers, phone being compared, No. of repetitions per speaker, No. of sessions recorded.
    • Facility to record a signal
    • Facility for segmentation – to mark the beginning and ending locations of the chosen phone – No need to segment the signal into a separate file.
    • Select any one of the acoustic features such as mel scale log spectra, MFCCs, linear cepstral coefficients, parcor coefficients, auto-correlation coefficients
    • Facility to suppress any one or more coefficients
    • Facility to print out of the feature vectors.
    • Compute the distance matrix – Euclidian distance or Weighted Euclidian distance or k-nearest neighbor rule
    • Speech recognition task to get the score
    • print out of distance matrix
    • Facility to compare telephone quality speech with full-band speech.
    • Facility for pair-wise graphic comparison of the signal in terms of spectrograms, log spectra, auto-correlation function, MFCCs etc.