Skip to content

Using wav2png to generate waveform and spectrogram images

Frederic Font edited this page Jun 14, 2018 · 3 revisions

Freesound generates the waveform and spectrogram images used in the sound players with code found in the utils.audioprocessing module. In particular it uses the function utils.audioprocessing.processing.create_wave_images. However, we added a wrapper to this function that can be used independently of the rest of Freesound code as a command-line utility for generating spectrogram and waveform images given an audio file. The input must be uncompressed PCM audio (as supported by scikits.audiolab class). The wrapper can be found in utils/audioprocessing/wav2png.py.

Installing and running wav2png

  1. Checkout Freesound code git clone [email protected]:MTG/freesound.git
  2. cd to audioprocessing modeule directory cd utils/audioprocessing
  3. Install Python requirements pip install pillow numpy scikits.audiolab (scikits.audiolab might require you to install some additional libraries, check here https://pypi.org/project/scikits.audiolab/)
  4. You can now run it like python wav2png
Usage: wav2png.py [options] input-filename

Options:
  --help                show this help message and exit
  -a OUTPUT_FILENAME_W, --waveout=OUTPUT_FILENAME_W
                        output waveform image (default input filename +
                        _w.png)
  -s OUTPUT_FILENAME_S, --specout=OUTPUT_FILENAME_S
                        output spectrogram image (default input filename +
                        _s.jpg)
  -w IMAGE_WIDTH, --width=IMAGE_WIDTH
                        image width in pixels (default 500)
  -h IMAGE_HEIGHT, --height=IMAGE_HEIGHT
                        image height in pixels (default 171)
  -f FFT_SIZE, --fft=FFT_SIZE
                        fft size, power of 2 for increased performance
                        (default 2048)
  -c COLOR_SCHEME, --color_scheme=COLOR_SCHEME
                        name of the color scheme to use (one of: 'Freesound2'
                        (default), 'FreesoundBeastWhoosh', 'Cyberpunk',
                        'Rainforest')
  -p, --profile         run profiler and output profiling information

Example:

python wav2png.py -c FreesoundBeastWhoosh /path/to/audiofile.wav -w 900 -h 301