• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

soumith/lua---audio: Module for torch to support audio i/o as well as do common ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

soumith/lua---audio

开源软件地址(OpenSource Url):

https://github.com/soumith/lua---audio

开源编程语言(OpenSource Language):

C 68.3%

开源软件介绍(OpenSource Introduction):

Audio Library for Torch

Audio library for Torch-7

  • Support audio I/O (Load files, save files)
  • Common audio operations (Short-time Fourier transforms, Spectrograms)

Load the following formats into a torch Tensor

  • mp3, wav, aac, ogg, flac, avr, cdda, cvs/vms,
  • aiff, au, amr, mp2, mp4, ac3, avi, wmv,
  • mpeg, ircam and any other format supported by libsox.

Calculate Short-time Fourier transforms with

  • window types - rectangular, hamming, hann, bartlett

Generate spectrograms

Dependencies

  • libsox v14.3.2 or above
  • libfftw3

Quick install on OSX (Homebrew):

$ brew install sox
$ brew install fftw

Linux (Ubuntu):

$ sudo apt-get install libfftw3-dev
$ sudo apt-get install sox libsox-dev libsox-fmt-all

Installation

This project can be installed with luarocks like this:

$ luarocks install https://raw.githubusercontent.com/soumith/lua---audio/master/audio-0.1-0.rockspec

On Ubuntu 13.04 64-bit, I had to modify the command slightly because of new library directory structures not picked up by luarocks.

$ sudo luarocks install https://raw.githubusercontent.com/soumith/lua---audio/master/audio-0.1-0.rockspec LIBSOX_LIBDIR=/usr/lib/x86_64-linux-gnu/ LIBFFTW3_LIBDIR=/usr/lib/x86_64-linux-gnu

Or, if you have downloaded this repository on your machine, and you are in its directory:

$ luarocks make

Usage

audio.load

 loads an audio file into a torch.Tensor
 usage:
 audio.load(
     string                              -- path to file
 )

returns torch.Tensor of size NSamples x NChannels, sample_rate

audio.save

 saves a tensor into an audio file. The extension of the given path is used as the saving format.
 usage:
 audio.save(
     string                              -- path to file
	 tensor                              -- NSamples x NChannels 2D tensor
	 number                              -- sample_rate of the audio to be saved as
 )

audio.compress

 Compresses a tensor in-memory and returns a CharTensor. The extension of the given path is used as the saving format. This can be decompressed using the "decompress" method
 usage:
 audio.compress(__
	 tensor                              -- NSamples x NChannels 2D tensor
	 number                              -- sample_rate of the audio to be saved as
     extension                           -- format of audio to compress in. Example: mp3, ogg, flac, sox etc.
 )

audio.decompress

 Decompresses a tensor in-memory and returns raw audio. The extension of the given path is used as the loading format.
 usage:
 audio.decompress(__
	 CharTensor                          -- 1D CharTensor that was returned by .compress
     extension                           -- format of audio used to compress. Example: mp3, ogg, flac, sox etc.
 )

audio.stft

calculate the stft of an audio. returns a 3D tensor, with number_of_windows x window_size/2+1 x 2(complex number with real and complex parts)
usage:
audio.stft(
    torch.Tensor                        -- input single-channel audio
    number                              -- window size
    string                              -- window type: rect, hamming, hann, bartlett
    number                              -- stride
)

audio.spectrogram

generate the spectrogram of an audio. returns a 2D tensor, with number_of_windows x window_size/2+1, each value representing the magnitude of each frequency in dB
usage:
audio.spectrogram(
    torch.Tensor                        -- input single-channel audio
    number                              -- window size
    string                              -- window type: rect, hamming, hann, bartlett
    number                              -- stride
)

Example Usage

Generate a spectrogram

require 'audio'
require 'image' -- to display the spectrogram
voice = audio.samplevoice()
spect = audio.spectrogram(voice, 8192, 'hann', 512)
image.display(spect)



鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
上一篇:
zhengguo07q/UnityLuaFramework: 一个用LUA写的框架游戏框架发布时间:2022-08-16
下一篇:
jonstoler/lua-toml: toml decoder/encoder for Lua发布时间:2022-08-16
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap