NoteLab
NoteLab is a scientific computing platform for the analyses of animal vocal behavior. It enables you to perform computationally intensive tasks on mass amounts of recorded sounds, run complex selection queries, and visualize results. It can best be seen as a data mining tool for vocal behavior.

NoteLab is based on the very high-level, object-oriented language Python, and can be used interactively, in scripts, or even as an extendable basis in the development of your own software for animal communication research. NoteLab is developed by me, and it is intented to be free and open software. I am using alpha versions in my own research at this moment. I welcome collaborations on this project.

Features

  • Mass storage of up to millions of sound events ('notes') with microsecond precision in absolute time, all in a single file.
  • Efficient retrieval of subsets of sounds, and/or information about them, with the possibility of complex selection criteria.
  • Microphone array-aware (for multi-channel recordings, to enable, e.g., localization, source separation).
  • Signal analyses, statistical measurements or visualization in the Python scientific computing environment (SciPy/NumPy)
  • Advanced statistics (e.g. clustering, discriminant function analysis) also possible through an interface to R, or python modules such as MDP and Orange.
  • Multi-platform (runs on Windows, GNU/Linux, Mac, etc.).

 

Key technologies

NoteLab glues together and builds upon state-of-the-art, proven, and well-maintained technology. It thus avoids reinventing the wheel, which saves a lot of time and bugs.

  • Sound and meta-info (e.g. measurements) are stored-- through PyTables--using HDF5, the newest standard for scientific data storage in data intensive computing environments, designed by the National Center for Supercomputing Applications.
  • Scientific computing with Python extensions SciPy/NumPy.

FAQ

- Why provide this information if NotaLab is not publically available yet?
- Is NoteLab an alternative to Sound Analysis Pro?
- Is NoteLab an alternative to Matlab®?
- Why Python?
- Why PyTables/HDF5?
- What is the practical maximum size of a NoteLab dataset?
- Can I make graphs, spectrograms etc?
- Can I view / access my data without NoteLab?

Why provide this information if NotaLab is not publically available yet? If you are interested in this project and are proficient in Python and/or data-intensive computing and/or have particularly good ideas about functionality of such software for research, you may want to help in the further development of NoteLab before its first release. I would be happy to involve anyone who is able to contribute one way or the other.

Is NoteLab an alternative to Sound Analysis Pro? No, absolutely not. NoteLab is intended as a programming environment for mass storage and computing of animal vocalizations. That is: you can efficiently store and retrieve huge amounts of sound events, and perform computations at a command prompt or in a script. Sound Analysis Pro is much more tailored to end users with specific needs, and involves no programming / scripting by the user.

Is NoteLab an alternative to Matlab®? As far as the above mentioned purpose is concerned, yes, absolutely. Like Matlab® it is intended for scientific computing / programming. The main advantage of NoteLab is that it does the storage and retrieval of sound events for you. Moreover, its programming language / environment (Python/Numpy/Scipy) has many advantages over Matlab®'s. It has a completely open development model, and, in contrast to Matlab®, it is free, which saves you and/or society the costs of an expensive commercial package with yearly renewal costs and a vendor lock-in. I therefore encourage you to also consider Python-based NumPy/SciPy as an alternative to Matlab® for general scientific computing tasks (for a comparison see: NumPy for Matlab® Users)

Why Python? Python is an interactive high-level language that allows rapid software development and interactive debugging. It includes a wide variety of well supported software libraries for tasks such as data analysis, statistical measurements, and visualization. Python is a complete, well-defined, mature language with a strong user base. The powerful object-oriented and dynamic typing features of the language greatly aid in the development of flexible and intuitive data structures and user interface elements. Its interactive mode of user interface follows the same principle as that of Matlab, and shares the same advantages for quick prototyping, querying and scripting of complex object manipulations and computations. Python is open and free, which is in line with scientific principles.

Why PyTables/HDF5? See this page.

What is the practical maximum size of a NoteLab dataset? Theoretically there is a maximum of 2,147,483,648 notes per dataset. This a lot. It corresponds all vocal behavior of a hypothetical animal that produces 10 notes per second, 24 hours per day, for 7 years. Practically speaking, using regular hardware: I don't know. Storage and I/O are established through PyTables and HDF5, which are extremely well-designed pieces of software that deal with huge datasets efficiently. Of course there are limits to what you can do with your hardware. I use datasets with at least 500,000 2-channel birdsong syllables (file size about 20 Gb) on a daily basis on a regular PC (Pentium 4, 512 Mb memory) without any problems. This corresponds to the complete recording, at CD quality, of all sounds that two zebra finches make in 10 days.

Can I make graphs, spectrograms etc? Yes. Although this is beyond the scope of NoteLab itself, one can use one of the many graphing / visualization modules that are available for Python. I use Matplotlib myself, a Python package that features Matlab-like graphing capabilities.

Can I view / access my data without NoteLab? Yes, data sets are saved in pure PyTables / HDF5 format, which is an open standard that has nothing specifically to do with NoteLab. If you want you can access your data sets graphically (for instance with ViTables (highly recommended!), or other hdf5 viewers) or use publically available tools to export your data to pretty much any format and output device you like.


MATLAB® is a registered trademark of The MathWorks.
Computing