Toolbox
Information Technologies
alpine
Fast, easy to use email client that is suitable for both the inexperienced email user as
well as for the most demanding of power users. Successor of Pine, also
developed at the University of Washington.
http://www.washington.edu/alpine/
alsa-base,alsa-utils
The Advanced Linux Sound Architecture is used to provide audio and MIDI
functionality to the GNU/Linux OS. Since I work in the Speech Recognition
field, its importance is evident.
http://www.alsa-project.org/main/index.php/Main_Page
apache ant
Another Neat Tool. Software tool for automating software build processes.
Similar to make but implemented using the Java language. It
requires the Java platform, and is best suited to building Java projects.
The build process is described with a XML file.
http://ant.apache.org/
argouml
Open source UML modeling tool and includes support for all standard UML 1.4 diagrams. It runs on any Java platform and is available in ten languages.
http://argouml.tigris.org/
bash
It is the shell, or command language interpreter, that will appear in the
majority of GNU/Linux distributions.
Bash is an sh-compatible shell that incorporates useful features from the Korn shell (ksh) and C shell (csh).
http://www.gnu.org/software/bash/
bbkeys
Provides keyboard shortcuts to the Blackbox WM. This useful program
takes me one step closer to be a Keyboard Jedi. I may use it several
times per minute without even noticing.
http://bbkeys.sourceforge.net/
beamer
LaTeX class for creating slides for presentations. It works together with pdflatex, dvips and LyX.
http://latex-beamer.sourceforge.net/
[See LaTeX]
beowulf
A classic clustering solution. A most mature project on clusers of
computers.
http://www.beowulf.org/
blackbox
Fast and lighweight window manager for the X Window System built with C++.
This WM provides a nice look-and-feel without the expense of spending lots
of memory. It follows my deed of having a light but still powerful
workstation in my poor resources laptop.
http://blackboxwm.sourceforge.net/AboutBlackbox
blender
3D animation studio. It includes tools for modeling, sculpting, texturing, UV mapping, rigging and constraints, weight painting, particle systems, simulation , rendering, node-based compositing, and non linear video editing, as well as an integrated game engine for real-time interactive 3D, and game creation and playback with cross-platform compatibility.
http://www.blender.org/
boinc
Open-source software for volunteer computing and grid computing. A massive
worldwide cluster for taking profit from idle computer time by using
the processor to cure
diseases, study global warming, discover pulsars and do many other
types of scientific research.
http://boinc.berkeley.edu/
build-essential
This Debian package installs all the necessary packages to compile.
Basically deals with GCC (the GNU Compiler Collection), GNU Make and
additional libraries.
http://gcc.gnu.org/
http://www.gnu.org/software/make/
cmake
Cross Platform Make. CMake is a family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files. CMake generates native makefiles and workspaces that can be used in the compiler environment of your choice.
http://www.cmake.org/
conky
Free, light-weight system monitor for X, that displays any information on your desktop.
http://conky.sourceforge.net/
cups
Common UNIX Printing System. Its name stands for its description. The
packages for Debian are: cupsys, cupsys-server, cupsys-client and
cupsys-bsd.
http://www.cups.org/index.php
darcs
CVS. A different open source source code management system. It is used,
for example, by the DokuWiki developers.
http://darcs.net/
debian gnu/linux
As described in the project's website, it is The Universal Operating
System. It's one the most stable GNU/Linux distributions, widely
spread among servers (a familiar example is the so beloved
cygnus.salle.url.edu) and with over 18733 packages it's almost
sure to be suitable for any application or need.
Debian is highly scalable, a feature I do appreciate very much since I
have quite an old laptop with limited resources. In order to obtain
a fairly good performance with such an old machine I installed the OS
with no graphical environment and with the special laptop utils, apart
from the base system. These
options are available with the tasksel application. The resulting
box is small and swift, ready to grow into a powerful workstation.
http://www.debian.org/
http://www.iberprensa.com/todolinux/articulos/TL80_Portatildebian.pdf
dia
GTK+ based diagram creation program released under the GPL license.
A useful tool for creating high quality diagrams, such as the ones than
can be found in exams ;)
http://live.gnome.org/Dia
dokuwiki
Complete Wiki. Written in PHP and text-files based. Excellent wiki engine.
Permissions, revisions, RSS, search engine... A swift
alternative to Mediawiki, upon Wikipedia is based.
http://wiki.splitbrain.org/wiki:dokuwiki
http://www.wikimatrix.org/
dosbox
A x86 emulator with DOS. Ideal for running those old rusty apps (or
games) that needed DOS, sentimental software.
http://www.dosbox.com/
doxygen
Documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.
http://www.doxygen.org/
drupal
CMS. One of the biggest and most complete open source CMSs around.
http://drupal.org/
http://www.cmsmatrix.org/
flash-player
An app required for many Internet widgets, such as embedded videos on
websites.
http://www.adobe.com/products/flashplayer/
flip
Newline conversion between Unix, Macintosh and MS-DOS ASCII files.
ASCII text files can contain different forms of newlines, depending on which operating system is being used. Converting between these formats is often necessary if you use several operating systems. The flip program will convert the newlines to any format.
http://ccrma-www.stanford.edu/~craig/utility/flip/
ghex
Simple binary editor. It lets users view and edit a binary file in both hex and ascii with a multiple level undo/redo mechanism.
http://live.gnome.org/Ghex
gimp
GNU Image Manipulation Program. A raster graphics editor. Ideal for
tweaking photographs. Sometimes taken for the free software replacement
for Photoshop.
http://www.gimp.org/
git
CVS. Open source version control system designed to handle very large projects with speed and efficiency, but just as well suited for small personal repositories; it is especially popular in the open source community, serving as a development platform for projects like the Linux Kernel, Ruby on Rails, WINE or X.org.
http://git.or.cz/
gnu make
Tool which controls the generation of executables and other non-source files of a program from the program's source files.
Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the non-source files and how to compute it from other files. When you write a program, you should write a makefile for it, so that it is possible to use Make to build and install the program.
http://www.gnu.org/software/make/
hoz
Hacha Open Zource. File splitter.
http://hoz.sourceforge.net/
hddtemp
Displays the present temperature of the HDD passed as a parameter.
http://www.guzu.net/linux/hddtemp.php
See [Debian GNU/Linux]
icedtea
Free implemetation of the Java Virtual Machine.
The IcedTea project provides a harness to build the source code from
openjdk using Free Software build tools and provides replacements for the binary plugs with code from the GNU Classpath project.
http://icedtea.classpath.org/wiki//Main_Page
http://openjdk.java.net/
http://www.gnu.org/software/classpath/
http://java-source.net/
http://www.javaranch.com/
iceweasel
Debian's own Mozilla Firefox compilation, having passed the sieve of
the free software statements. Anyway, this Internet browser is practically
unbeatable. It's most complete and customizable. The only setback is
the big amount of memory that requires.
http://www.geticeweasel.org/
inkscape
Vector Graphics Editor. With this program I became aware of the value
and usefulness of the vector images. Ideal for designing Web 2.0 icons
with Free Software Tools.
http://www.inkscape.org/
jabref
Open source bibliography reference manager. The native file format used by JabRef is BibTeX, the standard LaTeX bibliography format. JabRef runs on the Java VM (version 1.5 or newer).
http://jabref.sourceforge.net/
jacwiki
Wiki engine. The smallest I have ever seen. Written in PHP and text-files
based. This homepage is based upon this project.
To my mind, the big advantage being a small program is that
it can be thoroughly read and
understood to then hack it and adapt it to the needs and preferences of
every developer. I do like it a lot.
http://jacwiki.jacroe.com/
javadoc
Documetation generator.
Tool for generating API documentation in HTML format from doc comments in source code. It can be downloaded only as part of the Java 2 SDK.
http://java.sun.com/j2se/javadoc/
jnode
Java New Operating System Design Effort. Simple to use & install Java operating system for personal use.
It runs on modern devices.
http://www.jnode.org/
jquery
A JavaScript library for including nice effects on websites.
http://jquery.com/
http://www.noupe.com/jquery/50-amazing-jquery-examples-part1.html
jswat
Graphical Java debugger front-end, written to use the Java Platform
Debugger Architecture and based on the NetBeans Platform.
http://code.google.com/p/jswat/
k3b
Feature-rich and easy to handle CD burning application aimed at the
KDE graphical environment.
http://k3b.plainblack.com/
libqt3-mt
Trolltech Qt library, version 3. Required library for the applications
that link against libqt-mt.so.3, like all KDE apps and Opera browser.
http://trolltech.com/products/qt/
librarian
Free tool for self-creation of virtual annotated library of PDF articles, designed for small trusted groups, e.g. science labs.
Librarian is written in PHP and thus produces standard HTML output that can be read by IE5 or NN4 compatible internet browsers.
http://www.bioinformatics.org/librarian/
linux-headers-2.6.18-6-686
The kernel headers. These are used for building extra kernel modules. In my case, I used them for my laptop to support the proprietary Nvidia driver. The Debian package holds the same name.
memstat
This Debian package discovers
what libraries and programs are using up memory.
http://debaday.debian.net/2008/10/19/memstat-identify-what-is-using-up-virtual-memory/
memtest86
A RAM memory tester. Useful for checking a recently bought memory.
http://www.memtest86.com/
mldonkey-server
My favorite P2P client. It accesses lots of different file-sharing
networks. It has a GUI, a TUI and a WUI.
http://mldonkey.sourceforge.net/Main_Page
modconf
Provides a terminal-based interface for installing and configuring device
driver modules. I used it to set the cpufreq module in order to
control the working frequency of the processor thus obtaining an optimal
speed/consumption ratio.
See [Debian GNU/Linux]
nmap
Scans a network in order to determine what hosts are available, what
services (application name and version) those hosts are offering, what
operating systems (and OS versions) they are running, what type of
packet filters/firewalls are in use, and dozens of other characteristics.
It a swiss-army knife for crackers when used malevolently. Useful for
network administrators.
http://nmap.org/
openoffice.org
Office suite provided by Sun Microsystems. Nothing to envy towards the
proprietary MSOffice that most people are still stubborn on using.
http://www.openoffice.org/
openssh-server
Runs a deamon in the host that accepts remote connections via SSH. I find
it useful/necessary for controlling the pc remotely, specially when
a problem occurs and all other peripherals are dead, there's always an
open port (usually TCP/22) available to save the computer
from a crude reboot.
http://www.openssh.com/
opera browser
A very nice fully standards compliant Internet browser with a low memory
footprint that fits in my low resources laptop. Although it is not free
software, the enterprise that develops it offers a free binary distribution
for personal computers and mobile phones.
http://www.opera.com/
pdftk
PDF Toolkit. Simple tool for doing everyday things with PDF documents such as merging, splitting,
rotations, etc.
http://www.pdfhacks.com/pdftk/
pmd
PMD scans Java source code and looks for potential problems like
possible bugs, dead code, suboptimal code, overcomplicated expressions
and duplicate code.
http://pmd.sourceforge.net/
pptp-linux
Client for establishing a VPN against La Salle (for example) through
the PPTP protocol. This is a security hole (remember that MS is
behind it). If I need to surf the Internet with the IP of La Salle
for accessing scientific literature, I rather prefer to use wget in a
SSH connection.
http://pptpclient.sourceforge.net/
python
Dynamic object-oriented programming language that can be used for many kinds of software development. It offers strong support for integration with other languages and tools, comes with extensive standard libraries, and can be learned in a few days. Many Python programmers report substantial productivity gains and feel the language encourages the development of higher quality, more maintainable code.
http://www.python.org/
qiv
Quick Image Viewer. A CLI tool to display images. Handy and swift.
http://www.klografx.net/qiv/
rcconf
Debian admin tool for configuring system services according to system runlevels.
rdesktop
Open source client for Windows Terminal Services, capable of natively speaking Remote Desktop Protocol (RDP) in order to present the user's Windows desktop. Supported servers include Windows 2000 Server, Windows Server 2003, Windows Server 2008, Windows XP, Windows Vista and Windows NT Server 4.0.
http://www.rdesktop.org/
rdiff-backup
Easy incremental backups from the command line.
rdiff-backup is a python script that helps doing local and remote incremental backups.
http://debaday.debian.net/2008/10/26/rdiff-backup-easy-incremental-backups-from-the-command-line/
remind
Sophisticated calendar and alarm program with a Text User Interface. Ideal for combining with alpine (see above).
http://www.roaringpenguin.com/products/remind
simple php blog
Flat file blog written in PHP. Easy to install and run.
http://www.simplephpblog.com/
sox
Sound eXchange. The Swiss Army knife of sound processing programs, as
described in the project's homepage. SoX is a cross-platform
command line utility that can convert various formats of computer audio files in to other formats. It can also apply various effects to these sound files and play and record audio files on many major platforms.
http://sox.sourceforge.net/
subversion
CVS. This is one of those tools one begins getting used to
doing without, until one
is aware of its existence, then tries it and ends up finding impossible
to do the coding tasks without it. I'm not the only one that supports this
opinion.
http://subversion.tigris.org/
tjws
Tiny Java Web Server. The server is pretty small as in Java code as in result byte code. General purpose of the Web server is running and debugging servlets. However, it can be used as a regular web server for sites with low to medium load.
http://tjws.sourceforge.net/
unace
Program for extracting, testing and viewing ACE archives. The Debian package holds the same name.
unzip
Obvious usefulness. The Debian package holds the same name.
vim
Vi improved. Text editor. Console-based, light, customizable... For me,
one essential tool. It has advanced features for programming tasks such
as colored syntax, auto indentation and line nummeration. I use it
almost for everything.
http://www.vim.org/
http://fprintf.net/vimCheatSheet.html
http://www.viemu.com/a_vi_vim_graphical_cheat_sheet_tutorial.html
virtualbox
Easy virtualization program. It provides a generic hardware emulation that
is used to install and run a guest OS inside a host OS. Ideal for
running a guest Win box with all those apps that are still subject to this
platform.
http://www.virtualbox.org/
Eines de virtualitzacio lliures per a sistemes GNU/Linux
visualvm
Visual tool integrating several commandline JDK tools and lightweight profiling capabilities. Designed for both production and development time use, it further enhances the capability of monitoring and performance analysis for the Java SE platform.
https://visualvm.dev.java.net/
vlc
Video LAN Client. A media player. Supports the majority of the encodings
used nowadays. Streaming also available.
http://www.videolan.org/
vlock
Locks the current terminal (local or remote), or locks the entire
virtual console system, completely disabling all console access.
A nice way to keep nosy people at bay.
http://linux.maruhn.com/sec/vlock.html
vncviewer
The VNC client most compatible and compliant with the original implementation.
wine
Wine Is Not an Emulator. It is an implementation of the Win16 and Win32
API for Unix-like systems under the Intel platforms. A means of having
Windows software running on a GNU/Linux box without virtualizing the
whole system.
http://www.winehq.org/
wodim
Write Optical Disk Media. A command line tool that allows you to create CDs or DVDs on a CD/DVD recorder.
http://www.cdrkit.org/
wordpress
Blog publishing system written in PHP. Runs along with a MySQL database.
Very usable and customizable.
http://wordpress.org/
wxwidgets
GUI cross-platform library which can be used from languages such as C++, Python and Perl.
http://www.wxwidgets.org/
x11vnc
A most complete VNC server which runs and is configured through the
command line.
http://www.karlrunge.com/x11vnc/
xchm
A viewer for MS Compiled HTML Help files.
http://xchm.sourceforge.net/
xpaint
A tiny paint program for X. Ideal for those of us who have not had
time to learn a good application like Gimp but still need to hack
images from time to time.
http://sourceforge.net/projects/sf-xpaint/
xserver-xorg
Graphical server based on the open source implementation of the X
Window System provided by the X.Org project. Yes, GUIs drive crazy
eventually. Jokes apart, the installation of this package is a must.
We engineers do have a lot of PDF reading.
http://www.x.org/wiki/
yafc
Yet Another FTP Client. A very nice one. This is a sort of mixture
between a plain
FTP client and a SSH client, with all the advantages that this
implies.
http://yafc.sourceforge.net/
zip
Obvious.
Engineering
adaptive resonance theory for unsupervised learning
This software package includes the ART algorithms for unsupervised learning only. It is a family of four programs based on different ART algorithms (ART 1, ART 2A, ART 2A-C and ART distance). All of them are clustering algorithms and they are command-line programs. Written in C++.
http://www.fi.muni.cz/~xhudik/art/
adriane
Audio Desktop Reference Implementation and Networking Environment.
It is an implementation of an easy-to-use desktop system, which can be used entirely without vision oriented output devices. Especially access to standard internet services like email, www, chat, and using mobile phone extension services like SMS and MMS (over the users own mobile phone via bluetooth) are supported.
http://www.knopper.net/knoppix-adriane/index-en.html
alchemy
Software package providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on the Markov logic representation. Alchemy allows you to easily develop a wide range of AI applications. Coded in C++.
http://alchemy.cs.washington.edu/
aleph
Multi-platform machine learning framework aimed at simplicity and
performance, and library of selected state-of-the-art algorithms.
Aleph is coded in the Java programming language.
http://aleph-ml.sourceforge.net/
ann
Library for Approximate Nearest Neighbor Searching.
ANN is a library written in C++, which supports data structures and algorithms for both exact and approximate nearest neighbor searching in arbitrarily high dimensions.
http://www.cs.umd.edu/~mount/ANN/
anna
The Artificial Neural Network Architecture. It is a Back propagation neural network C++ class developed thinking in a good matching class to the FLTK library.
http://eetorres.googlepages.com/anna
apache mahout
Library aimed at delivering scalable machine learning tools under the Apache license.
http://lucene.apache.org/mahout/
antlr
ANother Tool for Language Recognition.
Language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages. ANTLR provides excellent support for tree construction, tree walking, translation, error recovery, and error reporting.
http://www.antlr.org/
ardour
DAW. Used to record, edit and mix multi-track audio. Ardour strives
to meet the needs of professional users.
http://ardour.org/
http://parumi.org/curso_produccion_musical_linux/capitulo3.html
http://parumi.org/curso_produccion_musical_linux/capitulo4.html
http://parumi.org/curso_produccion_musical_linux/capitulo5.html
arduino
An open source electronics prototyping platform for developing projects
with an ATMEL microcontroller. Ideal for small university projects.
http://www.arduino.cc/
armadillo
Linear algebra library (matrix and vector maths) aiming towards a good balance between speed and ease of use. It's distributed under a license that is useful in both commercial and open-source contexts.
This library is useful if C++ has been decided as the language of choice (due to speed and/or integration capabilities).
http://arma.sourceforge.net/
audacity
Sound editor. A free, open source software for recording and editing
sounds. It's quite complete to my taste. Also extendable through
LADSPA plugins in order to obtain a bigger collection of sound effects.
http://audacity.sourceforge.net/
http://parumi.org/curso_produccion_musical_linux/capitulo1.html
See [LADSPA]
autobi
Tool for the automatic analysis of Standard American English prosody. AuToBI is a java toolkit that hypothesizes pitch accents and phrase boundaries. The toolkit includes an acoustic feature extraction frontend, and a classification backend that is heavily supported by the weka machine learning toolkit.
http://eniac.cs.qc.cuny.edu/andrew/autobi/
bison
Bison is a general-purpose parser generator that converts an annotated context-free grammar into an LALR(1) or GLR parser for that grammar.
Bison is upward compatible with Yacc: all properly-written Yacc grammars ought to work with Bison with no change. Anyone familiar with Yacc should be able to use Bison with little trouble. You need to be fluent in C or C++ programming in order to use Bison.
http://www.gnu.org/software/bison/
See [yacc]
brian
Simulator for spiking neural networks written in Python.
http://brian.di.ens.fr/
brill tagger
PoS Tagger. Uses Transform-Based Learning. Implemented in C.
http://research.microsoft.com/~brill/
chestnut machine learning suite
Collection of machine learning algorithms written in Python with some code written in C for efficiency. Most algorithms are called with a simple, functional API with input data encoded as arrays.
http://www.soe.ucsc.edu/~eads/chestnut/
ci-bayes
Bayesian Classifiers for Java. This project contains two bayesian classifiers for Java: a Naive implementation and a Fishers implementation. It's merely a port from Toby Segaran's python code for Bayesian analysis from his book "Programming Collective Intelligence."
The only requirement for this library is javolution.
It's licensed under the Artistic License.
https://ci-bayes.dev.java.net/
cicero tts
Small, Fast and Free Text-To-Speech Engine.
http://www.cam.org/~nico/cicero/
cilib
Computational Intelligence Library written in Java.
It is a collaborative component
based framework for developing Computational Intelligence software in
swarm intelligence, evolutionary computing, neural networks,
artificial immune systems, fuzzy logic and robotics. Developed at the
University of Pretoria.
http://www.cilib.net/
circuit simulator
A Java-based circuit simulator. A great way to simulate simple circuits using a plain Java enabled browser.
http://www.falstad.com/circuit/
clam
C++ Library for Audio and Music. CLAM is a full-fledged software framework for research and application development in the Audio and Music Domain. It offers a conceptual model as well as tools for the analysis, synthesis and processing of audio signals. It also provides a Faust integration.
http://clam.iua.upf.edu/index.html
See [Faust]
cln
Class Library for Numbers. CLN is a C++ library for efficient computations
with all kinds of numbers in arbitrary precision.
http://www.ginac.de/CLN/
cognitive foundry
Modular Java software library for the research and development of cognitive systems. It contains many reusable components for machine learning, statistics, and cognitive modeling. It is primarily designed to be easy to plug into applications to provide adaptive behaviors.
http://foundry.sandia.gov/
corpus building for minority languages
A web crawling software.
It exploits the vast quantities of text freely available on the web as a way of bringing the benefits of statistical NLP to languages with small numbers of speakers and/or limited computational resources.
http://borel.slu.edu/crubadan/index.html
cvx
Matlab Software for
Disciplined Convex Programming.
Matlab-based modeling system for convex optimization. CVX turns Matlab into a modeling language, allowing constraints and objectives to be specified using standard Matlab expression syntax.
http://www.stanford.edu/~boyd/cvx/
databases for machine learning experiments
An experiment database is a database designed to store learning experiments in full detail, aimed at providing a convenient platform for the study of learning algorithms.
http://expdb.cs.kuleuven.be/expdb/index.php
http://www.statcan.ca/cgi-bin/downpub/freepub.cgi
dbpedia
Community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to make sophisticated queries against Wikipedia, and to link other data sets on the Web to Wikipedia data.
http://wiki.dbpedia.org/About
debellor
Open source extensible data mining platform which provides common architecture for data processing algorithms of various types. The algorithms can be combined together to build data processing networks of large complexity. The unique feature of Debellor is data streaming, which enables efficient processing of large volumes of data. Written in Java.
http://www.mimuw.edu.pl/~mwojnars/debellor/
deltalda
Implements the DeltaLDA model, which is a modification of the Latent Dirichlet Allocation (LDA) model. DeltaLDA can use multiple topic mixing weight priors to jointly model multiple corpora with a shared set of topics. The inference method is Collapsed Gibbs sampling. The program can also be used to do "standard" LDA as a special case, and is implemented as a Python C extension module.
http://pages.cs.wisc.edu/~andrzeje/research/delta_lda.html
dlib
A modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes.
It contemplates threading, networking, GUIs, numerical algorithms,
ML algorithms, image processing, data compression, integrity algorithms and
testing.
http://dclib.sourceforge.net/
dl-learner
Tool for supervised Machine Learning in OWL and Description Logics.
The goal of DL-Learner is to provide a DL/OWL based machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge and learning about the data they created.
http://dl-learner.org/Projects/DLLearner
dysii
C++ library for distributed probabilistic inference and learning in large-scale dynamical systems. It provides methods such as the Kalman, unscented Kalman and particle filters and smoothers, as well as useful classes such as common probability distributions and stochastic processes.
http://www.indii.org/software/dysii
eblearn
Object-oriented C++ library that implements various machine learning models, including energy-based learning, gradient-based learning for machine composed of multiple heterogeneous modules. In particular, the library provides a complete set of tools for building, training, and running convolutional networks.
http://eblearn.sourceforge.net/
ejml
Efficient Java Matrix Library (EJML) is a linear algebra library for manipulating dense matrices. Its design goals are; 1) to be as computationally and memory efficient as possible for both small and large matrices, and 2) to be accessible to both novices and experts.
http://code.google.com/p/efficient-java-matrix-library/
elefant
Efficient Learning, Large-scale Inference, and Optimization Toolkit.
An open source library for machine learning licensed under the Mozilla Public License. Written in Python.
http://elefant.developer.nicta.com.au/
ephi
C++ physics simulation software to simulate static magnetic fields and movement of charged particles in those fields (using the Lorentz force). Coulomb forces are also accounted when simulating particle paths. So Ephi allows you to model and visualize magnetic fields through current elements and also to visualize electron paths within those fields. Magnetic fields are calculated using numeric integration over the Biot-Savart law.
http://www.mare.ee/indrek/ephi/
epos
Language independent rule-driven Text-to-Speech (TTS) system primarily designed to serve as a research tool. Epos is (or tries to be) independent of the language processed, linguistic description method, and computing environment.
http://epos.ure.cas.cz/
espeak
TTS. Compact open source software speech synthesizer for English and other languages, for Linux and Windows.
ompact open source software speech synthesizer for English and other languages, for Linux and Windows.
http://espeak.sourceforge.net/
extjwnl
Extended Java WordNet Library is a Java API for creating, reading and updating dictionaries in WordNet format. extJWNL is an upgraded version of JWNL.
http://extjwnl.sourceforge.net/
fann
Fast Artificial Neural Network Library. Implements multilayer artificial
neural networks in C with support for both fully connected and sparsely
connected networks. Cross-platform execution in both fixed and floating
point are supported. It includes a framework for easy handling of training
data sets. It is easy to use, versatile, well documented, and fast, with
many bindings to different languages.
http://leenissen.dk/fann/
faust
Functional AUdio STream.
A compiled language for real-time audio signal processing.
Its programming model combines two approaches : functional programming and block diagram composition. You can think of FAUST as a structured block diagram language with a textual syntax.
http://faust.grame.fr/
festival
TTS. Speech synthesis system. Developed at the Centre for Speech Technology
Research at the University of Edinburgh and written in C++, Festival stands
for one of the most important free software speech synthesis systems
nowadays. It is related to the Festvox project.
http://www.cstr.ed.ac.uk/projects/festival/
HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)
See [Festvox]
festvox
Aims to make the building of new synthetic voices more systematic
and better documented, making it possible for anyone to build a new voice
for Festival.
Developed by the Carnegie Mellon University's speech group.
http://festvox.org/index.html
http://gps-tsc.upc.es/veu/festcat/
See [Festival]
figtree
Library for fast computation of Gauss transforms in multiple dimensions, using the Improved Fast Gauss Transform and Approximate Nearest Neighbor searching.
http://www.umiacs.umd.edu/~morariu/figtree/
flanagan java scientific library
Java scientific and numerical library to support both research and undergraduate programming courses and projects.
http://www.ee.ucl.ac.uk/~mflanaga/java/index.html
flite
Flite (festival-lite) is a small, fast run-time synthesis engine developed at CMU and primarily designed for small embedded machines and/or large servers. Flite is designed as an alternative synthesis engine to Festival for voices built using the FestVox suite of voice building tools.
http://www.speech.cs.cmu.edu/flite/
flowdesigner
Data flow oriented development environment. It can be used to build complex applications by combining small, reusable building blocks. In some ways, it is similar to both Simulink and LabView, but is hardly a clone of either.
Quite fast, written in C++ features a plugin mechanism that allows plugins/toolboxes to be easiliy added.
http://flowdesigner.sourceforge.net/wiki/index.php/Main_Page
fms
Fully Modular Synthesizer. Tool to generate all kinds of sounds.
http://fmsynth.sourceforge.net/
freefem++
Language written in C++ dedicated to the
finite element method. It enables solving
Partial Differential Equations (PDE) easily.
http://www.freefem.org/ff++/
freehdl
VHDL simulator. Used by Qucs for digital simulation.
http://www.freehdl.seul.org/
freeling
Open Source Suite of Language Analyzers developed in C++.
Includes Larger Spanish dictionary,
Debugged English dictionary,
More WN-based semantic information access,
More expressive rule language for dependency parsing,
Machine Learning functionalites moved to external omlet+fries library, for clearer organization,
Suport for 64-bit processors and
Extended Java API.
http://garraf.epsevg.upc.es/freeling/
freemat
Free environment for rapid engineering and scientific prototyping and
data processing. Similar to Matlab.
http://freemat.sourceforge.net/
freertos
RTOS. Portable open source mini Real Time Kernel for applications that
are critical with time. The project implements lots of ports to
multiple processor architectures.
http://www.freertos.org/
freetts
A speech synthesizer written entirely in the Java programming language.
FreeTTS is a speech synthesis system written entirely in the Java programming language. It is based upon Flite: a small run-time speech synthesis engine developed at Carnegie Mellon University. Flite is derived from the Festival Speech Synthesis System from the University of Edinburgh and the FestVox project from Carnegie Mellon University.
http://freetts.sourceforge.net/
frink
A calculating tool and programming language that tracks units of measure
through all calculations, being adequate for physical calculations. This
tool was named after a Simpson's character: brilliant professor
John Frink.
http://futureboy.homeip.net/frinkdocs/
galib
Set of C++ genetic algorithm objects. The library includes tools for using genetic algorithms to do optimization in any C++ program using any representation and genetic operators.
http://lancet.mit.edu/ga/
gaussian process resources
Resources concerned with probabilistic modeling, inference and learning based on Gaussian processes.
Literature, software and more.
http://www.gaussianprocess.org/
geda
GPL'd suite of Electronic Design Automation tools. Another application
I would have liked to know when doing electronic designs. It includes
schematic capture, simulation, prototyping and production. A true
alternative to commercial proprietary software like Orcad.
http://www.geda.seul.org/
ghdl
Complete VHDL simulator using the GCC technology. Its results are
thrown into a text file which are then visually interpreted with the
GTKWave program. Anyway, Altera already offers a free binary distribution of
its IDE for working with its FPGAs.
http://ghdl.free.fr/
ghmm
General Hidden Markov Model library.
C library implementing efficient data structures and algorithms for basic and extended HMMs.
Coded at the Max Planck Institute for Molecular Genetics.
http://ghmm.org/
ginnet
Graphical Interface for Neural Networks.
A decision-making platform written in Java. It has been developped to favorize the developpement and use of neural networks.
Neural network classical models are already available (Multi-layer perceptron, Kohonen self-organizing maps, neural gas, growing neural gas, etc.).
Coded in Java.
http://ginnet.gforge.inria.fr/
gjrand
Pseudo-random number generator for the purpose of simulations,
Monte-Carlo integration, computer games and the like.
http://gjrand.sourceforge.net/
gpalta
Simple and fast Genetic Programming toolbox written in Java.
http://gpalta.berlios.de/doku/doku.php
gp music composition
Genetic Programming techniques to allow computers to compose music.
Genetic Programming is an Artificial Intelligence technique that evolves "fit" individual programs from an initially random population of programs. In the case of music, fitness can be defined as how pleasing it is to listen to a particular sequence.
http://graphics.stanford.edu/~bjohanso/gp-music/gp_music-old.html
gsl
GNU Scientific Library (GSL) is a numerical library for C and C++
programmers. The library provides a wide range of mathematical routines
such as random number generators, special functions and least-squares
fitting. There are over 1000 functions in total with an extensive test
suite.
http://www.gnu.org/software/gsl/gsl.html
gstreamer
A library for constructing graphs of media-handling components.
Multimedia framework written in the C. GStreamer serves a host of multimedia applications, such as video editors, streaming media broadcasters, and media players.
http://gstreamer.freedesktop.org/
gtkwave
A waveform viewer for interpreting the results dumped by ghdl.
http://home.nc.rr.com/gtkwave/
hapi
HTK Application Programming Interface.
http://www.ee.uwa.edu.au/~roberto/research/speech/local/entropic/HAPIBook/hapibook.html
See [HTK]
hmmpak
Java HMM toolkit implemented at the Arizona State University, aimed at
building a gesture recognition system.
http://www.public.asu.edu/~tmcdani/hmm.htm
hotbits
Genuine random numbers, generated by radioactive decay.
An Internet resource that brings genuine random numbers, generated by a process fundamentally governed by the inherent uncertainty in the quantum mechanical laws of nature, directly to your computer in a variety of forms. HotBits are generated by timing successive pairs of radioactive decays detected by a Geiger-Müller tube interfaced to a computer. Includes Java
code to query the server.
http://fourmilab.ch/hotbits/
htk
Hidden Markov Model Toolkit. Excellent toolkit for HMM-based speech
recognition applications, among many others. Written in C by
the Cambridge University Engineering Department, it has been
adopted by many universities for research projects.
http://htk.eng.cam.ac.uk/
hts
TTS. HMM-based Speech Synthesis System. The speech synthesis system developed
at the Nagoya Institute of Technology. Makes use of HTK. The produced
voices can be used with Festival.
http://hts.sp.nitech.ac.jp/?Home
See [HTK]
See [Festival]
hydrogen
Advanced drum machine for GNU/Linux. It's main goal is to bring
professional yet simple and intuitive pattern-based drum programming.
http://www.hydrogen-music.org/
http://parumi.org/curso_produccion_musical_linux/capitulo2.html
imagej
Public domain, Java-based image processing program developed at the National Institutes of Health.
ImageJ was designed with an open architecture that provides extensibility via Java plugins and recordable macros.
http://rsbweb.nih.gov/ij/
isip asr
ASR developed at the Mississippi State University. Written in C++ and
aimed at research activities.
http://www.ece.msstate.edu/research/isip/projects/speech/index.html
it++
C++ library of mathematical, signal processing, speech processing
and communications classes and functions.
Its main use is in simulation of communication systems and for
performing research in the area of communications.
Developed at the Chalmers University of Technology.
http://itpp.sourceforge.net/
http://www.netlib.org/blas/
http://www.netlib.org/lapack/
http://math-atlas.sourceforge.net/
http://www.fftw.org/
http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm
http://developer.amd.com/cpu/libraries/acml/Pages/default.aspx
jack
Low-latency audio server, written for POSIX conformant operating systems
such as GNU/Linux. It can connect a number of different applications to
an audio device, as well as allowing them to share audio between
themselves. JACK is an essential tool for audio plumbing.
http://jackit.sf.net/
jahmm
Java HMM library written with code readibility in mind. Designed to be easy
to use and general purpose.
http://www.run.montefiore.ulg.ac.be/~francois/software/jahmm/
jamin
JACK Audio Connection Kit (JACK) Audio Mastering interface. JAMin is an open source application designed to perform professional audio mastering of stereo input streams. It uses LADSPA for digital signal processing (DSP).
http://jamin.sourceforge.net/en/about.html
See [LADSPA]
javacc
Java Compiler Compiler. The most popular parser generator for use with Java
applications.
A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar.
https://javacc.dev.java.net/
java-ml
Java Machine Learning Library. Library of ML algorithms and related
datasets. Machine learning techniques include: clustering, classification,
feature selection, regression, data pre-processing, ensemble learning,
voting...
http://java-ml.sourceforge.net/
jawbone
Java WordNet API library (hence the "Jaw" portion of the name - an acronym for Java API for WordNet). It makes it very easy to search the Wordnet data files for terms, either all terms or just those terms matching some search criteria.
http://mfwallace.googlepages.com/jawbone.html
jaws
Java API for WordNet Searching.
API that provides Java applications with the ability to retrieve data from the WordNet database. It is a simple and fast API that is compatible with both the 2.1 and 3.0 versions of the WordNet database files and can be used with Java 1.4 and later.
http://lyle.smu.edu/~tspell/jaws/index.html
[See WordNet]
jblas
Fast linear algebra library for Java. jblas is based on BLAS and LAPACK, the de-facto industry standard for matrix computations, and uses state-of-the-art implementations like ATLAS for all its computational routines, making jBLAS very fast.
http://jblas.org/
jclec
Software system for Evolutionary Computation (EC) research, developed in the Java programming language. It provides a high-level software environment to do any kind of Evolutionary Algorithm (EA), with support for genetic algorithms (binary, integer and real encoding), genetic programming (Koza style, strongly typed, and grammar based) and evolutionary programming.
http://jclec.sourceforge.net/
jena
ava framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine.
http://jena.sourceforge.net/
jess
Rule engine and scripting environment written entirely in Java. Using Jess, you can build Java software that has the capacity to "reason" using knowledge you supply in the form of declarative rules. Jess is small, light, and one of the fastest rule engines available.
http://www.jessrules.com/
jhapi
Java HAPI.
http://www.ee.uwa.edu.au/~roberto/research/speech/local/entropic/HAPIBook/node182.html#SECTION06200000000000000000
See [HAPI]
jinsect
Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classi?cation and indexing.
http://users.iit.demokritos.gr/~ggianna/#Tools%20JInsect
jlab
A scientific open-source programming environment coded in Java.
https://jlab.dev.java.net/
jmathlib
A Java Clone of Octave, SciLab, Freemat and Matlab.
http://www.jmathlib.de/
jmonkey
Java scenegraph API. Its primary focus is high-performance 3D gaming. jME itself is written entirely in Java and uses an abstraction layer for communicating natively with the platform's hardware.
http://www.jmonkeyengine.com/
jncc2
Java Implementation of Naive Credal Classifier 2.
NCC2 constitutes an extension of the traditional Naive Bayes Classifier (NBC) towards imprecise probabilities; it is designed to return robust classification, even on small and/or incomplete data sets. A peculiar feature of NCC2 is that it returns set-valued (or imprecise) classifications (i.e., more than one class) when faced with doubtful instances.
http://www.idsia.ch/~giorgio/jncc2.html
joone
Java Object Oriented Neural Engine. A framework to create, train
and test artificial neural networks.
http://www.jooneworld.com/
jprogram
PRObabilistic GRAphical Models in Java. Open-source Java library which can be used for learning the following probabilistic models from data: Bayesian networks, Markov random fields, hybrid random fields, probabilistic decision trees, dependency networks, Gaussian mixture models, and Parzen windows.
http://www.dii.unisi.it/~freno/JProGraM.html
julius
High-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers.
Based on word N-gram and context-dependent HMM, it can perform almost real-time decoding on most current PCs in 60k word dictation task.
It is written in C and the main platform is Linux.
http://julius.sourceforge.jp/en_index.php
jwnl
Java WordNet Library. API for accessing WordNet-style relational dictionaries. It also provides functionality beyond data access, such as relationship discovery and morphological processing.
http://jwordnet.sourceforge.net/
[See WordNet]
jwordnet
Pure Java standalone object-oriented interface to the WordNet database of lexical relationships. It is intended for Java programmers who wish to write portable Java applications that use a local copy of the WordNet files, or who find JWordNet's object-oriented interface preferable to the procedural interface that the C library (and native method interfaces built on top of it) provide.
http://jwn.sourceforge.net/
[See WordNet]
keel
Knowledge Extraction based on Evolutionary Learning.
Spanish National Project providing a
Software tool to assess evolutionary algorithms for Data Mining problems including regression, classification, clustering, pattern mining and so on. It contains a big collection of classical knowledge extraction algorithms, preprocessing techniques (instance selection, feature selection, discretization, imputation methods for missing values, etc.), Computational Intelligence based learning algorithms, including evolutionary rule learning algorithms based on different approaches (Pittsburgh, Michigan and IRL, ...), and hybrid models such as genetic fuzzy systems, evolutionary neural networks, etc.
http://www.keel.es/
kernlab
Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods kernlab includes Support Vector Machines, Spectral Clustering, Kernel PCA and a QP solver.
http://cran.r-project.org/web/packages/kernlab/index.html
ladspa
Linux Audio Developer's Simple Plugin API. It is a standard that allows
software audio processors and effects to be plugged into a wide range of
audio synthesis and recording packages.
http://www.ladspa.org/
http://tap-plugins.sourceforge.net/
http://plugin.org.uk/ladspa-swh/docs/ladspa-swh.html
http://javiervalcarce.es/wiki/Como_escribir_un_plugin_LADSPA
http://javiervalcarce.es/wiki/RTLADSPA
lame
High quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.
http://lame.sourceforge.net/index.php
latex
A document preparation system. It is a high quality typesetting system.
It is oriented to scientific and technical productions, although
I had read that faculties of letters began to use for the spendid
quality that it yields. Prestigious scientific magazines, such
as the IEEE, hand over
the layouts required for publications.
Apart from articles and books, LaTeX can prepare presentations, calendars,
drawings... it's really powerful. The Debian packages for having the
application ready are: tetex-base, tetex-bin, tetex-doc and tetex-extra.
http://www.latex-project.org/
http://www.ieee.org/web/publications/authors/transjnl/index.html
[See Beamer]
libann
Library that supports all kinds of Neural Nets, including ARTs and more.
Currently it has a Multi-Layer Perceptron network, Kohonen network, a
Boltzmann machine and a Hopfield network.
The library is written objectively using the C++ Standard Template Library.
http://www.nongnu.org/libann/
See [STL]
libocas
Library implementing OCAS solver
for training linear SVM classifiers from large-scale data. Coded in C.
http://cmp.felk.cvut.cz/~xfrancv/ocas/html/index.html
libsvm
A C++ and Java Library for Support Vector Machines.
LIBSVM is an integrated software for support vector classification,
regression and distribution estimation.
It supports multi-class classification.
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
linsmith
Smith charting program for GNU/Linux, mainly designed for educational use.
http://jcoppens.com/soft/linsmith/index.en.php
lwpr
Locally Weighted Projection Regression (LWPR) is a recent algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. A locally weighted variant of Partial Least Squares (PLS) is employed for doing the dimensionality reduction. A C-library with wrappers for C++, Matlab/Octave, and Python.
http://www.ipab.inf.ed.ac.uk/slmc/software/lwpr/index.html
mallet
MAchine Learning for LanguagE Toolkit.
Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
http://mallet.cs.umass.edu/
marvin
Extensible image processing framework developed in Java. It is open source and distributed under GPL License.
Algorithms to manipulate images are externally implemented as plug-ins. The framework provides an interface to manipulate plug-ins.
http://marvinproject.sourceforge.net/en/index.html
mary
TTS. Speech synthesis system for German, English and Tibetan. Produced by
the Institute of Phonetics at Saarland University using the Java
technology.
http://mary.dfki.de/
matpack
Numerics and
Graphics Library written in C++.
http://www.matpack.de/
maxent
PoS Tagger. Mature Java package for training and using maximum entropy models.
http://maxent.sourceforge.net/
maxima
CAS. System for the manipulation of symbolic and numerical expressions,
including differentiation, integration, Taylor series, Laplace transforms,
ordinary differential equations, systems of linear equations, polynomials,
and sets, lists, vectors, matrices, and tensors. A direct competitor of
Derive.
http://maxima.sourceforge.net/
mdp
Modular toolkit for Data Processing. Data processing framework written in Python.
MDP consists of a collection of trainable supervised and unsupervised algorithms or other data processing units (nodes) that can be combined into data processing flows and more complex feed-forward network architectures.
http://mdp-toolkit.sourceforge.net/
meep
MIT Electromagnetic Equation Propagation. FDTD simulator for modeling
electromagnetic systems.
http://ab-initio.mit.edu/wiki/index.php/Meep
mlpack
Comprehensive scalable machine learning library.
Developed by the Fundamental Algorithmic and Statistical Tools laboratory (FASTlab), MLPACK and its core functions library FASTlib are the much needed filling of an existing void.
http://mloss.org/software/view/152/0.1
mlpy
High-performance Python/NumPy based package for machine learning.
Includes classification, feature weighting, feature ranking, resampling
methods, metric functions, feature list analysis and landscaping tools.
https://mlpy.fbk.eu/
morphix-nlp
Live CD Linux distribution with a rich collection of Natural Language Processing (NLP) applications.
http://morphix-nlp.berlios.de/
mulan
Open-source Java library for learning from multi-label datasets. Multi-label datasets consist of training examples of a target function that has multiple binary target variables. This means that each item of a multi-label dataset can be a member of multiple categories or annotated by many labels (classes).
http://mulan.sourceforge.net/
multiwordnet
Multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6.
http://multiwordnet.itc.it/english/home.php
mxpost
Java PoS Tagger based on the model of maximum entropy.
ftp://ftp.cis.upenn.edu/pub/adwait
nas
Network Audio System.
Network transparent, client/server audio transport system. It can be described as the audio equivalent of an X server.
http://www.radscan.com/nas.html
netkit-srl
Network Learning toolkit for statistical relational learning. It is written in Java 1.5 and was designed with a plug-and-play architecture to enable the mix-and-match between different components in the relational learning process. It integrates seamlessly with the Weka machine learning toolkit, making it possible to use any of Weka's learning classifiers in the context of relational learning.
http://netkit-srl.sourceforge.net/
See [Weka]
neurobjects
A set of C++ library classes for neural networks development.
The main goal of the library consists in supporting researchers and
practitioners in developing new neural network methods and applications,
exploiting the potentialities of object-oriented design and programming.
NEURObjects provides also general purpose applications for classification
problems and can be used for fast prototyping of inductive machine
learning applications.
http://www.disi.unige.it/person/ValentiniG/NEURObjects/
ngramj
Java library for language recognition. It uses language profiles (counts of character sequences) to guess what language some arbitrary text is.
http://ngramj.sourceforge.net/
nico ann toolkit
General purpose toolkit for constructing artificial neural networks
and training with the back-propagation learning algorithm.
It is written in C and originally
developed for speech recognition applications.
http://nico.nikkostrom.com/
nieme
Machine learning library for large-scale classification, regression and ranking. It relies on the framework of energy-based models which unifies several learning algorithms. This framework also unifies batch and stochastic learning which are both seen as energy minimization problems. Nieme is released under the GPL license. It is efficiently implemented in C++.
http://nieme.lip6.fr/
nist math resources
Mathematical and statistical engineering resources from the National
Institute of Standards and Technology.
http://math.nist.gov/
nltk
Natural Language Toolkit.
Suite of open source Python modules, data and documentation for research and development in natural language processing. NLTK contains Code supporting dozens of NLP tasks, along with 40 popular Corpora and extensive Documentation including a 375-page online Book.
http://nltk.sourceforge.net/index.php/Main_Page
octave
A command line program intended for numerical computations. Its high level
language is mostly compatible with Matlab, which is a feature to consider
when having to hand in determined practice papers at the school.
Octave is available at the servers of La Salle.
This application has a lot of community support under the Octave-Forge
project.
http://www.gnu.org/software/octave/
http://octave.sourceforge.net/packages.html
Introduccion Informal a Matlab y Octave
opencog
Common platform to build and share artificial intelligence programs. The long-term goal of OpenCog is acceleration of the development of beneficial AGI, a goal which includes developing tools and protocols for AGI safety.
The OpenCog Framework which provides an OS-like infrastructure and stable APIs. Written in C++.
http://www.opencog.org/wiki/The_Open_Cognition_Project
opencyg
Open source version of the Cyc technology, the world's largest and most complete general knowledge base and commonsense reasoning engine. OpenCyc can be used as the basis of a wide variety of intelligent applications such
as rapid development of an ontology in a vertical area,
email prioritizing, routing, summarization, and annotating,
expert systems and games.
http://www.opencyc.org/
openhtmm
Hidden Topic Markov Model. Application to model the topics of words
in a document as a Markov chain. Written in C++.
http://code.google.com/p/openhtmm/
open mind speech
Free speech recognition for GNU/Linux. Tools and applications.
http://freespeech.sourceforge.net/
opennlp
Organizational center for open source projects related to natural language processing. Its primary role is to encourage and facilitate the collaboration of researchers and developers on such projects.
http://opennlp.sourceforge.net/
[See Maxent]
orca
Free, open source scriptable screen reader. Using various combinations of speech, braille, and magnification, Orca helps provide access to applications and toolkits that support the AT-SPI (e.g., the GNOME desktop). The development of Orca has been led by the Accessibility Program Office of Sun Microsystems, Inc. with contributions from many community members.
http://live.gnome.org/Orca
pebl-project
Python library and command line application for learning the structure of a Bayesian network given prior knowledge and observations. Pebl has been developed at the Systems Biology lab at the University of Michigan and is available with a permissive MIT-style license.
http://code.google.com/p/pebl-project/
petsc
Portable, Extensible Toolkit for Scientific Computation.
Suite of data structures and routines for the scalable (parallel)
solution of scientific applications modeled by partial differential
equations. It implements bindings for Python.
http://www-unix.mcs.anl.gov/petsc/petsc-as/index.html
phet
Physics Education Technology. Interactive Physics Simulator. Fun, interactive, research-based simulator of physical phenomena from the Physics Education Technology project at the University of Colorado.
Written in Java.
http://phet.colorado.edu/index.php
phmm
Python Hidden Markov Models. Implemented at the University of Bologna.
http://www.biocomp.unibo.it/piero/PHMM/
phoenix
Connected speech recognition system.
Phoenix is a speaker dependent (user trained) connected word recognition
system. Phoenix is designed as a real-time recognition system in that
recogniton takes place in parallel to utterance input and partial
results are available before the end of utterance is encountered.
ftp://svr-ftp.eng.cam.ac.uk/comp.speech/recognition/
pikdev
Graphic IDE for the developement of PIC-based applications.
Developed in C++ under Linux and based on the KDE environment.
PiKdev can drive parallel port programmers or serial port programmers.
The project's page provides the needed schematics for cheap
developement.
http://pikdev.free.fr/
piklab
IDE for applications based on Microchip PIC and dsPIC microcontrollers
similar to the MPLAB environment. It integrates with several compiler
and assembler toolchains (like gputils, sdcc, c18) and with the GPSim
simulator. It supports the most common programmers (serial, parallel,
ICD2, Pickit2, PicStart+), the ICD2 debugger, and several bootloaders
(Tiny, Pickit2, and Picdem).
http://piklab.sourceforge.net/
pinguino
Arduino-like board based on a PIC Microcontroller. The goal of this project is to build an integrated IDE easy to use on LINUX, WINDOWS and MAC OS X.
The IDE of Pinguino is built with Python. An integrated preprocessor translates specific Arduino instructions directly into C. This preprocessor reduces the code length and the execution speed. Pinguino hardware is based on a 18F2550. This chip has an integrated native USB module and an UART for serial link.
http://www.hackinglab.org/pinguino/index_pinguino.html
See [Arduino]
pspp
Program for statistical analysis of sampled data. It is a Free replacement for the proprietary program SPSS, and appears very similar to it with a few exceptions.
PSPP can perform descriptive statistics, T-tests, linear regression and non-parametric tests. Its backend is designed to perform its analyses as fast as possible, regardless of the size of the input data. You can use PSPP with its graphical interface or the more traditional syntax commands.
http://www.gnu.org/software/pspp/
pulse audio
Sound server written in C. It allows you to do advanced operations on your sound data as it passes between your application and your hardware. Things like transferring the audio to a different machine, changing the sample format or channel count and mixing several sounds into one are easily achieved using a sound server.
According to Paul Davis, main developer of Ardour and Jack, PulseAudio is
One Audio System To Bind Them All, adapting the famous quote from The
Lord of the Rings.
http://pulseaudio.org/
pure data
Real-time graphical programming environment for audio, video, and graphical processing.
A very complete and complex application.
http://puredata.info/
pybrain
Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library.
PyBrain is a modular Machine Learning Library for Python. It's goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks and a variety of predefined environments to test and compare your algorithms.
http://www.pybrain.org/
pyro
Python Robotics. The goal of the project is to provide a programming environment for easily exploring advanced topics in artificial intelligence and robotics without having to worry about the low-level details of the underlying hardware.
http://pyrorobotics.org/?page=Pyro
python numeric and scientific
A rich set of numerical tools for scientific computations with the
python programming language.
http://wiki.python.org/moin/NumericAndScientific
qfsm
A graphical tool for designing finite state machines. Written in C++
using Qt.
http://qfsm.sourceforge.net/about.html
qtag
Probabilistic parts-of-speech tagger. That means it's a program that reads text and for each token in the text returns the part-of-speech (eg noun, verb, punctuation, etc). It works using statistical methods, hence the `probabilistic'. As a result it does make mistakes (as does every POS tagger), but it is fairly robust and (from informal evaluation) tags texts with good accuracy.
http://www.english.bham.ac.uk/staff/omason/software/qtag.html
qucs
Quite Universal Circuit Simulator. This is one of the tools I would have
liked to know when I took the Bachelor's degree in Electronic
Engineering. It is a circuit simulator with GUI that supports various
kinds of simulations including DC, AC, S-parameter, Harmonic Balance
analysis, etc.
http://qucs.sourceforge.net/
rapidminer
Environment for machine learning and data mining experiments. It allows experiments to be made up of a large number of arbitrarily nestable operators, described in XML files which are created with RapidMiner's graphical user interface. RapidMiner is used for both research and real-world data mining tasks. Written in Java.
http://rapidminer.com/
rlab
Interactive interpreted scientific programming environment. Rlab is a very high level language intended to provide fast prototyping and program development, as well as easy data-visualization, and processing.
It focuses on creating a good experimental environment (or laboratory) in which to do matrix math, for what it can be called "Matlab-like".
http://rlab.sourceforge.net/
http://rlabplus.sourceforge.net/
rl-glue
A set of common guidelines for the reinforcement learning community to follow to allow us to share and compare agents and environments with greater ease.
The software implementation of RL-Glue is the reusable glue to connect the basic parts of an experiment.
RL-Glue is functionally a harness to "plug in" agents and environments and experiment without having to continually rewrite the connecting code.
http://glue.rl-community.org/
r-project
Free software environment for statistical computing and graphics that
is used frequently for Machine Learning applications and research.
http://www.r-project.org/
rtems
A most complete RTOS for multiprocessor systems.
http://www.rtems.com/
scilab
Open source platform for numerical computation developed at INRIA, the
French national institute for research in informatics and automatics.
It has a command line console and a dynamic systems simulator.
Its inferface resembles Matlab, used par excellence in La
Salle. There are though
some universities that have
switched to Scilab because of its good performance
and characteristics.
Scilab is in constant development, for which I prefer downloading the
tarball from the Internet instead of dealing with the non-free Debian repos.
http://www.scilab.org/
http://gforge.inria.fr/
sensus
A 70,000-node terminology taxonomy, as a framework into which additional knowledge can be placed. SENSUS is an extension and reorganization of WordNet (built at Princeton University).
At the top level, nodes from the Penman Upper Model have been added, and the major branches of WordNet have been rearranged to fit. In addition, nodes based on work with other ontologies have also been added.
http://www.isi.edu/natural-language/projects/ONTOLOGIES.html
sfml
Free multimedia C++ API that provides you low and high level access to graphics, input, audio, etc.
http://www.sfml-dev.org/index.php
shapelogic
Toolkit for declarative programming, image processing and computer vision.
ShapeLogic is a library for declarative programming and lazy computations in Java,
image processing and computer vision and particle analyzer for medical image processing.
http://code.google.com/p/shapelogic/
shark
Modular C++ library for the design and optimization of adaptive systems. It provides methods for linear and nonlinear optimization, in particular evolutionary and gradient-based algorithms, kernel-based learning algorithms and neural networks, and various other machine learning techniques.
http://shark-project.sourceforge.net/
shogun
Machine Learning toolbox focused on large scale kernel methods and
especially on Support Vector Machines. Written in C++ it interfaces
Matlab, R, Octave and Python.
http://www.shogun-toolbox.org/
simbad
Java 3d robot simulator for scientific and educationnal purposes. It is mainly dedicated to researchers/programmers who want a simple basis for studying Situated Artificial Intelligence, Machine Learning, and more generally AI algorithms, in the context of Autonomous Robotics and Autonomous Agents.
http://simbad.sourceforge.net/
snack sound toolkit
Designed to be used with a scripting language such as Tcl/Tk or Python. Using Snack you can create powerful multi-platform audio applications with just a few lines of code. Snack has commands for basic sound handling, such as playback, recording, file and socket I/O. Snack also provides primitives for sound visualization, e.g. waveforms and spectrograms. It was developed mainly to handle digital recordings of speech, but is just as useful for general audio. Snack has also successfully been applied to other one-dimensional signals.
http://www.speech.kth.se/snack/
snow
Sparse Network of Winnows learning architecture. Multi-class classifier that is specifically tailored for large scale learning tasks and fpr domains in which the potential number of features taking part in decisions is very large, but may be unknown a priori. It learns a sparse network of linear functions in which the targets concepts (class labels) are represented as linear functions over a common feature space.
http://l2r.cs.uiuc.edu/~danr/snow.html
snowball
Small string processing language designed for creating stemming algorithms for use in Information Retrieval.
http://snowball.tartarus.org/
speakups
Screen review package for the Linux operating system.
Speakup allows you to interact with applications and the GNU/Linux operating system with audible feedback from the console using a synthetic speech device.
http://www.linux-speakup.org/speakup.html
speech dispatcher
Device independent layer for speech synthesis, developed with the goal of making the usage of speech synthesis easier for application programmers. It takes care of most of the tasks necessary to solve in speech-enabled applications.
The architecture is based on a proven client/server model. The basic means of client communication is through a TCP connection using the Speech Synthesis Independent Protocol (SSIP), or through an interface library.
http://www.freebsoft.org/speechd
sphinx
ASR. Speech recognition application and set of tools for speech
recognition developed at Carnegie Mellon University. Originally it
was implemented in C, but its latest release, Sphinx4, has been
programmed in Java.
http://cmusphinx.sourceforge.net/html/cmusphinx.php
spro
Free speech signal processing toolkit which provides runtime commands implementing standard feature extraction algorithms for speech related applications and a C library to implement new algorithms and to use SPro files within your own programs.
http://www.irisa.fr/metiss/guig/spro/
sptk
The Speech Signal Processing Toolkit (SPTK) is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
http://sp-tk.sourceforge.net/
ssj
Stochastic Simulation in Java. It provides facilities for generating uniform and nonuniform random variates, computing different measures related to probability distributions, performing goodness-of-fit tests, applying quasi-Monte Carlo methods, collecting (elementary) statistics, and programming discrete-event simulations with both events and processes.
http://www.iro.umontreal.ca/~simardr/ssj/indexe.html
stanford log-linear part-of-speech tagger
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'. This software is a Java implementation of the log-linear part-of-speech tagger.
http://nlp.stanford.edu/software/tagger.shtml
stl
Standard Template Library. A C++ library of container classes, algorithms,
and iterators. It provides many of the basic algorithms and data
structures used in computer science.
http://www.sgi.com/tech/stl/stl_introduction.html
svm-light
Implementation of Support Vector Machines (SVMs) in C
for the problem of pattern recognition, for the problem of regression, and for the problem of learning a ranking function.
The algorithm has scalable memory requirements and can handle problems with many thousands of support vectors efficiently.
http://svmlight.joachims.org/
texai.org
A knowledge-based software project to create artificial intelligence.
The first approach is to construct an English dialog system, to then
let it acquire linguistic and common sense skills for representing
its own beahvior in the knowledge base. Implemented in Java.
http://texai.org/blog/about/texai-project/
text-analysis
Java implementation, with an easy to use API and full unit-test coverage, of some techniques to perform Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Document Summarization (single or multiple documents) and Plagiarism Detection.
http://code.google.com/p/text-analysis/
theora
Video compression. Theora is a free and open video compression format from the Xiph.org Foundation.
Theora scales from postage stamp to HD resolution, and is considered particularly competitive at low bitrates. It is in the same class as MPEG-4/DiVX, and like the Vorbis audio codec it has lots of room for improvement as encoder technology develops.
http://www.theora.org/
tnt tagger
Trigrams'n'Tags. ery efficient statistical part-of-speech tagger that is trainable on different languages and virtually any tagset. The component for parameter generation trains on tagged corpora. The system incorporates several methods of smoothing and of handling unknown words.
Written in C.
http://www.coli.uni-saarland.de/~thorsten/tnt/
torch
Matlab-like environment for state-of-the-art machine learning algorithms.
http://torch5.sourceforge.net/
treetagger
PoS Tagger. Language independent part-of-speech tagger.
TreeTagger is a tool for annotating text with part-of-speech and lemma information.
It has been successfully used to tag German, English, French, Italian, Dutch, Spanish, Bulgarian, Russian, Greek, Portuguese, Chinese and old French texts and is easily adaptable to other languages if a lexicon and a manually tagged training corpus are available.
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html
tsearch2
Full text engine, fully integrated into PostgreSQL RDBMS.
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
ujmp
Universal Java Matrix Package. The Universal Java Matrix Package (UJMP) is an open source Java library that provides sparse and dense matrix classes, as well as a large number of calculations for linear algebra like matrix multiplication or matrix inverse. Operations such as mean, correlation, standard deviation, replacement of missing values or the calculation of mutual information are supported also.
http://www.ujmp.org/
universvm
Support Vector Machine with Large Scale CCCP Functionality.
The UniverSVM is a SVM implementation written in C/C++. Its functionality comprises large scale transduction via CCCP optimization, sparse solutions via CCCP optimization and data-dependent regularization with a Universum.
http://www.kyb.mpg.de/bs/people/fabee/universvm.html
voxforge
Collection of transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).
http://voxforge.org/
voximp
Desktop control. Written in Python, Voximp is an application, with which programs can be spawned and key/mouse presses simulated, all from just speaking a few words.
http://ardoris.wordpress.com/2008/08/09/speech-recognition-desktop-control-voximp/
waffles
Collection of C++ classes and tools for researchers in machine learning, AI, data mining, pattern recognition, and related fields.
http://waffles.sourceforge.net/
watchmaker
Java Framework for Evolutionary Computation.
Extensible, high-performance, object-oriented framework for implementing platform-independent evolutionary algorithms (EAs) in Java. The framework provides type-safe, non-invasive evolution for arbitrary representations.
Watchmaker project's home
wavesurfer
Sound visualization and manipulation tool. WaveSurfer has a simple and logical user interface that provides functionality in an intuitive way and which can be adapted to different tasks. It can be used as a stand-alone tool for a wide range of tasks in speech research and education. Typical applications are speech/sound analysis and sound annotation/transcription.
http://www.speech.kth.se/wavesurfer/
weka
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
http://www.cs.waikato.ac.nz/ml/weka/
http://www.scms.waikato.ac.nz/~fracpete/projects/bmvw/
http://cran.r-project.org/web/packages/RWeka/index.html
http://www.scms.waikato.ac.nz/~fracpete/projects/kepler_and_ptolemy/
wireshark
Data Sniffer. Network protocol analyzer used in the universitiy's labs.
It's a mature project, useful and complete.
http://www.wireshark.org/
wordnet
Large lexical database of English.
Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations.
http://wordnet.princeton.edu/
wsmt
The Web Service Modeling Toolkit (WSMT) is a collection of tools for Semantic Web Services intended for use with the Web Service Modeling Ontology (WSMO), The Web Service Modeling Language (WSML) and the Web Service Execution Environment (WSMX).
http://sourceforge.net/projects/wsmt
yacc
Yet Another Compiler Compiler.
Parser generator developed by Stephen C. Johnson at AT&T for the Unix operating system.
It generates a parser (the part of a compiler that tries to make syntactic sense of the source code) based on an analytic grammar written in a notation similar to BNF. Yacc generates the code for the parser in the C programming language.
http://dinosaur.compilertools.net/#yacc
yasr
Yet Another Screen Reader. General-purpose console screen reader for GNU/Linux and other Unix-like operating systems.
http://yasr.sourceforge.net/
yorick
Interpreted programming language, designed for postprocessing or steering large scientific simulation codes.
The language features a compact syntax for many common array operations, so it processes large arrays of numbers very efficiently.
http://web.mit.edu/afs/athena/software/yorick_v1.5.12/yorick/1.5/doc/

