Code
This page compiles some software projects, code snippets and scripts that I have programmed and that might be of interest to browse. You may also check my GitHub account.
Ants
Ant foraging simulation coded in JavaScript. The ants have been implemented following a state machine, where they first perform a random walk and then they fetch food indefinitely. The foraging behaviour is cast as a biologically-inspired search algorithm for an optimisation procedure, considering that the objective is a cost function to be minimised. You may check out its demo and review its source code.
TwitterScraper.js
Script bundle to build a text corpus based on tweet IDs. Given a corpus of tweet IDs in XML (a la SEPLN TASS), the twitterScraper.js extracts the tagged IDs and queries Twitter in order to obtain the body of the text. It is coded in the JavaScript programming language (plus Node.js with jQuery and xml2js modules). You may review its source code.
NLP-Tools
Text processing framework (mainly based on classification) to analyse Natural Language by performing operations and tasks on corpus data. Hence, this approach focuses on the statistical/quantitative track of Natural Language Processing (NLP). It is coded in the PHP programming language. You may check its website, its demos, its RESTful API service and/or review its source code.
L-system
Parallel string rewriting system, namely a variant of a formal grammar (both deterministic and stochastic grammars are allowed), most famously used to model the growth processes of plant development. A Logo code interpreter is used for its graphical representation. It is coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
Natural Language Generator
The Natural Language Generator (NLG) shows how a simple n-gram-based Language Model can be used to learn from textual data and generate language instances. The order "n" of the model is a configurable parameter that determines the "amount of linguistic knowledge" (i.e., complexity) that can be learnt from textual data. It is coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
Pipeline Skeleton
The Pipeline Skeleton intends to provide an adequate ground framework to buttress a data processing project without compromising its future growth. Therefore, it is focused on modularity, configurability and extensibility, by making use of the pipes and filters design pattern. It is aimed at speech and language processing applications and coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
EmoLib: Emotional Library
Java library that extracts the affect from an incoming text by tagging such text according to the feeling that is written or being conveyed. Although it has primarily been designed for feature extraction and text categorisation, its flexible architecture allows EmoLib to reconfigure and accomplish many diverse Natural Language Processing tasks. You may check its docs and its demo. I am not allowed to deliver its source code due to copyright restrictions.
Magnus: Mouse Advanced GNU Speech
Computer mouse pointer controller through Catalan voice commands. This speech recognition application aims to provide oral accessibility for people with reduced mobility. Coded in Java. Additionally, it provides a set of simulation code snippets (for speech enhancement, mainly) for the Scilab numerical computation platform. You may read more about it in the Magnus project's weblog and/or check out its forge page.

