This page compiles some software projects, code snippets and scripts that I have programmed and that might be of interest to browse. You may also check my GitHub account.
Text processing framework (mainly based on classification) to analyse Natural Language by performing operations and tasks on corpus data. Hence, this approach focuses on the statistical/quantitative track of Natural Language Processing (NLP). It is coded in the PHP programming language. You may check its website, its demos, its RESTful API service and/or review its source code.
Parallel string rewriting system, namely a variant of a formal grammar (both deterministic and stochastic grammars are allowed), most famously used to model the growth processes of plant development. A Logo code interpreter is used for its graphical representation. It is coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
Natural Language Generator
The Natural Language Generator (NLG) shows how a simple n-gram-based Language Model can be used to learn from textual data and generate language instances. The order "n" of the model is a configurable parameter that determines the "amount of linguistic knowledge" (i.e., complexity) that can be learnt from textual data. It is coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
The Pipeline Skeleton intends to provide an adequate ground framework to buttress a data processing project without compromising its future growth. Therefore, it is focused on modularity, configurability and extensibility, by making use of the pipes and filters design pattern. It is aimed at speech and language processing applications and coded in the C/C++ programming language. You may check its release announcement, its docs and/or download the source code distribution tarball.
EmoLib: Emotional Library
Java library that extracts the affect from an incoming text by tagging such text according to the feeling that is written or being conveyed. Although it has primarily been designed for feature extraction and text categorisation, its flexible architecture allows EmoLib to reconfigure and accomplish many diverse Natural Language Processing tasks. You may check its docs and its demo. I am not allowed to deliver its source code due to copyright restrictions.
Magnus: Mouse Advanced GNU Speech
Computer mouse pointer controller through Catalan voice commands. This speech recognition application aims to provide oral accessibility for people with reduced mobility. Coded in Java. Additionally, it provides a set of simulation code snippets (for speech enhancement, mainly) for the Scilab numerical computation platform. You may read more about it in the Magnus project's weblog and/or check out its forge page.