Alexandre Trilla, PhD - Data Scientist | home publications
 

Blog

-- Thoughts on data analysis, software development and innovation management. Comments are welcome


Post 69

On using Hacker News to validate a product idea involving NLP and PHP

12-Jul-2012

The first step to creating a valuable product is to discover what it is exactly wanted or needed by the target customers. The Lean Startup process states it straight, and the Pragmatic Programmer even provides a means to find it out by asking Hacker News (HN). HN is a vibrant community of tech people, hackers is its broadest sense... and entrepreneurs (these concepts need not be disjoint), which can provide a lot of insight into the value of a product idea.

Now, my product idea: a general-purpose Natural Language Processing (NLP) toolkit coded in PHP. This is certainly a long wanted product (note that the two links date back to 2008), and for a sensible reason: the Internet is bloated with textual content, so let's develop a NLP tool that is focused on processing text on the web. In this sense, the PHP programming language, i.e., by definition, the Hypertext Preprocessor, should be a practical choice with which to do it. Moreover, PHP is the default platform that is available on a web server. Then, all the elements seem to be in the right place. And the problem seems to be addressed logically this way, but it still needs positive feedback from the end users (the developers) to succeed. Note that none of the currently available NLP toolkits reported in the Wikipedia list has been developed in PHP, so there must be a niche of improvement here, or must there be something wrong going on? Why is it so? Perhaps the product was not interesting a few years ago, maybe it did not catch up because of marketing issues, or using the many bindings and wrappers available was just enough in contrast to putting the effort in doing it all again from scratch... Therefore, the question naturally arises: is it really interesting to the community? If so, to what extent? Is it worth the bother? Will this be a profitable project? Would it be nuts to rely solely on Ian Barber's opinion?

These questions require some scientific experimentation, so I built a prototype (mainly based on text classification, which has 24 GitHub watchers at this time of writing; thanks for your interest, indeed) and submitted it to HN. What I found out was contrary to what I expected: the general interest in this kind of product is essentially nonexistent, just in line with what had already happened with the previous approaches. I failed. OK. At least I now know by myself it's nonsense to invest in this product. I'd better do something else. Fine. Let's keep engineering. The upside is that I practised some PHP (my skills with this language were getting a little rusty) and (more importantly) I learnt that businesses need solutions, not tools to develop solutions (this conclusion is derived directly from the only -ironic- comment that appears in HN, which was motivated by the demo app that I provided where I trained the classifier with a popular research dataset only as a proof of concept). That's awesome! If I had dismissed the so-valuable Lean Startup directive, assuming that the world was just how I saw it, I would have "wasted" (please note the quotation marks) a whole lot of time developing something nobody would pay for (I'm being rather like Edison here, I know). This is an undoubtedly good "lesson learned". Needless to say, though, if I ever get to obtain economic support for its development, I will gladly resume the coding phase!



All contents © Alexandre Trilla 2008-2024