Part 1: Different types of data science projects

In this article, I am going to share my experience on how to turn your data science project into a successful product. Far too often, we see great projects be stuck in the Proof of Concept (PoC) or Prototype phase, because going live globally would mean redoing much of the work, changing the programming language or frameworks the model was built from, or even starting from scratch. Even the quality of the data science model, like results of precision and recall, may heavily differ from the results obtained during the prototype phase. Whereas the first part of this article will be all about which flavors of data science projects exist, the second part will share some guidelines on how to turn a PoC into an actual AI product.

To start with, we should have a closer look at which types of data science projects are suitable at all to be turned into productive solutions – or even products. It doesn’t always make sense to strive for production-grade solutions. From my experience, data science projects usually come in either of those flavors:

  1. The one-time analysis: This type of project aims at answering a particular business question for a one-shot decision. For example, an insurance company may want to pay out a special dividend for its 100ths birthday. Given its capital base, portfolio of contracts, and open claims, it needs to decide what an acceptable dividend payout could be without affecting its S & P or Moody’s rating which is strongly linked to a decreasing capital related to a high dividend payout. Such a one-time analysis can be done using structured methods and Natural Language Processing (NLP) technology. It boils down to skimming through all text documents. In this case, no special software or application for a repetition of this process needs to be developed. It is a one-time question that shall be answered. There is no point in complicating matters by planning for a product.
  2. The repetitive analysis aka regular reporting: The more common type of analysis is the repetitive one, where a certain question needs an answer on a regular basis. The answer may not be trivial and can require complex machine learning models to predict future outcomes. For example, a company may want to contact customers which are most likely to churn and make them a better offer to keep them engaged. Special baseboards are set up for the regional managers. They get an overview of their sales area and the clients they most probably will lose if no action is taken. In this kind of analysis, it is of utter importance that data flows in continuously, dashboards have been set up and reports are sent to the right audience to make timely decisions.
  3. The AI product or solution: The AI product is much more than an algorithm, a one-off analysis, or regular reporting. Here, AI models are embedded into processes for automatic decision-making. This is done to reduce risk, improve processing efficiency, decrease the end-to-end processing time, or just deal with the sheer amount of data flowing into the system. It also contains a mechanism to collect feedback data, show model results to the administrators, and continuously maintain and improve the prediction quality. An example may be a system to extract claimant name, loss details, loss date, and assess the urgency of the claim. The backend modules of this system will allow the possibility to override results, intervene when extraction quality falls under a certain threshold, and will show performance metric dashboards to its maintainer. It aims at holistically tackling the problem, optimizing for throughput and quality while offering the possibility to humans to override the system when required.

The three types of data science projects can bring tremendous benefits to a company. But what is important to turn analysis into a product? What are the steps and considerations it takes and how can this be explained to key decision-makers?

In the next part of this article, I am addressing these questions and will be sharing guidelines on how to successfully turn a data science project into an AI product.

Marc Giombetti

Marc Giombetti

Marc Giombetti is the Chief Product Architect at TwoImpulse. He has 15+ years of international experience in IT, mainly within the re/insurance sector. His professional interests include AI, Cloud, SaaS, IoT, Time Series, Software Quality, and Agile development.