Of course, this is not the ultimate goal. The goal is to create value from data. But just like in investments, you want to minimize the duds, in other words, the ones that will go nowhere - or worse, down! You may not know which ones will make you money, but you have to be disciplined and cut your losses - manage your risks.
One of the things I learned in agile projects is to maximize the work I don't do. When applying data science, it is easy to waste time and wind up nowhere. Knowing this, we want to figure out how to filter out projects as early on as possible before too much money has been burned.
So, is there a way to prevent some of that waste? Of course, there is. Ask these three questions:
What is the question we are trying to answer with data? The more precise we are here, the better. The question may have already been answered. Cool technology and buzzwords are fine, but knowing what you are trying to get at will filter out a lot of ideas that will not bring any value to your organization.
Do we have the data actually to answer the question? Again, without data that is pertinent to the question, there isn't much we can do.
Finally, iIf you could answer the question with the data, could you use the answer? You would be surprised here the problems we can get into, like ethics, lack of computing resources, or extreme difficulty in placing the data pipeline in production.
Asking these simple questions will help you focus efforts and financial resources where they have more chances of paying off rather than just having cool projects.
Focus is always the best way to make sure we deliver value. How can we accomplish this easily when it comes to data science (or big data) projects?