Data Moats
I’d like to think that over the last 6.5 years at Upstart I’ve learned a little bit about machine learning. I’ve even built a very simple neural network that performed pitifully in a Kaggle competition. One thing I’ve heard repeatedly is that a mediocre algorithm with great data will beat a great algorithm with mediocre data. Of course, having great models and robust data is obviously the best options…
However, this focus on the quality and quantity of data represents a real opportunity. There will be a tremendous number of industries where ML techniques can be extremely valuable and where an early entrant could build a substantial moat by having the best data sources. This is a virtuous cycle because as you get new customers, you have even more data vs your competitors — which leads to a better product (ie more accurate machine learning models) which leads to more customer wins.
This is a similar dynamic to what has helped Google maintain a lead in search quality. Having more search data results in a better search algorithm. The better algorithm attracts more users — resulting in even more data. And on and the on the cycle goes — and it’s working pretty well for Google.
As a B2B guy by background, I an interesting opportunity to marry this type of virtuous cycle with a SaaS product. This SaaS company could ultimately build a moat around their best product by having the most/best data sources to feed into ML algorithms. You can imagine this happening in a number of areas. Image processing of various types is a natural (medical, law enforcement, etc), but there will be interesting opportunities in almost every industry for ML technologies to make huge improvements over some processes — and for entrepreneurs to build great businesses.
Originally published at https://jeffkeltner.com on January 31, 2019.