12 June 2023

Cervin Founder Spotlight: Anshumali Shrivastava of ThirdAI

Anshumali Shrivastava, Founder & CEO of ThirdAI, sat down with Cervin partner Shirish Sathaye to share his experience as a founder and offer advice for the next generation of entrepreneurs.

Shirish Sathaye, General Partner, Cervin: 

Hi Anshu, thanks for taking the time to talk with me today. We are excited to be invested in ThirdAI. We were among the first investors in the company, which is an exciting opportunity to change the economics of artificial intelligence and machine learning. First things first, tell us a bit about yourself and ThirdAI. 


Anshumali Shrivastava, Founder and CEO, ThirdAI:

I’m Anshumali Shrivastava. I’m the founder and CEO of ThirdAI. I’m also a tenured faculty member in Computer Science at Rice, I’ve been here since 2015. My specialization is training really large neural networks, and I have been working on that since 2010.


SS: So you’re a professor at Rice. You are established there, you have graduate students. What inspired you to start a company?


AS: For a bit of background, I was a math major as an undergrad. Then I worked for a few years at FICO, building credit score models, and that is how I got interested in machine learning. With my math background and this renewed interest in data-driven sciences, I started my Ph.D. at Cornell, where I looked at fundamental information retrieval problems. Around this time, we solved an open problem and got the NeurIPS best paper award, and we showed that certain theoretical information retrieval problems could be solved efficiently. This was the direction I was headed in when I joined Rice. Then I focused on how to use similar ideas to make neural networks faster. I’ll not bore you with all of the technical details, but we figured out a non-trivial way to train neural networks with the same accuracy but requires 10,000 or 1,000 fewer operations. In some of the early papers we put out in the academic community, we talked about a neural network that, on some CPUs, beat GPUs by 5 or 10X speed. Now that excited everybody. At this point, my co-founder Tharun and I were convinced that we’d created a very valuable thing because AI demand is shortly going to surpass any computing capabilities. What we have is a technology that can dramatically increase the performance and efficiency of AI. That's when we decided we would do the hard work, and take this journey.


SS: So obviously, there is a lot of excitement around AI and large language models. Now along with that excitement, there are some fears about it turning sentient and turning on humanity, but let’s put that aside for now. Besides that, there are also many challenges that have to do with the economics of training models, including environmental problems, cost issues, and availability issues. Can you speak to that? I’d love to hear from an expert.


AS: I recently gave a talk about what it really takes to build a large language model and the cost of it. Unfortunately, there is a lack of information about this topic. 


So imagine you are an enterprise and want to build a large language model; you have two options. One is to go with the usual suspects, take your data, send it to a special private cloud, and train the model. Now we are talking about terabytes of your information. So if you are an enterprise that deals with payroll or something like that, let's say you need all the payroll information to be sitting there to train, but if you trust the cloud, you should also know that it's fine to have one more copy in a AI ready cloud, but having 2 and 3 copies in the cloud doubles and triples the privacy risk. Mind you, to train this large language model, you need a dedicated infrastructure, so it cannot be where your data resides. So you are transferring your data. 


The alternative is to use an open source model. Now that's a viable route, but it requires that you first build the infrastructure yourself. This infrastructure requires a large hardware cluster. If you are relying on cloud infrastructure, you still need to send your data to the cloud. You need specialized engineers to build this pipeline. For getting the model to work, you are relying on the open source community to progress in your desired direction, which can be a challenge.


The whole friction arises because of hardware tension. Data and AI cannot sit together. AI sits on its own hardware, and data sits on its own hardware. And you have to move the data from its hardware to the AI hardware to train it. Also, AI building is a constant process, so you have to constantly deal with the hardware barrier and friction. Sooner or later, enterprises will realize that, and that is where ThirdAI comes in. We are bringing AI to the data, which is much easier. 


SS: Now let’s talk about your transition from professor to entrepreneur. How does the world of academia compare to the startup world? 


AS: There are quite a few differences, but one big one is that academia is more focused on the rigor of an idea and startups and companies are more focused on the rigor of execution. 


Another difference comes from how you interact with people. When you are a professor, every Ph.D. candidate is like a small company with their own objectives and goals they are working towards. Whereas in a startup, you have a team all working towards the same goal. And I think adjusting to this is a hard part of the transition from academia to the startup world. It’s also the fun part of transition.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas a fringilla tortor, et porttitor tort. Vestibulum non nisi interdum, blandit dolor in. laoreet magna. Suspendisse sit amet elit sit amet nisl. semper imperdiet. Suspendisse

SS: Now let’s talk about starting and building a company. There are a lot of fun things about it, but there are also a lot of difficult things. So what do you think is the most difficult thing about starting your own company?

AS: The most difficult part is always asking what not to do. So trying to keep ourselves away from a new idea, especially in the world of AI, right now, if you look at and ask what we can do, the answer is we can do so many things and all are exciting in their own way. So keeping ourselves away from all those tantalizing possibilities and focusing on very few compelling ones is the hard part we struggle with. And not not only do I personally struggle with this, but the team also does.

SS: What is the most important trait for entrepreneurs? 

AS: To be honest, I’ve been thinking about this a lot recently. I think for entrepreneurs, it's very hard to put down a rule that there are certain traits that are most important. It’s basically the magic of the environment, the mindset, the team. It is an ecosystem. But what takes an ecosystem to the next level, or the level that goes beyond success - is consistency.  Whatever you believe is correct, be consistent in that. 

SS: And finally, as you continue on this journey, I’m sure entrepreneurs will come and ask you for advice. What advice would you give them?

AS: Do it for the fun of it. There are lots of ups and downs, lots of uncertainty, and lots of unknowns. But there is one reliable thing, and that is the fun meter. If it’s not fun, if that fun meter is down, then something somewhere is wrong. Change it. If the fun meter is high, keep going. Another thing I’ll say, startups are way more fun because of the thrill. But, of course, there is no thrill if there is no uncertainty. But uncertainty also means no limit to what you can achieve and you could also lose it all.