Episode #63: Meet the Start-Up at the Center of the Voice-Computer Revolution

Written by

Published on

October 15, 2024

Read

2 min

In the Alumni Ventures Tech Optimist Podcast, host Naren Ramaswamy interviews Tanay Kothari, CEO of Wispr AI, about the transformative potential of their AI-powered voice dictation platform, Flow. They discuss how Flow aims to enhance communication by making technology more intuitive and efficient, paving the way for a screenless future in human-computer interaction.

Episode #63: Meet the Start-Up at the Center of the Voice-Computer Revolution

See video policy below.

In this Meet the Startup episode of the Alumni Ventures Tech Optimist Podcast, Naren Ramaswamy hosts Tanay Kothari, CEO and co-founder of Wispr AI, to discuss the future of voice technology. Wispr’s breakthrough product, Flow, is an AI-powered voice dictation platform designed to transform how people interact with their devices, making communication faster and more intuitive. Tanay shares his vision for creating technology that fades into the background, allowing users to focus on being present.

Watch Time ~31 minutes

The show is produced by Alumni Ventures, which has been recognized as a “Top 20 Venture Firm” by CB Insights (’24) and as the “#1 Most Active Venture Firm in the US” by Pitchbook (’22 & ’23).

READ THE FULL EPISODE TRANSCRIPT

Creators and Guests

HOST

Naren Ramaswamy

Senior Principal, Spike & Deep Tech Fund, Alumni Ventures

Naren combines a technical engineering background with experience at startups and VC firms. Before joining AV, he worked with the investing team at venture firm Data Collective (DCVC) looking at frontier tech deals. Before that, he was a Program Manager at Apple and Tesla and has worked for multiple consumer startups. Naren received a BS and MS in mechanical engineering from Stanford University and an MBA from the Stanford Graduate School of Business. In his free time, he enjoys teaching golf to beginners and composing music.

GUEST

Tanay Kothari

CEO & Founder, Wispr AI

Tanay Kothari is the CEO and Founder of Wispr AI, a company that has developed an innovative brain-computer interface (BCI) that enhances productivity by allowing users to complete tasks through articulated thoughts, facilitated by a Bluetooth-like headset.

To Learn More

Click the logos below for more information.

Important Disclosure Information

The Tech Optimist Podcast is for informational purposes only. It is not personalized advice and is neither an offer to sell, nor a solicitation of an offer to purchase, any security. Such offers are made only to eligible investors, pursuant to the formal offering documents of appropriate investment funds. Please consult with your advisors before making any investment with Alumni Ventures. For more information, please see here.

One or more investment funds affiliated with AV may have invested, or may in the future invest, in some of the companies featured on the Podcast. This circumstance constitutes a conflict of interest. Any testimonials or endorsements regarding AV on the Podcast are made without compensation but the providers may in some cases have a relationship with AV from which they benefit. All views expressed on the Podcast are the speaker’s own. Any testimonials or endorsements expressed on the Podcast do not represent the experience of all investors or companies with which AV invests or does business.

The Podcast includes forward-looking statements, generally consisting of any statement pertaining to any issue other than historical fact, including without limitation predictions, financial projections, the anticipated results of the execution of any plan or strategy, the expectation or belief of the speaker, or other events or circumstances to exist in the future. Forward looking statements are not representations of actual fact, depend on certain assumptions that may not be realized, and are not guaranteed to occur. Any forward- looking statements included in this communication speak only as of the date of the communication. AV and its affiliates disclaim any obligation to update, amend, or alter such forward-looking statements whether due to subsequent events, new information, or otherwise.

Frequently Asked Questions

FAQ

Where is the transcript for this webinar?
Sam:
Do you prefer whispering over typing? Then, listen up.

Tanay Kothari:
The thing that makes Flow very unique as compared to other dictation tools that have come up in the past is that every other tool tries to write everything you say word for word, which is not what you want. You speak very differently than you write. Bell Labs was one of the world’s first voice assistants. This is before Siri, Alexa, any of those. And to people, it just felt like magic. You can break through the standard limitations of human-computer interaction.

Naren Ramaswamy:
A new version of a keyboard that doesn’t require a keyboard basically. So, yeah, just using your voice. That’s cool.

Sam:
Hello, everyone. Welcome back to this episode of the Tech Optimist. We have another Meet the Startup episode for you today, and the startup that we’re talking to is Wispr. Behind the play with an AV jersey on is Naren Ramaswamy, Senior Principal here on the Alumni Ventures team. And our guest today is Tanay Kothari. He’s the co-founder and CEO of Wispr Flow. And you all recognize my voice—I’m Sam, the guide and editor for this show.

Honestly, when it comes to introducing this episode, I don’t have much to add because Naren did it for me. So, I’m going to cut to the chase here and get us right into the episode. Sit back, relax—we so, so hope you enjoy this episode. You’ll hear from me in a few minutes, so don’t go anywhere and enjoy.

As a reminder, the Tech Optimist podcast is for informational purposes only. It is not personalized advice and it’s not an offer to buy or sell securities. For additional important details, please see the text description accompanying this episode.

Naren Ramaswamy:
Hi, everyone. Welcome to this episode of the Tech Optimist, a podcast hosted by Alumni Ventures. My name is Naren Ramaswamy. I’m a Senior Principal at the firm. Today, we’re excited to chat with Tanay Kothari, the CEO of portfolio company Wispr AI. Before we begin, a brief background on Wispr:

Wispr is redefining how we interact with technology using our voice. The company has launched a new platform called Flow, which is a smart voice dictation platform using AI. The platform uses context from your screen as well as AI algorithms to create a much more efficient way to interface with your computer, since our speech is three times faster than our thumbs typing into a keyboard.

In this chat with Tanay, we’re going to discuss how Wispr is blurring the human-machine interface, what he’s seeing in the AI space, and what the future of voice technology looks like. Hope you enjoy it.

Sam:
All right. Naren set us up perfectly for this episode. I’m going to take the baton here for a few seconds and get everyone on the same page about Wispr Flow—what they do as a company, their values, and everything—before we hop into the conversation with Naren and Tanay. I think it’s powerful to know what the company is doing and what their values are before we hear from them. Let their work speak for itself.

This is right on their About page, right on their website. I did the dive so you don’t have to, and I’m just going to read it to you:

“Building voice intelligence that understands you. We are a team of designers, AI researchers, and engineers who step away from the status quo to rethink the fundamental layer of computing—how humans interact with technology.

We want to craft voice interfaces that are both useful, so you trust them, and ubiquitous, so you can use them everywhere. For us, it’s the only way we move from screen-first technologies to voice-first experiences and create a future where we aren’t stuck looking at screens all day.

Our first product, Flow, makes voice dictation delightful. We focus on the biggest use case for technology—letting people communicate their thoughts and AI. Over the last few months, Flow became the first consumer voice dictation platform that makes people want to use voice more than their keyboards. And we’re just getting started.

We care about designing with incredible attention to detail. We care about building intelligence that mimics humans. We care about building experiences that feel intuitive. We care about building software that seamlessly fits into your life, and we care about building magic. If this excites you, we’d love to work with you.”

All right, let’s hop in.

Naren Ramaswamy:
With that introduction out of the way, let’s jump in. Tanay, thank you for joining us today.

Tanay Kothari:
Naren, thanks for having me.

Naren Ramaswamy:
You’ve had a really interesting background in linguistics and computer science. Could you start with a little bit of that background and how it led you to founding Wispr?

Tanay Kothari:
Yeah, of course. I started building in this space about 15 years ago, back when the first Iron Man movie came out, because I wanted to build Jarvis. A buddy of mine and I built what was one of the world’s first voice assistants—this was before Siri, Alexa, any of those. To people, it just felt like magic. At its peak, we had about two and a half million users, and then we got shut down—that’s a story for another time.

What always drove me to this space was that when I thought about how we interact with technology, it just felt very mechanical and cold. Given that technology was going to be a much bigger part of our lives, I wanted to make interaction with it feel as natural as interacting with other people—that’s where all of this came from.

After that, my love for languages grew. I was part of the Linguistics Olympiad team from India, which honestly had less to do with this and more to do with the fact that it was entertaining. It was fun problems to solve, and that’s just something I enjoy doing in my free time.

Naren Ramaswamy:
That’s awesome. And then, you spent some time at Stanford studying computer science. When did you have the vision for something like Wispr, and how did that evolve over time?

Tanay Kothari:
It’s honestly been the same vision for the last 15 years: how do you make technology fade into the background so you can focus on being more present? That, to me, is a very unique part of voice interfaces in general.

When you think about current technology interfaces, they’re all screen-first. This means you’re always distracted, looking down at a device. With voice interfaces that are both useful—so you trust them—and ubiquitous—so you can use them everywhere—that’s the only way I imagine we can step away from screens and actually make technology feel more seamless. It would work for us rather than us working for it.

Naren Ramaswamy:
Yeah, I think that’s a fascinating vision. I know you talk a lot about efficiency as it relates to today’s technology. We use our thumbs to type into a keyboard, but our brain processes thoughts much faster than that. I’d love for you to educate our listeners a little about the productivity lost as a result of that, what opportunity you see there, and how something like Wispr can become a breakthrough technology in terms of productivity.

Tanay Kothari:
Yeah, for sure. The average person types at about 40 words per minute. Less than 1% of people in the world type faster than 80 words per minute. The fastest typists cap out around 130 to 140 words per minute.

If you think about how fast we speak, on average, it’s about 120 to 140 words per minute—already three to four times faster than typing. And when you’re thinking in your mind, your brain runs even faster than that.

So, when you’re trying to do any sort of work, you know in your mind what you want to say or write. The biggest bottleneck is your fingers moving over keyboards—an archaic piece of technology that’s existed since the 1800s and still used because nothing else has worked as reliably yet.

What ends up happening—and we clearly see this when we’ve given Flow to users—is that you just don’t realize how much time and mental bandwidth goes into making sure things are formatted correctly, that there are no typos, and that punctuation is properly placed.

Sam:
More on Flow’s technology right after this.

Matt Caspari:
Hey, everyone, just taking a quick break so I can tell you about the Deep Tech Fund from Alumni Ventures. AV is one of the only VC firms focused on making venture capital accessible to individual investors like you. In fact, AV is one of the most active and best-performing VCs in the U.S., and we co-invest alongside renowned lead investors.

With our Deep Tech Fund, you’ll have the opportunity to invest in innovative solutions to major technical and scientific challenges—companies that have the potential to redefine industries, create a more sustainable future, and deliver significant financial returns. If you’re interested, visit us at av.vc/funds/deeptech. Now, back to the show.

Tanay Kothari:
…versus if you can just let your thoughts flow, you get a lot more out. It lets you be way more creative and removes that bottleneck we have with keyboards reducing our typing speed. What we eventually found is that people on average with Flow are outputting at 120 words per minute. Some people go above 200, even 250 words per minute, which is unfeasible to do with a standard keyboard.

Once they start doing this for everything in their workflow—sending emails, Slack messages, writing long documents—the productivity gains just compound.

In terms of output, as a manager, you’re way more responsive to your team. If you’re doing customer support or sales, you’re reaching out to a lot more people. If you’re a developer using ChatGPT, Claude, or Cursor to get your work done, you can now work much faster and produce more. At the heart of it, you can break through the standard limitations of human-computer interaction.

Naren Ramaswamy:
Yeah, absolutely. It’s a new paradigm in how we communicate with technology. Just to back up a little bit for our listeners—I know you mentioned Flow.

Tanay Kothari:
Yup.

Naren Ramaswamy:
We haven’t actually introduced Flow to the audience yet. I know that’s your new product. Could you share what Flow is, how it works, and perhaps talk about the upcoming public launch?

Tanay Kothari:
Oh, yeah, of course. Flow is super simple: you speak, and it writes for you in every application in your style. Our first product is a Mac app that runs in the background. You just hold down a key, speak naturally, and Flow understands where you are, what you’re trying to do, and writes the messages as you would have written them.

What makes Flow very unique compared to other dictation tools is that every other tool tries to write everything you say word for word, which is not what you want. You speak very differently than you write.

The second thing is that these systems often optimize for the wrong metric. Most speech recognition systems try to make the word error rate really low. But that’s a technical metric—not what matters to the end user.

The metric we care about is the percentage of zero-edit messages—that is, of all the messages you write with Flow, how often do you have to go back and edit something before you send it? With Apple Dictation, Google, or OpenAI’s Whisper, only about 5% of messages are ready to send without edits. With Flow, that number shoots up to 50% to 70%, depending on where you are. That’s our biggest differentiator, driven by rethinking what we wanted Flow to do.

Naren Ramaswamy:
That’s fascinating. Thanks for sharing. What’s the core technology that powers this voice interface, and what makes Flow unique compared to existing solutions?

Tanay Kothari:
The biggest part is how we think about it. Flow aims to do two things really well: first, users must trust Flow, and second, users must feel understood by Flow.

For example, if I say, “Naren, let’s meet at 5:00. Actually, you know what? Let’s do 6:00 PM,” a standard dictation system would write everything word for word. Flow would write, “Naren, let’s meet at 6:00 PM.” When users see that, they feel Flow is competent. They think, “Oh, I can make mistakes and it’ll be fine because Flow will correct them,” and that drives trust.

Under the hood is a set of models we built internally at Wispr that handle personalization, context understanding, reducing hallucinations, making you sound concise, and reasoning if something is a command (like changing the time to 6:00 PM) versus just part of your message.

At the core, people think of Flow as another person, and they have human-like expectations of it. If Slack has a bug, you think “Slack bugged out.” If Flow makes a mistake, people say, “Flow didn’t understand me” or “Flow didn’t remember this.” Since users treat Flow like a person, everything we do to architect the system aims to meet those human expectations.

Sam:
In a few seconds, you’ll hear a video demo of Flow that Wispr has put together. It’s awesome—really cool and witty. To see the visuals, hop over to our YouTube channel, but if you’re just hanging with us on audio, you’ll get the audio version. Hang tight.

Tanay Kothari:
And here’s how: Hey, Sehaj, you want to meet at 5:00? Or actually, you know what? Let’s do 6:00 PM. Yo, that’s sick—fire emoji. Hey, listen, here are three things that make Flow unique:
- It gets names right, even uncommon ones like yours.
- It lets you edit, especially if you change your mind while speaking.
- It formats your messages just like you would.
And when you’re around other people, plus you can use Flow commands to access AI wherever you’re working.

Flow, let’s actually make this for our Spanish-speaking audience. Most Flow users actually use Flow more than their keyboards. Don’t believe me? Go to flowvoice.ai to try it yourself and download the app to use it everywhere. Cheers.

Steve Wozniak:
For those creative types like myself with very personal speaking styles, this is a godsend. I’m glad to see the launch of Flow today. This is what computers were meant to do for people.

Naren Ramaswamy:
Yeah, absolutely. The blur between humans and machines is getting smaller and smaller. The distance between the two is being reduced through these platforms, and that’s a sneak peek into what the future will look like.

Bringing it back to the present—I know that in the Wispr office, all of your employees use Flow regularly. I think every email that I send you, I get a response from you that’s written in Flow. For our audience, would you be able to share some of the common use cases and applications for which people use the platform?

Tanay Kothari:
Yeah, of course. One of the big ones right now is that everybody who uses Cursor is switching to Flow. This is primarily for writing code and building products. The same goes for Vizero and Replit. And when people are programming agents, as engineers, they’re often early adopters of technology, and everything they do is now more natural language-driven. For that, Flow has become their default go-to.

Another use case is that many of our users are highly ambitious people whose jobs involve a lot of communication. This could be with your team on Slack or other messaging platforms. It could also be external to your company if you’re messaging clients or potential customers.

Personally, I use it for all of our customer support, all of my investor communications, and to respond to my entire team on Slack all day. These are the two biggest use cases.

Another emerging use case is when people are interacting with AI agents like ChatGPT or Claude. What we’ve seen happen is that if you’re typing, you tend to write the shortest query possible, like “toilet broken, how fix”—similar to what you’d type into Google. But if you can speak, you might say, “Hey, my toilet is broken, and when I pull the flush, it doesn’t actually flush properly…” You can go on a rant for 30 seconds—it takes the same amount of time—but now the system has much more context and gives you better results, which is what you really want.

We’re still understanding why people prefer using Flow over ChatGPT’s voice or other voice interfaces built into these products. Part of it has to do with how we think about the whole interface, and part of it is habit—people start using Flow in one application and then slowly realize, “Why use a keyboard at all?” Over time, we see people using Flow more than their keyboards in terms of daily output.

Naren Ramaswamy:
Yeah, it’s the new version of a keyboard that doesn’t require a keyboard.

Tanay Kothari:
Yeah.

Naren Ramaswamy:
Just using your voice—that’s cool.

Tanay Kothari:
Mm-hmm.

Naren Ramaswamy:
Moving to AI tools like this, there’s always a conversation around privacy and data security. What’s your stance on that, and how are you tackling it?

Tanay Kothari:
Great question. This is very front and center for the entire product. People use Flow for their most sensitive messages—whether it’s to loved ones, to their team professionally, or for personal journaling and notes.

By default, all your data stays locally on your computer. No one else can access it but you. We don’t save anything on our servers. Most AI tools today are data-sharing opt-out—they opt you in by default, and you have to manually turn it off later. A lot of users just don’t know about this, so they aren’t well-educated on it.

With Flow, by default, data sharing is off. You can turn it on to help us improve our models, but it’s off initially. This has helped us build a lot of trust with our early users and allows them to use the product for everything without having to think twice.

Now, this raises the question: if we’re not collecting all this data, how do we make our models better? We have a few methods for that, including federated learning that’s privacy-preserving and completely anonymous. But for the most part, we rely on using just the context available locally on a person’s computer to deliver the best personalized experience.

Naren Ramaswamy:
Yeah, that’s fascinating. Thanks for going into detail about Flow. Let’s zoom out a little bit and talk about the AI space. I’d love to get your thoughts. The jury is still out on how AI will transform our lives—there are tools for enterprise and tools for consumers. What are some of the challenges you’re seeing as an AI company today?

Tanay Kothari:
A big one that’s emerging is how people think about generative AI. The standard conversation right now is around large language models (LLMs) and their architecture. LLMs are fundamentally non-deterministic.

This is great because they can be creative, reason, and display different emergent behaviors. But non-determinism means you can’t use them for tasks that need to be 100% accurate. They’re pretty much like humans in that way.

The implication is that many companies are trying to replace everything they do with AI-only solutions instead of thinking of them as AI-first solutions.

For example, if you’re doing accounting, would you want a smart person to do it manually on paper, or would you rather give them a calculator to help?

I think a fundamental idea often missed is that when building AI-based solutions, AI doesn’t have to be the only component. You need to back it up with business logic, heuristics, and algorithms you develop yourself. Use the LLM component for what it does well—reasoning, creativity, and knowledge about the world—and combine it with other systems.

I hope that over the next couple of years, we’ll see more solutions built this way. Because as it stands, non-determinism makes it hard for many business use cases to adopt LLMs. There’s just not enough guarantee that one day it won’t hallucinate or write an extra zero on an invoice.

Naren Ramaswamy:
And you hit the nail on the head. I think, for me, that gets me excited as a VC because there’s tremendous potential for this technology. But widespread adoption will require this determinism. In the enterprise, you can’t afford to make those kinds of mistakes often. So, what new tools can help AI tools be deterministic? Because not every company is a Wispr AI where you have AI engineers on your team that can build those systems internally.

Definitely an exciting opportunity for us as we continue to look for startups. Just to close out, tell us a little bit more about your public launch. How can users try Flow? And is there anything you’d like to leave with users before we close out?

Tanay Kothari:
Yeah, of course. So far, Flow’s biggest complaint from users has been that it’s only available on Macs, meaning people can’t experience the magic of Flow on their phones or other laptops they might have.

One of the big things we’re launching is actually a web version of Flow that you can try out on our website, flowvoice.ai. Once you go there, you can test how fast you are when you speak, and you can also download the application directly and get started.

We’re going to have a launch promo for all users who sign up in the first two weeks of October, giving them three months off on our annual plan. The promo code is phlaunchday—just add that when you sign up, and you’ll be ready to go.

Naren Ramaswamy:
That’s awesome. Thanks a lot, Tanay. I’m really excited about what you’re building. As someone who’s had the chance to try Flow and use it in some of my daily tasks, I believe the old technology of keyboards and the graphical user interface from decades ago is going to be transformed in the AI era. You guys are pioneers in this space, and I’m excited about what you’re creating.

Tanay Kothari:
Thank you. Always love hearing from happy users.

Naren Ramaswamy:
That’s great. Well, thanks a lot and good luck.

Tanay Kothari:
Thank you.

Naren Ramaswamy:
That was our podcast with Tanay Kothari, the CEO of Wispr AI. Hope you enjoyed it. For those of you interested in learning more about becoming an investor in our Deep Tech Fund, I encourage you to view the fund materials or book time with our team. You can learn more at av.vc/funds/deeptech.

We’re actively raising our fifth fund and just completed our first investment from that fund, which was into Grok, a pioneering AI chip company. By investing in the fund, you’ll have exposure to that company. Thank you for tuning in, and we hope you enjoyed the discussion.

Sam:
Thanks again for tuning into the Tech Optimist. If you enjoyed this episode, we’d really appreciate it if you’d give us a rating on whichever podcast app you’re using, and remember to subscribe to keep up with each episode. The Tech Optimist welcomes any questions, comments, or segment suggestions. Please email us at [email protected] with any of those, and be sure to visit our website at av.vc. As always, keep building.

Written by

Alumni Ventures

Alumni Ventures is a network-powered VC firm that helps accredited individuals invest in venture capital.

Twitter

Facebook

Episode #63: Meet the Start-Up at the Center of the Voice-Computer Revolution

Tech Optimist Podcast — Tech, Entrepreneurship, and Innovation

Episode #63: Meet the Start-Up at the Center of the Voice-Computer Revolution

READ THE FULL EPISODE TRANSCRIPT

Creators and Guests

HOST

Naren Ramaswamy

Senior Principal, Spike & Deep Tech Fund, Alumni Ventures

GUEST

Tanay Kothari

CEO & Founder, Wispr AI

To Learn More

Frequently Asked Questions

FAQ

Written by

Tagged with