My mom popped a question after dinner: “Do self-driving cars really work?” Before I could reply, my dad, a retired bank manager at sixty-five, interjected with an “obviously not, traffic is such a mess even for a human.” Meanwhile, my mind was wandering, pondering the recent Stephen Wolfram writeup1 about ChatGPT. The creator of Mathematica used Transformer’s failure to generate balanced parentheses to illustrate how deep learning models, or any machine learning models for that matter, can be incredibly unreliable. And there aren’t many things we can do to improve.
Based on my years of experience in machine learning, I agree with my dad, albeit for a different reason. If we were to have AI models take an elementary school arithmetic exam covering addition, subtraction, multiplication, and division of positive integers ranging from 0 to 1002, it is unlikely that our most advanced models would score 100/100. Inevitably, there will be cases where they fail.
Currently, there is no theoretical foundation to build an error-free model. In fact, model failure is a feature instead of a bug, as Terrance Tao has pointed out3. The limit of today’s learning theory does not allow us to know precisely where and how models will fail, as our goal is to have them work on average. Sure, we can modify the loss function to focus on certain cases, but not all cases. Therefore, the only knob we have today might force the AI can get 13 + 27 right, while making a new random mistake for 1+91. This issue applies to auto-driving as well, and in my humble opinion, it’s the fundamental dealbreaker. How can you trust a machine to take the steering wheel when there’s so much uncertainty?
I would rather not take any chances, even if the chances of an accident are one in ten million miles on average for every single car. I mean, that number might seem small, but I’m still worried about whether it is a uniform distribution across all locations. What if the Bay Area is more dangerous than other areas? And what are unknown unknown?
Despite my reservations, I do believe that we’ll have auto-driving one day (hopefully within the next thirty years) once we’ve made serious advancements in the learning theory. But until then, I can still make good uses of recent progress by using the stochastic chatGPT to write short pieces like this one.
- https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/ ↩
- not including real numbers, as they are relatively easy for computers to handle, but too challenging for AI models run on computers. ↩
- https://mathstodon.xyz/@tao/109971907648866712 ↩