There are a lot of meaningful and consequential ideas in mathematics that your average person on the street does not or cannot appreciate, and to his or her loss. One of them is optimization. Your author is, in fact, a mathematician who gets excited when he hearing about things like “class field theory”, but even he gets a bit sleepy when someone starts talking about minimization/maximization problems. It’s a shame because it’s one of those far-reaching principles that affects how we related to each other, how we organize our lives, and even how we think about the future. The neat thing about mathematics is that it gives us a language to describe our observations so precisely that new insights about whatever one is observing can pop out merely from symbols written on a page. Sometimes that same language appears in the mathematical descriptions of several different, distinct phenomena (for example, complex analysis, fluid mechanics, and electromagnetism) , and that’s when we need to pay especially close attention. Optimization is one of those mathematical languages and we’re using it describe our societal interactions, how learning happens, and how fundamental physics works. Let’s count some of the ways.
Capitalism and the optimization of markets
Capitalism, for whatever that word means, attracts its fair share of detractors. But even its detractors would begrudgingly admit1 that it optimizes for low cost and for more desirable products and services, which eventually raise standards of living. The ruthless efficiency of the market would drive down costs while unleashing “creative destruction” in the search for new commodities, cheaper ways to produce them, and more consumers to buy them. It’s defenders argue that any government intervention in free markets would cause higher prices, lower wages, less innovation, widespread shortages, and generally lower standards of living.2 In the ideal form, markets search for the best way to extract money from consumers, creating new technologies and innovations along the way. Every interaction we have – from meeting our basic needs to meeting friends for an overdue catch-up – requires some participation in the market. And so markets end up optimizing all of society for capitalist production. We study and encourage other people to go to university to increase lifetime earnings, which is to say to optimize how the markets value us and how much other people can extract something desirable from us. We pick places to live based on how easy it would be to commute to work, which is to say that we optimize for our participation in the labour market. In this way, markets, and capitalism in general, become an optimization force for all of society.
At some point in the above we started to abuse the word “optimize”, but it’s not an abuse at all to speak about how markets optimize prices, or commodity production, or labour markets. Indeed, where we write “optimize labour markets” someone else might write “exploit labour”, the meaning is the same. Whatever one calls it, roughly one third of our lives is spent participating in the labour market, which means that market optimization is a major force in our lives.
Markets are such great optimizers that we use them as metaphors for solutions in other mathematical optimization problems, such as in machine learning3. In fact, we could argue that learning itself is an optimization problem.
Learning as optimization
It’s a renaissance for Artificial Intelligence right now. AI models like GPT-34 can write entire articles almost as well as a human can. DALL-E5 can take a description of a picture and then produce a picture that matches. This doesn’t even touch on the advances in speech processing, auto-translations, self-driving cars, et cetera. So it is no surprise that people are abuzz with thoughts of an “artificial general intelligence”, the idea that an AI algorithm would be good at everything rather than just the specific task that it was designed for.
It is debatable whether or not such an “artifical general intelligence” is possible or if our current AI research could lead there, but we could sketch out an idea of how we might get there. Today’s advances in AI come chiefly from machine learning, and the core idea in machine learning is an optimization problem.
Machine learning means looking for the best set of variables that minimize the value of some “loss function” relative to some dataset. It’s a mathematical optimization problem, like the ones that markets try to solve but the method for solving it is different. Markets try to “search” for the lowest price, et al, by rewarding people who try new, successful things and withdrawing resources from the unsuccessful ideas. Machine learning algorithms optimize the loss function via “gradient descent”, that is, they make small changes to the current value of those variables, measure how the loss function changes, and make more or less of those changes accordingly. If the value of the loss function decreases, you make that your new set of variables and repeat. If it makes the loss function worse, you try the opposite change. This mathematical idea led directly to GPT-3 and DALL-E, and other AI models.
The bridge between these models and “general intelligence” could go something like this: learning involves building some model of the world, and coming up with a loss function that quantifies how wrong the predictions of that model are from the information you get from reality. Learning happens when you start to optimize that model for the lowest possible value of that loss function, id est, you optimize for the smallest error between your mental model of the world and your experience in reality.6
One idea of consciousness is that the human mind contains many such models, each one trying to predict what one would feel (or see, or hear, or smell, et cetera) based on what it felt (or saw, et cetera) in the past and what the other models are also predicting. Consciousness could have emerged as a way to improve the seamless functioning of these multiple models.7 A related idea is that these models lead to various hallucinations, which ancient people attributed to deities, until, eventually, they evolved the ability to introspect into these voices and become self-aware. This is the idea behind the “bicameral mind”8. Your author’s own mental models are still trying to decide if this idea is brilliant or absurd9.
Leaving aside these theories of consciousness, if this theory of learning is correct then both human learning and the predominate mode of organizing human societies are both solutions to optimization problems. It’s almost like it’s a fundamental idea in the human experience. Maybe even the universe itself.
The universe as an optimization problem
One of the most powerful descriptions of physics is also an optimization problem. Many people learn about Newton’s laws, and how they are just approximations for quantum mechanics. But there is another powerful tool in mathematics that gives a language that helps to bridge the gaps between “Newtonian” physics and the quantum world: it’s called Lagrangian mechanics10.
We won’t go through the details here11, but the basic idea is that if you want to model the motion of a particle you can start with a mathematical description of all paths you can imagine it could take as it moves through space. Obviously most of these paths (all but one) never actually happen. How do you find the actual path it will take? You ask for a Lagrangian function, which, for a given path, measures the difference between how much energy the particle has at a given time on that path and how much potential energy it could have on the same path. You then have another optimization problem: the path the particle will take is the path that minimizes this energy difference.
Put another way, the motion of everything, from stars to electrons, can be described as the universe solving an optimization problem: minimize the difference between kinetic and potential energy, and that is the path the universe will take.
Optimization as a grand unifying theory
What a coincidence that capitalism, learning, and fundamental physics all work via the same principle of optimizing some system to minimize some function that describes the system! How strange that all of these things could be described by the same language. Why is that?
Perhaps it really is a fundamental law of the universe. After all, capitalism and classical mechanics are both consequences of the same renaissance in Europe. Adam Smith published The Wealth of Nations just twelve years before Joseph-Louis Lagrange published his Mécanique Analytique. Maybe human society was observing a fundamental law of the universe in different guises; maybe the universe wants us to become a “paperclip maximizer”12 or die. Since the universe is optimizing for the difference between actual and potential energy, and how we learn optimizes for the difference between our actual and predicted experience, human society must organize itself to optimize something similar.
The is quite a tenuous and absurd proposition. But let’s leave the absurdity and imprecision of this argument aside for a second – if something like that is true, then it bodes well for the idea that capitalism, or some force like it, is the “end of history”13. What are the consequences for human society if the universe wants to turn us into “paperclip maximizers” ? This is useful question even if our absurd premise is false – as the absurdity of our premise doesn’t alter the fact that capitalism is still the predominate mode of organizing human society and it is an optimizing force. The answer is not a comforting thought.
Whether its the markets or some hereto undefined force that optimizes society, it means that people will continue to engage in competition for status, wealth, jobs, and resources. And that competition will only get more fierce. Since almost all of us need to work for a living, we will all have to engage in the competition to find a good job. Which will lead us to spend more time differentiating ourselves through labor-oriented education or other skills acquisition (undoubtedly this will get more efficient, too, no room for fun learning here). We can already see the effects from the pressure to do well in university admission exams leading to suicides in certain countries.14 This will come to all counties, and in more extreme versions. But it wouldn’t stop there. Obviously more time spent preparing for work means less time spent on cultivating relationships, strengthening families, building communities, or anything else that makes life seem hopeful, meaningful, or beautiful.15 Even if some section of society was walled-off from such competition so that it could cultivate such a garden, those optimizing forces would chip it away, alter the ethics of people involved, and drive it out of existence. There would be ever decreasing amounts of time to focus on art, or ethics, or more generally to consider how we would want human society to look like, unless such pondering would lead to a marketable commodity.
Those who wouldn’t have to work for a living would be pressured to make things worse for those that do. The elites would still be competing, and ultimately they would need to compete to find new ways to optimize the labour force. We have already seen capitalism produce new ways to create a workforce without the labourers realizing it – social media turns human attention into a fungible commodity16. This is because social medias’ business model relies entirely on advertising products and services to specific users, which means they must collect and use detailed information of those users and create a market for others to buy and sell the attention of specific groups of people in ways that earlier forms of advertising could never do. Social media users produce the commodity that tech companies sell, becoming an unwitting labour force. If one imagined how this new commodity and workforce could be optimized, one could foresee a world where those who couldn’t find a decent job would find solace in virtual reality, where their minds experience a parasocial, virtual holiday while their bodies toil away at whatever physical labour elites need to be done and cannot yet get robots to do.
Optimization as an example of how economics influences our thought
Trying to predict the future is dangerous, and working under the assumption that capitalism, or some other optimizing force, will continue to shape society doesn’t make it any easier. The future could be much worse than what we just sketched out. It could also be a lot better. Capitalism, or some force like it, could not “optimize” humans into pits of attention to be mind, because it would still require people with the time, money, and attention to buy whatever products are being advertised. It couldn’t destroy art or philosophy because it would still require a managerial class with enough creativity to outperform their peers, and thus have enough time, money, and attention to consider art and philosophy. Capitalism cannot optimize away humanity because it needs humanity to know what to optimize for. The universe itself cannot be an optimization machine because the system with the smallest difference between actual and potential energy is the the system where nothing exists and nothing happens. Even the concept of learning as an optimization problem has difficulties as one still needs to answer the question of why such mental models would exist at all.
It could be true that optimization is a fundamental fact of the universe, but it is more likely that its an artifact of the way we perceive the universe. That perception might be a consequence of our first example: the predominance of capitalism in our society. Markets dominate our society to such an extent that it colors our perception of how the universe works, in the same way that it gives us metaphors for machine learning models. Since we are so subject to this “invisible hand” shaping our lives, perhaps we start to see this “invisible hand” optimizing everything else. It is an underlying metaphor in our daily lives, so it would not be surprising if it were an underlying metaphor in our mathematics and scientific research. In fact, it is not even true that our example from fundamental physics, the Lagrangian mechanics, is the “most correct” description of the universe. The mathematical formulation of quantum mechanics, for instance, relies on describing “observables” (like position or speed) as mathematical operators on a set that describes all possible states of the system we wish to study. There is another formulation of classical mechanics that uses that same language: Koopman–von Neumann classical mechanics17. Indeed, we already know that our models for fundamental physics are not the end of history, because they cannot explain how gravity interacts with quantum physics18. It is very likely that machine learning is not the end of AI, nor capitalism the end of history. Much as ancient, agrarian societies never saw the industrial revolution coming, we probably won’t predict whatever is coming next, either.