In Silicon Valley, some of the brightest minds believe that a universal basic income (UBI) that guarantees people unrestricted cash payments will help them survive and thrive as advanced technologies wipe out more careers. as we know them, from administrative and creative positions: lawyers, journalists, artists, software engineers, to labor jobs. The idea has gained enough traction that dozens of guaranteed income programs have started in US cities since 2020.
However, even Sam Altman, the CEO of OpenAI and one of UBI’s highest-profile proponents, doesn’t think it’s a complete solution. As he said during a meeting earlier this year, “I think it’s a small part of the solution. I think it’s great. I think that as (advanced artificial intelligence) becomes more and more involved in the economy, we need to distribute wealth and resources much more than we have and that will be important over time. But I don’t think that’s going to solve the problem. I don’t think that makes sense to people, I don’t think that means that people will stop trying to create and do new things and whatever. So I would consider it an enabling technology, but not a blueprint for society.”
The question posed is what a plan for society would look like in that case, and computer scientist Jaron Lanier, a founder in the field of virtual reality, writes in this week’s New Yorker that “data dignity” could be a solution, if not he answer.
Here’s the basic premise: Right now, we mostly give our data away for free in exchange for free services. Lanier argues that it will be more important than ever that we stop doing this, that the “digital things” we depend on (social media in part, but also increasingly AI models like OpenAI’s GPT-4) “be connected to the humans”. who give them so much to ingest in the first place.
The idea is that people “get paid for what they create, even when it’s filtered and recombined through big models.”
The concept isn’t new, as Lanier first introduced the notion of data dignity in a 2018 Harvard Business Review article titled “A Blueprint for a Better Digital Society.” As he wrote at the time with co-author and economist Glen Weyl, “the (R)rhetoric of the tech sector suggests a coming wave of underemployment due to artificial intelligence (AI) and automation” and a “future where people are more and more treated”. as worthless and devoid of economic agency.”
But the “rhetoric” of universal basic income advocates “leaves room for only two outcomes,” and they are quite extreme, Lanier and Weyl noted. “Either there will be mass poverty despite technological advances, or a lot of wealth will have to be brought under central national control through a social wealth fund to provide citizens with a universal basic income.”
But both “hyper-concentrate power and undermine or ignore the value of data creators,” the two wrote.
unravel my mind
Of course, giving people the right amount of credit for their myriad contributions to everything in the world is no small challenge (even when one can imagine that AI audit startups promise to tackle the problem). Lanier acknowledges that even data dignity researchers can’t agree on how to unravel all that AI models have sucked up or how detailed an accounting should be attempted.
But he thinks, perhaps optimistically, that it could be done gradually. “The system would not necessarily account for the billions of people who have made environmental contributions to large models, those who have added to a model’s simulated proficiency with grammar, for example. (He) could cater only to the small number of special taxpayers that arise in a given situation.” Over time, however, “more people could be included, as intermediary rights organizations (unions, guilds, professional groups, etc.) begin to play a role.”
Of course, the most immediate challenge is the black-box nature of current AI tools, says Lanier, who believes that “systems need to be made more transparent. We need to get better at saying what’s going on inside them and why.”
While OpenAI had published at least some of its training data in previous years, it has since closed the kimono entirely. In fact, Greg Brockman told TechCrunch last month of GPT-4, his latest and most powerful large language model to date, that his training data comes from a “variety of data sources licensed, created, and available publicly, which may include publicly available personal data”. information,” but declined to offer anything more specific.
As OpenAI stated in the GPT-4 release, there are too many drawbacks for the team to reveal more than it does. “Given the competitive landscape and security implications of large-scale models like GPT-4, this report does not contain further details on architecture (including model size), hardware, training computation, ensemble construction of data, the training method or the like”.
The same is true for all large language models today. Google’s Bard chatbot, for example, is based on the LaMDA language model, which trains on Internet content-based data sets called Infiniset. But little else is known about it other than what Google’s research team wrote a year ago, which is that, at some point in the past, it onboarded 2.97 billion documents and 1.12 billion dialogs with 13 39 billion expressions.
Regulators are grappling with what to do. OpenAI, whose technology in particular is spreading like wildfire, is already in the crosshairs of a growing number of countries, including the Italian authority, which has blocked the use of ChatGPT. French, German, Irish and Canadian data regulators are also looking into how it collects and uses data.
But as Margaret Mitchell, an AI researcher who was previously Google’s co-head of AI ethics, tells Technology Review, it might be nearly impossible right now for these companies to identify people’s data and remove it from their models.
As the outlet explained: OpenAI “could have saved a huge headache by building in robust data record keeping from the start (according to Mitchell). Instead, it is common in the AI industry to create data sets for AI models by scraping the web indiscriminately and then outsourcing the work of removing duplicate or irrelevant data points, filtering out unwanted stuff, and fixing typos.”
How to save a Life
That these tech companies actually have a limited understanding of what is now in their models is an obvious challenge to the “data dignity” proposition of Lanier, who calls Altman a “colleague and friend” in his article in the New Yorker.
Whether it makes it impossible is something that only time will tell.
There’s certainly merit in wanting people to take ownership of your work, and frustration over the issue could certainly grow as more parts of the world reconfigure with these new tools.
Whether or not OpenAI and others had the right to crawl the entire Internet to feed their algorithms is already at the center of numerous wide-ranging copyright infringement lawsuits against them.
But the so-called dignity of data could also go a long way toward preserving human sanity over time, Lanier suggests in his fascinating New Yorker article.
As he sees it, universal basic income “is tantamount to putting everyone on the dole to preserve the idea of black box artificial intelligence.” In the meantime, ending the “black box nature of our current AI models” would make it easier to account for people’s contributions, making it more likely that they will continue to make contributions.
Importantly, Lanier adds, it might also help to “set up a new creative class instead of a new dependent class.” And you, which one would you rather be a part of?