De cerca, nadie es normal

On Natural Language Processing, Game Theory, and Diplomacy

Posted: April 11th, 2023 | Author: | Filed under: Artificial Intelligence | Tags: , , , , , | Comments Off on On Natural Language Processing, Game Theory, and Diplomacy

Beyond GPT in its different evolutions, there are other LLMs -as stated in Large Language Models  (LLMs): an Ontological Leap in AI– developed with a perfectly defined industry focus in mind. This is the case of CICERO.  

In November 2022, the Meta Fundamental AI Research Diplomacy Team (FAIR) and researchers from other academic institutions published the seminal paper Human-level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning, laying the foundations for CICERO. 

CICERO is an AI agent that can use language to negotiate, persuade, and work with people to achieve strategic goals similar to the way humans do. It was the first AI to achieve human-level performance in the strategy game No-press Diplomacy

No-press Diplomacy is a complex strategy game, involving both cooperation and competition, that has served as a benchmark for multi-agent AI research. It is a 7-player zero-sum cooperative/competitive board game, featuring simultaneous moves and a heavy emphasis on negotiation and coordination. In the game a map of Europe is divided into 75 provinces. 34 of these provinces contain supply centers, and the goal of the game is for a player to control a majority (18) of the SCs. Each players begins the game controlling three or four supply centers and an equal number of units. Importantly, all actions occur simultaneously: players write down their orders and then reveal them at the same time. This makes Diplomacy an imperfect-information game in which an optimal policy may need to be stochastic in order to prevent predictability. 

Diplomacy is a game about people rather than pieces. It is designed in such a way that cooperation with other players is almost essential to achieve victory, even though only one player can ultimately win. It requires players to master the art of understanding other people’s motivations and perspectives; to make complex plans and adjust strategies; and then to use natural language to reach agreements with other people and to persuade them to form partnerships and alliances.

How Was Cicero Developed by FAIR?

In two-player zero-sum (2p0s) settings, principled self-play algorithms ensures that a player will not lose in expectation regardless of the opponent’s strategy, as exposed by John von Neumann in 1928 in his work Zur Theorie der Gesellschaftsspiele.

Theoretically, any finite 2p0s game -such as chess, go, or poker- can be solved via self-play given sufficient computing power and memory. However, in games involving cooperation, self-play alone no longer guarantees good performance when playing with humans, even with infinite computing power and memory. The clearest example of this is language. A self-play agent trained from scratch without human data in a cooperative game involving free-form communication channels would almost certainly not converge to using English, for instance, as the medium of communication. Owing to this, the afore-mentioned researchers developed a self-play reinforcement learning algorithm -named RL-DiL-piKL-, that provided a model of human play while simultaneously training an agent that responds well to this human model. The RL-DiL-piKL was used to train an agent, named Diplodocus. In a 200-game No-press Diplomacy tournament involving 62 human participants, two Diplodocus agents both achieved a higher average score than all other participants who played more than two games, and ranked first and third according to an Elo rating system -a method for calculating the relative skill levels of players in zero-sum games.

Which Are the Implications of this Breakthrough?

Despite almost silenced by the advent of GPT in its different versions, firstly this is an astonishing advance in the field of negotiation, and more particularly in the realm of diplomacy. Never an AI model has had such a brilliant performance in a fuzzy environment, seasoned by information asymmetries, common sense reasoning, ambiguous natural language, and statistical modeling. Secondly and more importantly, this is another evidence we are in a completely new AI era in which machines can and are scaling knowledge

These LLMs have caused a deep shift: we went from attempting to encode human-distilled insights into machines to delegating the learning process itself to machines. AI is ushering in a world in which decisions are made in three primary ways: by humans (which is familiar), by machines (which is becoming familiar), and by collaboration between humans and machines (which is not only unfamiliar but also unprecedented). We will begin to give AI fewer specific instructions about how exactly to achieve the goals we assign it. Much more frequently we will present AI with ambiguos goals and ask: “How, based on your conclusions, should we proceed?”

AI promises to transform all realms of human experience. And the core of its transformations will ultimately occur at the philosophical level, transforming how humans understand reality and our roles within it. In an age in which machines increasingly perform tasks only humans used to be capable of: what, then, will constitute our identity as human beings? 

With the rise of AI, the definition of the human role, human aspirations, and human fulfillment will change. For humans accustomed to monopoly on complex intelligence, AI will challenge self-perception. To make sense of our place in this world, our emphasis may need to shift from the centrality of human reason to the centrality of human dignity and autonomy. Human-AI collaboration does not occur between peers. Our task will be to understand the transformations that AI brings to human experience, the challenges it presents to human identity, and which aspects of these developments require regulation or counterbalancing by other human commitments.

The AI revolution has come to stay. Unless we develop new concepts to explain, interpret, and organize its consequent transformations, we will be unprepared to navigate them. We must rely on our most solid resources -reason, moral and ethical values, tradition…- to adapt our relationship with reality so it keeps on being human. 


On Social Justice and AI Automation & Augmentation

Posted: January 29th, 2023 | Author: | Filed under: Artificial Intelligence | Tags: , , , , | Comments Off on On Social Justice and AI Automation & Augmentation

The year has gotten off to a bad start for many families.

On January 5th, Amazon announced that it would lay off 18,000 employees. Days later Google stated it would lay off 12,000 employees; and the last to join the merry-go-round was Microsoft, which announced on January 18th that it would lay off 10,000 people. Twitter kicked things off when, in November last year, it announced the layoff of almost 4,000 employees.

What’s going on in the industry? I am not going to be the one to do an in-depth analysis -which has already been done- of the economic and financial reasons that have led these companies to make these decisions. What is clear is that, sad as it may seem, some positions made little or no sense at all from a business point of view (Chief Happiness Officer!), and the labor market in this sector was totally “overheated” concerning salaries with all the cash volume that both governments and central banks -directly or indirectly- had pumped into the economy.

However, let’s move on to a reflection that has gone somewhat unnoticed these days and which is the one that interests me: has or will the progressive implementation of AI in these companies have anything to do with these layoffs? Before pondering on it and answering…Blue pill or red pill? As always, red pill.

As happened in the first and second industrial revolutions with the steam engine, electricity, the telephone or the radio, we have before us a new and likely the most general of all general-purpose technologies: artificial intelligence. AI is not only an innovation itself, but also one that triggers cascades of complementary innovations, from new products to new production systems.

In both the first and the second industrial revolution, there were initial phases of adaptation that meant job losses for thousands of workers, since their jobs and skills no longer made any sense in the new economic scenario. And this is where we begin to go deeper into the analysis: automatization versus augmentation.

Let’s be positive, at least from the outset: both automation and augmentation can boost labor productivity. Nevertheless what happens with automation is that, as machines become better substitutes for human labor, workers lose economic and political bargaining power and become increasingly dependent on those who control the technology and on their financial business plans. 

How are we envisioning AI nowadays? Towards automation or augmentation? There are many who deem AI should be focused on augmenting humans rather than mimicking them.  Augmentation through AI creates new capabilities and new products and services, ultimately generating far more value than merely automating human tasks. In this approach humans and machines become complements. Complementarity implies that people remain indispensable for value creation and, when humans are indispensable, economic power and political decision-making tend to be more decentralized and democratized. 

Nonetheless, there are currently excess incentives for automation rather than augmentation among technologists, business executives, and policy-makers. When AI replicates and automates existing human capabilities, it tends to reduce the marginal value of workers’ contributions, and more of the gains go to the owners, entrepreneurs, inventors, and architects of the new systems. Entrepreneurs and executives who have access to those AI models can and often will replace humans in those tasks. 

There are some voices which defend a fully automated economy, such as one which could, in principle, be structured to redistribute the benefits from production widely, even to those people who are no longer strictly necessary for value creation. However, the beneficiaries’ incomes would depend on the decisions of those in control of the technology. This opens the door to increased concentration of wealth and power. 

What is the solution regarding this dilemma? Clearly it is not slowing down technology, but from my standpoint rather eliminating the excess incentives for automation over augmentation. Think for instance on the US tax legislation, it encourages capital investment over investment in labor through effective tax rates that are much higher on labor than on plants and equipment. The US tax code treats labor income more harshly than capital income.

As a conclusion, the more technology is used to replace rather than augment labor, the worse the disparity may become. At the same time, automating a whole job is often extremely complex. Every job involves multiple different tasks, including some really challenging to automate. Think on industries such as health, legal, domestic security…

As mentioned once in a workshop, human beings and AI models should be -using the image of the Greek mythology- centaurs: a perfectly coordinated and unbeatable mix of wisdom and power.

Let’s see if, for once, we can think on the general benefit.


Large Language Models (LLMs): an Ontological Leap in AI

Posted: December 27th, 2022 | Author: | Filed under: Artificial Intelligence, Natural Language Processing | Tags: , , , , , | Comments Off on Large Language Models (LLMs): an Ontological Leap in AI

More than the quasi-human interaction and the practically infinite use cases that could be covered with it, OpenAI’s ChatGPT has provided an ontological jolt of a depth that transcends the realm of AI itself.

Large language models (LLMs), such as GPT-3, YUAN 1.0, BERT, LaMDA, Wordcraft, HyperCLOVA, Megatron-Turing Natural Language Generation, or PanGu-Alpha represent a major advance in artificial intelligence and, in particular, toward the goal of human-like artificial general intelligence. LLMs have been called foundational models; i.e., the infrastructure that made LLMs possible –the combination of enormously large data sets, pre-trained transformer models, and the requirement of significant computing power– is likely to be the basis for the first general purpose AI technologies.

In May 2020, OpenAI released GPT-3 (Generative Pre-trained Transformer 3), an artificial intelligence system based on deep learning techniques that can generate text. This analysis is done by a neural network, each layer of which analyzes a different aspect of the samples it is provided with; e.g., meanings of words, relations of words, sentence structures, and so on. It assigns arbitrary numerical values to words and then, after analyzing large amounts of texts, calculates the likelihood that one particular word will follow another. Amongst other tasks, GPT-3 can write short stories, novels, reportages, scientific papers, code, and mathematical formulas. It can write in different styles and imitate the style of the text prompt. It can also answer content-based questions; i.e., it learns the content of texts and can articulate this content. And it can grant as well concise summaries of lengthy passages.

OpenAI and the likes endow machines with a structuralist equipment: a formal logical analysis of language as a system in order to let machines participate in language. GPT-3 and other transformer-based language models stand in direct continuity with the linguist Saussure’s work: language comes into view as a logical system to which the speaker is merely incidental. These LLMs give rise to a new concept of language, implicit in which is a new understanding of human and machine. OpenAI, Google, Facebook, or Microsoft effectively are indeed catalyzers, which are triggering a disruption in the old concepts we have been living by so far: a machine with linguistic capabilities is simply a revolution.

Nonetheless, critiques have appeared as well against LLMs. The usual one is that no matter how good they may appear to be at using words, they do not have true language; based on the primeval seminal trailblazing work from the philologist Zipf, criticism have stated they are just technical systems made up of data, statistics, and predictions.

According to the linguist Emily Bender, “a language model is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot. Quite the opposite we, human beings, are intentional subjects who can make things into objects of thought by inventing and endowing meaning.

Machine learning engineers in companies like OpenAI, Google, Facebook, or Microsoft have experimentally established a concept of language at the center of which does not need to be the human. According to this new concept, language is a system organized by an internal combinatorial logic that is independent from whomever speaks (human or machine). They have undermined one of the most deeply rooted axioms in Western philosophy: humans have what animals and machines do not have, language and logos.

Some data: monthly, on average, humans publish about seventy million posts on the content management platform WordPress. Humans produce about fifty-six billion words a month, or 1.8 billion words a day on this content management platform. GPT-3 -before its scintillating launch- was producing around 4.5 billion words a day, more than twice what humans on WordPress were doing collectively. And that is just GPT-3; there are other LLMs. We are exposed to a flood of non-human words. What will it mean to be surrounded by a multitude of non-human forms of intelligence? How can we relate to these astonishingly powerful content-generator LLMs? Do machines require semantics or even a will to communicate with us?

These are philosophical questions that cannot be just solved with an engineering approach. The scope is much wider and the stakes are extremely high. LLMs can, as well as master and learn our human languages, make us reflect and question ourselves about the nature of language, knowledge, and intelligence. Large language models illustrate, for the first time in the history of AI, that language understanding can be decoupled from all the sensorial and emotional features we, human beings, share with each other. Gradually, it seems we are entering eventually a new epoch in AI.


Explainable Artificial Intelligence: A Main Foundation in Human-centered AI

Posted: March 30th, 2022 | Author: | Filed under: Artificial Intelligence, Human-centered explainable AI | Tags: , , , , | Comments Off on Explainable Artificial Intelligence: A Main Foundation in Human-centered AI

Human-centered explainable AI (HCXAI) is an approach that puts the human at the center of technology design. It develops a holistic understanding of who the human is by considering the interplay of values, interpersonal dynamics, and the socially situated nature of AI systems.

Explainable AI (XAI) refers to artificial intelligence -and particularly machine learning techniques- that can provide human-understandable justification for their output behavior. Much of the previous and current work on explainable AI has focused on interpretability, which can be viewed as a property of machine-learned models that dictates the degree to which a human userAI expert or non-expert usercan come to conclusions about the performance of the model given specific inputs. 

An important distinction between interpretability and explanation generation is that explanation does not necessarily elucidate precisely how a model works, but aims to provide useful information for practitioners and users in an accessible manner. The challenges of designing and evaluating “black-boxed” AI systems depends crucially on who the human in the loop is. Understanding the who is crucial because it governs what the explanation requirements are. It also scopes how the data are collected, what data can be collected, and the most effective way of describing the why behind an action.

Explainable AI (XAI) techniques can be applied to AI blackbox models in order to obtain post-hoc explanations, based on the information that they grant. For Pr. Dr. Corcho, rule extraction belongs to the group of post-hoc XAI techniques. This group of techniques are applied over an already trained ML model -generally a blackbox one- in order to explain the decision frontier inferred by using the input features to obtain the predictions. Rule extraction techniques are further differentiated into two subgroups: model specific and model-agnostic. Model specific techniques generate the rules based on specific information from the trained model, while model-agnostic ones only use the input and output information from the trained model, hence they can be applied to any other model. Post-hoc XAI techniques in general are then differentiated depending on whether they provide local explanations -explanations for a particular data point- or global ones -explanations for the whole model. Most rule extraction techniques have the advantage of providing explanations for both cases at the same time.

The researchers Carvalho, Pereira, and Cardoso have defined a taxonomy of properties that should be considered in the individual explanations generated by XAI techniques:

  • Accuracy: It is related to the usage of the explanations to predict the output using unseen data by the model. 
  • Fidelity: It refers to how well the explanations approximate the underlying model. The explanations will have high fidelity if their predictions are constantly similar to the ones obtained by the blackbox model.
  • Consistency: It refers to the similarity of the explanations obtained over two different models trained over the same input data set. High consistency appears when the explanations obtained from the two models are similar. However, a low consistency may not be a bad result since the models may be extracting different valid patterns from the same data set due to the Rashomon Effect -seemingly contradictory information is fact telling the same from different perspectives. 
  • Stability: It measures how similar the explanations obtained are for similar data points. Opposed to consistency, stability measures the similarity of explanations using the same underlying model. 
  • Comprehensibility: This metric is related to how well a human will understand the explanation. Due to this, it is a very difficult metric to define mathematically, since it is affected by many subjective elements related to humans perception such as context, background, prior knowledge, etc. However, there are some objective elements that can be considered in order to measure comprehensibility, such as whether the explanations are based on the original features (or based on synthetic ones generated after them), the length of the explanations (how many features they include), or the number of explanations generated (i.e. in the case of global explanations). 
  • Certainty: It refers to whether the explanations include the certainty of the model about the prediction or not (i.e. a metric score). 
  • Importance: Some XAI methods that use features for their explanations include a weight associated with the relative importance of each of those features. 
  • Novelty: Some explanations may include whether the data point to be explained comes from a region of the feature space that is far away from the distribution of the training data. This is something important to consider in many cases, since the explanation may not be reliable due to the fact that the data point to be explained is very different from the ones used to generate the explanations. 
  • Representativeness: It measures how many instances are covered by the explanation. Explanations can go from explaining a whole model (i.e. weights in linear regression) to only be able to explain one data point.

In the realm of psychology there are three kinds of views of explanations: 

  • The formal-logical view: an explanation is like a deductive proof, given some propositions.
  • The ontological view: events state of affairs explain other events.
  • The pragmatic view: an explanation needs to be understandable by the demander. 

Explanations that are sound from a formal-logical or ontological view, but leave the demander in the dark, are not considered good explanations. For example, a very long chain of logical steps or events (e.g. hundreds) without any additional structure can hardly be considered a good explanation for a person, simply because he or she will lose track. 

On top of this, the level of explanation refers to whether the explanation is given at a high-level or more detailed level. The right level depends on the knowledge and the need of the demander: he or she may be satisfied with some parts of the explanation happening at the higher level, while other parts need to be at a more detailed level. The kind of explanation refers to notions like causal explanations and mechanistic explanations. Causal explanations provide the causal relationship between events but without explaining how they come about: a kind of why question. For instance, smoking causes cancer. A mechanistic explanation would explain the mechanism whereby smoking causes cancer: a kind of how question.

As said, a satisfactory explanation does not exist by itself, but depends on the demanders need. In the context of machine learning algorithms, several typical demanders of explainable algorithms can be distinguished: 

  • Domain experts: those are the professional users of the model, such as medical doctors who have a need to understand the workings of the model before they can accept and use the model.
  • Regulators, external and internal auditors: like the domain experts, those demanders need to understand the workings of the model in order to certify its compliance with company policies or existing laws and regulations. 
  • Practitioners: professionals that use the model in the field where they take users input and apply the model, and subsequently communicate the result to the users situations, such as  for instance loan applications. 
  • Redress authorities: the designated competent authority to verify that an algorithmic decision for a specific case is compliant with the existing laws and regulations. 
  • Users: people to whom the algorithms are applied and that need an explanation of the result. 
  • Data scientists, developers: technical people who develop or reuse the models and need to understand the inner workings in detail.


Summing up, for explainable AI to be effective, the final consumers (people) of the explanations need to be duly considered when designing HCXAI systems. AI systems are only truly regarded as “working” when their operation can be narrated in intentional vocabulary, using words whose meaning go beyond the mathematical structures. When an AI system “works” in this broader sense, it is clearly a discursive construction, not just a mathematical fact, and the discursive construction succeeds only if the community assents.


China: Techno-socialism Seasoned with Artificial Intelligence

Posted: March 18th, 2022 | Author: | Filed under: Artificial Intelligence, Book Summaries, Realpolitik | Tags: , , , | Comments Off on China: Techno-socialism Seasoned with Artificial Intelligence

People take the great ruler for granted and are oblivious to his presence.The good ruler is loved and acclaimed by his subjects. The mediocre ruler is universally feared. The bad ruler is generally despised. Because he lacks credibility, the subjects do not trust him. On the other hand, the great ruler seldom issues orders. Yet he appears to accomplish everything effortlessly. To his subjects everything he does is just a natural occurrence.

Tao-T-Ching, Lao-Ts

Anyone who wants to learn something about China today, to know its strategic plan between now and 2050, the means to achieve it, and what drives this country in this titanic effort, should read the book El gran sueo de China: tecno-socialismo y capitalismo de estado by Claudio F. Gonzlez.

Claudio F. Gonzlez, PhD in engineering and economist, has lived in China, as director of Asia for the Polytechnic University of Madrid (UPM), for six years. During this time he has been involved in the fields of education, entrepreneurship, research and innovation in the Asian giant. From this privileged vantage point he has been able to observe, analyze, and understand the complexity of this country.

According to the author, throughout the 20th century, the Western world looked at China with the condescension that is due to a former empire in decline and mired in chaos, power struggles, and poverty, and only in the last decades of the past century, as a market of great potential and a cheap manufacturer of limited quality. Nonetheless, China had -and has- its plan, the ultimate goal of which is returning the “Empire of the Center” to the place it has held for most of human history. Namely: being the most socially and technologically advanced nation and, from there, regaining world leadership in the economic, commercial, and cultural spheres.

In 2015, the government announced the first of its grand plans – Made in China 2025, with the goal of making China by this date a leader in industries such as robotics, semiconductor manufacturing, electronic vehicles, renewable energy and, of course, artificial intelligence.

Initiatives such as the Belt and Road Initiative (BRI) or institutions such as the Asian Infrastructures Investment Bank (AIIB) are nothing more than instruments through which China wants to reshape an international order that is more favorable to its new interests. One of China’s stated goals is that by 2035 it wants to be the country that globally sets the next standards in areas such as AI, 5G or the Internet of Things.

China’s successes in the digital economy are based on three main factors:

1. A market that is both huge in size and young, which allows for the rapid commercialization of new business models and equally allows for a high level of experimentation.

2. An increasingly rich and varied innovation ecosystem that goes far beyond a few large and famous companies.

3. And a strong government support, which provides favorable economic and regulatory conditions, and also acts as a venture capital investor, a consumer of products based on new technologies and produced by local companies, and allows access to data that are key to developing new solutions in conditions that are unthinkable in other regions.

Professor F. Gonzlez calls this model techno-socialism or state capitalism.

What are the defining characteristics of this techno-socialist model?

China intends to harness the interest in technological development of its own industry to align it with government interests. The overall goal is, starting from what the Chinese Communist Party (CCP) calls a moderately prosperous socialist society, to catch up with and surpass the most developed Western countries, ideally by the 100th anniversary of the founding of the People’s Republic of China (2049). Socialism in the sense of the Chinese regime is no longer socialism in the traditional sense of ownership and collective management of the means of production, a political conception definitively defeated after Mao’s demise; but its control and coordination to achieve social objectives.

The features that characterize this techno-socialism are those of complete physical security for people and things, the absence of extreme poverty, full employment, and the possibility for the most industrious to obtain economic and prestige rewards for their efforts, as long as they are aligned with the objectives established by the party and do not put its dominion at the least risk. This techno-socialism tries to lead society as a whole towards a centrality of thought that avoids extremisms that destroy peace and social security, and that do not call into question the leadership and omnipotent dominance of the party.

The alignment between business interests -or those of other institutions- and public interests, as interpreted by the party, creates a unique innovation ecosystem in which companies capable of promoting solutions for a broad user base become champions of an industrial policy. Once this status is achieved, and always within the logic of interest alignment, they will gain access to a whole arsenal of measures -subsidies, tax reductions, preferential treatment-, to maintain this position and, if possible, extend it internationally, since they are no longer merely companies, but ambassadors of a new model. In the particular case of artificial intelligence, the government has contributed with the necessary conditions -strategies, plans, regulation, space for experimentations- and practical support -venture capital, public procurement, permissions to access data-, for innovations in this field to follow. Alibaba, Tencent, and Baidu set up research centers, deploy applications, enroll human capital, and support CCP policies.

Will techno-socialism be able to generate enough disruptive innovations to give the technology created in China an entity of its own?

Between 2015 and 2018, the venture capital funded more than 1 trillion to new technology start-ups in China. China has more unicorns -companies less than ten years old with a valuation above $1 billion- than any other country. In terms of research, China is already the country with the most scientific articles, surpassing the US, although it is true that its impact is still minor, the gap is rapidly closing. It turns out that it has been the state the one which, with its research grants, scholarships and universities, has generated ideas that, because of their risk, private initiative would never have dared to finance. Hence, in this sense, public authorities that nurture alternative ways of thinking are the true engine of progress.

Professor F. Gonzlez names this innovation paradigm applicable to China as asymmetric triple helix model, in which the national government controls the overall innovation context through its top- down policies and plans but, at the same time, allows a certain level of autonomy for district, local and regional governments to conduct their own experiments and accommodate innovations that emerge from the bottom up. Large companies, start-ups, and finance companies are aligned with government interests. And universities and research centers similarly align themselves with government objectives in producing new knowledge and generating talent in the form of human capital.

And eventually, when will China achieve and assume the role of world leader?

From the author’s standpoint China, due to a set of inconsistencies and structural gaps, is neither ready nor willing to assume the global leadership in the foreseeable future. However, it does claim to be the most powerful and influential economy, with the most cohesive society and the least contested domestic leadership that will enable it to become something like the best country in a fragmented world. China’s current strength lies in the existence of a long-term plan: a sense of destiny that ties in with its imperial past. There is a deep conviction in Chinese society, a determination, which is the key force to achieve these strategic objectives.

China does not want to be a powerful nation, but deserves it.