Human-like systematic generalization through a meta-learning neural network

What is symbolic artificial intelligence?

symbolic learning

Other ways of handling more open-ended domains included probabilistic reasoning systems and machine learning to learn new concepts and rules. McCarthy’s Advice Taker can be viewed as an inspiration here, as it could incorporate new knowledge provided by a human in the form of assertions or rules. For example, experimental symbolic machine learning systems explored the ability to take high-level natural language advice and to interpret it into domain-specific actionable rules. Optimization for the copy-only model closely followed the procedure for the algebraic-only variant.

symbolic learning

Last but not least, it is more friendly to unsupervised learning than DNN. We present the details of the model, the algorithm powering its automatic learning ability, and describe its usefulness in different use cases. The purpose of this paper is to generate broad interest to develop it within an open source project centered on the Deep Symbolic Network (DSN) model towards the development of general AI. Implementations of symbolic reasoning are called rules engines or expert systems or knowledge graphs.

Problems with Symbolic AI (GOFAI)

All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for. The Symbolic AI paradigm led to seminal ideas in search, symbolic programming languages, agents, multi-agent systems, the semantic web, and the strengths and limitations of formal knowledge and reasoning systems.

symbolic learning

Thirty participants in the United States were recruited using Mechanical Turk and psiTurk. The participants produced output sequences for seven novel instructions consisting of five possible words. The participants also approved a summary view of all of their responses before submitting.

Unlock advanced customer segmentation techniques using LLMs, and improve your clustering models with advanced techniques

The query input sequence (shown as ‘jump twice after run twice’) is copied and concatenated to each of the m study examples, leading to m separate source sequences (3 shown here). A shared standard transformer encoder (bottom) processes each source sequence to produce latent (contextual) embeddings. The contextual embeddings are marked with the index of their study example, combined with a set union to form a single set of source messages, and passed to the decoder. The standard decoder (top) receives this message from the encoder, and then produces the output sequence for the query. Each box is an embedding (vector); input embeddings are light blue and latent embeddings are dark blue.

symbolic learning

Thus, for episodes with a small number of study examples chosen (0 to 5, that is, the same range as in the open-ended trials), the model cannot definitively judge the episode type on the basis of the number of study examples. Each study phase presented the participants with a set of example input–output mappings. For the first three stages, the study instructions always included the four primitives and two examples of the relevant function, presented together on the screen. For the last stage, the entire set of study instructions was provided together to probe composition.

After considering seven different models, we found that, in contrast to perfectly systematic but rigid probabilistic symbolic models, and perfectly flexible but unsystematic neural networks, only MLC achieves both the systematicity and flexibility needed for human-like generalization. MLC also advances the compositional skills of machine learning systems in several systematic generalization benchmarks. Our results show how a standard neural network architecture, optimized for its compositional skills, can mimic human systematic generalization in a head-to-head comparison.

Furthermore, MLC derives its abilities through meta-learning, where both systematic generalization and the human biases are not inherent properties of the neural network architecture but, instead, are induced from data. MLC optimizes the transformers for systematic generalization through high-level behavioural guidance and/or direct human behavioural examples. To prepare MLC for the few-shot instruction task, optimization proceeds over a fixed set of 100,000 training episodes and 200 validation episodes. Extended Data Figure 4 illustrates an example training episode and additionally specifies how each MLC variant differs in terms of access to episode information (see right hand side of figure). Each episode constitutes a seq2seq task that is defined through a randomly generated interpretation grammar (see the ‘Interpretation grammars’ section).


SCAN involves translating instructions (such as ‘walk twice’) into sequences of actions (‘WALK WALK’). COGS involves translating sentences (for example, ‘A balloon was drawn by Emma’) into logical forms that express their meanings (balloon(x1) ∨ draw.theme(x3, x1) ∨ draw.agent(x3, Emma)). COGS evaluates 21 different types of systematic generalization, with a majority examining one-shot learning of nouns and verbs. To encourage few-shot inference and composition of meaning, we rely on surface-level word-type permutations for both benchmarks, a simple variant of meta-learning that uses minimal structural knowledge, described in the ‘Machine learning benchmarks’ section of the Methods.

Read more about here.

Leave a Reply

Your email address will not be published. Required fields are marked *