Definition

non-uniform memory access (NUMA)

Rahul Awati

By

Rahul Awati

What is non-uniform memory access (NUMA)?

Non-uniform memory access, or NUMA, is a method of configuring a cluster of microprocessors in a multiprocessing system so they can share memory locally. The idea is to improve the system's performance and allow it to expand as processing needs evolve.

In a NUMA setup, the individual processors in a computing system share local memory and can work together. Data can flow smoothly and quickly since it goes through intermediate memory instead of a main bus.

NUMA can be thought of as a microprocessor cluster in a box. The cluster typically consists of four microprocessors interconnected on a local bus to a shared memory on a single motherboard. The bus may be a peripheral component interconnect bus, the shared memory is called an L3 cache and the motherboard is often referred to as a card.

This unit can be added to similar units to form a symmetric multiprocessing system (SMP) that can contain 16 00 256 microprocessors with a common SMP bus interconnecting the clusters. In an SMP system, all the individual processor memories look like a single memory to an application program.

How non-uniform memory access works

When a processor looks for data at a certain memory address, it first looks in the L1 cache on the microprocessor. Then it moves to the larger L2 cache chip and finally to a third level of cache (L3). The NUMA configuration provides this third level. If the processor still cannot find the data, it will look in the remote memory located near the other microprocessors.

Each of these clusters is viewed by NUMA as a node in the interconnection network. NUMA maintains a hierarchical view of the data on all nodes. Data is moved on the bus between the clusters using a scalable coherent interface. SCI coordinates cache coherence (consistency) across the nodes of the multiple clusters in the NUMA architecture.

NUMA and symmetric multiprocessing

NUMA is commonly used in a symmetric multiprocessing system. An SMP system is a tightly coupled, share-everything system in which multiple processors work under a single operating system and access each other's memory over a common bus or interconnect path. These microprocessors work on a single motherboard connected by a bus.

One limitation of SMP is that as microprocessors are added, it overloads the shared bus or data path, creating a Performance bottleneck. NUMA adds an intermediate level of memory that is shared among a few microprocessors so that all accessed data doesn't travel on the main bus. This helps address performance bottleneck issues.

SMP and NUMA systems are typically used in data mining applications and decision support systems. In these types of applications, processing can be parceled out to multiple processors that collectively work on a common database.

NUMA node architecture

The NUMA architecture is common in multiprocessing systems. These systems include multiple hardware resources including memory, input/output devices, chipset, networking devices and storage devices (in addition to processors). Each collection of resources is a node. Multiple nodes are linked via a high-speed interconnect or bus.

Every NUMA system contains a coherent global memory and I/O address space that can be accessed by all processors in the system. The other components can vary, although at least one node must have memory, one must have I/O resources, and one must have processors.

In this type of memory architecture, a processor is assigned a specific local memory for its own use, and this memory is placed close to the processor. The signal paths are shorter, which is why these processors can access local memory faster than non-local memory. Also, since there is no sharing of non-local memory, there is an appreciable drop in delays (latency) when multiple access requests come in for the same memory location.

Advantages and disadvantages of NUMA

One of the biggest advantages of NUMA is the fast movement of data and lower latency in the multiprocessing system. Additionally, NUMA reduces data replication and simplifies programming. And the parallel computers in a NUMA architecture are highly scalable and responsive to data allocation in local memories.

One disadvantage of NUMA is that it can be expensive. And the lack of programming standards for larger configurations can make implementation challenging.

Difference between NUMA and UMA

There are three shared-memory models in multiprocessor systems.

Uniform memory access (UMA)
Non-uniform memory access (NUMA)
Cache-only memory access/architecture (COMA)

Figure 1 shows the differences between UMA and NUMA.

uniform memory access (UMA) vs. non-uniform memory access (NUMA) — Figure 1

Like UMA and NUMA, COMA is also a shared-memory model used in multiprocessors. In a COMA machine, multiple processing nodes are connected by an interconnection network. Each node includes a processor and cache and is allocated as part of the global shared memory. The local memory (typically DRAM) at each node is used as the cache. This feature differentiates COMA from NUMA in which the local memory is used as the main memory.

COMA generally enables more efficient use of memory resources. The drawback is that it also makes it harder to find a particular piece of data since there is no home node in COMA (unlike NUMA). Moreover, it can be challenging to address memory shortage problems once the local memory fills up.

See also: vNUMA (virtual NUMA), IT Acronyms at your fingertips

This was last updated in September 2022

Continue Reading About non-uniform memory access (NUMA)

Podcast: Storage the key bottleneck for AI processing

High-performance interconnects and storage performance

How the CXL interconnect will affect enterprise storage

Single-socket servers offer cost, utilization benefits to admins

Adjust these 4 server BIOS settings to improve performance

local area network (LAN)
A local area network (LAN) is a group of computers and peripheral devices that are connected together within a distinct ...
TCP/IP
TCP/IP stands for Transmission Control Protocol/Internet Protocol and is a suite of communication protocols used to interconnect ...
firewall as a service (FWaaS)
Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

identity management (ID management)
Identity management (ID management) is the organizational process for ensuring individuals have the appropriate access to ...
fraud detection
Fraud detection is a set of activities undertaken to prevent money or property from being obtained through false pretenses.
single sign-on (SSO)
Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials -- for ...

CIO

IT budget
IT budget is the amount of money spent on an organization's information technology systems and services. It includes compensation...
project scope
Project scope is the part of project planning that involves determining and documenting a list of specific project goals, ...
core competencies
For any organization, its core competencies refer to the capabilities, knowledge, skills and resources that constitute its '...

recruitment management system (RMS)
A recruitment management system (RMS) is a set of tools designed to manage the employee recruiting and hiring process. It might ...
core HR (core human resources)
Core HR (core human resources) is an umbrella term that refers to the basic tasks and functions of an HR department as it manages...
HR service delivery
HR service delivery is a term used to explain how an organization's human resources department offers services to and interacts ...

Customer Experience

martech (marketing technology)
Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...
transactional marketing
Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.
customer profiling
Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...

Close