Definition

latency

Alexander S. Gillis, Technical Writer and Editor

Latency is a synonym for delay. In telecommunications, low latency is associated with a positive user experience (UX) while high latency is associated with poor UX.

In computer networking, latency is an expression of how much time it takes for a data packet to travel from one designated point to another. Ideally, latency will be as close to zero as possible. Network latency can be measured by determining the round-trip time (RTT) for a packet of data to travel to a destination and back again.

High network latency can dramatically increase webpage load times, interrupt video and audio streams, and render an application unusable. Depending on the application, even a relatively small increase in latency can ruin UX.

One of the major reasons for poor latency is geography. Highly distributed Internet Protocol (IP) networks traverse vast distances, adding transmission time that can derail an application. In any situation where the latency between sensing and responding needs to be extremely low -- for example, certain actions in autonomous driving -- it makes sense to put the computer that's processing the data as close to the source of the data as possible, a concept known as edge computing.

Causes of latency in telecom

Latency can be caused by many things, including the following:

Transmission media. Latency can be affected by the type of media being used to transmit data, voice and video. For example, packets traveling over a T1 line can be expected to experience lower latency than packets traveling over a Cat5 cable.
Packet size. A large packet will take longer to send round trip than a small one.
Packet loss and jitter. Latency can also be introduced by a high percentage of packets that fail to make their destination or too much variation in the time it takes for some packets to travel from one system to another.
Signal strength. If the signal is weak and has to be boosted by a repeater, this can introduce latency.
Propagation delays. If each gateway node has to take time to examine and possibly change the header in a packet -- for example, changing the hop count in the time-to-live (TTL) field -- latency will be higher.
Other computer and storage delays. If a packet is subject to storage and hard disk access delays at an intermediate device, such as a switch, latency is likely to be higher.

Types of latency

Interrupt latency is the length of time that it takes for a computer to act on a signal that tells the host operating system (OS) to stop until it can decide what it should do in response to an event.

Fiber optic latency is how long it takes for light to travel a specified distance through a fiber optic cable. For every kilometer (km) covered, a latency of 3.33 microseconds (μs) naturally occurs, according to the speed of light. In reality, however, the per-kilometer latency of fiber optic cables is about 4.9 μs -- this is because light travels slower in a cable. Bends or other imperfections in the cable could make the latency higher.

Internet latency times are dependent upon distance. The longer a packet has to travel across a global wide area network (WAN), the higher the latency.

WAN latency can be an important factor in determining internet latency. A WAN that is busy directing other traffic will produce a delay, whether the resource is being requested from a server on the local area network (LAN), another computer on that network or elsewhere on the internet.

Audio latency is the delay between sound being created and heard. In sound created in the physical world, this delay is determined by the speed of sound, which varies slightly depending on the medium the sound wave travels through. Sound travels faster in denser mediums: It travels faster through solids, less quickly through liquids and slowest through air. In audio, the acceptable midrange of latency is around 8 to 12 μs. Latencies of 30 milliseconds (ms) are generally noticed by the listener.

Operational latency can be defined as the sum time of operations if they are performed in a linear workflow. In parallel workflows, the latency is determined by the slowest operation performed by a single task worker.

Mechanical latency is the delay from input into a mechanical system or device to the desired output. This delay is determined by Newtonian physics-based limits of the mechanism (excepting quantum mechanics).

Computer and OS latency is the combined delay between an input or command and the desired output. Contributors to increased computer latency include insufficient data buffers and mismatches in data speed between the microprocessor and input/output (I/O) devices.

Latency testing and measurement

Latency testing can vary from application to application. In some applications, measuring latency requires special and complex equipment or knowledge of special computer commands and programs; in other cases, latency can be measured with a stopwatch. Network managers have a number of tools to choose from to do this, including Traceroute, My traceroute (MTR) and Ping.

Ping commands are used to determine if a host computer the user is trying to reach is operating. To evaluate latency, a network administrator sends an Internet Control Message Protocol (ICMP) echo request to a specified interface on the network and waits for a reply.

Information about latency can also be gathered by using a traceroute command. Traceroute visualizes the path that packets take across an IP network, recording latency between each host on the route. MTR blends elements of both Ping and Traceroute to track both the delay between devices on the path and the total transit time.

To evaluate mechanical latency, high-speed cameras can be used to capture the minute differences in response times from input to mechanical action.

Reducing latency

Latency can be reduced by tuning, tweaking and upgrading computer hardware, software and mechanical systems. Within a computer, latency can be removed or hidden by such techniques as prefetching -- anticipating the need for data input requests -- and Multithreading or by using parallelism across multiple execution threads.

Other steps to reduce latency and increase performance include uninstalling unnecessary programs, optimizing networking and software configurations, and upgrading or overclocking hardware.

Latency vs. throughput

Both throughput and latency are commonly used to measure network performance and improve load times.

Latency can be thought of as the time it takes to perform an action, while throughput can be thought of as the number of actions that can be executed in one unit of time. In other words, latency measures how quickly data takes to be transferred, while throughput is how much data can be sent.

Bandwidth is another concept that is often associated with latency. Bandwidth describes the maximum capacity of a network/internet connection. The less bandwidth a network has, the more latency.

To understand bandwidth's relationship with latency, picture bandwidth as a pipe and throughput as how much water a pipe can carry over a specified time. Latency becomes the time it takes for the water to reach its destination. The smaller the pipe, the more time it takes for the water to reach its destination. With the same notion, the bigger the pipe, the faster the water can reach its destination. Bandwidth and latency have a cause-and-effect relationship in this way.

This was last updated in January 2020