Browse Definitions :
Definition

Simian Army

Contributor(s): Matthew Haughn

The Simian Army is a collection of open source cloud testing tools created by the online video streaming company, Netflix. The tools allow engineers to test the reliability, security, resiliency and recoverability of the cloud services that Netflix runs on Amazon Web Services (AWS) infrastructure.

Netflix engineers started creating the autonomous software agents, which are called monkeys, soon after moving to the cloud with AWS. Each monkey is designed to help make Netflix's service less fragile and better able to support continuous service, with minimal degradation, when parts of the cloud experience random failures.

Members of the Simian Army include:

  • Chaos Monkey - randomly shuts down virtual machines (VMs) to ensure that small disruptions will not affect the overall service.
  • Latency Monkey - simulates a degradation of service and checks to make sure that upstream services react appropriately.
  • Conformity Monkey - detects instances that aren’t coded to best-practices and shuts them down, giving the service owner the opportunity to re-launch them properly.
  • Security Monkey - searches out security weaknesses, and ends the offending instances. It also ensures that SSL and DRM certificates are not expired or close to expiration.
  • Doctor Monkey - performs health checks on each instance and monitors other external signs of process health such as CPU and memory usage.
  • Janitor Monkey - searches for unused resources and discards them.

Each of these tools helps make cloud service less fragile and better able to support continuous service, with minimal degradation, when parts of the cloud have a problem. Potential problems can be detected and addressed. Furthermore, induced failures provide knowledge that can help prevent future failures and also provide guidance for dealing with any that do occur.

The word Netflix engineers continue to conceptualize and develop new monkeys and invite the community to do so as well.

 

 

This was last updated in August 2013

Continue Reading About Simian Army

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

  • California Consumer Privacy Act (CCPA)

    The California Consumer Privacy Act (CCPA) is legislation in the state of California that supports an individual's right to ...

  • compliance audit

    A compliance audit is a comprehensive review of an organization's adherence to regulatory guidelines.

  • regulatory compliance

    Regulatory compliance is an organization's adherence to laws, regulations, guidelines and specifications relevant to its business...

SearchSecurity

  • privilege creep

    Privilege creep is the gradual accumulation of access rights beyond what an individual needs to do his job. In IT, a privilege is...

  • BlueKeep (CVE-2019-0708)

    BlueKeep (CVE-2019-0708) is a vulnerability in the Remote Desktop (RDP) protocol that affects Windows 7, Windows XP, Server 2003 ...

  • endpoint detection and response (EDR)

    Endpoint detection and response (EDR) is a category of tools and technology used for protecting computer hardware devices–called ...

SearchHealthIT

SearchDisasterRecovery

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

  • business continuity software

    Business continuity software is an application or suite designed to make business continuity planning/business continuity ...

SearchStorage

  • Hadoop as a service (HaaS)

    Hadoop as a service (HaaS), also known as Hadoop in the cloud, is a big data analytics framework that stores and analyzes data in...

  • blockchain storage

    Blockchain storage is a way of saving data in a decentralized network which utilizes the unused hard disk space of users across ...

  • disk mirroring (RAID 1)

    RAID 1 is one of the most common RAID levels and the most reliable. Data is written to two places simultaneously, so if one disk ...

Close