HashiCorp Nomad

Krupakar Reddy
4 min readOct 26, 2023

Key to efficient workload management. In this article, we’ll delve into the simplicity and power of Nomad, showing you how it can transform your application deployments!

What is Nomad?

Nomad is a flexible workload orchestrator for deploying and managing various applications easily, whether they’re in containers or not. It simplifies the process by offering a unified workflow, supporting Docker, non-containerized, microservice, and batch applications.

With Nomad, its bin packing algorithm optimizes job scheduling and resource utilization, making your operations efficient and hassle-free

Nomad is widely adopted and used in production by PagerDuty, Target, Citadel, Trivago, SAP, Pandora, Roblox, eBay, Deluxe Entertainment, and more.

Key Features:

  • Simplicity and Reliability: Nomad is a simple and easy to use, it has simple setup process and user-friendly interface.
    It is self-contained, single-binary system that combines resource management and scheduling and utilize efficient resources and reduce in costs. It operates without external dependencies and automatically manages failures, ensuring reliability even in distributed environments.
  • Flexible workload support: Nomad’s support for a wide range of workloads, such as containerized applications, standalone applications, and virtual machines, makes it a versatile solution suitable for a wide array of use cases.
  • GPU Support: Nomad natively supports GPU workloads like machine learning and AI, utilizing device plugins to detect and utilize GPU and other hardware resources effectively.
  • Multi-Region support: Nomad offers native support for multi-region deployments, allowing clusters to link and enabling job deployment across regions. It also automates replication of ACL policies, namespaces, resource quotas, and Sentinel policies across all clusters.
  • Multi-Datacenter Support: Even if you have geographically distributed datacenters within a region, you can use Nomad to manage all the clients effectively.
  • Scalability: Nomad is optimistically concurrent, enhancing throughput and reducing latency. It has demonstrated scalability in real-world production environments, accommodating clusters with 10K+ nodes.
  • Seamless Integration: Nomad seamlessly integrates with other HashiCorp tools like Terraform, Consul, and Vault, offering comprehensive provisioning, service discovery, and secrets management capabilities.

Nomad Architecture:

Nomad has client-server architecture.

Client Nodes:

  • Nomad clients are the machines (or nodes) in infrastructure where workloads are run. These nodes can be physical servers, virtual machines, or cloud instances.
  • On each client node, the Nomad agent is installed. The agent communicates with the Nomad server and executes tasks or jobs as instructed.

Server Nodes:

  • Nomad server nodes are responsible for the control plane operations, including job scheduling, cluster management, and maintaining the state of the Nomad cluster.
  • In a production environment, it is recommended to have multiple server nodes for high availability and fault tolerance. Nomad uses leader election and state replication to ensure availability even if one or more server nodes fail.

Cluster Communication:

  • Nomad agents on client nodes communicate with Nomad servers to submit job specifications and report the status of running tasks.
  • Servers communicate with each other for leader election and to replicate the state of the cluster. This ensures that all servers have consistent information about the cluster’s configuration and running jobs.

nomad uses a gossip protocol which is a peer-to-peer communication mechanism in which nodes periodically exchange state information about themselves and other nodes they know about.

Nomad v/s K8S:

Difference of Nomad and K8S.

Understanding Jobs and Task Groups:

Jobs in Nomad:

Workload Definitions: Jobs are high-level definitions for tasks or task groups, serving as blueprints for workload management.

Task Grouping: Jobs can group related tasks together for deployment on the same target node.

Declarative Configuration: Define jobs using declarative language (HCL or JSON) to specify workload states.

Scaling and Replication: Jobs control task instances, allowing replication across nodes for redundancy or load balancing.

Dependencies: Jobs manage task and task group dependencies for proper execution.

Tasks in Nomad:
Executable Workloads: Tasks are specific executable workloads within a job.

Task Types: Nomad supports various task types (e.g., Docker containers, binaries) within a single job.

Resource Allocation: Tasks define resource requirements (CPU, memory) for proper execution.

Health Checks: Implement health checks to monitor task health and trigger recovery on failure.

Logs and Metrics: Nomad provides centralized logging and metrics for task monitoring.

Lifecycle Management: Tasks can be managed through start, stop, restart, and updates.

Here’s a sample and simplified job specification file:


job "web-app2" {
group "web" {
network{
port "http"{
static = 9009
to = 80
}
}
task "app" {
driver = "docker"

config {
image = "nginx:latest"
ports = ["http"]
}

resources {
cpu = 100
memory = 128
}
}
}
}

Job Submission: Submit the job specification file to Nomad using the nomad job run <file.hcl>command.

Here are a few visuals that shows the Nomad UI with running job information, including CPU utilization by container.

Applied the job with the specifications mentioned in the above code snippet.
job deployment details
Resource utilzation graphs.

Concluding this initial discussion on HashiCorp Nomad, we’ve explored its core functionalities and its significance in managing containerized workloads.

--

--