Operating Systems Fundamentals: Processes, Memory, and Scheduling
An operating system is the foundational software layer that arbitrates access to hardware resources across all running programs, making the coordination of processes, memory allocation, and CPU scheduling the three most consequential subsystems any OS must implement correctly. Failures in these subsystems produce the most visible class of system pathologies — deadlocks, memory exhaustion, starvation, and priority inversion — that surface in production environments from embedded controllers to cloud hypervisors. This page provides a reference-grade treatment of each subsystem: how it is defined, how its components interact, where classification boundaries fall, and where engineering tradeoffs remain genuinely contested. The material draws on foundational treatments from the ACM and IEEE as well as NIST guidance relevant to OS-level security properties.
- Definition and scope
- Core mechanics or structure
- Causal relationships or drivers
- Classification boundaries
- Tradeoffs and tensions
- Common misconceptions
- Checklist or steps
- Reference table or matrix
- References
Definition and scope
An operating system (OS) is a privileged software layer that manages hardware resources — CPU cycles, physical memory, storage, and I/O devices — and provides a controlled interface through which user-space programs execute. The three primary abstractions the OS exposes are processes (units of execution), virtual memory (isolated address spaces), and scheduled CPU time (deterministic or fair allocation of processor access).
NIST Special Publication 800-123, Guide to General Server Security, frames the OS as the enforcement boundary for all higher-level security controls, meaning that process isolation and memory protection are not merely performance concerns but security-critical properties. The scope of OS fundamentals therefore spans computer science curricula, systems programming practice, and security hardening frameworks simultaneously.
The OS kernel operates in a privileged execution mode — Ring 0 on x86 architectures — while user processes run at Ring 3, a hardware-enforced boundary that prevents direct hardware manipulation by unprivileged code. This two-level privilege model is the structural basis for every isolation guarantee the OS provides.
Core mechanics or structure
Processes and the Process Control Block
A process is a program in execution, distinguished from a passive program binary by the addition of a program counter, a register set, and an associated stack. The OS represents each process internally through a Process Control Block (PCB), a kernel data structure that stores the process ID (PID), execution state, memory map pointers, open file descriptors, scheduling priority, and CPU register snapshots. When the OS context-switches between processes, it saves the outgoing process's register state into its PCB and loads the incoming process's saved state — a sequence that typically costs between 1 and 100 microseconds depending on architecture and cache effects ([Silberschatz, Galvin & Gagne, Operating System Concepts, 10th ed., Wiley]).
Process states follow a defined finite-state machine. The canonical five-state model recognized in the ACM/IEEE Computer Science Curricula 2013 guidelines (CS2013) includes: New, Ready, Running, Waiting (blocked), and Terminated. A process transitions from Running to Waiting when it issues an I/O request; it moves from Waiting back to Ready when the I/O completes, not directly back to Running — a distinction with important scheduling consequences.
Memory Management
Virtual memory creates the abstraction that each process owns a contiguous private address space, regardless of physical memory layout. The OS implements this through paging, which divides both virtual and physical memory into fixed-size units called pages (4 KiB is the default page size on x86-64, though Linux supports 2 MiB and 1 GiB "huge pages"). The page table maps virtual page numbers to physical frame numbers; a hardware component called the Memory Management Unit (MMU) performs this translation on every memory access.
The Translation Lookaside Buffer (TLB) is a hardware cache of recent page table entries. A TLB miss forces a page table walk, which may require 4 sequential memory accesses in a four-level page table structure (as used in x86-64 Linux). TLB miss penalties directly govern the performance cost of context switches and large working-set applications.
Demand paging extends virtual memory to secondary storage: pages that are not currently in physical memory reside on disk and are loaded only when accessed, triggering a page fault. The OS page fault handler resolves the fault by allocating a free frame, reading the page from disk, updating the page table, and resuming the faulting instruction.
CPU Scheduling
The scheduler selects which ready process receives CPU time next. The two scheduling contexts are the short-term scheduler (also called the CPU scheduler), which selects from the ready queue on each context switch, and the long-term scheduler (job scheduler), which controls the degree of multiprogramming by admitting processes from the job pool. A third component, the medium-term scheduler, temporarily removes processes from memory (swapping) to reduce contention.
Causal relationships or drivers
Process proliferation drives memory pressure: as the count of concurrent processes increases, aggregate working-set size eventually exceeds physical RAM, increasing page fault frequency. When page fault handling consumes more CPU time than productive computation, the system enters thrashing — a condition described formally by Peter Denning's working-set model (Denning, "The Working Set Model for Program Behavior," Communications of the ACM, 1968).
The OS scheduler's choice of algorithm determines latency characteristics downstream. A purely throughput-maximizing scheduler (e.g., Shortest Job First) increases average turnaround for CPU-bound jobs but causes starvation for long jobs. Introducing aging — incrementally raising the priority of waiting processes — is the standard mitigation, as documented in [Silberschatz et al., Operating System Concepts].
I/O-bound processes exhibit short CPU bursts followed by long I/O waits. CPU-bound processes exhibit the inverse. This behavioral distinction is the primary driver behind the design of multilevel feedback queues, which dynamically demote CPU-bound processes to lower-priority queues while keeping I/O-bound processes near the top of the scheduling hierarchy — improving interactive responsiveness without static priority assignment.
Classification boundaries
Operating system schedulers fall into two primary classes based on whether the OS can forcibly remove CPU control from a running process:
Non-preemptive (cooperative) scheduling — the running process retains the CPU until it voluntarily yields, issues an I/O request, or terminates. Windows 3.x used cooperative multitasking; modern kernels do not use this model for user processes, though it remains common in microcontroller RTOS environments.
Preemptive scheduling — the OS can interrupt a running process at any timer interrupt interval (a time quantum or time slice, typically 1–100 milliseconds in Linux and Windows). All modern general-purpose operating systems — Linux, macOS, Windows NT and later — use preemptive scheduling.
Memory management strategies divide along two axes:
| Axis | Option A | Option B |
|---|---|---|
| Allocation unit | Fixed-size pages | Variable-size segments |
| Residence policy | All pages in memory | Demand paging (pages on disk) |
Pure segmentation (used in early Intel 8086 real-mode) has been largely supplanted by segmented paging, which combines both models, or pure paging. The Intel 64 and IA-32 Architectures Software Developer's Manual documents the hardware mechanisms that support both models on x86-64.
Process classification distinguishes heavyweight processes (full virtual address space, separate OS resources) from threads (lightweight execution contexts that share the address space of their parent process). The POSIX standard (IEEE Std 1003.1) defines the thread API used across Unix-like systems, including the pthread interface.
Tradeoffs and tensions
Scheduling quantum length creates a direct tension between responsiveness and throughput. A short quantum (1 ms) improves interactive latency but increases context-switch overhead as a fraction of total CPU time. A long quantum (100 ms) reduces overhead but degrades responsiveness for interactive processes. Linux's Completely Fair Scheduler (CFS), introduced in kernel 2.6.23, sidesteps fixed quanta by targeting a configurable target latency (default 6 ms for ≤8 runnable processes) and dividing it proportionally among runnable tasks (Linux kernel documentation, kernel/sched/fair.c).
Memory overcommitment allows the OS to promise more virtual memory than physically exists, relying on the statistical improbability that all processes demand their full allocation simultaneously. Linux uses overcommitment by default (controlled via /proc/sys/vm/overcommit_memory). The tension is between memory utilization efficiency and the risk of the OOM killer (Out-of-Memory killer) terminating processes when physical pages are genuinely exhausted.
Real-time versus general-purpose scheduling presents a fundamental design conflict. General-purpose schedulers optimize average-case metrics (throughput, fairness). Real-time schedulers optimize worst-case latency bounds. The two objectives are structurally incompatible: the preemption and fairness mechanisms that improve average-case behavior introduce unpredictable latency spikes. POSIX defines SCHED_FIFO and SCHED_RR real-time scheduling policies (IEEE Std 1003.1) as separate from the standard SCHED_OTHER policy, acknowledging the incompatibility by isolating them.
Security versus performance in memory isolation surfaces in the Spectre and Meltdown hardware vulnerabilities disclosed in January 2018. Mitigations — particularly Kernel Page Table Isolation (KPTI) — restore the security property that user processes cannot infer kernel memory contents through timing side channels, but the mitigation increases system call overhead by 5–30% on workloads with high syscall frequency (Red Hat Performance Analysis, 2018).
Common misconceptions
Misconception: A process and a program are the same thing.
A program is a static file on disk. A process is a dynamic instance of that program executing with its own PID, memory space, and state. The same program binary can spawn 10 simultaneous processes, each independent.
Misconception: More RAM always prevents thrashing.
Thrashing is caused by the total working-set size of active processes exceeding available physical frames. Adding RAM raises the threshold but does not change the structural cause. The correct fix is either increasing RAM beyond total working-set demand or reducing the degree of multiprogramming — reducing the number of active processes competing for frames.
Misconception: The CPU scheduler runs continuously.
The scheduler is invoked only at specific events: timer interrupts, I/O completion interrupts, process creation/termination, and blocking system calls. Between these events, the currently running process executes without scheduler involvement.
Misconception: Threads are faster than processes because they avoid the OS.
Threads still require OS kernel involvement for creation, synchronization (via mutexes, semaphores), and scheduling. The performance advantage of threads over processes comes from shared address space (no address space switch on context switch) and cheaper creation cost — not from bypassing the OS.
Misconception: Virtual memory is synonymous with swap space.
Virtual memory is the abstraction of a private address space. Swap space is one mechanism (disk-backed paging) used to extend the physical backing of that address space. A system can provide virtual memory without any swap space, as long as physical RAM covers all active working sets.
Checklist or steps
The following sequence describes the phases of a complete page fault resolution cycle as implemented in demand-paging operating systems:
- The CPU generates a memory access for a virtual address not currently mapped to a physical frame (indicated by the "present" bit being 0 in the page table entry).
- The MMU raises a page fault exception, transferring control to the OS page fault handler in kernel space.
- The handler determines whether the address falls within the process's valid virtual address space. If not, the access is illegal and the OS delivers a segmentation fault signal (SIGSEGV on POSIX systems).
- If the address is valid, the handler locates the backing store location for the required page (swap partition, memory-mapped file, or zero-filled demand allocation).
- A free physical frame is selected. If no free frame exists, the page replacement algorithm (e.g., LRU, Clock, NRU) selects a victim frame to evict.
- If the victim frame is dirty (modified since it was loaded), it is written back to the backing store before the frame is reused.
- The required page is read from backing store into the selected physical frame.
- The page table entry for the faulting virtual address is updated with the new physical frame number and the present bit is set to 1.
- The TLB entry for the faulting address is invalidated or updated.
- The faulting instruction is restarted; it now completes without a page fault.
Reference table or matrix
The table below compares the six major CPU scheduling algorithms across four operational dimensions. This reference maps to algorithms covered in the ACM/IEEE CS2013 Systems Fundamentals knowledge area.
| Algorithm | Preemptive? | Optimizes For | Risk / Weakness | Typical Use Context |
|---|---|---|---|---|
| First-Come, First-Served (FCFS) | No | Implementation simplicity | Convoy effect; poor average wait time | Batch systems, print queues |
| Shortest Job First (SJF) | No | Minimum average waiting time | Requires burst-time prediction; starvation of long jobs | Batch, theoretical baseline |
| Shortest Remaining Time First (SRTF) | Yes | Minimum average waiting time (preemptive) | High context-switch overhead; requires prediction | Theoretical; preemptive SJF variant |
| Round Robin (RR) | Yes | Fairness; interactive responsiveness | Quantum-length sensitivity; poor for CPU-bound | Time-sharing, desktop OS |
| Priority Scheduling | Both variants | Differentiated service levels | Starvation of low-priority processes without aging | Real-time, embedded OS |
| Multilevel Feedback Queue (MLFQ) | Yes | Adaptive balance of responsiveness and throughput | Configuration complexity (queue count, quantum per level) | General-purpose OS (Linux CFS, Windows scheduler) |
For a broader map of the computer science domain in which OS fundamentals sit, the Computer Science Authority index provides structured navigation across all major subfields, including computer architecture and organization — the hardware layer that OS schedulers and memory managers directly abstract — and distributed systems, where process and scheduling concepts extend across networked nodes.
References
- ACM/IEEE Computer Science Curricula 2013 (CS2013) — Canonical academic scope definition for OS fundamentals, including Systems Fundamentals knowledge area.
- NIST Special Publication 800-123: Guide to General Server Security — NIST framework positioning the OS as a security enforcement boundary.
- IEEE Std 1003.1 (POSIX) — Defines process, thread, and real-time scheduling APIs (
SCHED_FIFO,SCHED_RR,SCHED_OTHER,pthread). - Intel 64 and IA-32 Architectures Software Developer's Manual — Authoritative hardware reference for paging, segmentation, privilege rings, and TLB behavior on x86-64.
- Peter Denning, "The Working Set Model for Program Behavior," Communications of the ACM, 1968 — Original formal treatment of thrashing and working-set theory.
- [Linux Kernel Documentation — Completely Fair