\documentclass[12pt]{llncs}

\usepackage{indentfirst}
\markright{Power management in Linux}
\begin{document}

\title{Power management in Linux}
\author{Pavel Machek}

\institute{SuSE CR, s r.o., Drahobejlova 27, Praha 9, Czech Republic\\
\email{pavel@suse.cz}}

\maketitle              % typeset the title of the contribution

\begin{abstract}
As CPUs get more powerfull (both in MIPS and in Watts), and notebooks
get smaller, power and thermal management gets more and more
important. Modern machines usually support several sleep states
(suspend to RAM, suspend to disk), few cpu halt states (hlt, C1, C2,
C3), cpu throttling, cpu frequency and voltage scaling, various disk
sleep states, and control over backlight. Modern operating system is
expected to keep processor cool and keep battery from
exploding. Linux is expected to deal with all this mess, and (around
2.6.5) does quite a good job in most areas, with exceptions of suspend
to RAM andbacklight control. I will talk in bigger detail about
swsusp (aka suspend to disk), because of its ``interesting'' design
which keeps confusing users and developers. I'll also present brief
overview of APM, ACPI and cpufreq modules.
\end{abstract}

\sloppy

\section{Overview}

Several things are necessary for operating system to smoothly run on
a notebook computer: slowing CPU down when idle, switching between
different CPU speeds and voltages, thermal management, spin the disk
down when it is not in use, suspend machine to RAM and to disk,
control LCD backlight, be able to read battery status and handle extra
keys notebooks like to introduce. 

Much of this stuff is beneficial to desktop and server PCs, too,
although it is not strictly necessary there. Desktop machines mostly
work to reduce noise and power consumption. Servers care most about
thermal output, battery life on UPS is also important.

Old method of doing power management on PC is called APM (Advanced
Power Management). Most things get done directly by hardware. Backlight
control actually tends to work, slowing CPU on idle is simplistic but
usually works. Suspend-to-disk and suspend-to-RAM is either not
present or done by hardware, and it usually has various
glitches. There's a design problem with APM suspends: BIOS needs to
know about all the hardware in machine for this to work correctly. But
newer notebooks have replaceable mini-PCI (usually wifi) cards and
cardbus slots. APM does not scale frequency/CPU voltage, or does so in
hardware in a very simplistic way. Battery status is only displayed in
percent left.

There's newer method of PC power management, called ACPI. It is better
than APM, but I'm not sure if it is right way to do things. Kernel
interprets byte code from ACPI BIOS (yes, we have language interpreter
in kernel!). ACPI standard is very long, and pretty hard to read. ACPI
handles all above aspects; we still have problems with backlight
control (standard exists but not everyone uses it) and suspend-to-RAM.

On non-PC machines, kernel usually knows all the hardware details, and
power management tends to just work.

\section{swsusp}

swsusp (originally SoftWare SUSPend, now also SWap SUSPend) is kernel
subsystem that is able to suspend machine to disk even without BIOS
support. It started as 2.4 patch from Gabor Kuti, now I'm working with
Nigel Cunningham and Patrick Mochel on 2.6 version. swsusp was merged
to Linus' kernel in the middle of 2.5 series. If you plan using
swsusp, please use 2.6.8.1-mm version (swsusp is merged with pmdisk
there, and code is way cleaner; it should be merged to Linus, soon).

Actually Nigel also has second code base called suspend2, which
contains many additional features over swsusp. It is faster, supports
compressed image, provides nice progress bar, and can be aborted by
pressing ``Esc''. It is unfortunately also quite big, and needs some
cleanups. Nigel is working on it.

How swsusp works: To ease implementation, suspend starts with stopping
all the user processes. Then it frees at least 50% of memory, so it
can do atomic copy, and suspends devices, so that no DMA is running
while doing snapshot. Then it can disable interrupts, and do atomic
copy. Before atomic copy can be written to disk, devices have to be
resumed. Image is then written to disk, devices are suspended once
again, and machine is powered down. Seeing devices transition three
times during suspend is strange to users, but it is okay. During
resume, normal boot is done, up-to the point just before mounting the
file system. Then devices are suspended, memory is copied back
atomically with interrupts disabled, devices are resumed and machine
continues normally.

\section{Suspend-to-RAM}

At the first sight, everyone assumes that suspend-to-RAM is easier
than suspend-to-disk. That is actually not the case, because drivers
need to work way better for suspend-to-RAM to work. In swsusp case,
devices are already initialised by normal boot. In suspend-to-RAM
case, devices are in some rather weird state. Problem is especially
bad video drivers, because those have ROMs that need to be ran...

There's easy-to-use pseudo-suspend-to-RAM mode called ACPI S1. In that
mode, machine is not running, but its state is preserved. It was very
common in early days when Windows could not do suspend-to-RAM
properly. Unfortunately it does not provide interesting power saving.

\section{Power management}

Modern computers eat lots of power... which is quite a problem for
notebooks. CPUs are especially power-hungry. To keep battery-life
reasonable, there are three different mechanisms: There are CPU sleep
states (ACPI Cx). When operating system is not doing any computation
(waiting for disk or for user), it can place CPU into one of CPU sleep
states. Deeper sleep (bigger x in Cx) means better power saving, but
also takes longer to enter and exit, potentially hurting
performance. Then there are CPU power states: modern CPUs can run at
various voltage/frequency combinations. Of course, lower frequency
means slower computation, but advantage in battery life is worth it,
and some batteries can not support high frequencies. And then there's
throttling; it is independent of previous two systems and it modifies
duty cycle of CPU. It is mostly useful in emergency overheat
conditions, because its energy savings are only linear.

Linux supports CPU sleep states. Unfortunately C3 is can't be used
when there's DMA going on; that's all the time if you use USB (but
some workarounds are possible). Power states and throttling is
supported, too, but you have to control it from userland.

Then there's backlight and hard drive. Effort on controlling backlight
with ACPI is ongoing; laptopmode is available in 2.6 kernels to make
spinning disk down useful.

\section{Thermal management}

Modern machines allow operating system to control machine cooling up
to certain point; if machine overheats badly and operating system
fails to react, hardware must protect itself from damage. It even
works on some systems ;-). Original specification allows operating
system to control fans, but most manufacturers do not allow that these
days. Fans are often controlled by hardware, but software still can do
``passive cooling'' -- basically slowing things down. That is useful
to keep machine running even with failed CPU fans. Unfortunately,
manufacturers often get limits wrong, so it does not always do the
right thing.

Thermal information can be accessed using ACPI (hardware independent)
way, or by directly accessing i2c bus (more info, but highly hardware
specific). Locking should be introduced to prevent ACPI/i2c conflicts,
but it is not there so do not use LM-sensors in ACPI mode.

\section{Conclusion}

Notebooks get more and more important these days, and include lots of
advanced stuff to conserve power. Linux still fails in some important
areas (suspend-to-RAM), but things are getting better and Linux is
already very usable on notebooks.

\end{document}

