0% found this document useful (0 votes)
135 views30 pages

x86 Fast Reboot & Panic Reboot

x86 Fast Reboot & Panic Reboot in OpenSolaris. Slides from a presentation given to the NYC OpenSolaris user group June 2009 by Sherry Moore.

Uploaded by

notpeter
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views30 pages

x86 Fast Reboot & Panic Reboot

x86 Fast Reboot & Panic Reboot in OpenSolaris. Slides from a presentation given to the NYC OpenSolaris user group June 2009 by Sherry Moore.

Uploaded by

notpeter
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

x86 Fast Reboot

& Panic Reboot


Sherry Q. Moore
[email protected]

1
Agenda
• Background
• Objectives
• Technical Details
• Demo
• Summary
• Advanced Features

2
2
Background

Typical Reboot Time (x86)


Measured from “rebooting...” to Solaris Banner
Being logged in /var/adm/messages

Time from reboot to


Platform Code Name # of CPUs Memory (GB) banner (minutes:seconds)
x4140 Dorado 8 64 02:10
x4150 Doradi 8 64 01:58
x4270 Lynx 16 74 03:02
x4450 Tucani 16 128 02:08
x4500 Thumper 4 16 02:12
x4600M2 Galaxy 4F 32 256 02:00
x6250 Wolf 4 8 02:25
x6270 Virgo 16 8 01:37

3
3
Background

Motivation
 Boot/Reboot time is down time.
Reducing down time will improve
- RAS rating
- User Experience
- Efficiency
- Quality of Life

4
4
Background

Typical Boot Steps

Hardware
Hardware BIOS GRUB dboot
Reset BIOS GRUB dboot Kernel
Reset
x86

Hardware
Hardware POST OBP dboot Kernel
Reset
Reset POST OBP dboot Kernel

SPARC

5
5
Background

Enabling Technology
 Dynamic Device Discovery
 Fault Management Architecture
- Component failures not limited to firmware testing time
 Memory Scrubber
 ZFS
 ...

6
6
Objectives
 Bypass BIOS and boot loader
 From “rebooting...” to banner in a few seconds

dboot
dboot Kernel
Kernel

In-kernel-boot-loader
In-kernel-boot-loader

7
7
Technical Details

Traditional Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2)

kadmin()
kadmin()

mdpreboot()
mdpreboot()

mdboot()
mdboot() pc_reset()
pc_reset()

8
8
Technical Details

Fast Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2) fb_swtch()
fb_swtch()

kadmin()
kadmin()

mdpreboot()
mdpreboot() fastboot_load_kernel()
fastboot_load_kernel()

mdboot()
mdboot() fast_reboot()
quiesce_devices()
fast_reboot()
quiesce_devices()

fast_reboot()
fast_reboot()
9
9
Technical Details

Software Components
 Userland code to process fast reboot options
 In-kernel Boot Loader
 Device quiesce(9E)
 Fast Reboot Switcher

10
10
Technical Details/Userland

Fast Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2) fb_swtch()
fb_swtch()

kadmin()
kadmin()

mdpreboot()
mdpreboot() fastboot_load_kernel()
fastboot_load_kernel()

mdboot()
mdboot() fast_reboot()
quiesce_devices()
fast_reboot()
quiesce_devices()

fast_reboot()
fast_reboot()
11
11
Techinical Details/Userland

Invocation
 reboot(1M)
 init(1M)
 uadmin(1M)
 uadmin(2)

12
12
Techinical Details/Userland

Default Settings
 Controlled by system/boot-config:default
 Fast Reboot is enabled by default
 Change fastreboot_default to change behavior

# svccfg ­s system/boot­config:default \
# svccfg ­s system/boot­config:default \
setprop config/fastreboot_default=false
setprop config/fastreboot_default=false
# svcadm refresh system/boot­config:default
# svcadm refresh system/boot­config:default
# svccfg ­s system/boot­config:default \
# svccfg ­s system/boot­config:default \
setprop config/fastreboot_default=true
setprop config/fastreboot_default=true
# svcadm refresh system/boot­config:default
# svcadm refresh system/boot­config:default

13
13
Techinical Details/Userland

reboot(1M) Examples
 Reboot to the default entry in the GRUB menu

# reboot
# reboot

 Reboot to the <n>th entry in the GRUB menu

# bootadm list­menu
# bootadm list­menu
# reboot n
# reboot n

 Other cool options
# reboot ­­ '­kvd'
# reboot ­­ '­kvd'
# reboot ­­ '/platform/i86pc/mykernel/amd64/unix ­k'
# reboot ­­ '/platform/i86pc/mykernel/amd64/unix ­k'
# reboot ­­ 'rpool/ROOT/opensolaris­1/'
# reboot ­­ 'rpool/ROOT/opensolaris­1/'
# reboot ­­ '/dev/dsk/c1t0d0s3'
# reboot ­­ '/dev/dsk/c1t0d0s3'
# reboot ­­ '/mnt/platform/i86pc/kernel/amd64/unix'
# reboot ­­ '/mnt/platform/i86pc/kernel/amd64/unix'
# reboot ­e SecondBE
# reboot ­e SecondBE

14
14
Techinical Details/Userland

Other Invocations
 init(1M)

# init 6
# init 6

 uadmin(1M)

# uadmin 2 8
# uadmin 2 8

15
15
Technical Details/In-kernel Boot Loader

Fast Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2) fb_swtch()
fb_swtch()

kadmin()
kadmin()

mdpreboot()
mdpreboot() fastboot_load_kernel()
fastboot_load_kernel()

mdboot()
mdboot() fast_reboot()
quiesce_devices()
fast_reboot()
quiesce_devices()

fast_reboot()
fast_reboot()
16
16
Technical Details/In-kernel Boot Loader

In-Kernel Boot Loader


 Process reboot command line arguments
 Load kernel and boot archive into high memory
- Must be above the end of boot_archive
- Could fail if there is insufficient free memory
 Load fast reboot switcher to low memory
 Construct physical memory list (actually PTE list)
- For new kernel and boot archive
 Construct page tables
- 1-1 mapping for first 1GB
- Tables to handle mapping in new kernel and boot archive
 Construct multiboot data structure

17
17
Technical Details/In Kernel Boot Loader

High
Memory
New boot archive

New kernel

New boot archive


New kernel
In
Kernel Root
boot dataset
New kernel or disk
loader
New boot archive

boot archive

Kernel

0x5000 Fast Reboot Switcher


Low
Memory

18
18
Technical Details/Device Quiescent

Fast Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2) fb_swtch()
fb_swtch()

kadmin()
kadmin()

mdpreboot()
mdpreboot() fastboot_load_kernel()
fastboot_load_kernel()

mdboot()
mdboot() fast_reboot()
quiesce_devices()
fast_reboot()
quiesce_devices()

fast_reboot()
fast_reboot()
19
19
Technical Details/Device Quiescent

Driver Structure: Bird's-Eye View


modlinkage modldrv mod_driveropts

cb_ops Provided by
dev_ops Solaris
open
attach
close
detach
read
probe
write
getinfo
ioctl
reset
strategy
data: devo_cb_ops
dump Some or all
implement at ions
quiesce devmap provided by device
driver
segmap
Passed t o mmap
mod_install() in
_init() and prop_op
mod_uninstall() in
aread
_fini()
awrite

20
20
Technical Details/Device Quiescent

quiesce(9E)
 Stop DMA
 Stop generating interrupts
 Leave device in such a state that it can be correctly
configured by the driver's attach routine without being
reconfigured by firmware
 Implementation must be lock-free
- Necessary to support panic reboot
- Obsolete reset(9E) is not lock-free and causes random hangs

21
21
Technical Details/Fast Reboot Switcher

Fast Reboot Flow Control


reboot(1M)
reboot(1M) init(1M)
init(1M)

libc,
libc,libscf
libscf

uadmin(2)
uadmin(2) fb_swtch()
fb_swtch()

kadmin()
kadmin()

mdpreboot()
mdpreboot() fastboot_load_kernel()
fastboot_load_kernel()

mdboot()
mdboot() fast_reboot()
quiesce_devices()
fast_reboot()
quiesce_devices()

fast_reboot()
fast_reboot()
22
22
Technical Details/Fast Reboot Switcher

Fast Reboot Switcher

 Copy fastboot_info of new image to low memory


 Switch to 32-bit legacy mode
 Enable paging if PAE is set
 Copy kernel and boot archive from high memory to low
memory, one page at a time
- Rewrite PTE for fake VA 0x80000000
- Reload CR3
- Copy one page from high memory to low memory
 Enter new unix via dboot

23
23
Technical Details/Fast Reboot Switcher

High
Memory
New boot archive

New kernel

New boot archive


New kernel

New kernel

New boot archive

New boot archive

New kernel
Fast Reboot Switcher
Low
Memory

24
24
Technical Details/How to debug fb_swtch()?

Dtrace doesn't help – invented leDTrace

An example
of a le D

25
25
Demo

Fast Reboot Demo

26
26
Summary

Typical Reboot Time (x86)


Measured from “rebooting...” to Solaris Banner
Being logged in /var/adm/messages

Platform Code Name # of CPUs Memory (GB) Regular Reboot Fast Reboot
x4140 Dorado 8 64 02:10 :05
x4150 Doradi 8 64 01:58 :05
x4270 Lynx 16 74 03:02 :10
x4450 Tucani 16 128 02:08 :06
x4500 Thumper 4 16 02:12 :06
x4600M2 Galaxy 4F 32 256 02:00 :08
x6250 Wolf 4 8 02:25 :03
x6270 Virgo 16 8 01:37 :05

27
27
Advanced Features

Panic Fast Reboot

 Known good kernel loaded after multiuser


- Controlled by boot-config service
- Enabled by default
 If fast reboot capable, reboot to new kernel on panic
 To debug panics between vfs_mountroot() and multiuser
- Set fastreboot_onpanic=1 in /etc/system
 What happens if the panic CPU is not the boot CPU?
 When can't it work?

28
28
Lives Made Easier by Fast Reboot
 Debugging early boot problems
 Anonymous Tracing
 Debugging non-Unloadable Modules
 Replacing/Upgrading Modules
 Resetting Software States
- Changing networking settings
 Clearing Software Problems
- Clearing hosed networking stacks
- Clearing nfsv4 states
 Performance Tuning
- Reduce time spent from banner to login prompt
 Work around BIOS and GRUB bugs

29
29
Additional Information
 Documentations
- System Administration Guide
 http://dlc.sun.com/osol/docs/content/SYSADV1/ghsbc.html
- What's New
 http://www.opensolaris.com/learn/features/whatsnew/200811
- PSARC cases
 http://opensolaris.org/os/community/arc/caselog/2008/382
 http://opensolaris.org/os/community/arc/caselog/2008/760
 http://opensolaris.org/os/community/arc/caselog/2009/091
 http://opensolaris.org/os/community/arc/caselog/2009/092
 Blogs
- http://blogs.sun.com/sherrym

30
30

You might also like