Ticket #9564 (closed enhancement: fixed)

Opened 4 years ago

Last modified 23 months ago

XO-1: RTC anti-rollback

Reported by: wmb@… Owned by: martin.langhoff
Priority: normal Milestone: 1-firmware-security
Component: ofw - open firmware Version: 1.0 Hardware
Keywords: Cc: martin@…, richard@…
Action Needed: test in release Verified: no
Deployments affected: Blocked By:
Blocking:

Description

The idea is to record boot timestamps in SPI FLASH to guard against clock-rollback attacks on the XO security.

It could be done without FLASH wearout by using several thousand locations in the mfg data page, incrementing to the next location on each boot. Erasure would be very infrequent. For example, if 32K were used, with 4-byte-plus-parity-byte timestamps, that would be 6K reboots before erase/rewrite is needed. That's about 4 reboots per day every day for 5 years.

The current idea is for OFW to convert the RTC date and time to a Unix-style seconds timestamp and write it to the next available location in the mfg data page of SPI FLASH. This would happen in the OFW secure startup sequence before disabling indexed IO. A new EC feature (already prototyped) permits writing to SPI FLASH without having to reboot.

OFW will make the latest timestamp available to the OS via a property in the device tree - details TBD.

OFW will only write increasing timestamps. If the RTC time is less than the last valid (good parity) timestamp, OFW will not write a new timestamp, and the fact that the RTC is too early will be exported to the OS via another device tree property - but the OS will be booted anyway in order to permit the initrd to fix the RTC.

Change History

  Changed 4 years ago by wmb@…

  • status changed from new to assigned
  • next_action changed from never set to code
  • summary changed from RTC anti-rollback to XO-1: RTC anti-rollback

Added XO-1 to summary line for ease of sorting. Also applies to XO-1.5, but the immediate need is for XO-1 machines.

follow-up: ↓ 3   Changed 4 years ago by mikus

How about a clock-rollforward attack -- suppose I see your XO sitting there, and surreptitiously in root change the time (and RTC) to say it is now the year 2015 - then reboot. Would your XO from then on keep showing the wrong date ?

in reply to: ↑ 2   Changed 4 years ago by wmb@…

Replying to mikus:

How about a clock-rollforward attack -- suppose I see your XO sitting there, and surreptitiously in root change the time (and RTC) to say it is now the year 2015 - then reboot. Would your XO from then on keep showing the wrong date ?

That is indeed a risk. Locks can be used against the owner.

The feature will have to be enabled with a special manufacturing data tag - I forgot to mention that in the description above.

  Changed 4 years ago by martin.langhoff

Good. What will the workflow be to actually rollback the clock?

In deployments that activate this, we'll start hearing if our clocks run fast, and of every ntp bug ;-)

  Changed 4 years ago by Quozl

  • version changed from not specified to 1.0 Hardware
  • milestone changed from Not Triaged to Future Release

triage.

  Changed 4 years ago by wmb@…

  • milestone changed from Future Release to 1.0-firmware-security

  Changed 4 years ago by martin.langhoff

Mitch,

there are a couple of changes I am hoping to get into the plan. Basically, a recovery mechanism for clock-forward situations. Will post it ASAP.

  Changed 4 years ago by martin.langhoff

  Changed 4 years ago by wmb@…

This has all the earmarks of a feature that will be the bane of my existence for years to come. A can of steroid-enhanced worms. Sigh.

  Changed 4 years ago by martin.langhoff

I gather the proposed approach ain't your favourite... alternatives...?

  Changed 4 years ago by wmb@…

There is nothing particularly wrong with the approach. It's just that the overall complexity is likely to result in a lot of corner cases, some of which may not present until much later. I'm starting to get the sinking feeling I had when working through all the possible error cases for the keyjector - and see http://dev.laptop.org/ticket/10022 for the latest chapter in that saga.

  Changed 4 years ago by martin.langhoff

I had assumed that the plan you outlined for saving the last-known-good-rtc would be atomic and resilient in the face of powerloss (which is the core issue at #10022 ). Resiliency is a major issue for a value that will be updated on every boot.

Other than that. I have been studying the possible corner cases for the OFW side and I cannot find one. We have exactly 3 variables:

  • RTC
  • Last-known-good RTC in SPI flash
  • /security/rtc-reset

So the main logic is

  • if LKGRTC == rtc-reset and rtc-reset is signed; then LKGRTC = RTC = rtc-reset; fi
  • if LKGRTC < RTC; then LKGRTC = RTC; fi

The real bastard is to make those writes to the SPI Flash in a failsafe way. On that track, I am 100% clueless in practical atomicity / checksum / validation strategies to deal with low level HW.

  Changed 3 years ago by martin.langhoff

Mitch points out that fixing #4397 would save us from a double-reboot in warm-boot sequences (and perhaps cold-boot sequences too?).

  Changed 3 years ago by wmb@…

  • owner changed from wmb@… to martin.langhoff
  • status changed from assigned to new
  • next_action changed from code to test in release

This feature has been released in some test versions and even in some production builds. Reassigning to Martin for final testing.

  Changed 23 months ago by wmb@…

  • status changed from new to closed
  • resolution set to fixed

Having heard no complaints for a year, I'm closing this ticket.

Note: See TracTickets for help on using tickets.