Ticket #10999 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

XO-1.75 A3 idt1338 chip loses oscillator over laptop power cycle

Reported by: Quozl Owned by: wad
Priority: normal Milestone: 1.75-hardware
Component: hardware Version: 1.75-A2
Keywords: Cc: saadia
Action Needed: test in build Verified: no
Deployments affected: Blocked By:
Blocking: #10914

Description

Method:

  • start OpenFirmware and stop at ok prompt,
  • query the oscillator stop flag of the idt1338 chip,
    dev /rtc 7 rtc@ 20 and .
    
    and expect result 20, which means the bit is set,
  • clear the bit,
    7 rtc@ 20 xor 7 rtc!
    
  • check the bit,
    7 rtc@ 20 and .
    
    and expect result 0, which means the bit is cleared,
  • reboot with
    reboot
    
    and again obtain the ok prompt,
  • check the bit,
    dev /rtc 7 rtc@ 20 and .
    
    and expect result 0, which means the RTC oscillator was not stopped by the reboot, a nominal finding,
  • power off with
    power-off
    
    then use the power button to turn the laptop back on,
  • check the bit,
    dev /rtc 7 rtc@ 20 and .
    

Expected result: 0.

Observed result: 20, an abnormal finding indicating that the RTC oscillator was stopped by the power cycle.

Attachments

i2c_log (224 bytes) - added by wad 3 years ago.
log of I2C accesses made to the RTC chip

Change History

Changed 3 years ago by Quozl

Does not reproduce on XO-1.75 A2 unit 39.

Changed 3 years ago by saadia

  • cc saadia added

Changed 3 years ago by saadia

In set-time, 100 /mod puts 0 on the stack but leaves the previous six numbers (s m h d m y) The next instruction does 8 bcd! So two things don't make sense: Writing 0 to byte 8, when byte 8 doesn't need to be written to. And also when you write to byte 0, maybe you need to make sure the top bit (CH) is not 1 for some reason. Yesterday I saw very large values (1ff, 578) being read from byte 0 in the seconds value.

Changed 3 years ago by wad

You don't have to power-off for this to happen.

Clear the bit, and then wait for a while. It will be reset.

A first check of Vbat shows that it doesn't glitch or droop during a power cycle, or when this bit sets itself. The supposition is that this is actually due to noise impacting the RTC oscillator itself.

Changed 3 years ago by Quozl

(Cache flush ... Symptom reproduced by Mitch on an A3, and by Saadia on an A3, and by John. Saadia's question resolved in IRC; byte 8 is first non-RTC RAM location of RTC chip. James confirmed on conference call that RTC value is being cleared at the same time as Oscillator Stop Flag is being set.)

Changed 3 years ago by wad

  • owner set to wad
  • status changed from new to assigned
  • next_action changed from never set to diagnose
  • component changed from not assigned to hardware
  • version changed from not specified to 1.75-A2

A look on the hardware side is at: http://dev.laptop.org/~wad/10999/

No glaring problems, just some abnormalities...

Changed 3 years ago by wad

Changed the crystal layout, and cleanup up the high frequency noise on the crystal input. It had no effect on this problem.

Changed 3 years ago by wad

Interesting result --- wiring up Vdd to a voltage divider tied to +3.3VSUS fixed the problem. This doesn't sound interesting, as +3.3VSUS is not turned off when the laptop is powered off, but removing all power from the motherboard (forcing the IDT1338 to switch to the Vbat input) didn't trigger the problem.

I believe the problem is due to the power-off waveform of Vdd. I would note that the registers of the IDT1338 always report the amount of time since the last power-off!

Changed 3 years ago by wad

Using +1.8V_PMIC (which has much nicer rise and fall than +1.8V_GPIO) didn't have any effect --- it still loses the time when the laptop is powered off.

Double-checked again with Vdd powered from a voltage divider, and it doesn't lose the time.

Changed 3 years ago by wad

log of I2C accesses made to the RTC chip

Changed 3 years ago by Quozl

Turning off square wave output mode of the chip reduced probability of failure.

(Tested by disabling the square wave output mode in the chip, using " /dev/rtc 0 7 rtc! " ... this made it possible for a short power cycle " reboot " in OpenFirmware to preserve the time, but a long power cycle in Linux or with the power button still lost the time. The time shown by the RTC registers corresponded to the wall clock time since power off, not the wall clock time since power on. This shows the stored value is lost when the power falls.)

Changed 3 years ago by wad

When the RTC is powered from a divider from +3.3VSUS (and maintaining the time), the fall time during power-off is greater --- about 0.5V/mS as opposed to 1V/ms. The glitch on the oscillator is correspondingly smaller.

This is shown by the trace at the bottom of: http://dev.laptop.org/~wad/10999/

Changed 3 years ago by wad

And bingo, the data sheet implies that the Vdd fall time on power off must be less than 1V/mS when passing through the power fail region (1.7 to 1.4V), using the little noticed tVCCF parameter.

I'll propose a quick fix and test it out tmw.

Changed 3 years ago by Quozl

Yes, I just saw that too and was about to draw your attention to it.

Also noticed in your most recent trace that the divided supply voltage was below the minimum specified, which is V(pf)(max), so wondered if the device had already transitioned to battery.

Changed 3 years ago by dsd

  • milestone changed from Not Triaged to 1.75-hardware

Changed 3 years ago by wad

  • next_action changed from diagnose to test in build

The data sheet is off by a little bit, as the minimum fall-time is greater than 300 uS. But using a fall time of 760 uS works. Tested on two laptops so far.

Waveforms are at: http://dev.laptop.org/~wad/10999/

The ECO is to replace R29 with a 330 ohm resistor, and place a 10 uF, 6.3V cap in parallel with C227.

Changed 3 years ago by Quozl

  • blocking 10914 added

(In #10914) Adding B1 exit criteria as at 2011-08-25.

Changed 3 years ago by Quozl

  • status changed from assigned to closed
  • resolution set to fixed

this was fixed.

Changed 3 years ago by wad

Just a final note. When tested over a large sample of devices in B1 runin, it was determined that a 2 mS delay was actually required for some devices. The part datasheet was re-issued with this correction.

Instead of using a simple RC delay (which either requires a large R, lowering VCC at the part, or a large C, which is expensive), a switch and a capacitor are used in C1 and beyond. This easily provides turn-off times of 4mS and cheaply fixes the problem.

Note: See TracTickets for help on using tickets.