Ticket #12150 (closed defect: fixed)

Opened 22 months ago

Last modified 19 months ago

First-boot resize incomplete on XO-4 B1

Reported by: martin.langhoff Owned by: dsd
Priority: high Milestone: 13.1.0
Component: initramfs Version: not specified
Keywords: Cc: rsmith
Action Needed: no action Verified: no
Deployments affected: Blocked By:
Blocking:

Description

13.1.0 build 4 -- seems to have a 100% hit rate over a small number of tries and units.

Partition resize succeeds, but rereading the partition fails, so no ext fs resize happens during first boot, nor in later boots.

As a workaround for developers, running resize2fs on the partition on the 2nd boot (or any subsequent boot) works.

Attachments

first-dmesg (23.0 kB) - added by martin.langhoff 22 months ago.
dmesg output after first boot
second-dmsg (18.0 kB) - added by martin.langhoff 22 months ago.
dmesg output after second boot

Change History

Changed 22 months ago by martin.langhoff

dmesg output after first boot

Changed 22 months ago by martin.langhoff

dmesg output after second boot

  Changed 22 months ago by martin.langhoff

Relevant section of first boot dmesg

Successfully wrote the new partition table

Re-reading the partition table ...
[    5.213961] BLKRRPART: Device or resource busy
[    5.223694] sfdisk: The command to re-read the partition table failed.
Run partprobe(8), kpartx(8) or reboot your system now,
before using mkfs

[    5.259083] If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
[    5.293988] sfdisk returned 1
[    5.458798] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)

Later boots always show

[    4.806024] EXT4-fs (mmcblk0p1): mounted filesystem without journal. Opts: (null)
[    4.848336] Dsize 15515648 Psize 15376384 Pstart 139264 Pend 15515648
[    5.404058] dcon_freeze_store: 0

  Changed 22 months ago by martin.langhoff

Installed OS5 on two XO-4 units, one showed this bug, the other one did not.

  Changed 21 months ago by erikos

  • next_action changed from never set to diagnose

I just run again into this on my XO-4, I had 1.7G used and 116M available for '/'. Flashed again, same result. Flashed another time to make sure and now I have 1.7G used and 5.6G available on '/'.

Is this a device dependent issue? Or just something we see from time to time on any XO-4 device?

follow-up: ↓ 5   Changed 21 months ago by dsd

It's just time to time according to other reporters. I haven't seen it yet, even when I ran it in a loop. But I have some more things to try.

in reply to: ↑ 4   Changed 21 months ago by rsmith

Replying to dsd:

It's just time to time according to other reporters. I haven't seen it yet, even when I ran it in a loop. But I have some more things to try.

Seems its gotten worse. In my os10 runin tests that I ran this weekend 50% of the laptops failed runin due to running out of disk space. Perhaps a 1st boot right into runin makes it happen more likely?

C1 Assembly/Runin is currently scheduled for 11/19 which gives us this week to sort this out before we have a very unhappy runin experience.

  Changed 21 months ago by dsd

  • cc rsmith added
  • next_action changed from diagnose to add to build

Thanks for the reminder.

The problem is described here: http://lists.freedesktop.org/archives/systemd-devel/2012-November/007375.html

The solution is to use "sfdisk --no-reread". That means that the first BLKRRPART is skipped (not checking if anyone is using the device before making changes), avoiding the interfering udev device open described above. The final BLKRRPART (to ask the kernel to re-read the partition table after the changes have been made) still happens, conveniently.

I tested this in a loop that was previously failing - it doesn't fail any more.

Fixed in dracut-modules-olpc commit 8fd211913e1f15606dba89a30ead6413aee9f59c

  Changed 21 months ago by dsd

  • next_action changed from add to build to test in build

Test in 13.1.0 build 12

  Changed 19 months ago by greenfeld

  • status changed from new to closed
  • next_action changed from test in build to no action
  • resolution set to fixed

I do not think we have seen this in a while.

Closing on XO-4 with 13.1.0 os20.

Note: See TracTickets for help on using tickets.