Opened 2 years ago

Closed 22 months ago

#12150 closed defect (fixed)

First-boot resize incomplete on XO-4 B1

Reported by: martin.langhoff Owned by: dsd
Priority: high Milestone: 13.1.0
Component: initramfs Version: not specified
Keywords: Cc: rsmith
Blocked By: Blocking:
Deployments affected: Action Needed: no action
Verified: no

Description

13.1.0 build 4 -- seems to have a 100% hit rate over a small number of tries and units.

Partition resize succeeds, but rereading the partition fails, so no ext fs resize happens during first boot, nor in later boots.

As a workaround for developers, running resize2fs on the partition on the 2nd boot (or any subsequent boot) works.

Attachments (2)

first-dmesg (23.0 KB) - added by martin.langhoff 2 years ago.
dmesg output after first boot
second-dmsg (18.0 KB) - added by martin.langhoff 2 years ago.
dmesg output after second boot

Download all attachments as: .zip

Change History (10)

Changed 2 years ago by martin.langhoff

dmesg output after first boot

Changed 2 years ago by martin.langhoff

dmesg output after second boot

comment:1 Changed 2 years ago by martin.langhoff

Relevant section of first boot dmesg

Successfully wrote the new partition table

Re-reading the partition table ...
[    5.213961] BLKRRPART: Device or resource busy
[    5.223694] sfdisk: The command to re-read the partition table failed.
Run partprobe(8), kpartx(8) or reboot your system now,
before using mkfs

[    5.259083] If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
[    5.293988] sfdisk returned 1
[    5.458798] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)

Later boots always show

[    4.806024] EXT4-fs (mmcblk0p1): mounted filesystem without journal. Opts: (null)
[    4.848336] Dsize 15515648 Psize 15376384 Pstart 139264 Pend 15515648
[    5.404058] dcon_freeze_store: 0

comment:2 Changed 2 years ago by martin.langhoff

Installed OS5 on two XO-4 units, one showed this bug, the other one did not.

comment:3 Changed 2 years ago by erikos

  • Action Needed changed from never set to diagnose

I just run again into this on my XO-4, I had 1.7G used and 116M available for '/'. Flashed again, same result. Flashed another time to make sure and now I have 1.7G used and 5.6G available on '/'.

Is this a device dependent issue? Or just something we see from time to time on any XO-4 device?

comment:4 follow-up: Changed 2 years ago by dsd

It's just time to time according to other reporters. I haven't seen it yet, even when I ran it in a loop. But I have some more things to try.

comment:5 in reply to: ↑ 4 Changed 2 years ago by rsmith

Replying to dsd:

It's just time to time according to other reporters. I haven't seen it yet, even when I ran it in a loop. But I have some more things to try.

Seems its gotten worse. In my os10 runin tests that I ran this weekend 50% of the laptops failed runin due to running out of disk space. Perhaps a 1st boot right into runin makes it happen more likely?

C1 Assembly/Runin is currently scheduled for 11/19 which gives us this week to sort this out before we have a very unhappy runin experience.

comment:6 Changed 2 years ago by dsd

  • Action Needed changed from diagnose to add to build
  • Cc rsmith added

Thanks for the reminder.

The problem is described here: http://lists.freedesktop.org/archives/systemd-devel/2012-November/007375.html

The solution is to use "sfdisk --no-reread". That means that the first BLKRRPART is skipped (not checking if anyone is using the device before making changes), avoiding the interfering udev device open described above. The final BLKRRPART (to ask the kernel to re-read the partition table after the changes have been made) still happens, conveniently.

I tested this in a loop that was previously failing - it doesn't fail any more.

Fixed in dracut-modules-olpc commit 8fd211913e1f15606dba89a30ead6413aee9f59c

comment:7 Changed 2 years ago by dsd

  • Action Needed changed from add to build to test in build

Test in 13.1.0 build 12

comment:8 Changed 22 months ago by greenfeld

  • Action Needed changed from test in build to no action
  • Resolution set to fixed
  • Status changed from new to closed

I do not think we have seen this in a while.

Closing on XO-4 with 13.1.0 os20.

Note: See TracTickets for help on using tickets.