Ticket #12164 (closed defect: fixed)

Opened 22 months ago

Last modified 20 months ago

XO-4 os5 yum Invalid GPG Key due to branch prediction bug

Reported by: Quozl Owned by: dsd
Priority: normal Milestone: 13.1.0
Component: distro Version: not specified
Keywords: Cc: cjl, shawnl
Action Needed: no action Verified: no
Deployments affected: Blocked By:
Blocking:

Description

bash-4.2# http_proxy=http://10.0.0.1:3128/ yum install firefox
Resolving Dependencies
--> Running transaction check
---> Package firefox.armv7hl 0:15.0.1-1.fc18 will be installed
--> Processing Dependency: xulrunner(armv7hl-32) >= 15.0.1-1 for package: firefox-15.0.1-1.fc18.armv7hl
--> Processing Dependency: system-bookmarks for package: firefox-15.0.1-1.fc18.armv7hl
--> Processing Dependency: libxul.so for package: firefox-15.0.1-1.fc18.armv7hl
--> Processing Dependency: libxpcom.so for package: firefox-15.0.1-1.fc18.armv7hl
--> Processing Dependency: libmozalloc.so for package: firefox-15.0.1-1.fc18.armv7hl
--> Running transaction check
---> Package fedora-bookmarks.noarch 0:15-0.3 will be installed
---> Package xulrunner.armv7hl 0:15.0.1-1.fc18 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================
 Package                Arch          Version                Repository    Size
================================================================================
Installing:
 firefox                armv7hl       15.0.1-1.fc18          fedora        23 M
Installing for dependencies:
 fedora-bookmarks       noarch        15-0.3                 fedora       5.9 k
 xulrunner              armv7hl       15.0.1-1.fc18          fedora        12 M

Transaction Summary
================================================================================
Install  1 Package (+2 Dependent packages)

Total size: 35 M
Installed size: 54 M
Is this ok [y/N]: y
Downloading Packages:
warning: /var/cache/yum/armhfp/18/fedora/packages/firefox-15.0.1-1.fc18.armv7hl.rpm: Header V3 RSA/SHA1 Signature, key ID a4d647e9: NOKEY
Retrieving key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-armhfp


Invalid GPG Key from file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-armhfp: bad checksum on pgp message
bash-4.2# 

Change History

Changed 22 months ago by Quozl

Workaround: add --nogpg

Changed 22 months ago by cjl

I saw the same thing with every yum install I tried on XO4-b1 OS5, only overcame with --nogpg flag (at jnettlet's suggestion).

Changed 21 months ago by cjl

  • cc cjl added

Changed 21 months ago by tonyforster

  • milestone changed from Not Triaged to 13.1.0

Changed 21 months ago by tonyforster

As a workaround, maybe edit /etc/yum.repos.d/*.repos and toggle gpgcheck= to 0? OOB could do that till the issue is resolved upstream.

Jerry

Changed 21 months ago by dsd

  • owner set to dsd
  • component changed from not assigned to distro

This is an XO4-specific issue that doesn't affect XO-1.75: tested 13.1.0 build 8, "yum -y install patch" to install patch-2.6.1-13.fc18 from the "fedora" repository, failed on XO-4 as above, succeeded on XO-1.75.

Changed 21 months ago by shawnl

workaround:

rpm --import /etc/pki/rpm-gpg/*

(as root)

Changed 21 months ago by dsd

This is a bizarre bug. The following python code produces bad results on XO-4:

def crc24(msg) :
    crc24_init = 0xb704ce
    crc24_poly = 0x1864cfb

    crc = crc24_init
    for i in list(msg) :
        crc = crc ^ (ord(i) << 16)
        for j in range(0, 8) :
            crc = crc << 1
            if crc & 0x1000000 :
                crc = crc ^ crc24_poly
    print "ret", crc
    return crc & 0xffffff

print crc24("aaaaa")

If I insert a print statement anywhere in the loop, or make the following change, things start working.

-            if crc & 0x1000000 :
+            if (crc & 0x1000000) != 0 :

Works on XO-1.75 on the same OS build (i.e. most binaries are identical). Very odd.

Changed 21 months ago by dsd

  • cc shawnl added

Changed 21 months ago by wmb@…

I've done a fair amount of exploration of this program too, and have seen some bizarre results. For example, if you insert the line

     foo = crc & 0x1000000

before the if statement, the answer changes. Worse yet, the answer sometimes changes from run to run of the same program text!

I have looked at the implementation of the Python interpreter to see what happens when you say "if crc & 0x1000000 :". It turns out to be surprisingly complex, due to the fact that Python internally represents long integer values in sign-magnitude format, converting to twos-complement for bitwise operations, and then back to sign-magnitude for the result. Furthermore, the interpretation of an integer value as a boolean (i.e. for "if") depends on a size optimization where a zero-valued integer is represented as an object of size 0. All of this results in allocation of new temporary objects, so there are many opportunities for memory or cache corruption to change values.

This one has me worried...

Changed 21 months ago by shawnl

is the bug present when running the armel soft float python interpreter?

Changed 21 months ago by dsd

Shawn: not sure, interested in trying it out? I wonder if the following way would be a quick way to set up a softfp chroot:

# yum install mock
# /usr/bin/mock -r fedora-18-arm --init
# /usr/bin/mock -r fedora-18-arm --install python
# /usr/bin/mock -r fedora-18-arm --shell

Going back to the problem in question, after some thinking I was reminded of #11763.

So I made a new kernel with CONFIG_CPU_BPREDICT_DISABLE=y and now suddenly the test case and yum start working fine.

So this may be another case of branch prediction screwing up execution flow. Or it could be another random obscuring effect similar to inserting a print statement...but with #11763 under our fingernails as well, I'm suspicious of the former.

Changed 21 months ago by dsd

A quick discussion in #python suggests that there isn't really an interactive way of stepping through python bytecode, viewing the stack, etc. The next closest thing would be using gdb, setting some breakpoints in PyEval_EvalFrameEx, using gdb to access stack values via C variables, and maybe using the gdb python extensions which can print a python backtrace on demand.

Changed 21 months ago by dsd

After applying PJ4B_ERRATA_6409, the python code works.

Changed 21 months ago by dsd

  • next_action changed from never set to test in build
  • summary changed from XO-4 os5 yum Invalid GPG Key to XO-4 os5 yum Invalid GPG Key due to branch prediction bug

The latest PJ4B errata documents from Marvell document this hardware bug, and suggest disabling static branch prediction. The errata docs for the PJ4B-B1 version (to be included in the XO-4 C1) also includes this errata. So I've pushed an arm-3.5 kernel hack to disable static branch prediction: bc0e5a9bf0306e2fb9235efd9747f1b45f1cadbb

Changed 21 months ago by dsd

  • next_action changed from test in build to add to build

Changed 21 months ago by dsd

  • next_action changed from add to build to test in build

Test in 13.1.0 build 12. Bert, can you provide a test case showing how etoys was affected by this breakage?

Changed 20 months ago by greenfeld

  • status changed from new to closed
  • next_action changed from test in build to no action
  • resolution set to fixed

Fixed in 13.1.0 os14.

Note: See TracTickets for help on using tickets.