Opened 6 years ago

Closed 6 years ago

#8052 closed defect (fixed)

audio hardlocks the machine

Reported by: mstone Owned by: wmb@…
Priority: high Milestone: 8.2.0 (was Update.2)
Component: ofw - open firmware Version: not specified
Keywords: blocks?:8.2.0 8.2.0-753:- Cc: dsaxena, dilinger, cjb, cscott, dsd, ChristophD
Blocked By: Blocking:
Deployments affected: Action Needed: qa signoff
Verified: no

Description

On 8.2.0-753, Pippy-25,

  1. Start pippy and run the 'Bounce' example.
  1. Swirl the mouse cursor in a circle. For me, initial swirls are smooth and become increasingly bumpy over a period of ~ 3-10 seconds until the cursor becomes unresponsive.
  1. The screen is NOT redrawn, e.g. with a pygame window.

Change History (17)

comment:1 Changed 6 years ago by mstone

  • Cc dsd added

Seems that Pippy's 'Play sine' and cahalan's '/dev/audio' examples lock the machine in question even more quickly. So this may be dupe of #7885, or it may be a new hang.

cjb: does Bounce do any audio work?

comment:2 Changed 6 years ago by cjb

No, bounce does no audio work. It uses pygame, though, so I can't guarantee that the audio device isn't mutated at all.

comment:3 Changed 6 years ago by thomaswamm

I could not reproduce the Bounce bug with Pippy-25 in joyride-2301. There is no cursor displayed while "Hello!" bounces slowly around the screen.

comment:4 Changed 6 years ago by dsd

I have Michael's laptop. Crash is entirely reproducible and also happens on update1 2.6.22 kernel. One line over serial console at time of crash:

[  148.708691] snd-malloc: invalid device type 0

comment:5 Changed 6 years ago by dsd

The warning comes from snd_dma_free_pages, trace is:

[  164.510615]  [<c05652e7>] snd_cs5535audio_hw_free+0x42/0x59                  
[  164.511034]  [<c0556585>] snd_pcm_release_substream+0x32/0x63                
[  164.514740]  [<c05565f0>] snd_pcm_release+0x3a/0x81                          
[  164.521449]  [<c0459544>] __fput+0xab/0x17a                                  
[  164.525977]  [<c0448bb0>] remove_vma+0x34/0x45                               
[  164.534159]  [<c04498cc>] do_munmap+0x1ba/0x1d4                              
[  164.541571]  [<c0449915>] sys_munmap+0x2f/0x3d                               
[  164.546369]  [<c04037d2>] sysenter_past_esp+0x5f/0x85    

comment:6 Changed 6 years ago by dsd

  • Summary changed from Pippy's Bounce example hardlocks the machine. to audio hardlocks the machine

Tricky bug. It's not just bounce, its any audio producing application. Instant hard crash. It's unrelated to the above message, which is harmless.

comment:7 Changed 6 years ago by erikos

Yeah absolutely reproducible with memorize in 2301. I had not that issue a few builds before - /versions/boot/alt/boot/olpc_build tells me the old build I used was 2273 so the devil must be in between.

comment:8 Changed 6 years ago by dsd

Here's the trace:

SND_PCM_IOCTL_HWSYNC
snd_pcm_update_hw_ptr()
snd_pcm_update_hw_ptr_pos()
snd_cs5535audio_pcm_pointer()
cs5535audio_playback_read_dma_pntr()
cs_readl(cs5535au, ACC_BM0_PNTR)
<hang>

It hangs on inb(0x14e0). The port address is correct, I confirmed it on a machine that doesn't hang. Veery strange. Will save-nand and try 2273 now.

comment:9 Changed 6 years ago by dsd

2273 exhibits the same problem on this laptop, as does 708!

comment:10 Changed 6 years ago by cjb

Wow. Should we start reflashing older OFWs?

comment:11 Changed 6 years ago by dsd

  • Cc wmb@… added

Q2D16 works. Seems like a Q2E regression.

comment:12 Changed 6 years ago by dsd

Q2D16 works, Q2E10 fails.

comment:13 Changed 6 years ago by dsd

  • Cc ChristophD added; wmb@… removed
  • Component changed from kernel to ofw - open firmware
  • Owner changed from dilinger to wmb@…

This is because the volume in OpenFirmware was turned all the way down, causing OFW to not initialize the sound chip prior to linux boot. Mitch acknowledges that this is an OFW bug, he's working on a fix.

Workaround: during boot hit the "volume up" key a few times so that you can hear the boot jingle.

comment:14 Changed 6 years ago by wmb@…

Fixed by svn 886 - OFW was leaving the AC97 registers disabled. The fix is to
enable all of the virtualized PCI command registers before starting the OS.

The fix still needs to be rolled into an official OFW release.

comment:15 Changed 6 years ago by wmb@…

  • Action Needed changed from diagnose to add to build

comment:16 Changed 6 years ago by wmb@…

  • Action Needed changed from add to build to qa signoff

The fix is in q2e14 .

comment:17 Changed 6 years ago by mstone

  • Resolution set to fixed
  • Status changed from new to closed

Thanks for the prompt diagnosis and fix!

Note: See TracTickets for help on using tickets.