Ticket #4184 (closed defect: wontfix)

Opened 7 years ago

Last modified 7 years ago

JFFS2 Dirent Anomaly

Reported by: wmb@… Owned by: krstic
Priority: blocker Milestone: 8.2.0 (was Update.2)
Component: kernel Version:
Keywords: Cc: mstone, cscott, wmb, dwmw2, bertl
Action Needed: Verified: no
Deployments affected: Blocked By:
Blocking:

Description

We have a laptop with a JFFS2 filesystem with lots of bogus dirent nodes. OFW q2c28 cannot read from this filesystem, and although Linux can read it, certain operations are very slow. Analysis shows that there is a directory /versions/b0rked that contains 6 million dirent nodes for the file "joydev.ko".

David thinks that the problem is caused by a bug in the JFFS2 garbage collector; instead of cleaning out the garbage, it sometimes just keeps making clones of the same dirent node.

There are two things that need fixing: a) OFW needs to cope with this pathology b) JFFS2 needs to stop making it

I am filing this bug against the kernel because the OFW fix is about to be checked in, so there is little point in tracking it against open firmware.

Attachments

modprobe.patch2 (1.8 kB) - added by gnu 7 years ago.
Patch to modprobe to avoid opening module file read/write.

Change History

  Changed 7 years ago by wmb@…

The OFW workaround for this is svn 674, which will appear in q2d01.

  Changed 7 years ago by jg

  • milestone changed from Never Assigned to First Deployment, V1.0

  Changed 7 years ago by dwmw2

Analysis so far shows that it is garbage collection going wrong -- somehow there are _two_ 'joydev.ko' entries in the dirent list for the offending directory, which should never happen. When GC tries to obsolete one of them, it removes the _other_ one from the list instead -- so it doesn't make any progress at all, and writes out a dirent for the same 'joydev.ko' over and over again, because that's always the next node to be obsoleted.

Still investigating how this happens....

  Changed 7 years ago by wmb@…

Another example of this problem just surfaced, per the message below. I gave him a special firmware with analysis tools, and we found 8,975,987 duplicate dirents. At least one of them was named "joydev.ko".

scan-nand showed that the entire NAND was filled with JFFS2 blocks, except for two clean blocks. The JFFS2 blocks all had summaries, except for one.

By my calculations, a raw dirent + a summary dirent for "joydev.ko" consumes a total of 86 bytes, so the total space occupied by these dirents would be 771,934,882 bytes.

Considering the size of the OS, this means that nearly everything other than the OS is filled with the junk dirents.

Yoshiki Ohshima wrote:
>   Hello,
>
>   My B4 with 616 build went into some interesting state.  I was
> copying some executable files to a directory (under /usr/local/lib/)
> from my USB memory and playing with it (for several iteration) from
> the Sugar console.  But I terminate the executable and left the system
> idle for a night.  That was yesterday.
>
>   Today, I was trying to remove the file I copied by "rm" command but
> get an error that says "No space left on device".  I try to reboot my
> machine (perhaps a bad idea) but it went into the "launching X loop".
>
>   I did force poweroff and now the unit doesn't boot.  I got OFW's
> "ok" prompt and typed:
>
> 	dir nand:\boot
>
> but it says:
>
>         jffs2-file-system
>         jffs2:bad read
>
> I saw some trac items about nand corruption and it might have happened
> to me.  I don't know if there is a way to salvage some useful
> information from my unit at this point, but if somebody has an idea
> (or just say it is a known problem) please let me know.  Otherwise,
> I'd just reinstall a build...
>
> -- Yoshiki

  Changed 7 years ago by dwmw2

This is the offending directory entry:

1c616bd0  .................................... 85 19 01 e0  |...:@....G;0....|
1c616be0  32 00 00 00 f3 76 37 50  ef b2 02 00 12 00 00 00  |2....v7P........|
1c616bf0  68 be 02 00 32 26 05 47  0a 08 00 00 0f 40 39 10  |h...2&.G.....@9.|
1c616c00  a3 38 58 41 6a 6f 79 64  65 76 2e 6b 6f 00 ff ff  |.8XAjoydev.ko...|

Note the recorded name length of ten bytes, and the tenth byte being zero. When we try to replace this directory entry, we end up creating a new dirent "joydev.ko" with a _different_ hash value, which doesn't succeed in displacing the offending "joydev.ko\0" from the list.

There are other strange dirents too, for example:

1c6157a0  .......................  85 19 01 e0 32 00 00 00  |.....P......2...|
1c6157b0  f3 76 37 50 ef b2 02 00  11 00 00 00 68 be 02 00  |.v7P........h...|
1c6157c0  32 26 05 47 0a 08 00 00  fd f4 f1 39 ef 23 52 ee  |2&.G.......9.#R.|
1c6157d0  6a 6f 79 64 65 76 2e 6b  6f a9 ff ff ...........  |joydev.ko.......|

This time, the length was extended to 10 bytes and one '0xa9' byte was appended.

I don't know why that would have happened in the first place -- I'll continue to investigate, and add a sanity check to catch it at least when the name contains zero bytes, at the time of creation.

The symptoms can be addressed in two ways -- firstly, we can abandon GC immediately when we attempt to GC and find that the node we were attempting to obsolete _isn't_ now obsolete. And secondly, we can check for zeroes in existing nodes and cope with them so that the name hash _does_ start to match.

  Changed 7 years ago by dwmw2

  • status changed from new to assigned

Three fixes committed to mtd-2.6.git tree and cherry-picked into OLPC 'master' and 'stable' trees:

1. Sanity check in GC, to make sure the victim node actually got obsoleted. 2. Sanity check in three places for embedded NUL in names on the medium. 3. Sanity check in dirent creation, for embedded NUL in what we're asked to write.

The first two should suffice to cope with this problem, and similar classes of problem, in existing file systems -- while the third should highlight this particular problem when it first happens and hopefully give us a clue how we got into this situation in the first place.

If anyone has been able to reproduce this problem, I'd very much appreciate them trying to do so again, with the third of my patches (commit 69ca4378aa376cf2c49657d4f6951da56c27cd3a in the mtd-2.6.git tree, commit 8a4160c1c4579d1519845b95e0891b0ee4c8e209 in the OLPC stable tree).

  Changed 7 years ago by mstone

  • cc mstone, cscott, wmb, dwmw2, Bertl added

This sounds like it might be related to CoW link-breaking to me. Just as speculation, if the reporter used olpc-update and rsync got caught in some kind of loop when it encountered joystick.ko, it could have produced lots of copies of that file as it repeatedly tried to download the file but failed due to a link-breaking failure.

Can we confirm or deny any parts of this speculation?

  Changed 7 years ago by wmb@…

I don't think the loop hypothesis is necessary. dwmw has already figured out how the infinite copying happens, as a result of an issue with the way garbage collection was working.

The question we have now is "how did a name with a null included as part of the length get created in the first place".

I have my own speculation to add to the mix: Look for some place in some code where the file name is represented as address, length instead of C-style "address of string implicitly assumed to be null-terminated". Only with the addr,len representation could a trailing null possibly survive. Then work upwards until you find where the addr,len representation is generated. I suspect something like an inappropriate "length = strlen(s)+1".

  Changed 7 years ago by wmb@…

At the kernel syscall interface, the filename representation is the C string format - address of null-terminated string. That format cannot transport a string that *contains* a null - the extra null would just terminate the string.

It is therefore likely that the problem is inside the kernel, rather than a userspace app passing in a bogus string.

  Changed 7 years ago by dilinger

  • owner changed from dwmw2 to mstone
  • status changed from assigned to new

We have a c1 machine where this bug apparently triggered during init starting up udev (rc.sysinit calling start_udev, specifically). It happened on oct 6, with a kernel that included all of the latest vserver (0.4.7) patches. The only code that appends 0xa9 to filenames is in cow_break_link, which is provided by vserver.

We need to follow up w/ Bertl.

  Changed 7 years ago by wmb@…

I have found the problem on my stock 616 installation too. The easy way to check for the symptom is:

ls /ls /versions/run/616/lib/modules/2.6.22-20071009.1.olpc.ec54a65da6de0

43/kernel/drivers/input/

If "joydev.ko" appears twice, the problem exists. The second copy of joydev.ko is actually "joydev.ko\0", but you can't see the null in the name.

Here is the sequence of operations that leads to the problem:

a) The Activation startup process creates "/versions/run/616" as a "shallow copy" (tree of links) of "/versions/pristine/616" and reboots with "versions/run/616" as the virtual root.

b) During the execution of /etc/rc.d/rc.sysinit (Linux startup), around the time that /sbin/start_udev is running, several kernel modules are loaded, specifically cs5535_gpio, serio_raw, psmouse, ieee80211_crypt, ieee80211, libertas, usb8xxx, joydev, mousedev, and i2c_dev.

c) For some unknown reason, reading those modules causes "copy-on-write link breakage", i.e. the vserver code decides that it cannot leave those modules as links to the pristine copies, but rather must create writable copies of them.

d) For some other unknown reason, the copy process appears to happen twice for joydev.ko (it happens only once for the other modules), at roughly the same time (same 1-second timestamp on the JFFS2 dirents, dirents are close together in the same JFFS2 erase block, with only the mousedev.ko dirents intervening).

e) The copy-on-write process inside vserver involves creating a temp file named "joydev.ko\251" (octal 251 is hex a9), copying the pristine file to it, then renaming the temp file to "joydev.ko". See fs/namei.c:cow_break_link().

f) The second time that this happens on the same file, the temp file is renamed not to "joydev.ko", but instead to "joydev.ko\0", with a spurious extra null on the end.

g) The JFFS2 garbage collector cannot scavenge the file with the null on the end, because it sometimes uses strcmp() to match filenames (thus treating the null as a terminator), and other times uses a hash over the entire string length (thus including the null in the name). That confuses it about whether or not the node has been scavenged. The end result is that, instead of scavenging "joydev.ko\0", it creates an "infinite" number of copies of the dirent, with the copies named "joydev.ko" without the null.

h) There are four aspects to this bug, fixing any of which would probably make the bad effect (JFFS2 filling up with garbage) go away:

1) There is a bug in vserver whereby "simultaneous" attempts to break the same link race, and the second one to finish creates a bad name. Bertl has verified that this race exists, using a different, simpler test script. Fixing that bug would eliminate the bogus filename, thus the JFFS2 bug would not be triggered. (Note that the race does not always result in the appending of a null to the file name; sometimes the name gets garbled in other ways.)

2) There should be no need to break the links for these modules since there are only read, not written. Eliminating that link-breaking would suppress this particular manifestation of the problem - but the race condition would still exist and might bite us in some other context.

3) We don't need the joydev module anyway, so it should be eliminated from the kernel build. That too would hide the problem for now, but it might come back later in another context.

4) JFFS2 garbage collection should be improved to be stable in the face of such malformed filenames. Either that, or JFFS2 should refuse to create dirents with embedded or trailing nulls, since they cannot be garbage collected successfully.

5) It would be interesting to know why the link-breaking for joydev.ko happens twice. It could be due to asynchronous modloading, or something more subtle.

In any case, problem (1) must be fixed, because it could cause filesystem corruption of many different flavors, including files going "missing" because their names got mangled.

  Changed 7 years ago by wmb@…

I reloaded os616 with copy-nand, booted, and the bogus joydev.ko\0 was not present. I did the whole thing twice, the first time with a USB disk attached so the backup procedure happened, and the second time without a USB disk.

I looked at the JFFS2 raw data, searching for evidence of a duplicate cow_break_link on joydev.ko, and found none. So this problem, while common enough (at least 4 confirmed cases), is not 100% repeatable. That is consistent with a race condition - if a second thread were to enter may_open() after the first had completed the copy, then it would see the already-broken link and not attempt to rebreak it.

follow-up: ↓ 15   Changed 7 years ago by wmb@…

Regarding (c) above, i.e. "c) For some unknown reason, reading those modules causes "copy-on-write link breakage", ..."

Well, it turns out that modprobe opens .ko files in Read/Write mode.

Why does it do that? So it can support the --force* options, in which it attempts to remove version information from the module file to override mismatches with the kernel version. It opens R/W regardless of whether any --force* option is present.

IMHO, that is a bad idea. Munging a module file should be done with a separate command.

insmod does not do that bad thing - it opens the module file read-only.

  Changed 7 years ago by wmb@…

Here is a test script from Bertl that will reproduce the race condition issue:

rm -f za zb zc zd echo five >za ln za zb; ln za zc; ln za zd setattr --iunlink za sysctl -w vserver.debug_misc=7 touch ze touch za zb zc & touch zc zb za

If you run that whole sequence about 10 times, then reboot and do an "ls", you will usually see

ls: cannot access zb: No such file or directory (or za or zc)

Then, if you do "rm z*", then another "ls", one or more of the zN files will not have been deleted.

in reply to: ↑ 13   Changed 7 years ago by cscott

Replying to wmb@firmworks.com:

Well, it turns out that modprobe opens .ko files in Read/Write mode. Why does it do that? So it can support the --force* options, in which it attempts to remove version information from the module file to override mismatches with the kernel version. It opens R/W regardless of whether any --force* option is present. IMHO, that is a bad idea. Munging a module file should be done with a separate command.

A-ha! Thanks for this find; this was one of the puzzling issues remaining from trac #3581. I'll add 'fixing modprobe' to my to-do list.

  Changed 7 years ago by dilinger

I tried the latest vserver patch, along with the test script for reproducing the bug. Note that with 2.6.22-20071015.2.olpc.d6e22ac24d4182d (which includes jffs2 fixes), I don't get the same error as Mitch; the file never gets created improperly because jffs2 sanity checking catches it. Instead, I get a "Error in jffs2_write_dirent() -- name contains zero bytes!" both with and without the latest vserver patch. Here's what happens with the VS 0.4.8 patch:

-bash-3.2# ./test                                                               
vserver.debug_misc = 7                                                          
[   87.444815] vxD: cow_break_link(�za�)                                        
[   87.449334] vxD: path_lookup(old): 0                                         
[   87.453556] vxD: old path �/root/t/za� [�za�:2]                              
[   87.458994] vxD: temp copy �/root/t/za��                                     
[   87.463609] vxD: path_lookup(new): 0                                         
[   87.467831] vxD: lookup_create(new): cbc6d918 [�za��:3]                      
[   87.474112] vxD: vfs_create(new): 0                                          
[   87.478228] vxD: dentry_open(old): c10d90c0                                  
[   87.488775] vxD: dentry_open(new): cbef99a0                                  
[   87.494051] vxD: do_splice_direct: 5                                         
[   87.498385] vxD: vfs_rename: [�za��:3] -> [�za�:2]                           
[   87.504252] vxD: vfs_rename: 0                                               
[   87.508238] vxD: cow_break_link(�zb�)                                        
[   87.512793] vxD: path_lookup(old): 0                                         
[   87.517126] vxD: old path �/root/t/zb� [�zb�:2]                              
[   87.522690] vxD: temp copy �/root/t/zb��                                     
[   87.527425] vxD: path_lookup(new): 0                                         
[   87.531846] vxD: lookup_create(new): cb98ba38 [�zb��:3]                      
[   87.538121] vxD: vfs_create(new): 0                                          
[   87.542452] vxD: dentry_open(old): cc157ca0                                  
[   87.547468] vxD: dentry_open(new): cbef9760                                  
[   87.552640] vxD: do_splice_direct: 5                                         
[   87.556965] vxD: vfs_rename: [�zb��:3] -> [�zb�:2]                           
[   87.562818] vxD: vfs_rename: 0                                               
[   87.566677] vxD: cow_break_link(�zc�)                                        
[   87.571206] vxD: path_lookup(old): 0                                         
[   87.575539] vxD: old path �/root/t/zc� [�zc�:2]                              
[   87.581607] vxD: cow_break_link(�zc�)                                        
[   87.586044] vxD: path_lookup(old): 0                                         
[   87.590652] vxD: temp copy �/root/t/zc��                                     
[   87.595383] vxD: path_lookup(new): 0                                         
[   87.599817] vxD: lookup_create(new): cb98b9a8 [�zc��:3]                      
[   87.606088] vxD: vfs_create(new): 0                                          
[   87.610428] vxD: dentry_open(old): c10d90c0                                  
[   87.615447] vxD: dentry_open(new): ce5246c0                                  
[   87.620613] vxD: do_splice_direct: 5                                         
[   87.641221] vxD: vfs_rename: [�zc��:3] -> [�zc�:2]                           
[   87.663048] vxD: vfs_rename: 0                                               
[   87.682516] vxD: old path �/root/t/zc� [� zc�:3]                             
[   87.705706] vxD: temp copy �/root/t/zc��                                     
[   87.725877] vxD: path_lookup(new): 0                                         
[   87.745148] vxD: lookup_create(new): cbf2eac8 [�zc��:3]                      
[   87.766440] vxD: vfs_create(new): 0                                          
[   87.785794] vxD: dentry_open(old): cbef96a0                                  
[   87.805994] vxD: dentry_open(new): cc0bce60                                  
[   87.825865] vxD: do_splice_direct: 5                                         
[   87.844845] vxD: vfs_rename: [�zc��:3] -> [� zc�:3]                          
[   87.865387] Error in jffs2_write_dirent() -- name contains zero bytes!       
[   87.888397] Directory inode #36694, name at *0xcbc6d594 "zc"->ino #36968, nam
e_crc 0x17b6d5e2                                                                
[   87.914379] WARNING: at fs/jffs2/write.c:226 jffs2_write_dirent()            
[   87.938186]  [<c04c3675>] jffs2_write_dirent+0x99/0x3c4                      
[   87.961468]  [<c04c1974>] jffs2_reserve_space+0x17e/0x1a6                    
[   87.985049]  [<c04c3e76>] jffs2_do_link+0x115/0x168                          
[   88.008112]  [<c04be9ba>] jffs2_rename+0xc9/0x245                            
[   88.030962]  [<c0466cf4>] vfs_rename+0x2d0/0x428                             
[   88.053660]  [<c04694d9>] cow_break_link+0x48f/0x5b8                         
[   88.076851]  [<c0629723>] _spin_unlock+0xf/0x23                              
[   88.099488]  [<c04e391a>] _atomic_dec_and_lock+0x22/0x2c                     
[   88.123013]  [<c0629723>] _spin_unlock+0xf/0x23                              
[   88.145806]  [<c04e391a>] _atomic_dec_and_lock+0x22/0x2c                     
[   88.169787]  [<c06280bd>] __sched_text_start+0x7ad/0x7dd                     
[   88.193920]  [<c046fc6d>] __d_lookup+0x178/0x1a1                             
[   88.217286]  [<c04698de>] open_namei+0x282/0x5c6                             
[   88.240689]  [<c045dea3>] do_filp_open+0x2a/0x3e                             
[   88.264178]  [<c045dc31>] get_unused_fd+0x120/0x12a                          
[   88.288071]  [<c04e7ee7>] strncpy_from_user+0x33/0x4c                        
[   88.312174]  [<c045defd>] do_sys_open+0x46/0xcd                              
[   88.335723]  [<c045dfbd>] sys_open+0x1c/0x1e                                 
[   88.359063]  [<c0403cfe>] sysenter_past_esp+0x5f/0x85                        
[   88.383323]  [<c0620000>] ip6fl_seq_next+0xf/0x1d                            
[   88.407331]  =======================                                         
[   88.429993] vxD: vfs_rename: -5                                              
-bash-3.2# [  115.958444] JFFS2 notice: (1334) check_node_data: wrong data CRC i
n data node at 0x2c774604: read 0x881a82a8, calculated 0x5d770f97.              
                                                                           

  Changed 7 years ago by dilinger

  • cc bertl added; Bertl removed

  Changed 7 years ago by bertl

assumed fixed by the following patch:

http://vserver.13thfloor.at/Stuff/OLPC/delta-cow-fix17.diff

rationale: the issue was caused by a stale (unhashed) dentry which resulted in incorrect data being passed down to the actual filesystem layer.

  Changed 7 years ago by cscott

  • owner changed from mstone to dilinger

dilinger, can you review this/get this into a build for testing?

  Changed 7 years ago by dilinger

Should I be seeing these errors with the latest patch, when running the test script above?

vserver.debug_misc = 7                                                          
[  474.207326] vxD: cow_break_link(�za�)                                        
[  474.228040] vxD: path_lookup(old): 0                                         
[  474.247393] vxD: old path �/root/t/za� [�za�:2]                              
[  474.273551] vxD: cow_break_link(�zc�)                                        
[  474.293427] vxD: path_lookup(old): 0                                         
[  474.314180] vxD: lookup_create(new): c6808f48 [�zb��:3]                      
[  474.335955] vxD: vfs_create(new): 0                                          
[  474.356592] vxD: cow_break_link(�/root/t/zb�)                                
[  474.377548] vxD: path_lookup(old): 0                                         
[  474.398370] vxD: old path �/root/t/zb� [�zb�:2]                              
[  474.419429] vxD: temp copy �/root/t/zb��                                     
[  474.440689] vxD: path_lookup(new): 0                                         
[  474.460664] vxD: temp copy �/root/t/za��                                     
[  474.482204] vxD: path_lookup(new): 0                                         
[  474.502297] vxD: lookup_create(new): c6800478 [�za��:3]                      
[  474.525286] vxD: vfs_create(new): 0                                          
[  474.545424] vxD: dentry_open(old): c7ee42e0                                  
[  474.567238] vxD: dentry_open(new): c6e791a0                                  
[  474.588203] vxD: do_splice_direct: 5                                         
[  474.608915] vxD: vfs_rename: [�za��:3] -> [�za�:2]                           
[  474.630242] vxD: vfs_rename: 0                                               
[  474.650861] vxD: cow_break_link(�zb�)                                        
[  474.671043] vxD: old path �/root/t/zc� [�zc�:2]                              
[  474.693225] vxD: temp copy �/root/t/zc��                                     
[  474.713561] vxD: path_lookup(new): 0                                         
[  474.734240] vxD: lookup_create(new): c6808e28 [�zc��:3]                      
[  474.756295] vxD: vfs_create(new): 0                                          
[  474.778459] vxD: dentry_open(old): cc0163e0                                  
[  474.799731] vxD: dentry_open(new): c7e2c740                                  
[  474.821581] vxD: do_splice_direct: 5                                         
[  474.841700] vxD: vfs_rename: [�zc��:3] -> [�zc�:2]                           
[  474.864217] vxD: vfs_rename: 0                                               
[  474.883761] vxD: path_lookup(old): 0                                         
[  474.904760] vxD: old path �/root/t/zb� [�zb�:2]                              
[  474.925674] vxD: temp copy �/root/t/zb��                                     
[  474.946794] vxD: path_lookup(new): 0                                         
[  474.967140] vxD: cow_break_link(�zb�)                                        
[  474.988194] vxD: path_lookup(old): 0                                         
[  475.008255] vxD: old path �/root/t/zb� [�zb�:2]                              
[  475.030193] vxD: temp copy �/root/t/zb��                                     
[  475.050699] vxD: path_lookup(new): 0                                         
[  475.071663] vxD: lookup_create(new): c6808f48 [�zb��:3]                      
[  475.094295] vxD: vfs_create(new): 0                                          
[  475.115631] vxD: dentry_open(old): c7ee42e0                                  
[  475.136695] vxD: dentry_open(new): c88b8e60                                  
[  475.158358] vxD: do_splice_direct: 5                                         
[  475.178227] vxD: vfs_rename: [�zb��:3] -> [�zb�:2]                           
[  475.200449] vxD: vfs_rename: 0                                               
[  475.220014] vxD: lookup_create(new): c6d7c628 [�zb��:3]                      
[  475.242846] vxD: vfs_create(new): 0                                          
[  475.262775] vxD: lookup_create(new): c6d7c628 [�zb��:3]                      
[  475.285707] vxD: vfs_create(new): -17                                        
[  475.306114] vxD: temp copy �/root/t/zb��                                     
[  475.327935] vxD: path_lookup(new): 0                                         
[  475.348287] vxD: lookup_create(new): c6808ac8 [�zb��:3]                      
[  475.371360] vxD: vfs_create(new): 0                                          
touch: cannot touch `zb': No such file or directory             
[  476.438096] vxD: cow_break_link(�za�)                                        
[  476.459348] vxD: path_lookup(old): 0                                         
[  476.479231] vxD: old path �/root/t/za� [�za�:2]                              
[  476.501474] vxD: temp copy �/root/t/za��                                     
[  476.526369] vxD: cow_break_link(�zc�)                                        
[  476.547407] vxD: path_lookup(old): 0                                         
[  476.567248] vxD: path_lookup(new): 0                                         
[  476.587763] vxD: lookup_create(new): c6808d08 [�za��:3]                      
[  476.609789] vxD: vfs_create(new): 0                                          
[  476.630446] vxD: dentry_open(old): c6e842c0                                  
[  476.651004] vxD: dentry_open(new): c6e796e0                                  
[  476.672377] vxD: do_splice_direct: 5                                         
[  476.692067] vxD: vfs_rename: [�za��:3] -> [�za�:2]                           
[  476.714303] vxD: vfs_rename: 0                                               
[  476.733728] vxD: cow_break_link(�zb�)                                        
[  476.754265] vxD: path_lookup(old): 0                                         
[  476.773558] vxD: old path �/root/t/zc� [�zc�:2]                              
[  476.794783] vxD: temp copy �/root/t/zc��                                     
[  476.814957] vxD: path_lookup(new): 0                                         
[  476.835401] vxD: lookup_create(new): c6800f28 [�zc��:3]                      
[  476.856934] vxD: vfs_create(new): 0                                          
[  476.877445] vxD: dentry_open(old): c6e797a0                                  
[  476.897974] vxD: dentry_open(new): c9614200                                  
[  476.919455] vxD: do_splice_direct: 5                                         
[  476.939073] vxD: vfs_rename: [�zc��:3] -> [�zc�:2]                           
[  476.961550] vxD: vfs_rename: 0                                               
[  476.980635] vxD: old path �/root/t/zb� [�zb�:2]                              
[  477.002369] vxD: temp copy �/root/t/zb��                                     
[  477.022849] vxD: path_lookup(new): 0                                         
[  477.043930] vxD: cow_break_link(�zb�)                                        
[  477.063967] vxD: path_lookup(old): 0                                         
[  477.084617] vxD: old path �/root/t/zb� [�zb�:2]                              
[  477.105116] vxD: temp copy �/root/t/zb��                                     
[  477.126048] vxD: path_lookup(new): 0                                         
[  477.145772] vxD: lookup_create(new): c6808d98 [�zb��:3]                      
[  477.168441] vxD: vfs_create(new): 0                                          
[  477.188280] vxD: dentry_open(old): c6e842c0                                  
[  477.209770] vxD: dentry_open(new): c7ee4160                                  
[  477.230389] vxD: do_splice_direct: 5                                         
[  477.250777] vxD: vfs_rename: [�zb��:3] -> [�zb�:2]                           
[  477.271860] vxD: vfs_rename: 0                                               
[  477.313713] vxD: lookup_create(new): c6808f48 [�zb��:3]                      
[  477.336881] vxD: vfs_create(new): 0                                          
touch: cannot touch `zb': No such file or directory                             
ln: creating hard link `zc': File exists                               

There are times when the script runs without any errors at all..

  Changed 7 years ago by krstic

  • owner changed from dilinger to krstic

follow-up: ↓ 25   Changed 7 years ago by wmb@…

The current status:

a) The bug no longer causes JFFS2 to go crazy with garbage collection failures, due to a JFFS2 patch that prevents it from creating a bogus dirent when passed a bad filename containing nulls. (Need to confirm that the JFFS2 patch is in a released kernel.)

b) joydev.ko is no longer being config-ed in the kernel, so this particular trigger should no longer exist. (Need to confirm that this config change has made it into a release.)

c) We know why the .ko files are getting link-broken - it is because modprobe is stupid. We need to either change modprobe to open the files R/O when --force is not present, or use insmod instead, or compile-in several drivers instead of them being modules.

d) The vserver patch above seems to prevent the multiple-link-breaking, but has another non-atomicity. The patch needs to be reviewed by qualified kernel/filesystem hackers. Ideally, we would like link-breaking to be atomic, but failing that, we might be able to convince ourselves that the window of vulnerability is acceptable. In any case, the patch needs to be sanity-checked.

  Changed 7 years ago by gnu

Modprobe is opening the files read/write so it can use fcntl(, F_SETLKW) to avoid race conditions when two modprobes run simultaneously. The --force options don't appear to need to open the file read/write; they modify its image in malloc'd or private mmap'd memory. I wonder if something about the fcntl operation is giving the file system fits?

I have a patch that replaces fcntl with flock(, LOCK_EX), which doesn't require that the file is open read/write, but I'm still testing it.

Changed 7 years ago by gnu

Patch to modprobe to avoid opening module file read/write.

  Changed 7 years ago by gnu

I made the attached patch to modutils-3.3-pre3. It applies without trouble to modutils-3.3-pre11. I don't know what version of modutils Fedora and/or OLPC are using; I wasn't willing to sit through yum to find out, when "apt-get sources module-init-utils" was done in seconds. I tested this on Ubuntu 7.04, including forking hundreds of racing processes (yes MODULENAME | xargs -P 500 -n 2 ./modprobe). The patch is straightforward, and should be suitable for upstream.

in reply to: ↑ 22   Changed 7 years ago by bertl

Replying to wmb@firmworks.com:

The current status: d) The vserver patch above seems to prevent the multiple-link-breaking, but has another non-atomicity. The patch needs to be reviewed by qualified kernel/filesystem hackers. Ideally, we would like link-breaking to be atomic, but failing that, we might be able to convince ourselves that the window of vulnerability is acceptable. In any case, the patch needs to be sanity-checked.

it is quite hard to make a possibly long lasting 'copy' operation atomic, without locking down the entire filesystem, but I think what we really want is userspace consistancy instead of atomicity, thus I will propose a patch to achieve that shortly.

  Changed 7 years ago by jg

  • status changed from new to closed
  • resolution set to wontfix

Closing for now, as we're not doing vserver at this tmie.

Note: See TracTickets for help on using tickets.