Escalated to Bugzilla from IssueTracker
This is a continuation of IT#127712. Please make this issue visible to NetApps as they are involved in the ongoing RHEL4/5 performance problems discussion. This issue is about the increase in LOOKUP operations. Once again our co-worker Allan Soeby has come up with a test case: ------------------- Allan's mail ------------------------------------------ It seems like RHEL4/5 kernel is very keen on invalidating its dentry cache when changes happens to a directory. I have created a test program that will show the behavior. The test program (createunlink) will run 10 iterations of: 1. Create a file 2. Try to get status of 10 non-existing files 3. Get status of 10 existing files 4. Remove the file from item 1. RHEL4/5 show significant different behavior when changes occur in a directory and issuing additional NFS ops: For the first iteration the two are largely comparable. Whereas RHEL4/5 uses ACCESS calls RHEL3 uses GETATTR calls. That part is conceptually OK. For consecutive runs RHEL3 effectively uses cached information, whereas RHEL4/5 will continue to produce identical operations as in the first iteration. For these 10 iterations RHEL4/5 will produce approx 20 times more LOOKUPs and 10 times more ACCESS/GETATTR. ------------------- Allan's mail ------------------------------------------ I've attached the archive so that your engineers can verify the problem in your environment. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
SEG escalation -- The customer is reporting this problem as a follow up on another very similar case with GETATTR calls (IT#127712). They have attached a test case which shows that RHEL 4/5 are issuing a lot more LOOKUP calls too. We would appreciate comments on whether this behaviour is normal or not and whether it can be alleviated. Thanks, -Imed Issue escalated to Support Engineering Group by: ichihi. Internal Status set to 'Waiting on SEG' This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, we installed the latest FC6 kernel Linux XXX 2.6.22.7-57.fc6 #1 SMP Fri Sep 21 20:23:24 EDT 2007 i686 athlon i386 GNU/Linux on the RHEL5 test machine and found that RHEL5 user space plays along with this later kernel. Running the test case revealed that the LOOKUP problem is also visible on the 2.6.22 kernel. This indicates that this is a new problem and not something already fixed in the later kernel development. Unfortunately the latest development kernel 2.6.23-rc8 from Fedora 8 doesn't boot up due to too old mkinitrd on RHEL5 so I can't check the latest developments... This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hello Stefan, Thank you for those additional results. Please keep posting this sort of findings, they should help us get a more accurate understanding of the problem. -Imed This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, we received a patch proposal from Trond. It applies cleanly to RHEL5 kernel. Running the test case indicates that the problem will be fixed by it. It also looks like that with it RHEL5 is better than RHEL3 (our current platform): RHEL5 RHEL5-patched getattr : 18 | 12 | 0.66 lookup : 454 | 88 | 0.19 access : 214 | 26 | 0.12 RHEL3 RHEL5-patched getattr : 36 | 12 | 0.33 lookup : 84 | 88 | 1.04 access : 54 | 26 | 0.48 We'll rerun our full build tests with the patched kernel next to see if this improvement also translates in decreased build times on RHEL5. I guess you'll probably wait for the official commit from the kernel development tree before initiating your processes. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hello Stefan, Thanks a lot for sharing this patch with us. We have reported this issue to the same team which worked on the previous case. We will keep you posted. -Imed This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, Is this a RHEL-5 issue? It keeps mentioning RHEL-4/5, but the Product selected is RHEL-4, and the patch is for RHEL-5. If they want the patch to be considered for RHEL-4 and RHEL-5 we'll need two tickets, each considering only one specific product. Thanks! Fabio Internal Status set to 'Waiting on Support' This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
To SEG -- Fabio, You're right. I should have made this more specific, my apologies, I will fork another RHEL 4 case now. Let's address RHEL 5 only here. -Imed Product changed from 'Red Hat Enterprise Linux 4' to 'Red Hat Enterprise Linux 5' This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hello Stefan, I will turn this case into a RHEL 5-specific case and create another one for RHEL 4. -Imed Summary edited. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, while the proposed patch resolves the test case the results from the full build show that the patched RHEL5 kernel is even worse then the original kernel. We're currently studying tcpdumps taken with the new kernel. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hello Stefan, This is a bit strange indeed. Do you think it's still worth checking with Engineering whether the patch can be included and whether it effectively resolves an existing problem? I don't know whether Trond is planning to submit the patch upstream, when/if he does it will give the feedback of some additional reviewers. -Imed This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, I don't think it is worth including in the current state. We'll have to see what we come up with in further analysis. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Stefan, Thank you for notifying us about this. I will put this into Long Term state until we can decide what to do about it. -Imed Internal Status set to 'Waiting on Customer' Status set to: Long Term This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Hi Imed, the patch on it's own was incomplete. Trond has now provided a set of 19 patches for RHEL5 kernel. With these patches applied the RHEL5 kernel is close or better than the RHEL3 kernel on NFS operations for our customer application. We would kindly request that these are considered for inclusion in the RHEL5 kernel development. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Stefan, Thanks for the patches, let me consult with Engineering about including these patches. Cheers, Paul on behalf of Imed. This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Fabio, Nokia have now presented us with a whole patch series that they want including in RHEL5 as the previous sinlge patch aparently was not complete. I am unsure how we want to handle this, what are your thoughts. Cheers, Paul on behalf of Imed. Internal Status set to 'Waiting on SEG' This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
File uploaded: RHEL-5_devel.tar.bz2 This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433 it_file 104136
Hi Imed, Trond provided an updated patch set: "I've reordered some of the patches in order to try to put NFSv4-related optimisations towards the end, and fixes towards the beginning. This should hopefully make it easier for RedHat to figure out what they want to keep.." This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Fabio, Customer has provided an updated patchset. Cheers, Paul This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
I studied all patches today and, as far as I can see, they are all benign and either fix or optimize things. All questions I had during the analysis were answered by the subsequent patches in the series. I'll send them to Engineering for another pair of eyes and to decide how we want to bring them in. Of course we'll have to do extensive QA before applying such lengthy changes. Thanks, Fabio This event sent from IssueTracker by fleite [Support Engineering Group] issue 133433
Created attachment 226021 [details] Purposed Patch set written by the NFS upstream maintianer (Trond Myklebust)
Opening this bug up to NetApp per Nokia's instructions...
Created attachment 234341 [details] NFSv2: Ensure that the directory metadata gets revalidated on file create Testing with the new patchset revealed a missing revalidation in the case of NFSv2 file creation. The problem could potentially trigger with the old attribute revalidation code too.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Development Management has reviewed and declined this request. You may appeal this decision by reopening this request.
Reopening and putting on the list for RHEL 5.3.
this bug has been tagged for inclusion in the RHEL5.2 release notes. please post the necessary release notes text for it. thanks!
Created attachment 290270 [details] Proposed patch Here is the sum/total ported Proposed patch from the 37 patches previously posted.
Created attachment 290272 [details] nfsstat output from vanilla RHEL-5 kernel Here is the nfsstat output from running 5 passes of the Connectation testsuite, v2/udp, v2/tcp, v3/udp, v3/tcp, and v4 on the vanilla RHEL-5 .62 kernel.
Created attachment 290273 [details] nfsstat output from the patched RHEL-5 kernel Here is the nfsstat output from running 5 passes of the Connectation testsuite, v2/udp, v2/tcp/, v3/udp, v3/tcp, and v4 on the patched RHEL-5 .62 kernel.
So far, the changes seem to be a net negative reault. There are some small gains in RPC count reductions, but then a large increase in the number of NFSv4 READ operations which were done. Some more work needs to be done.
A question -- where are the 19 patches mentioned in Comment #14?
Created attachment 291741 [details] nfsstat output from vanilla RHEL-5 kernel Here is the current "nfsstat -c" output from the vanilla RHEL-5 .68 kernel.
Created attachment 291742 [details] nfsstat output from the patched RHEL-5 kernel Here is the "nfsstat -c" output from the patched RHEL-5 .68 kernel.
Created attachment 291744 [details] Proposed patch Here is the current proposed set of changes. They include a small bug fix to nfs4_atomic_open which was required in order to stabillize the system during Connectathon runs. The 'nfsstat -c" results from the RHEL-5 .68 kernels show dramatic reductions in NFSv2 READ operations and also in NFSv4 GETATTR and LOOKUP operations. They show increases in NFSv3 LOOKUP and ACCESS operations however.
Created attachment 293197 [details] "before" nfsstat -c statistics Here are the "nfsstat -c" statistics from a vanilla i686 RHEL-5, b75.
Created attachment 293198 [details] "after" nfsstat -c statistics Here are the "nfsstat -c" statistics from a patched i686 RHEL-5, b75.
Created attachment 293201 [details] Proposed patch Here is a tested proposed patch. It contains a couple of fixes for bugzillas which were discovered along the way. One was a cut-n-paste error in nfs_atomic_open(). This was leading to a system due to using a NULL pointer because the wrong pointer was being assigned. The other problem was a hang due to an rpciod needing to make a synchronous call to release a delegation. This change was constructed by Trond and backported to the RHEL-5 kernel. The performance increase seems significant especially in the reduction in the number of over the wire LOOKUP and READ operations. The number of over the wire GETATTR operations was increased, but this was not unexpected. The cache validation has to happen someway, whether by GETATTR or some other operation which returns attributes. Fortunately, GETATTR is generally the cheapest operation for the server.
One other note, the nfsstat statistics were gathered by running five passes of the Connectathon testsuite against three different servers. These servers were running RHEL-4, Solaris 10, and RHEL-5. The Connectathon testsuite does not generate a load which is typical of any operation mix, so the changes in the numbers of over the wire operations may vary depending upon the application load and behavior.
*** Bug 431092 has been marked as a duplicate of this bug. ***
in 2.6.18-78.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
I have built a package for CentOS that addresses the NFS issues of 2.6.18-53.x.x and includes upto the 2.6.18-51.1.13 security patch as well: http://people.centos.org/~hughesjr/kernel/5/
I confirm that the kernel offered by Johnny fixed the NFS problem (at least in my test runs using an x86_64 client).
Does kernel 2.6.18-53.1.14.el5 that's just been released address this issue? There are a few NFS items mentioned in the errata. This introduced NFS slowness is really hurting us.
My understanding from earlier comments is that the NFS fix will be in 5.2.
Yes, the changes are currently in the RHEL-5.2 beta kernel. Comment #54 implied that there was a CentOS kernel which contains the patch as well.
The Centos plus kernel contains the patch ( 2.6.18-53.1.13.el5.centos.plus) . However to stay within Red Hat patching would seem to be the correct thing to do on a Red Hat System (so it's supported in our contracts) I'm (very) surprised that Red Hat feel this should wait until 5.2, after all it was broken by a Red Hat patch release of the kernel! Rather like Glenn Morris in the duplicate bug: http://bugzilla.redhat.com/show_bug.cgi?id=431092 I'm left with a nasty choice between fast NFS or insecure systems.
It seems that you want stable products, but without giving us the chance to create and attempt to ensure those stable products. We have processes which are designed to give us the most stable products that we can at any given point. Ignoring these processes will not lead to better products. Clearly, the NFS implementation was not in the shape that we all would have liked. However, rushing out another patch is not the right way to address this situation. The patch in question here does much more than just undo the previous patch, so it needs time for testing and experience to tell us whether it is acceptable or not.
I appreciate the need for thorough testing of a new kernel, but I don't understand why "The patch in question here does much more than just undo the previous patch". Why were the previous NFS changes made in the first place? Can somebody who knows more about the kernel than I do (i.e., just about anybody) explain what caused the NFS slowdown, and what changes need to be done to fix it? Just trying to get my head around the problem, so I can explain it to my users.
The performance issue was introduced by attempting to address another bug which needed to be addressed. That bug was causing system failures under certain circumstances, so was a must fix. The problem with pure performance issues, such as this one, is that they are not detectable via normal functionality testing. There are no standard tests for measuring NFS clients. Thus, when we ran our gamut of tests, the system functionality appeared to be good and we missed the performance impact. Explain to your users about the dangers of making changes into a system as complex as a Linux kernel and how they can sometimes have unexpected side effects.
There was a serious bug in the kernel giving anyone with access to a shell the ability to become root (in my case about 12000 people, total for my university about 60000). You should have patched the at the time current kernel for this and this only. My department can't go back due to the serious exploit and we have to suffer poor performance. Rule of thumb: fix one thing at a time, if you have to do an emergency fix, fix only that and don't include anything else. Know what you fixed and be prepared to rollback. It makes your life easier(tm). And it's easy to be a smart ass afterwards as well :-)
Sigurd -- they did do that. As you can see from the dates of the comments above, the bug in question here was introduced with a _previous_ update.
Just a quick query, what is the current expected timescale for the 5 update 2 release? I'm assuming that 5.2 will probably include fixes for this (and #429109 if we are lucky). If the expected wait is comparable to the QA-time then there probably isn't any point in pressing for the fixes to be added to the .1.z series (or whatever the name for current EL5 update 1 should be).
Jonathan, I can't give you expected GA dates without proper NDA, but you can probably do some extrapolation assuming the 5.2 Public Beta is being released very soon.
re: #60 for the record ... the centosplus kernel in centos does have that patch rolled in, but we also have created a kernel-2.6.18-53.1.13.el5.bz321111.src.rpm that has ONLY the NFS patches to improve performance. I will be updating that to kernel-2.6.18-53.1.14 later today. You can test that and use it if it works for you (instead of using a .80ish kernel) ... or use it to build your own from RedHat sources if you don't want to use a CentOS kernel. See the link from comment #54. CentOS will be maintaining that kernel updated in that location until this is fixed in a released and production kernel. It is, of course, just our best effort at fixing the issue.
Thanks Johnny. This is a great example of how CentOS benefits the RHEL ecosystem as a whole -- CentOS users using the kernel with only the above fix deviating from the RHEL kernel can provide real-world feedback on the effects.
Many thanks, Johnny. I have installed the updated 2.6.18-53.1.14.el5.bz321111 kernel you rebuilt: http://people.centos.org/~hughesjr/kernel/5/ and confirm that it does not have the NFS issue. I am also running on other machines the centosplus kernel 2.6.18-53.1.14 and find no NFS problem there either. Akemi
Very much appreciated Johnny. We first noticed this issue because of the sheer amount of NFS traffic that was being generated compared to identical machines running FC7.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html