we screw up sometimes but they're usually fun stories
Vous ne pouvez pas sélectionner plus de 25 sujets Les noms de sujets doivent commencer par une lettre ou un nombre, peuvent contenir des tirets ('-') et peuvent comporter jusqu'à 35 caractères.

2018-12-08.md 7.9 KiB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113
  1. ---
  2. title: 'NFS Out Of Sync: Apache Outage'
  3. date: '2018-12-08T21:11:41+01:00'
  4. author: greenday
  5. twitter: gruunday
  6. description: NFS Out Of Sync
  7. tags:
  8. - Apache
  9. - NFS
  10. ---
  11. # Redbrick Web Server Outage 07/12/2018 - 08/12/2018
  12. ## Alert Recieved
  13. * A raintank alert was recieved @ 23:38 to inform that the website was down
  14. * A customer informed the site was down @ 00:38
  15. ## Alert Validation
  16. * Exploration to the site revealed that there was in fact an apache error
  17. * The error was a 403 that apache couldn't read the files
  18. * And interesting not is that the webserver could also not read the custom redbrick error page, another hint that this was bigger than just one folder
  19. ## Fix
  20. * Error logs were investigated
  21. * Apache error logs gave an error of the following
  22. ```
  23. [Sat Dec 08 11:53:33 2018] [error] [client 66.249.81.152] (1)Operation not permitted: file permissions deny server access: /webtree/redbrick/rb_custom_error/403.html
  24. [Sat Dec 08 11:53:33 2018] [crit] [client 66.249.81.150] (1)Operation not permitted: /home/member/m/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable
  25. [Sat Dec 08 11:53:33 2018] [error] [client 66.249.81.150] (1)Operation not permitted: file permissions deny server access: /webtree/redbrick/rb_custom_error/403.html
  26. [Sat Dec 08 11:53:33 2018] [crit] [client 66.249.81.154] (1)Operation not permitted: /home/member/m/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable
  27. [Sat Dec 08 11:53:33 2018] [error] [client 66.249.81.154] (1)Operation not permitted: file permissions deny server access: /webtree/redbrick/rb_custom_error/403.html
  28. [Sat Dec 08 11:53:33 2018] [error] [client 66.249.75.33] PHP Warning: Unknown: failed to open stream: Operation not permitted in Unknown on line 0
  29. [Sat Dec 08 11:53:33 2018] [crit] [client 46.229.168.140] (1)Operation not permitted: /webtree/w/wiki/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable
  30. [Sat Dec 08 11:53:33 2018] [error] [client 46.229.168.140] (1)Operation not permitted: file permissions deny server access: /webtree/redbrick/rb_custom_error/403.html
  31. [Sat Dec 08 11:53:38 2018] [crit] [client 157.55.39.210] (1)Operation not permitted: /webtree/p/pubsoc/.htaccess pcfg_openfile: unable to check htaccess file, ensure it is readable
  32. [Sat Dec 08 11:53:38 2018] [error] [client 157.55.39.210] (1)Operation not permitted: file permissions deny server access: /webtree/redbrick/rb_custom_error/403.html
  33. ```
  34. * This lead us to view logs from dmsg
  35. ```
  36. [36821900.601330] NFS: Server 192.168.0.24 reports our clientid is in use
  37. [36821900.605982] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  38. [36821905.612160] NFS: Server 192.168.0.24 reports our clientid is in use
  39. [36821905.616701] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  40. [36821910.622881] NFS: Server 192.168.0.24 reports our clientid is in use
  41. [36821910.626795] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  42. [36821915.633815] NFS: Server 192.168.0.24 reports our clientid is in use
  43. [36821915.637714] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  44. [36821920.644780] NFS: Server 192.168.0.24 reports our clientid is in use
  45. [36821920.648684] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  46. [36821925.655444] NFS: Server 192.168.0.24 reports our clientid is in use
  47. [36821925.660511] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  48. [36821930.666309] NFS: Server 192.168.0.24 reports our clientid is in use
  49. [36821930.670822] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  50. [36821935.677022] NFS: Server 192.168.0.24 reports our clientid is in use
  51. [36821935.680605] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  52. [36821940.687986] NFS: Server 192.168.0.24 reports our clientid is in use
  53. [36821940.691938] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  54. [36821945.698937] NFS: Server 192.168.0.24 reports our clientid is in use
  55. [36821945.702396] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  56. [36821950.709790] NFS: Server 192.168.0.24 reports our clientid is in use
  57. [36821950.713700] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  58. [36821955.720501] NFS: Server 192.168.0.24 reports our clientid is in use
  59. [36821955.724923] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  60. [36821960.731372] NFS: Server 192.168.0.24 reports our clientid is in use
  61. [36821960.735952] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  62. [36821965.742345] NFS: Server 192.168.0.24 reports our clientid is in use
  63. [36821965.746246] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  64. [36821970.753027] NFS: Server 192.168.0.24 reports our clientid is in use
  65. [36821970.756539] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  66. [36821975.763974] NFS: Server 192.168.0.24 reports our clientid is in use
  67. [36821975.767870] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  68. [36821980.774846] NFS: Server 192.168.0.24 reports our clientid is in use
  69. [36821980.779018] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  70. [36821985.785629] NFS: Server 192.168.0.24 reports our clientid is in use
  71. [36821985.790880] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  72. [36821990.796508] NFS: Server 192.168.0.24 reports our clientid is in use
  73. [36821990.800403] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  74. [36821995.807262] NFS: Server 192.168.0.24 reports our clientid is in use
  75. [36821995.811159] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  76. [36822000.818190] NFS: Server 192.168.0.24 reports our clientid is in use
  77. [36822000.822107] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  78. [36822005.828955] NFS: Server 192.168.0.24 reports our clientid is in use
  79. [36822005.833709] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  80. [36822010.839894] NFS: Server 192.168.0.24 reports our clientid is in use
  81. [36822010.845658] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  82. [36822015.850727] NFS: Server 192.168.0.24 reports our clientid is in use
  83. [36822015.854476] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  84. [36822020.861622] NFS: Server 192.168.0.24 reports our clientid is in use
  85. [36822020.865539] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  86. [36822025.872404] NFS: Server 192.168.0.24 reports our clientid is in use
  87. [36822025.876325] NFS: state manager: lease expired failed on NFSv4 server 192.168.0.24 with error 1
  88. ```
  89. * From this an admin identified the error "clientid is in use" can mean that NFS (Netword File Storage) and server (Web Server) were out of sync
  90. * This means that there were error messages to do with permissions
  91. * The next step was to try unmount the NFS and remount it
  92. * Apache was using the NFS and would not let the NFS unmount
  93. * Apache was stopped.
  94. * There was still something stopping the NFS from unmounting
  95. * It was decided that safest option over forcing an unmount was the reboot the machine
  96. * The machine was rebooted and the NFS mounted successfully
  97. * The website and files were restored to their working state
  98. On behalf of the admin team we appologise for the outage
  99. Regards,
  100. greenday && The admin Team