WSUS ed elevato utilizzo di CPU e Memoria a causa dei metadati

Nel post High CPU/High Memory in WSUS following Update Tuesdays del 18 agosto 2017 viene dichiarato che sono state osservati picchi di utilizzo di CPU e Memoria nei server WSUS in seguito agli aggiornamenti rilasciati il giovedì a seguito al rilascio di alcuni update. Al momento Microsoft sta ancora investigando sul problema.

“Recently, we’ve seen an increase in the number of high CPU/High Memory usage problems with WSUS, including WSUS in a System Center Configuration Manager environment – these have mostly corresponded with Update Tuesdays.

“Microsoft support has determined that the issue is driven primarily by the Windows 10 1607 updates, for example KB4022723, KB4022715, KB4025339, etc. See here for the list of Windows 10 1607 updates

“Microsoft is also aware of a known issue with KB4034658 that will cause Windows 10 1607 clients to run a full scan after install – Microsoft is investigating and the latest information is available here.

These updates have large metadata payloads for the dependent (child) packages because they roll up a large number of binaries. Windows 10, versions 1507 (Windows 10 RTM) and 1511 updates can also cause this, though to a lesser extent. Windows 10, version 1703 is still recent enough that the metadata is not that large yet (but will continue to grow).”

Do seguito i sintomi che è possibile riscontrare nel caso si presentino gli issues descritti:

  • High CPU on your WSUS server – 70-100% CPU in w3wp.exe hosting WsusPool
  • High memory in the w3wp.exe process hosting the WsusPool – customers have reported memory usage approach 24GB
  • Constant recycling of the W3wp.exe hosting the WsusPool (identifiable by the PID changing)
  • Clients failing to scan with 8024401c (timeout) errors in the WindowsUpdate.log
  • Mostly 500 errors for the /ClientWebService/Client.asmx requests in the IIS logs

La causa degli issues sembra sia legati all’aumento del metadata degli aggiornamenti. Tali issues si sono riscontrati a causa di client Windows 10 1507, 1511, 1607, mentre Windows 10 1073 al momento non causa il problema, ma solo perché le dimensioni dei metadata sono ancora contenuti

“Microsoft support has determined that the issue is driven primarily by the Windows 10 1607 updates, for example KB4022723, KB4022715, KB4025339, etc. See here for the list of Windows 10 1607 updates.

These updates have large metadata payloads for the dependent (child) packages because they roll up a large number of binaries. Windows 10, versions 1507 (Windows 10 RTM) and 1511 updates can also cause this, though to a lesser extent. Windows 10, version 1703 is still recent enough that the metadata is not that large yet (but will continue to grow).”

“For large metadata packages and many simultaneous requests, it can take longer than ASP.NET’s default timeout of 110 seconds to retrieve all of the metadata the client needs.”

“When the thread abort happens, all of the metadata that has been retrieved to that point is discarded and is not cached. As a result, WSUS enters a continuous cycle where the data isn’t cached, the clients can never complete the scan and continue to rescan.”

“Another issue that can occur is the WSUS application pool keeps recycling because it exceeds the private memory threshold (which it is very likely to do if the limit is still the default of 1843200). This recycles the app pool, and thus the cached updates, and forces WSUS to go back through retrieving updates from the database and caching them.”

Nell’articolo High CPU/High Memory in WSUS following Update Tuesdays sono descritte varie impostazioni che è possibile eseguire in IIS per mitigare gli issue nel Site usato da WSUS:

  • Configure IIS to stop recycling the App Pool
    The goal is to stop the app pool recycling since a recycle clears the cache. The defaults in IIS for Private Memory limit of 1843200 will cause the pool to constantly recycle. We want to make sure it doesn’t recycle unless we intentionally restart the app pool.
  • Limit the number of inbound connections to WSUS (viene suggerito di eseguire tale impostazione se si riscontrano più di 1000 connessioni, ma se si riscontra il problema potrebbe essere opportuno impostarlo a circa la metà dei client)
    Reducing the number of allowed connections will cause clients to receive 503 errors (service not available), but they will retry.
  • Increase the ASP.NET timeout

Per i dettagli sulle configurazioni e su come monitorare WSUS per capire se il problema si è risolto vedere la soluzione vedere l’articolo High CPU/High Memory in WSUS following Update Tuesdays.

[Update 01]

Per risolvere l’issue il 28 agosto 2017 Microsoft ha rilasciato i seguenti aggiornamenti: