Older Intel Atom C2000 Series Server Chips May Stop Working After a While, and There’s no Fix

It takes time and efforts to debugging hardware and software to get a product right, but some bugs may be hard to reproduce, or only happen over time, and it appears some Intel Celeron C2000 series processor for microservers may stop working after about 18 months, with the likelihood of problems increasing over time, due to clock signals that stop functioning.

Atom C2000 Block Diagram

This is documented in Intel Atom Processor C2000 Product Family Specification Update, with Errata AVR 54 explaining the issue:

AVR54. System May Experience Inability to Boot or May Cease Operation

Problem: The SoC LPC_CLKOUT0 and/or LPC_CLKOUT1 signals (Low Pin Count bus clock
outputs) may stop functioning.
Implication: If the LPC clock(s) stop functioning the system will no longer be able to boot.
Workaround: A platform level change has been identified and may be implemented as a workaround
for this erratum.
Status: For the steppings affected, see Table 1, “Errata Summary Table” on page 9.

The table on page 9 shows stepping “B0” suffers from this problem. The issue affects existing motherboard and server based on Atom C2000, and companies like Cisco will provide replacements:

Recently, Cisco became aware of an issue related to a component manufactured by one supplier that affects some Cisco products. In some units, we have seen the clock signal component degrade over time. Although the Cisco products with this component are currently performing normally, we expect product failures to increase over the years, beginning after the unit has been in operation for approximately 18 months. Once the component has failed, the system will stop functioning, will not boot, and is not recoverable. This component is also used by other companies.

We have identified all Cisco products that have this component and worked with the supplier to quickly put a fix in place. All products shipping currently do not have this issue. To support our customers and partners, Cisco will proactively provide replacement products under warranty or covered by any valid services contract dated as of November 16, 2016, which have this component. Due to the age-based nature of the failure and the volume of replacements, we will be prioritizing orders based on the products’ time in operation.

The good news is that a new revision of the chip fixes the issue for new processors, but there’s no fix for older ones. So if you own any such systems, and they have stopped working or become unstable suddenly, it may be the reason. You also want to check if you can get a replacement while it is still under warranty whether it works or not.

Thanks to Mike for the tip.

Share this:

Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress

Radxa Orion O6 Armv9 mini-ITX motherboard
Subscribe
Notify of
guest
The comment form collects your name, email and content to allow us keep track of the comments placed on the website. Please read and accept our website Terms and Privacy Policy to post a comment.
12 Comments
oldest
newest
Boardcon CM3588 Rockchip RK3588 System-on-Module designed for AI and IoT applications