Bitmain changed its ASIC design. Miners need to be ready

Bitmain changed its ASIC design. Miners need to be ready
DALL-E
Bitmain changed its ASIC design. Miners need to be ready

Disclosure: The below is an open discussion into newer generation Bitmain Antminer ASICs performed by the Compass Mining operations team.

Thanks to Fred Gimino for his help in writing this report.

Beginning in 2022, Bitmain–the largest single designer and manufacturer of Bitcoin mining ASICs–began implementing a few production changes to various versions of their S19 series. The scope and possible repercussions of these changes are still being understood, as are the motivations behind them. Our purpose in writing this report is to help shed light on one of the most opaque and misunderstood parts of the Bitcoin industry, ASIC manufacturing, and perhaps help increase each miner’s operation efficiency.

To date, Compass Mining’s operations team have identified three issues with two versions of the Antminer S19 series miners, including the S19 90T and S19 XP. Other versions could be affected as well, including the recently announced S19JjPro+ model, due to the likely manufacturing changes the following assumes.

Summary

We’ve noted the following changes in Antminer ASICs:

  1. Lack of a peripheral interface controller (PIC) on units, meaning individual hashboards cannot be easily controlled.
  2. Use of aluminum plating on one side of a hashboard, causing increased chances of units overheating.
  3. Consolidation of all components onto one side of the board, causing increased likelihood of hashboard errors.

From the manufacturing change above, we can identify a few possible outcomes for Bitcoin miners running newer generation units.

  1. Without a PIC, units will not be able to continue underhashing on one or two boards.
  2. Aluminum-plated hashboards will have a significantly higher failure rate in warmer temperatures (i.e. Texas) than those built on printed circuit boards (PCB).
  3. Aluminum-plated hashboards will have higher servicing needs due to increased chances of traces, chips or other parts tripping an error.
  4. Non-Bitmain ASIC repair shops will have a higher difficulty in repairing ASICs due to the use of the aluminum plate, which increases the difficulty of replacing malfunctioning chips substantially.

Deployment context

Facility: We first became aware of performance issues with the S19 XP series after deployment to our Texas 4 facility partner, located outside Houston, Texas. Humidity and heat are concerns due to the site’s location, but nothing miners havn’t handled before. Given how new the site is, we can’t rule out the facility from amplifying the issues. We continue to look into this angle.

Manufacturer: Another possible cause here could be the batch of units from Bitmain. It’s common knowledge in Bitcoin mining not to buy the first batch of Bitcoin ASICs. Unknown errors are often only revealed overtime, so its best to have your competitors find them out first.

Manufacturers occasionally issue a ‘bad batch,’ with lower quality hardware than the following or preceding units. Or the ASIC chips themselves can come from a sub-optimal ‘bin,’ leading to lower performance. Lastly, there’s the control boards and other various electronics within the board themselves that can be poorly assembled.

As we’ll show in a moment, our team came to the conclusion that we may indeed have gotten a bad batch from Bitmain–one that Bitmain solves via its RMA program. However, our operations team also uncovered design changes within the last six months that acutely affect a units performance, including: lack of a peripheral interface controller (PIC) on units, the use an aluminum backing board on one side of a hashboard and shift from laminated dual side PCB to a single sided PCB.

Let’s look at these all individually.

Lack of PIC

The PIC is a type of microcontroller that contains a processor core, memory, and various input/output peripherals on a single chip. PICs are used to control and monitor a wide range of devices and systems, including industrial equipment, automotive systems, and consumer electronics.

In ASICs, a PIC sits at the top of a hashboard and allows each hashboard to be spoken to individually. Without it, you have to address the unit as one unit, instead of three hashboards.

Bitmain changed its ASIC design. Miners need to be ready
The PIC in the center, with a red dot, on older models S19 series machines.
Bitmain changed its ASIC design. Miners need to be ready
An S19 XP without a PIC.

Although a small part, the lack of PIC means miners may not underhash by running on fewer boards. (For reference, most ASICs have three hashboards).

Instead, a miner fails completely. We’ve found this to be the case with our S19 XP 141 TH units at Texas 4, which have failed completely when only one board is having issues. Typically, a miner would prefer to continue accruing Bitcoin until the bad board can be addressed.

The lack of a PIC is compounded by the abrupt change to aluminum-plates on all hashboards. If a board overheats because of the aluminum's heat dispersion properties, then the entire unit will go down instead of just one board.

Cumbersome software work arounds with high programmatic overhead may allow for addressing and controlling the hash boards independently of each other, such as a virtual microcontroller. However, doing so could violate the Bitmain warranty so must be considered with the risks.

Aluminum boards

Perhaps the largest issue is the swap to an aluminum backed single sided PCB on some Antminer hashboards, as is common with Avalon or Whatsminer units.

Previously, almost all Antminers used laminated multilayer PCBs, with traces built into the layers of laminate material. One side of the board displayed the chips (signal side), connected by traces, and the other side displayed the heatsinks (power side). Recent manufacturing changes swapped out laminated multilayer PCB for aluminum backed single layer PCB which put both power and signal squarely on the same side of the board.

This change increases overall trace density, which causes a higher probability of producing an irreparable broken or blown trace, as confirmed by Bitmain technicians Compass spoke with. (A trace connects the ASIC chips together and are protected by trace covers. Typically, the traces are built into layers of the PCB).

There’s positives and negatives to this approach, such as cheaper materials decreasing production costs. Newer S19 XPs also have one large heat sink, which may transfer heat better than the multiple heat sinks baked into the PCB.

However, the approach has two possible drawbacks: poor heat transfer due to the aluminum material in conjunction with the current chassis design and crowding of components onto one layer of PCB.

Chassis

The chassis itself has also been criticized when compared to the tubular Antminer S9 or Whatsminer M20/M30 series units, which operate comfortably at higher temperatures. In the Compass operations team's opinion, swapping to an aluminum-plated board without changing the chassis design will likely result in higher failure rates for machines.

The issue here is that the aluminum acts as a vehicle for heat, as it transfers heat across the entire thinner single sided PCB. Therefore, we expect to see higher PCB temperatures on these boards in areas that might not normally receive as much heat on a double-sided board without the aluminum backing.

Moreover, the heat emissivity of aluminum, especially in an oxidized form, will cause convective heating of the space inside the chassis. A solution to this in an air-cooled environment would be to increase the mass air flow sufficiently to dissipate the heat stored within the miner–a different design or stronger fans.

So, what to do?

We've identified the following possible solutions for the above stated issues.

  1. Third party firmware that may allow the miner to run on one or two boards.
  2. Third party firmware that allows for lowering the frequency and voltage to a level that keeps the miner cooler.
  3. Specifying the optimal environmental conditions for these miners, including the upside for immersion, optimal temperature and humidity settings and optimal voltage range for machine longevity.

We view the design decision to swap to aluminum-plating on hashboards as a net negative–one that will increase ASIC failure and underhashing while increasing service and maintenance costs. Paired with the lack of a PIC and increased difficulty of swapping out bad chips, we encourage miners to double down on their repair game as they onboard next generation units into their fleets.

Thanks to Elijah Hendrickson, Taylor Monnig and Brett Hiley.

文 » A