Nvidia's highly anticipated Blackwell AI chips have run into another hurdle, a report has said, adding that the AI chip is experiencing overheating issues in the accompanying servers. According to a report by The Information, these problems have caused concerns among some customers who fear delays in getting their new data centres operational.
The report by The Information (via news agency Reuters) reveals that the Blackwell graphics processing units (GPUs), – dubbed 'superchip' and will be available by 2024-end – overheat when deployed in high-density server racks designed to accommodate up to 72 chips.
Citing sources, the report also notes that Nvidia has repeatedly requested design modifications from its suppliers to address the overheating, but a definitive solution has not been found.
What Nvidia has to say
Nvidia acknowledged the challenges but downplayed the issue.
“Nvidia is working with leading cloud service providers as an integral part of our engineering team and process. The engineering iterations are normal and expected,” a company spokesperson was quoted as saying.
The ongoing problems may add to delays in the Blackwell rollout, potentially impacting major customers like Meta, Google, and Microsoft who are eager to leverage the chip's capabilities for AI applications, the report highlighted.
“...100% Nvidia's fault”: CEO Jensen Huang
Last month, Nvidia CEO Jensen Huang said that a design flaw that impacted the Blackwell chips’ production and caused delays has been fixed.
“It was functional, but the design flaw caused the yield to be low. It was 100% Nvidia's fault. In order to make a Blackwell computer work, seven different types of chips were designed from scratch and had to be ramped into production at the same time,” he said.
"What TSMC did was to help us recover from that yield difficulty and resume the manufacturing of Blackwell at an incredible place," the CEO added.
Despite these setbacks, the Blackwell chip represents a significant leap forward in AI processing power. By combining two silicon squares into a single component, Nvidia claims it delivers a 30-fold increase in speed for tasks like generating chatbot responses.