Two-Phase Liquid Cooling – The Future of High-End GPUs


Author: Yulin Wang, Senior Technology Analyst at IDTechEx

 

Thermal Design Power Trend

 

As of 2024, single-phase direct-to-chip (D2C) cooling dominates the high-end GPU thermal management market. However, with the increasing thermal design power (TDP), two-phase D2C cooling will be required, and it is expected to come in large volume no earlier than 2026 and 2027. IDTechEx has interviewed a large number of players in the data center value chain, ranging from chip makers, cold plate suppliers, and system integrators. Despite different opinions on the exact timeline, the consensus is that around 1500W is the TDP where single-phase D2C starts to struggle, and 2000W might be the limit of single-phase D2C. According to analysis of the historic trend of thermal design power of GPUs by IDTechEx, the take-off of two-phase direct-to-chip will happen soon. IDTechEx also projects the future trend of GPU’s TDP, based on the historic trend and roadmap of leading chip suppliers interviewed by IDTechEx, such as Nvidia. More details are included in IDTechEx’s report, “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities”.

 

unnamed (3).png

TDP of data center GPU: historic data and forecast of value in 2025. Source: IDTechEx – “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities

 

D2C cooling challenges: Single and two-phase

 

With the potential adoption of two-phase liquid cooling, IDTechEx foresees some advantages and barriers, both technically and commercially. Single-phase direct-to-chip (D2C) cooling is a relatively simple and widely adopted solution. It uses a liquid coolant, typically a water-glycol mixture, to absorb heat from the chips via convection without undergoing a phase change. However, it faces significant technical challenges, such as potential leakage of coolant, which poses risks to IT equipment, and the mechanical stress caused by high flow rates. In order to cool down a chip with a TDP of 1000W, approximately 1.5L per minute is needed, which is fairly significant. The high flow rate also leads to potential erosion corrosion and requires quick disconnects with larger diameters, which adds up the total cost quickly.

 

The complexity of plumbing in data centers, especially around tight spaces, adds to the maintenance burden. Additionally, the high capital expenditure (CAPEX) (e.g., US$200-US$400 for a cold plate system including QDs, fluid distribution manifold inside servers, hoses, etc.) required for installation, particularly in retrofitting older data centers makes cold plate cooling a costly option upfront despite over the long run, it will be more energy efficient thereby saving costs.

 

On the other hand, two-phase D2C cooling offers higher efficiency by using the phase change of the coolant, which allows for better heat dissipation and lower cooling costs per watt. It also reduces mechanical stress because it operates at lower flow rates than single-phase systems. For instance, the flow rate for a two-phase cold plate is around 0.3L/min to cool down a chip with a TDP of 1000W. However, two-phase systems come with their own challenges. The use of fluorinated liquids can lead to environmental hazards if these fluids escape and form aerosols, raising concerns about safety and their global warming potential (GWP). Additionally, these systems are expensive to implement, with higher CAPEX for cold plate setups and additional fluid recycling and disposal costs. Despite its efficiency, the environmental and commercial hurdles make two-phase cooling a more complex choice. However, with design considerations, some of the challenges can be mitigated, and IDTechEx’s “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities” report also quantifies the CAPEX of single and two-phase cooling technologies with costs per component.

 

2.png

Technical and commercial challenges of data center direct-to-chip cooling. Source: IDTechEx – “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities

 

In summary, while single-phase cooling is simpler and more established, it has higher maintenance and technical risks. Two-phase cooling is more efficient but faces environmental concerns and higher initial costs, making it a less straightforward solution despite its advantages. However, with the upcoming thermal design power trend, IDTechEx believes that two-phase cold plate has potential, especially considering that they are easier to get retrofitted into existing data centers compared with immersion cooling. In “Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities”, IDTechEx has also listed the technical and commercial barriers of single and two-phase immersion, along with the roadmap and timeline for different cooling technologies based on primary and secondary research. 

 

To find out more about this IDTechEx report, including downloadable sample pages, please visit www.IDTechEx.com/TMDC.

 

For the full portfolio of thermal management market research available from IDTechEx, please see www.IDTechEx.com/Research/Thermal.

 

 

Upcoming free-to-attend webinar

Enabling Emerging Industries Through Thermal Management in 2024 and Beyond

 

IDTechEx will be hosting a webinar on the topic on Wednesday 13 November 2024 – Enabling Emerging Industries Through Thermal Management in 2024 and Beyond.

 

This webinar will summarize some of the key trends seen in 2024 and what can be expected for the future of thermal management, including:

  • Thermal management fluids and greater component integration in EVs
  • EV power electronics and its emerging thermal management needs
  • Adoption of new thermal management systems in data centers
  • How the above trends impact thermal interface materials and their application

 

We will be holding exactly the same webinar three times in one day. Please click here to register for the session most convenient for you.

 

If you are unable to make the date, please register anyway to receive the links to the on-demand recording (available for a limited time) and webinar slides as soon as they are available.

 

Previous post Exploring the Equipment Management of Smart Buildings: A Tour of the Minato MIRAI Innovation Center Machine Room