Business CircleBusiness Circle
  • Home
  • AI News
  • Startups
  • Markets
  • Finances
  • Technology
  • More
    • Human Resource
    • Marketing & Sales
    • SMEs
    • Lifestyle
    • Trading & Stock Market
What's Hot

23 Aldi Dinners Under $10 Your Family Won’t Complain About

June 2, 2026

What do SMEs think is the best business bank account? – survey

June 2, 2026

Daloopa Raises $47M to Make AI-Driven Investment Research Reliable and Auditable – AlleyWatch

June 2, 2026
Facebook Twitter Instagram
Tuesday, June 2
  • Advertise with us
  • Submit Articles
  • About us
  • Contact us
Business CircleBusiness Circle
  • Home
  • AI News
  • Startups
  • Markets
  • Finances
  • Technology
  • More
    • Human Resource
    • Marketing & Sales
    • SMEs
    • Lifestyle
    • Trading & Stock Market
Subscribe
Business CircleBusiness Circle
Home » The first exascale supercomputer has a hardware failure every day
Technology

The first exascale supercomputer has a hardware failure every day

Business Circle TeamBy Business Circle TeamOctober 10, 2022Updated:August 21, 2025No Comments3 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
The first exascale supercomputer has a hardware failure every day
Share
Facebook Twitter LinkedIn Pinterest Email


Briefly: Frontier, the world’s strongest supercomputer, is on-line however nonetheless removed from operational. Its director has confirmed studies that it’s experiencing a system failure each few hours, however insists that is par for the course.

Frontier is in a category of its personal. It has 9,408 HPE Cray EX235a nodes, every powered by an AMD Trento 7A53 Epyc 64-core CPU outfitted with 512 GB of DDR4, and 4 AMD Intuition MI250X GPUs / accelerators every outfitted with 128 GB of HBM2e. Summed, the system has 602,112 CPU cores and eight,138,240 GPU cores in whole, and 4.6 PB of each DDR4 and HBM2e.

In Could, Frontier joined the TOP500 as the primary supercomputer to interrupt the exascale barrier after it accomplished the HPL benchmark with a rating of 1.102 ExaFlops/s. Since then, the Oak Ridge Nationwide Laboratory in Tennessee, which manages the supercomputer, has been readying it for scientific analysis scheduled to begin in January.

Nonetheless, there have been studies that the launch of Frontier may very well be waylaid by extreme {hardware} failures. Searching for solutions, Inside HPC organized an interview with the Program Director at Oak Ridge, Justin Whitt. Within the interview, he confirmed Frontier was experiencing day by day system failures however asserted that was inevitable in such a big system.

“Imply time between failure on a system this measurement is hours, it isn’t days,” he stated. “So you have to be sure you perceive what these failures are and that there is not any patterns to these failures that you have to be involved with.” Whitt added that going a day with out a failure “could be excellent.”

“Our purpose continues to be hours.”

says Justin Whitt, Program Director on the OLCF

There have been rumors that the {hardware} issues have been being attributable to the brand new AMD Intuition MI250X, however Whitt refuted them. The MI250X is AMD’s strongest GPU/accelerator, and it solely sells it to pick out companions. It has 220 CUs containing 14,080 cores clocked at 1700 MHz in a 500 W bundle.

“The problems span plenty of completely different classes, the GPUs are only one,” Whitt remarked. “It has been a fairly good unfold amongst widespread culprits of components failures which have been a giant a part of it. I do not suppose that at this level that we’ve got plenty of concern over the AMD merchandise,” he added.

“We’re coping with plenty of the early-life sort of issues we have seen with different machines that we have deployed, so it is nothing too out of the unusual.”

Whitt conceded that the unprecedented scale of Frontier had made nice tuning it “just a little bit more durable” however stated they have been nonetheless following the schedule set again in 2018-19 regardless of delays attributable to the pandemic.

Head over to Inside HPC to learn the complete interview.



Source link

day exascale failure Hardware supercomputer
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Business Circle Team
Business Circle Team
  • Website

Related Posts

From code-first to intent-first: Microsoft Build 2026 could be the end of programming as we know it

June 2, 2026

Google’s first new smart speaker in six years might finally have a release date

June 2, 2026

Russia’s Military Hackers Targeted Home Routers Across 23 States. Here’s What to Do

June 2, 2026

Anker’s 250W desktop charging station cuts clutter, now $50 off

June 1, 2026
LATEST UPDATES

23 Aldi Dinners Under $10 Your Family Won’t Complain About

June 2, 2026

What do SMEs think is the best business bank account? – survey

June 2, 2026

Daloopa Raises $47M to Make AI-Driven Investment Research Reliable and Auditable – AlleyWatch

June 2, 2026

Google Is Using AI to Change the Rules of the Internet

June 2, 2026

Agentic AI and Content & Messaging: What Revenue Leaders Need to Know, Act On, and Watch Out For

June 2, 2026

From code-first to intent-first: Microsoft Build 2026 could be the end of programming as we know it

June 2, 2026

Subscribe to Updates

Get the latest sports news from SportsSite about soccer, football and tennis.

Business, Finance and Market Growth News Site

Important Pages
  • Advertise with us
  • Submit Articles
  • About us
  • Contact us
Recent Posts
  • 23 Aldi Dinners Under $10 Your Family Won’t Complain About
  • What do SMEs think is the best business bank account? – survey
  • Daloopa Raises $47M to Make AI-Driven Investment Research Reliable and Auditable – AlleyWatch
© 2026 BusinessCircle.co
  • Privacy Policy
  • Terms and Conditions
  • Cookie Privacy Policy
  • Disclaimer
  • DMCA

Type above and press Enter to search. Press Esc to cancel.