# MI300X vs H100

***

## Raw Performance Comparison

<figure><img src="/files/Ofcl0pjuCtWtVAHSyYpe" alt=""><figcaption><p>Raw performance comparison of AMD's MI300x vs NVIDIA's H100</p></figcaption></figure>

The current go-to provider of GPUs, NVIDIA, has a long history in developing graphics accelerators and related hardware, so adding a line of AI-focused GPUs was not a great leap for them. Their flagship AI GPU, the H100, is in such high demand that customers must wait for a year or more for their orders to be filled.

Meanwhile, Advanced Micro Devices (AMD), better known as a competitor to Intel in the PC and server CPU market, has introduced its own GPU product line, called Instinct. The Instinct MI300X, introduced in late 2023, is causing a stir in the AI development community.

*Lets break down their individual capabilities to determine which best fits your use case.*

***

## Technical Specifications Comparison

### Architecture

The H100 and MI300X have quite different architectures. The H100 is implemented on a single large (814 square millimeters) chip of silicon, with all the components in the same plane. This architecture is the same tried-and-true approach used in almost all integrated circuits. The advantage is that the manufacturing process is mature, although the large size pushes the limits of what can be manufactured using standard processes.

The MI300X, in contrast, is assembled as a three-dimensional stack. The MI300X has eight separate GPU integrated circuits surrounded by high-bandwidth memory in one layer, which is placed on top of a layer of input-output circuitry. This approach packs more transistors in a smaller area with shorter distances between the computing modules and memory. However, the manufacturing process is entirely new and more complex: The layers must line up perfectly with nanometer precision for the device to work.

### Memory

The H100 comes with 80 GB of GPU memory, whereas the MI300X has 192 GB. The memory bandwidth—the speed at which the chip can move data between memory and the computing modules, and an important contributor to overall performance—is also greater for the MI300X (5.2 TB/s vs. 3.35 TB/s).

***

## Performance Benchmarks

{% hint style="warning" %}
At this writing, independent comparisons are not yet available, so all we have are published performance claims by each side, without knowing the exact environments from which these claims were generated.
{% endhint %}

### Inference Performance

AMD claims a 20% advantage over the H100 in inference performance (that is, using a trained AI model to perform tasks) on the Llama 2 LLM with 13 billion parameters.

### Floating Point Operations

For eight-bit floating-point precision (known as FP8), AMD claims 2,614.9 trillion FLOPS (TFLOPS) vs. 1,978.9 TFLOPS for the H100.

### Latency

AMD claims a 40% advantage over the H100 in inference latency on Llama 2 with 70 billion parameters. The higher memory bandwidth of the MI300X has a strong influence on this performance metric.

{% hint style="info" %}
For more information or to discuss your specific requirements, [contact TensorWave today](https://www.tensorwave.com/contact).
{% endhint %}

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tensorwave.com/connect-with-us/amd-vs-nvidia/mi300x-vs-h100.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
