Hugging Face Quickstart
Estimated time: 7 minutes, 9 minutes with buffer.
Hugging Face is an AI/ML platform for the entire model pipeline. For this quickstart, we'll walk you through accelerated inference using a pretrained model.
Installing Dependencies
Because PyTorch with ROCm comes preloaded on your device, you will not need to install this dependency. However, you will still need a couple of libraries in order to run our quickstart script. Begin by installing transformers using the following command:
pip install transformersThis should take no more than a few minutes.
Creating and Running Inference Script
Next, go ahead and create and navigate to a new directory to create your script in:
mkdir hf-hello-world
cd hf-hello-worldThen, create a new script:
nano hello-world.pyWithin this script, paste the following code and exit:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import time
# Load model without quantization
tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m")
model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m")
# Move model to GPU
model = model.to("cuda")
# Input text
print("Warming up model...")
input_text = "Hello, my name is"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
warmup = model.generate(**inputs, max_new_tokens=20)
print("Preparing text...")
input_text = "According to all known laws of aviation, there is no way that a bee should be able to fly. Its wings are too small to get its fat little body off the ground. The bee, of course, flies anyway because"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
print("Starting inference...")
start = time.time()
outputs = model.generate(
**inputs,
max_new_tokens=50,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95,
no_repeat_ngram_size=2
)
t = time.time()-start
print(f"inference time: {t}")
print(tokenizer.decode(outputs[0], skip_special_tokens=True))After doing so, you may run the script using the following:
This runs a small model on one GPU, but feel free to swap out your model and prompts to your liking, then map to the proper devices. The output should be similar to:
Teardown
Navigate back to your base directory and remove your hf-hello-world folder:
Last updated

