Hugging Face is an AI/ML platform for the entire model pipeline. For this quickstart, we'll walk you through accelerated inference using a pretrained model.
Because PyTorch with ROCm comes preloaded on your device, you will not need to install this dependency. However, you will still need a couple of libraries in order to run our quickstart script. Begin by installing transformers using the following command:
pipinstalltransformers
This should take no more than a few minutes.
Creating and Running Inference Script
Next, go ahead and create and navigate to a new directory to create your script in:
mkdirhf-hello-worldcdhf-hello-world
Then, create a new script:
nanohello-world.py
Within this script, paste the following code and exit:
importtorchfromtransformersimportAutoModelForCausalLM,AutoTokenizerimporttime# Load model without quantizationtokenizer=AutoTokenizer.from_pretrained("facebook/opt-350m")model=AutoModelForCausalLM.from_pretrained("facebook/opt-350m")# Move model to GPUmodel=model.to("cuda")# Input textprint("Warming up model...")input_text="Hello, my name is"inputs=tokenizer(input_text,return_tensors="pt").to("cuda")warmup=model.generate(**inputs, max_new_tokens=20)print("Preparing text...")input_text = "According to all known laws of aviation, there is no way that a bee should be able to fly. Its wings are too small to get its fat little body off the ground. The bee, of course, flies anyway because"
inputs=tokenizer(input_text,return_tensors="pt").to("cuda")print("Starting inference...")start=time.time()outputs=model.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, no_repeat_ngram_size=2)t=time.time()-startprint(f"inference time: {t}")print(tokenizer.decode(outputs[0],skip_special_tokens=True))
After doing so, you may run the script using the following:
python3hello-world.py
This runs a small model on one GPU, but feel free to swap out your model and prompts to your liking, then map to the proper devices. The output should be similar to:
Warming up model...
Preparing text...
Starting inference...
inference time: 0.4770219326019287
According to all known laws of aviation, there is no way that a bee should be able to fly. Its wings are too small to get its fat little body off the ground. The bee, of course, flies anyway because it can.
Well, if you fly with a little fat of your body, you can fly pretty damn well. You just have to be careful.
Teardown
Navigate back to your base directory and remove your hf-hello-world folder: