The checkpoint for my final implementation of nanoGPT from Andrej Karpathy. This model was trained on an RTX4090 for 48 hours.

For more details, check out my GitHub repository

Usage:

import torch
from transformers import GPT2Tokenizer
from gpt2 import GPT,GPTConfig
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")


model=GPT.from_pretrained('nanoGPT2-124m.pth')
model=model.to(device)
prompt = "Hello, I'm a mathematics teacher"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generated_ids=model.generate(input_ids.to(device))
print(tokenizer.decode(generated_ids))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support