Hugging Face: 10 Code-Along Examples in Python

The Hugging Face article explains the landscape: a hub of pretrained models and datasets, the pipeline class that makes them one-liners, and the Auto classes underneath when you need control. This workbook makes each idea runnable. You will classify sentiment, generate text, search the Hub from code, run zero-shot labels, summarise, load and filter a dataset, look inside a tokenizer, assemble a pipeline by hand, and finish by asking questions of a PDF. The small models used here run comfortably on an ordinary laptop CPU.

1. Your first pipeline: sentiment analysis

The pipeline class hides every step, tokenisation, model, and decoding, behind one call. Give it a task name and it picks a sensible default model.

			
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
reviews = [
    "The delivery was quick and the keyboard feels fantastic.",
    "Two weeks late and the box arrived crushed."
]
for review in reviews:
    result = classifier(review)[0]
    print(f"{result['label']} ({result['score']:.3f}) - {review}")
# POSITIVE (1.000) - The delivery was quick and the keyboard feels fantastic.
# NEGATIVE (0.999) - Two weeks late and the box arrived crushed.

		

Three lines of setup and you have a working sentiment model. The returned score is the model’s confidence, and printing it alongside the label is a good habit: a 0.55 is a shrug, a 0.999 is a verdict.

2. Text generation with GPT-2

Swap the task name and the same interface generates text. max_new_tokens caps the length and num_return_sequences asks for several continuations of the same prompt.

			
from transformers import pipeline
generator = pipeline("text-generation", model="gpt2")
outputs = generator(
    "The most useful skill for a data analyst is",
    max_new_tokens=30,
    num_return_sequences=2
)
for i, out in enumerate(outputs, 1):
    print(f"--- Continuation {i} ---")
    print(out["generated_text"])

		

GPT-2 is small and dated, which is exactly why it is ideal for learning: it downloads fast and runs anywhere. The structure you practise here, prompt in, generated_text out, is identical for far larger models.

3. Searching the Hub from code

The Hub is not just a website. The huggingface_hub library lets you search its models programmatically, which is how you find candidates without leaving your script.

			
from huggingface_hub import HfApi
api = HfApi()
models = api.list_models(
    task="summarization",       # filter by task
    sort="downloads",           # most-used first
    limit=5
)
for model in models:
    print(f"{model.id}  |  downloads: {model.downloads}")

		

The result is the five most-downloaded summarisation models with their ids, and any id can be dropped straight into a pipeline(model=...) call. Download counts are a useful proxy for “battle-tested” when choosing between similar models.

4. Zero-shot classification

Zero-shot models classify text into categories they were never trained on. You supply the candidate labels at call time, which makes this the fastest way to prototype a classifier with no training data at all.

			
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli")
ticket = "My invoice shows the wrong VAT amount for last month."
departments = ["billing", "technical support", "sales", "complaints"]
result = classifier(ticket, candidate_labels=departments)
for label, score in zip(result["labels"], result["scores"]):
    print(f"{label}: {score:.3f}")
# billing: 0.812
# complaints: 0.102 ...

		

The model ranks every label by how well it fits the text. Change the departments list and rerun; there is no retraining step, which is what “zero-shot” means and why it is so useful for routing, tagging, and triage prototypes.

5. Summarisation with length constraints

Summarisation pipelines accept min_new_tokens and max_new_tokens to box in the output length. This one condenses a paragraph to a couple of sentences.

			
from transformers import pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
report = """
Quarterly traffic grew 18 percent, driven mainly by organic search after the
site migration completed in April. Paid channels stayed flat, with social
declining slightly as budgets moved to search. Conversion rate improved from
2.1 to 2.6 percent, which the team attributes to the new checkout flow.
Average order value was unchanged. The main risk flagged for next quarter is
tracking loss from the upcoming consent banner changes.
"""
summary = summarizer(report, min_new_tokens=20, max_new_tokens=50)
print(summary[0]["summary_text"])

		

BART-large-CNN is an abstractive summariser, meaning it writes new sentences rather than copying existing ones. The length constraints matter: without them, summaries drift long or truncate awkwardly, and tuning those two numbers is most of the craft.

6. Loading and filtering a dataset

The datasets library gives you the same one-liner convenience for data. Load a dataset from the Hub, then shape it with .filter and .select.

			
from datasets import load_dataset
reviews = load_dataset("imdb", split="test")
print(reviews)                      # 25,000 labelled movie reviews
# keep only short reviews
short = reviews.filter(lambda row: len(row["text"]) < 500)
print(f"Short reviews: {short.num_rows}")
# take the first three for a look
sample = short.select(range(3))
for row in sample:
    print(row["label"], row["text"][:80], "...")

		

filter takes a function that returns True for rows to keep, and select slices by position. Both return a new dataset rather than modifying the original, so you can chain them into a small preprocessing recipe.

7. Inside the tokenizer

Models never see words; they see token ids. AutoTokenizer loads the exact tokenizer a model was trained with, and inspecting its output demystifies what a pipeline does before the model runs.

			
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
text = "Tokenizers split text into pieces."
tokens = tokenizer.tokenize(text)
ids = tokenizer.encode(text)
print(tokens)   # ['token', '##izer', '##s', 'split', 'text', 'into', 'pieces', '.']
print(ids)      # [101, 19204, 17629, 2015, 3975, 3793, 2046, 4109, 1012, 102]
print(tokenizer.decode(ids))   # [CLS] tokenizers split text into pieces. [SEP]

		

Notice tokenizers became three pieces, with ## marking continuations, and that special [CLS] and [SEP] markers were added around the sentence. Every odd model behaviour you will ever debug, truncation, weird splits, length limits, starts here.

8. Assembling a pipeline by hand with Auto classes

A pipeline is just a tokenizer plus a model plus some post-processing. Building one manually shows exactly what the convenience wrapper does, and it is the pattern you need when you want logits or custom behaviour.

			
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)
inputs = tokenizer("A genuinely delightful little film.", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)[0]
labels = model.config.id2label
for i, p in enumerate(probs):
    print(f"{labels[i]}: {p:.3f}")
# NEGATIVE: 0.000
# POSITIVE: 1.000

		

Tokenise, run the model, softmax the logits, map ids to labels: that is the whole pipeline, spelled out. model.config.id2label is the lookup that turns position 1 into “POSITIVE”, and having the raw probabilities lets you set your own thresholds.

9. Question answering over a context

Extractive QA models find the answer span inside a passage you provide. Give the pipeline a question and a context, and it returns the exact substring plus a confidence.

			
from transformers import pipeline
qa = pipeline("question-answering",
              model="distilbert-base-cased-distilled-squad")
context = """
The onboarding project kicked off on 3 February and is led by Priya Shah.
The first milestone, the data audit, is due on 28 March, with the full
rollout planned for July. Budget approval sits with the finance committee.
"""
questions = [
    "Who leads the onboarding project?",
    "When is the data audit due?",
]
for q in questions:
    result = qa(question=q, context=context)
    print(f"{q} -> {result['answer']} ({result['score']:.2f})")
# Who leads the onboarding project? -> Priya Shah (0.98)
# When is the data audit due? -> 28 March (0.97)

		

The model does not generate an answer; it points at one, returning a span lifted verbatim from the context. That makes extractive QA reliable and easy to verify, since the answer is always traceable to the source text.

10. Putting it together: ask questions of a PDF

The final example combines a PDF parser with the QA pipeline from Example 9: extract the text, then query it. This is the skeleton of every “chat with your document” tool.

			
from pypdf import PdfReader
from transformers import pipeline
# 1. extract text from the PDF
reader = PdfReader("report.pdf")          # any text-based PDF you have
document_text = ""
for page in reader.pages:
    document_text += page.extract_text()
# 2. feed it to a QA pipeline as the context
qa = pipeline("question-answering",
              model="distilbert-base-cased-distilled-squad")
question = "What was the total revenue?"
result = qa(question=question, context=document_text)
print(f"Answer: {result['answer']}")
print(f"Confidence: {result['score']:.2f}")

		

Point it at any text-based PDF and adjust the question to match. Two honest caveats: scanned PDFs need OCR first since extract_text only reads embedded text, and very long documents exceed the model’s context window, which is when you would chunk the text and query each chunk. Both are natural next steps once this skeleton works.

Work through these and you will have used the whole toolkit from the article: pipelines for six different tasks, the Hub API, datasets with filtering, tokenizers, and the Auto classes that let you build it all by hand. The selection habit worth keeping is the one the guide closes on: start with a pipeline and a small default model, prove the task works, and only then trade up to bigger models or manual control when the results, or your requirements, demand it.

See you soon.

Hugging Face Cheatsheet

Working with Hugging Face: 10 Code-Along Examples

SCSS Functions Cheatsheet

SCSS Functions: 10 Code-Along Examples

Working with Hugging Face: 10 Code-Along Examples

1. Your first pipeline: sentiment analysis

2. Text generation with GPT-2

3. Searching the Hub from code

4. Zero-shot classification

5. Summarisation with length constraints

6. Loading and filtering a dataset

7. Inside the tokenizer

8. Assembling a pipeline by hand with Auto classes

9. Question answering over a context

10. Putting it together: ask questions of a PDF

Related

Leave a ReplyCancel reply

Recommended for You

Hugging Face Cheatsheet

Python Functions: A Practical Guide

Hugging Face Cheatsheet

Working with Hugging Face: 10 Code-Along Examples

SCSS Functions Cheatsheet

SCSS Functions: 10 Code-Along Examples

Working with Hugging Face: 10 Code-Along Examples

1. Your first pipeline: sentiment analysis

2. Text generation with GPT-2

3. Searching the Hub from code

4. Zero-shot classification

5. Summarisation with length constraints

6. Loading and filtering a dataset

7. Inside the tokenizer

8. Assembling a pipeline by hand with Auto classes

9. Question answering over a context

10. Putting it together: ask questions of a PDF

Related

Leave a ReplyCancel reply

Subscribe to My Newsletter

Recommended for You

Hugging Face Cheatsheet

Python Functions: A Practical Guide

Discover more from Discuss Data Science, Machine Learning and Analytics