How Large Language Models Work

How Large Language Models Work#

This chapter provides an introduction to large language models (LLMs) and their applications in finance.

Note

This chapter is under construction.

Topics to Cover#

What is a language model?
From n-grams to neural networks
The transformer architecture
Attention mechanisms
Pre-training and fine-tuning
Tokenization and embeddings
Prompt engineering basics
Using LLMs via APIs
Applications in finance
- Document summarization
- Information extraction
- Code generation
Limitations and considerations

Agentic Engineering#

Simon Willison has put together a guide on agentic engineering patterns — how to work effectively with AI coding agents like Claude Code, GitHub Copilot, and OpenAI Codex. This is essential reading for anyone using these tools.

Agentic Engineering Patterns — The full guide.
How Coding Agents Work — A detailed look at the internals of coding agents.

Guide Overview#

The guide covers several key topics:

What is Agentic Engineering? Agentic engineering is the practice of developing software with coding agents — AI systems that write and execute code in loops to achieve goals. Rather than replacing developers, it shifts the focus from writing code to the more strategic work of defining problems, choosing solutions, and verifying results.

Writing Code is Cheap Now. While AI coding agents have made writing code nearly free, producing good code still requires substantial developer effort to ensure correctness, testing, documentation, and maintainability. Developers should experiment more freely with agent-generated code, since the cost of failure has dropped significantly.

Knowledge Retention. Willison advocates “hoarding” working code examples and solutions as a core professional skill. Having proven implementations lets you spot applications others might miss. These curated collections become powerful inputs for coding agents, which can rapidly combine multiple examples to solve new problems.

Improving Code Quality. Coding agents should be leveraged to improve code quality rather than reduce it. Agents excel at tedious refactoring tasks — fixing API designs, renaming concepts across codebases — that are simple but time-consuming, thereby eliminating technical debt without sacrificing quality.

Traps to Avoid. The key anti-pattern is submitting unreviewed agent-generated code to collaborators. Developers must personally verify code functionality and quality before creating pull requests, rather than delegating the review burden to others.

Red/Green Testing. Red/green test-driven development is an effective pattern for coding agents: write automated tests first, confirm they fail (red phase), then have the agent implement code until the tests pass (green phase). This prevents common agent mistakes like producing non-functional code.

Understanding Code. Agents can produce structured walkthroughs of codebases — useful for understanding projects you built with AI assistance without fully grasping every detail. Willison warns about cognitive debt that accumulates when AI-generated code is incomprehensible, and advocates for interactive explanations that help developers understand complex algorithms.

How Coding Agents Work#

A coding agent is a piece of software that acts as a harness for an LLM, extending it with additional capabilities through tools. The key components are:

LLMs are the foundation — machine learning models that predict the next token in a sequence. They process text by converting it into integer tokens, which matters because providers charge per token. Vision LLMs can also accept images as input.
Chat-Templated Prompts. Modern agents use a simulated conversation format. Since LLMs are stateless, the system must replay entire conversations for each new input, making longer conversations progressively more expensive.
Tool Calling. The distinguishing feature of agents is their ability to invoke tools — functions the harness makes available. Powerful agents include execution tools like Bash() or Python() for running code. The LLM generates tool-call requests in special markup, which the harness extracts and executes.
System Prompts. Lengthy initial instructions (sometimes hundreds of lines) inform the model about available tools and desired behavior, though users never see them.
Reasoning. Recent advances enable models to generate internal “thinking” text before responding, spending additional tokens on problem-solving — particularly valuable for debugging complex code.

Building a basic agent loop requires just a few dozen lines of code, though production-quality implementations demand substantially more effort.

# Placeholder for imports
import pandas as pd
import numpy as np