What ActiveRecord is (and isn’t)
Active Record (the pattern) maps a database row to an in-memory object where persistence methods live on the object itself. ActiveRecord (Rails’ implementation) is a rich ORM built on top of:
- Adapters (PostgreSQL, MySQL, SQLite, etc.)
- A type system (casting and serialization)
- Relations (chainable, lazy query builders)
- Arel (a lower-level relational algebra builder that produces SQL)
Mental model: Table ⇄ Class, Row ⇄ Instance, Column ⇄ Attribute.
Where people go wrong:
- Treating ActiveRecord as a black box (leading to accidental N+1s, bad locking, etc.)
- Pushing app/business logic into callbacks/validations instead of explicit services/transactions
- Relying on validations for integrity instead of database constraints
The lifecycle: from Ruby call → SQL → Ruby objects
ActiveRecord turns your Ruby method calls into database queries through a layered process. Understanding these layers helps you predict behavior, optimize performance, and debug weird issues.
- You build a Relation by chaining query methods (no SQL yet).
When you write something like this users = User.where(active: true).order(created_at: :desc)
AcitveRecord doesn’t query the database right away. It builds an ActiveRecord::Relation
- a lazy, composable object that stores information about what query will be executed.
You can see that by calling:
puts users.to_sql
# => SELECT "users".* FROM "users" WHERE "users"."active" = TRUE ORDER BY "users"."created_at" DESC
At this point, no SQL has been executed. You can chain more scopes and filters:
recent_admins = users.where(role: "admin").limit(5)
puts recent_admins.to_sql
# => SELECT "users".* FROM "users" WHERE "users"."active" = TRUE AND "users"."role" = 'admin' ORDER BY "users"."created_at" DESC LIMIT 5
- On access (iteration,
to_a
,first
,each
,pluck
, etc.), Rails compiles the relation to SQL (via Arel) and his the DB.
When you finally need the data - for example by calling:
recent_users.each do |user|
puts user.email
end
Rails compiles your Relation
into SQL using Arel and executes it through the configured adapter.
Under the hood:
- The relation is transformed into an Arel AST (Abstract Syntax Tree).
- The AST is compiled into SQL.
- The adapter send the SQL to the database connection.
- The DB driver (like pg or mysql2) executes it and returns the raw result rows.
You can observe the SQL in your logs:
User Load (1.2ms) SELECT "users".* FROM "users" WHERE "users"."active" = TRUE AND "users"."role" = 'admin' ORDER BY "users"."created_at" DESC LIMIT 5
- Results are type-cast and materialised into model instances (or scalars for
pluck
).
Once the SQL runs, ActiveRecord receives a raw result set - usually an array of hashes like:
[{"id"=>1, "email"=>"admin@example.com", "role"=>"admin", "active"=>true}, ...]
It then:
- Instantiates model objects (
User.new
) without running validations or callbacks. - Casts each value using
ActiveModel::Type
(for example, converts timestamps toActiveSupport::TimeWithZone
, JSON strings to Ruby hashes, booleans totrue/false
). - Caches them in the relation’s internal
@records
array.
Example:
user = recent_admins.first
user.class # => User
user.email # => "admin@example.com"
user.active #=> true (Boolean, not string)
At this stage, these are detached Ruby objects that mirror DB state at the time of query.
- Subsequent writes (
save
,update
,destroy
) produce DML with validations/callbacks/transactions as configured.
If you later call a write method like:
user.update!(last_login_at: Time.current)
ActiveRecord will:
- Run validations and callbacks.
- Generate an
UPDATE
SQL statement. - Execute it via the same adapter.
- Refresh timestamps and dirty tracking.
Meanwhile, the query cache (enabled per-request) ensures repeated identical queries reuse cached results without another DB hit:
User.where(id: user.id).first # hits DB first time
User.where(id: user.id).first # served from cache
You can bypass the cache for benchmarking:
ActiveRecord::Base.uncached do
User.where(id: user.id).first
end
TL;DR
Think of ActiveRecord as a pipeline: Relation -> SQL -> ResultSet -> Model objects. Each layer is inspectable - use
to_sql
,explain
, and logs to peek inside.
The Relation API — the beating heart
A Relation
is: (a) a reusable, immutable query descriptor, (b) chainable, (c) lazy.
Composing queries
paid = Order.where(status: :paid)
recent = Order.where("created_at > ?", 2.weeks.ago)
# Intersection (AND)
paid_recent = paid.merge(recent)
# Union-like patterns (OR)
urgent = Order.where(priority: :high)
combined = Order.where(id: paid.select(:id).or(urgent))
Selecting only what you need
# Avoid SELECT *
Order.select(:id, :total_cents).where(status: :paid)
# Pluck for scalars (no model instantiation)
Order.where(status: :paid).pluck(:id)
# Aggregates
Order.where(status: :paid).sum(:total_cents)
Order.group(:status).count
Batching
Order.where(status: :paid).find_each(batch_size: 1_000) do |order|
# efficient for large tables
end
Rule:
find_each
/in_batches
stream rows ordered by primary key; don’t combine with customORDER BY
.
Associations: modeling relationships
The essentials
belongs_to
(child) - required by default;optional: true
to allow nulls.has_many
,has_one
,has_many :through
,has_one :through
- Polymorphic associations: flexible, but consider indexing/storage cost.
class Order < ApplicationRecord
belongs_to :account
has_many :line_items, dependent: :destroy
has_one :invoice
end
class LineItem < ApplicationRecord
belongs_to :order, inverse_of: :line_items, counter_cache: true
end
Tips
- Always add foreign keys and
{null: false}
at the DB level when appropriate. - Use
inverse_of
to help Rails avoid extra queries when building graphs. dependent: :destroy
runs callbacks;:delete_all
is faster (no callbacks) - choose intentionally.
Eager loading without surprises
includes
→ decides between separate queries or LEFT OUTER JOIN based on usage.preload
→ always separate queries.eager_load
→ always uses JOINs (single SQL).
# N+1 fix
orders = Order.includes(:line_items).where(status: :paid)
orders.each { |o| o.line_items.map(&:sku) }
# Need to filter/order by joined table? Use references or eager_load
orders = Order.includes(:line_items).references(:line_items)
.where(line_items: {sku: %w[AAA BBB]}).order("line_items.sku")
# or
orders = Order.eager_load(:line_items).where(line_items: {sku: %w[AAA BBB]})
Validations, callbacks, and the object lifecycle
Validations
class User < ApplicationRecord
validates :email, presence: true, uniqueness: true
end
But remember: application validations ≠ database integrity. Reinforce with:
add_index :users, :email, unique: true
Callbacks: powerful and dangerous
Common ones:
before_validation
before_save
after_commit
Example:
class Order < ApplicationRecord
after_commit :send_receipt, on: :create
end
Keep callbacks idempotent. For complex workflows, use service objects.
Dirty tracking
user.email = "new@example.com"
user.changed? # => true
user.saved_change_to_email? # => true after save
user.email_before_last_save # => "old@example.com"
Transactions & locking
Transactions
ActiveRecord::Base.transaction do
order.save!
payment.capture!
end
If any part fails, everything rolls back atomically.
Locking
Optimistic:
# Requires `lock_version` column
order.update!(status: :paid)
Raises if record changed mid-transaction.
Pessimistic:
Order.transaction do
o = Order.lock.find(1)
o.update!(status: :paid)
end
Use locks to prevent race conditions in concurrent environments.
Query performance playbook
- Select less: Don’t
SELECT *
. - Batch work: Use
find_each
for processing. - Avoid N+1s: Use eager loading.
- Count smartly:
count
(DB),size
(cached),length
(loads all). - Profile queries: Use
.explain
or Rails logs. - Cache results where appropriate.
Order.where(status: :paid).explain
Arel and raw power
Arel builds SQL through composable Ruby objects:
users = Arel::Table.new(:users)
q = users.project(users[:id], users[:email]).where(users[:created_at].gt(2.weeks.ago))
User.find_by_sql(q.to_sql)
Useful for:
- Window functions
- Complex joins
- Vendor-specific SQL
Prefer pure ActiveRecord where possible for readability.
Debugging & instrumentation
Model.logger = Logger.new($stdout)
relation.to_sql
relation.explain
Monitor SQL performance:
ActiveSupport::Notifications.subscribe("sql.active_record") do |*, payload|
puts "SQL: #{payload[:sql]} (#{payload[:duration]}ms)"
end
Schema & migrations best practices
- Use constraints:
null: false
,foreign_key: true
- Add indexes for lookup columns
- Add check constraints for numeric invariants
Example:
create_table :orders do |t|
t.references :account, null: false, foreign_key: true
t.integer :total_cents, null: false, default: 0, check: "total_cents >= 0"
end
Bulk operations & upserts
User.insert_all([{ email: "a@x.com" }, { email: "b@x.com" }])
User.upsert_all([{ email: "a@x.com" }], unique_by: :index_users_on_email)
Bypass callbacks/validations. Great for imports, sync jobs, and ETL tasks.
Advanced locking and concurrency
Advisory locks (Postgres)
ActiveRecord::Base.connection.execute("SELECT pg_advisory_lock(12345)")
# critical section
ActiveRecord::Base.connection.execute("SELECT pg_advisory_unlock(12345)")
Connection pools
production:
pool: <%= ENV.fetch("RAILS_MAX_THREADS", 5) %>
timeout: 5000
Monitor pool exhaustion to prevent timeouts under load.
Real-world example: an order system
Schema:
create_table :orders do |t|
t.references :account, null: false, foreign_key: true
t.integer :status, default: 0
t.integer :total_cents, default: 0
t.timestamps
end
create_table :line_items do |t|
t.references :order, null: false, foreign_key: true
t.string :sku
t.integer :qty, default: 1
t.integer :price_cents, default: 0
end
Model:
class Order < ApplicationRecord
belongs_to :account
has_many :line_items
enum status: { pending: 0, paid: 1, cancelled: 2 }
def recalc_total!
update!(total_cents: line_items.sum("qty * price_cents"))
end
end
Usage:
order = account.orders.create!(status: :pending)
order.line_items.create!(sku: "AAA", qty: 2, price_cents: 500)
order.recalc_total!
order.update!(status: :paid)
Production checklist
- Database constraints match model validations
- Bullet gem to detect N+1s
- Query plans reviewed with
EXPLAIN
- Connection pool tuned
- Background jobs wrapped in transactions
- Clear rules for
delete
vsdestroy
,update_all
vsupdate
Final thoughts
ActiveRecord is a beautifully layered ORM. Treat it as a transparent abstraction, not a black box:
- Inspect your SQL
- Enforce integrity at the DB
- Compose clean, lazy relations
- Measure before optimizing
Master these and you’ll think in relations, not just records — the mark of a real Rails engineer.