Ruby on Rails / October 9, 2025 / 7 mins read / By Wagner Matos

ActiveRecord, Deconstructed A Deep Dive into Rails’ ORM

What ActiveRecord is (and isn’t)

Active Record (the pattern) maps a database row to an in-memory object where persistence methods live on the object itself. ActiveRecord (Rails’ implementation) is a rich ORM built on top of:

  • Adapters (PostgreSQL, MySQL, SQLite, etc.)
  • A type system (casting and serialization)
  • Relations (chainable, lazy query builders)
  • Arel (a lower-level relational algebra builder that produces SQL)

Mental model: Table ⇄ ClassRow ⇄ InstanceColumn ⇄ Attribute.

Where people go wrong:

  • Treating ActiveRecord as a black box (leading to accidental N+1s, bad locking, etc.)
  • Pushing app/business logic into callbacks/validations instead of explicit services/transactions
  • Relying on validations for integrity instead of database constraints

The lifecycle: from Ruby call → SQL → Ruby objects

ActiveRecord turns your Ruby method calls into database queries through a layered process. Understanding these layers helps you predict behavior, optimize performance, and debug weird issues.

  1. You build a Relation by chaining query methods (no SQL yet).

When you write something like this users = User.where(active: true).order(created_at: :desc) AcitveRecord doesn’t query the database right away. It builds an ActiveRecord::Relation - a lazy, composable object that stores information about what query will be executed.

You can see that by calling:

puts users.to_sql
# => SELECT "users".* FROM "users" WHERE "users"."active" = TRUE ORDER BY "users"."created_at" DESC

At this point, no SQL has been executed. You can chain more scopes and filters:

recent_admins = users.where(role: "admin").limit(5)
puts recent_admins.to_sql
# => SELECT "users".* FROM "users" WHERE "users"."active" = TRUE AND "users"."role" = 'admin' ORDER BY "users"."created_at" DESC LIMIT 5
  1. On access (iteration, to_a, first, each, pluck, etc.), Rails compiles the relation to SQL (via Arel) and his the DB.

When you finally need the data - for example by calling:

recent_users.each do |user|
  puts user.email
end

Rails compiles your Relation into SQL using Arel and executes it through the configured adapter.

Under the hood:

  1. The relation is transformed into an Arel AST (Abstract Syntax Tree).
  2. The AST is compiled into SQL.
  3. The adapter send the SQL to the database connection.
  4. The DB driver (like pg or mysql2) executes it and returns the raw result rows.

You can observe the SQL in your logs:

User Load (1.2ms)  SELECT "users".* FROM "users" WHERE "users"."active" = TRUE AND "users"."role" = 'admin' ORDER BY "users"."created_at" DESC LIMIT 5
  1. Results are type-cast and materialised into model instances (or scalars for pluck).

Once the SQL runs, ActiveRecord receives a raw result set - usually an array of hashes like:

[{"id"=>1, "email"=>"admin@example.com", "role"=>"admin", "active"=>true}, ...]

It then:

  • Instantiates model objects (User.new) without running validations or callbacks.
  • Casts each value using ActiveModel::Type (for example, converts timestamps to ActiveSupport::TimeWithZone, JSON strings to Ruby hashes, booleans to true/false).
  • Caches them in the relation’s internal @records array.

Example:

user = recent_admins.first
user.class # => User
user.email # => "admin@example.com"
user.active #=> true (Boolean, not string)

At this stage, these are detached Ruby objects that mirror DB state at the time of query.

  1. Subsequent writes (save, update, destroy) produce DML with validations/callbacks/transactions as configured.

If you later call a write method like:

user.update!(last_login_at: Time.current)

ActiveRecord will:

  1. Run validations and callbacks.
  2. Generate an UPDATE SQL statement.
  3. Execute it via the same adapter.
  4. Refresh timestamps and dirty tracking.

Meanwhile, the query cache (enabled per-request) ensures repeated identical queries reuse cached results without another DB hit:

User.where(id: user.id).first # hits DB first time
User.where(id: user.id).first # served from cache

You can bypass the cache for benchmarking:

ActiveRecord::Base.uncached do
  User.where(id: user.id).first
end

TL;DR

Think of ActiveRecord as a pipeline: Relation -> SQL -> ResultSet -> Model objects. Each layer is inspectable - use to_sql, explain, and logs to peek inside.

The Relation API — the beating heart

A Relation is: (a) a reusable, immutable query descriptor, (b) chainable, (c) lazy.

Composing queries

paid = Order.where(status: :paid)
recent = Order.where("created_at > ?", 2.weeks.ago)

# Intersection (AND)
paid_recent = paid.merge(recent)

# Union-like patterns (OR)
urgent = Order.where(priority: :high)
combined = Order.where(id: paid.select(:id).or(urgent))

Selecting only what you need

# Avoid SELECT *
Order.select(:id, :total_cents).where(status: :paid)

# Pluck for scalars (no model instantiation)
Order.where(status: :paid).pluck(:id)

# Aggregates
Order.where(status: :paid).sum(:total_cents)
Order.group(:status).count

Batching

Order.where(status: :paid).find_each(batch_size: 1_000) do |order|
  # efficient for large tables
end

Rule: find_each/in_batches stream rows ordered by primary key; don’t combine with custom ORDER BY.

Associations: modeling relationships

The essentials

  • belongs_to (child) - required by default; optional: true to allow nulls.
  • has_many, has_one, has_many :through, has_one :through
  • Polymorphic associations: flexible, but consider indexing/storage cost.
class Order < ApplicationRecord
  belongs_to :account
  has_many :line_items, dependent: :destroy
  has_one :invoice
end

class LineItem < ApplicationRecord
  belongs_to :order, inverse_of: :line_items, counter_cache: true
end

Tips

  • Always add foreign keys and {null: false} at the DB level when appropriate.
  • Use inverse_of to help Rails avoid extra queries when building graphs.
  • dependent: :destroy runs callbacks; :delete_all is faster (no callbacks) - choose intentionally.

Eager loading without surprises

  • includes → decides between separate queries or LEFT OUTER JOIN based on usage.
  • preload → always separate queries.
  • eager_load → always uses JOINs (single SQL).
# N+1 fix
orders = Order.includes(:line_items).where(status: :paid)
orders.each { |o| o.line_items.map(&:sku) }

# Need to filter/order by joined table? Use references or eager_load
orders = Order.includes(:line_items).references(:line_items)
             .where(line_items: {sku: %w[AAA BBB]}).order("line_items.sku")
# or
orders = Order.eager_load(:line_items).where(line_items: {sku: %w[AAA BBB]})

Validations, callbacks, and the object lifecycle

Validations

class User < ApplicationRecord
  validates :email, presence: true, uniqueness: true
end

But remember: application validations ≠ database integrity. Reinforce with:

add_index :users, :email, unique: true

Callbacks: powerful and dangerous

Common ones:

  • before_validation
  • before_save
  • after_commit

Example:

class Order < ApplicationRecord
  after_commit :send_receipt, on: :create
end

Keep callbacks idempotent. For complex workflows, use service objects.

Dirty tracking

user.email = "new@example.com"
user.changed?                 # => true
user.saved_change_to_email?   # => true after save
user.email_before_last_save   # => "old@example.com"

Transactions & locking

Transactions

ActiveRecord::Base.transaction do
  order.save!
  payment.capture!
end

If any part fails, everything rolls back atomically.

Locking

Optimistic:

# Requires `lock_version` column
order.update!(status: :paid)

Raises if record changed mid-transaction.

Pessimistic:

Order.transaction do
  o = Order.lock.find(1)
  o.update!(status: :paid)
end

Use locks to prevent race conditions in concurrent environments.


Query performance playbook

  1. Select less: Don’t SELECT *.
  2. Batch work: Use find_each for processing.
  3. Avoid N+1s: Use eager loading.
  4. Count smartly: count (DB), size (cached), length (loads all).
  5. Profile queries: Use .explain or Rails logs.
  6. Cache results where appropriate.
Order.where(status: :paid).explain

Arel and raw power

Arel builds SQL through composable Ruby objects:

users = Arel::Table.new(:users)
q = users.project(users[:id], users[:email]).where(users[:created_at].gt(2.weeks.ago))
User.find_by_sql(q.to_sql)

Useful for:

  • Window functions
  • Complex joins
  • Vendor-specific SQL

Prefer pure ActiveRecord where possible for readability.


Debugging & instrumentation

Model.logger = Logger.new($stdout)
relation.to_sql
relation.explain

Monitor SQL performance:

ActiveSupport::Notifications.subscribe("sql.active_record") do |*, payload|
  puts "SQL: #{payload[:sql]} (#{payload[:duration]}ms)"
end

Schema & migrations best practices

  • Use constraints: null: false, foreign_key: true
  • Add indexes for lookup columns
  • Add check constraints for numeric invariants

Example:

create_table :orders do |t|
  t.references :account, null: false, foreign_key: true
  t.integer :total_cents, null: false, default: 0, check: "total_cents >= 0"
end

Bulk operations & upserts

User.insert_all([{ email: "a@x.com" }, { email: "b@x.com" }])
User.upsert_all([{ email: "a@x.com" }], unique_by: :index_users_on_email)

Bypass callbacks/validations. Great for imports, sync jobs, and ETL tasks.


Advanced locking and concurrency

Advisory locks (Postgres)

ActiveRecord::Base.connection.execute("SELECT pg_advisory_lock(12345)")
# critical section
ActiveRecord::Base.connection.execute("SELECT pg_advisory_unlock(12345)")

Connection pools

production:
  pool: <%= ENV.fetch("RAILS_MAX_THREADS", 5) %>
  timeout: 5000

Monitor pool exhaustion to prevent timeouts under load.


Real-world example: an order system

Schema:

create_table :orders do |t|
  t.references :account, null: false, foreign_key: true
  t.integer :status, default: 0
  t.integer :total_cents, default: 0
  t.timestamps
end

create_table :line_items do |t|
  t.references :order, null: false, foreign_key: true
  t.string :sku
  t.integer :qty, default: 1
  t.integer :price_cents, default: 0
end

Model:

class Order < ApplicationRecord
  belongs_to :account
  has_many :line_items
  enum status: { pending: 0, paid: 1, cancelled: 2 }

  def recalc_total!
    update!(total_cents: line_items.sum("qty * price_cents"))
  end
end

Usage:

order = account.orders.create!(status: :pending)
order.line_items.create!(sku: "AAA", qty: 2, price_cents: 500)
order.recalc_total!
order.update!(status: :paid)

Production checklist

  • Database constraints match model validations
  • Bullet gem to detect N+1s
  • Query plans reviewed with EXPLAIN
  • Connection pool tuned
  • Background jobs wrapped in transactions
  • Clear rules for delete vs destroy, update_all vs update

Final thoughts

ActiveRecord is a beautifully layered ORM. Treat it as a transparent abstraction, not a black box:

  • Inspect your SQL
  • Enforce integrity at the DB
  • Compose clean, lazy relations
  • Measure before optimizing

Master these and you’ll think in relations, not just records — the mark of a real Rails engineer.