Firestore Data Model Explained: Collections, Documents, and Schema Design
The Firestore data model is built on three things: collections, documents, and subcollections. Understanding how they connect is the foundation. The more important lesson is that Firestore rewards developers who design their data structure around their query patterns from the start. Get this right and reads are fast and cheap. Get it wrong and you are restructuring data in production.
This page covers how Firestore organises data, how to model real application structures like users and orders, why denormalisation is expected, and the decisions that affect what you can query and what it costs. If you are new to Firestore, the Firestore Overview covers when the service fits and how per-operation billing works.
The simple explanation
Think of Firestore as a filing cabinet. Each drawer is a collection. Each folder inside a drawer is a document. Each folder contains data fields: a name, an email address, a status, a timestamp. Some folders have a mini-cabinet attached called a subcollection, which holds more folders of its own.
There is no template every folder must follow. Two folders in the same drawer can hold completely different things. That flexibility is real but it is also a responsibility: the database will not stop you from storing data inconsistently.
The complexity in Firestore does not come from the structure itself. It comes from understanding how structure decisions control what you can query later, and what those queries cost per read.
How Firestore organises data
Collections
A collection is a container for documents. It has no properties of its own. Collections come into existence when you create the first document inside them, and disappear when the last document is removed. You do not create or delete collections directly.
A typical application has a small number of top-level collections:
users, products, orders. Each one
holds all documents of that type.
Documents
A document is the unit of data in Firestore. It is a set of named values similar to a JSON object, living inside a collection. Every document has a unique ID within its collection. IDs can be auto-generated by Firestore or set explicitly by your application.
Two documents in the same collection can have completely different fields. There is no enforced schema. This is useful for evolving data models, but it means you are responsible for field consistency at the application layer.
Fields
Each field in a document has a name and a typed value. Firestore supports more types than plain JSON: String, Integer, Float, Boolean, Timestamp, Array, Map, Reference, GeoPoint, Bytes, and Null.
Always use the Timestamp type for dates and times, never a
string. Timestamp fields support ordering and range filters correctly. A
string like “2026-03-08” cannot be reliably range-queried
unless every document that ever writes to that field uses exactly the same
format, which is hard to guarantee once multiple clients are writing.
Document paths
Every document has a path that uniquely identifies it in the database. Paths alternate between collection names and document IDs:
/users/alicepoints to documentalicein theuserscollection/products/widget-42points to documentwidget-42in theproductscollection/users/alice/orders/order-001points to a document inside theorderssubcollection of useralice
Paths always have an even number of segments. An odd number of segments
points to a collection; an even number points to a document. So
/users is a collection, /users/alice is a
document, and /users/alice/orders is a subcollection.
Subcollections
A document can contain a subcollection, a collection nested inside the document. Subcollections model one-to-many relationships where child documents belong to a specific parent.
A users collection where each user has an orders
subcollection looks like this:
users/
alice/
name: "Alice Smith"
email: "alice@example.com"
orders/ ← subcollection
order-001/
total: 42.50
status: "shipped"
createdAt: 2026-03-01
order-002/
total: 15.00
status: "pending"
createdAt: 2026-03-18
bob/
name: "Bob Jones"
email: "bob@example.com"
orders/
order-003/
total: 99.00
status: "delivered"
createdAt: 2026-03-10Each order is its own document. You can read, query, and paginate orders independently without touching the parent user document. The user document stays small regardless of how many orders accumulate.
Subcollections support up to 100 levels of nesting. In practice one or two levels covers almost every real application. Deeper nesting makes queries harder to write without adding any performance benefit.
How it works in practice
Here is a realistic e-commerce structure using three top-level collections and two subcollection levels:
users/
user-abc123/
displayName: "Alice Smith"
email: "alice@example.com"
createdAt: Timestamp
orders/
order-001/
items: [{ productId: "widget-42", qty: 2, priceAtPurchase: 21.25 }]
total: 42.50
status: "shipped"
createdAt: Timestamp
products/
widget-42/
name: "Widget Pro"
price: 21.25
category: "tools"
stock: 148
updatedAt: Timestamp
support_tickets/
ticket-789/
userId: "user-abc123"
subject: "Order not arrived"
status: "open"
createdAt: Timestamp
messages/
msg-001/
body: "My order has not arrived."
authorId: "user-abc123"
sentAt: TimestampNotice that each order item stores priceAtPurchase directly
rather than relying on the product’s current price. Product prices
change; order records should reflect what the customer actually paid at the
time. This is a deliberate denormalisation decision.
The support_tickets collection uses a messages
subcollection for the conversation thread. Each message is individually
addressable, and the ticket document stays compact as the conversation grows.
To control who can read or write these documents from a client application, see Firestore Security Rules.
Designing your data model around queries
In a relational database you normalise data first and write queries afterwards. In Firestore, the process is reversed: you start with the queries your application needs, then design documents and indexes to serve them.
This is not optional. Firestore can only run queries against indexed fields. There are no joins. If a query needs data from two unrelated collections at once, Firestore cannot combine them in a single operation. You must restructure your data, denormalise, or accept multiple reads. Understanding how Firestore queries and indexes work before finalising your schema will save you from expensive rewrites later.
Denormalisation is expected
Denormalisation means storing the same data in more than one place to make reads faster. In a relational database this is a sign of poor design. In Firestore it is standard practice.
Think of a paper receipt. When you buy something, the receipt records the price at that moment. It does not store a live reference to the product’s current price. If the price changes next week, your receipt still shows what you paid. Firestore data modelling works the same way: you write values into documents when they are needed, rather than looking them up from a shared source on every read.
If every order screen needs to display the buyer’s name, you have two options: read the user document on every order display (an extra billable read), or store the display name directly on the order document at write time. Most Firestore applications choose the second option. The tradeoff is that if the user changes their display name later, existing order documents will show the old value. Whether that matters is a product decision, not a database limitation.
Structure follows your most frequent read
If your home screen always shows a user’s profile with their three most recent orders, you can either fetch the user document and query the orders subcollection separately, or store those three recent orders as an embedded field in the user document and read everything in one operation.
Embedding works well for high-traffic screens where reducing read count
matters for latency and cost. The subcollection works better for full order
history, filtering, and pagination. Many applications use both: an embedded
recentOrders field for the home screen, and a full subcollection
for the order history page.
Firestore limits writes to approximately one write per second per document. A global counter, a live inventory field, or any value updated by many concurrent users will cause write contention and dropped writes under load. Use sharded counter patterns or move high-frequency aggregations to a separate service.
When this data model fits
Firestore’s document model works well when:
- Records vary in structure: user profiles, product catalogues, or content pages where some items have fields others do not
- You need real-time push updates to clients as data changes, without polling
- Your app works offline: the mobile SDK caches data locally and syncs when connectivity returns
- You are building a chat app, social feed, or notification system where each user’s data is accessed independently
- Your schema is still evolving: adding a new field to documents requires no migration
It is a poor fit when:
- You need complex joins across many collections: use Cloud SQL for relational workloads
- You need analytical queries over millions of documents: BigQuery is built for that
- You have high-throughput time-series writes at scale: Bigtable handles those patterns more efficiently
For a full comparison of GCP storage options, see Choosing the Right Storage Service.
Common mistakes
Storing dates as strings instead of Timestamps. A string field like
“2026-03-08”cannot be reliably sorted or range-filtered unless every document that ever writes to that field uses exactly the same format. Timestamps are a first-class type with correct ordering and range filter support built in. Use Timestamp for every field representing a point in time.Growing arrays inside documents instead of using subcollections. Appending to an array rewrites the entire document and counts as one billable write. The array also grows inside the document toward the 1 MB size limit. For lists that accumulate over time, including orders, comments, messages, and log entries, use a subcollection. Each item becomes its own document, individually addressable and pageable.
Using a single document as a shared counter. Firestore allows approximately one write per second per document. Any field updated by many concurrent clients will cause write contention and failures under load. Use sharded counters or aggregate through a separate service.
Designing the schema before thinking about queries. The most common Firestore mistake is modelling data relationally and then discovering that the queries you need are not supported. Define your key read patterns first: what does the main screen load, what does a list page display, what filters does search require. Those answers define the document structure and indexes needed to serve them.
Ignoring document size as data accumulates. A document that stores a growing event history or message thread as a single array will eventually hit the 1 MB limit, often in production under real load. If a field is written to repeatedly, model it as a subcollection from the start.
Firestore versus relational databases
Developers coming from SQL databases often apply the same modelling instincts to Firestore. Those instincts are useful background, but the practical rules are different enough to cause real problems if you treat them as equivalent.
No joins, no foreign key constraints
In a relational database you store users in one table, orders in another, and join them at query time. In Firestore there is no equivalent of a JOIN. If you want to display a user’s name alongside an order, you either store the name in the order document at write time, or make a second read to the user document at display time. Each dereference is a separate billable read and a separate network call.
The reference field type lets you store a path to another document, but Firestore does not automatically fetch the referenced document. You fetch it explicitly when you need it.
Subcollections versus arrays
When you need to store a list of related items, you have two options:
Array in the parent document reads all items in one operation together with the parent. Simple to access but impossible to query on item fields independently. Grows inside the document toward the 1 MB limit. Best for small, stable lists always read with the parent.
Subcollection stores each item as its own document, queryable, sortable, and pageable on its own. The parent stays small. Requires knowing the parent’s path. Best for lists that grow or need individual addressing.
Default to subcollections for any list that grows. Use arrays only for small, bounded lists you always read together with the parent document, such as a product’s two or three category tags.
Flexible schema versus enforced schema
Relational databases enforce a schema at the table level. Every row must match the column definitions. Firestore has no equivalent constraint. Documents in the same collection can have completely different fields. This is genuinely useful for evolving data models, but inconsistencies accumulate without application-level validation or well-written security rules that require specific fields on writes.
Summary
- Data is organised as: collection → document → (optional) subcollection → document
- Paths alternate collection and document segments; an even number of segments points to a document
- Always use Timestamp for date and time fields, not strings, to support ordering and range queries
- Subcollections are better than arrays for growing lists; arrays count toward the 1 MB document size limit
- Firestore has no joins; design your data model around query patterns, not normalisation
- Denormalisation is normal in Firestore: storing data in more than one place to avoid extra reads is standard
- One write per second per document is the limit; use sharded counters for high-write aggregation fields
- There is no enforced schema; field consistency is your application’s responsibility
Frequently asked questions
Is Firestore schema-less?
Firestore has no enforced schema at the database level. You can store any fields in any document without declaring them first. But flexible schema does not mean consequence-free. Your query capability depends on what fields exist and whether they are indexed. If some documents store dates as strings and others as Timestamps, range queries will behave inconsistently across those documents. Treat field consistency as an application responsibility, not something the database enforces for you.
When should I use subcollections instead of arrays?
Use subcollections for any list that will grow over time or that needs to be queried, paginated, or addressed individually. Arrays work well for small, fixed-size lists you always read together with the parent document, such as a list of three category tags on a product. The key limit: every document has a 1 MB maximum size, and arrays grow inside the document. A user with thousands of activity records stored as an array will eventually hit that limit. In a subcollection, each record is its own document and the parent stays small.
Can documents in the same collection have different fields?
Yes. Firestore does not enforce a uniform structure across documents in a collection. One user document can have a phoneNumber field while another does not. This is useful for evolving schemas without migrations, but it means your application code must handle missing fields gracefully. Never assume a field is present just because it exists in most documents.
How does the Firestore data model affect what you can query?
Directly and significantly. Firestore only runs queries against indexed fields. Single-field indexes are created automatically, but multi-field queries require composite indexes configured in advance. You cannot query across two unrelated top-level collections simultaneously because there are no joins. If you need to show a user their recent orders alongside their profile in a single read, you must either store some order data inside the user document or accept two separate reads. Your data structure determines your query options.
What is the maximum size of a Firestore document?
A single Firestore document can be at most 1 MB. This includes all field names and their values. For large binary data like images or files, store the content in Cloud Storage and keep only a reference URL in the Firestore document. The limit is rarely hit for typical user or product data, but grows quickly if you store arrays that accumulate items over time. Chat messages, event logs, and activity histories are common culprits. Use subcollections for any list expected to grow.