What is a Data Dictionary and How to Build Yours

open book with glasses resting between two pages

Is your company struggling with data management? You're not alone. Many companies need help with data quality and alignment across departments. When there is a misalignment in how data is structured and interacting with other your systems, it's difficult to use your data for strategy, reporting, and daily activities. These inefficiencies have significant cost and time consequences. That's where a Data Dictionary comes in. 

Some of what you include in your data dictionary might seem like common sense, but it is only "common" once it is documented somewhere - like your data dictionary!

We regularly build data dictionaries for clients, and have implemented several improvements to our own document over time. Let's explore what a data dictionary is, how to begin to build one, and how frequently they should be updated.

What is a Data Dictionary?

A Data Dictionary is a comprehensive document that ensures all departments are aligned and working towards the same goals. "The data dictionary is a foundational RevOps component that the team at Remotish emphasizes in client engagements. Without a shared understanding of how a company uses a CRM, it's easy for the quality of your data to decline. And of course, having clean data is crucial for ensuring efficient and revenue-generating business operations," Camille Balhorn, HubSpot Strategist at Remotish, defines.

In other words, it provides crucial information about essential data points, their purpose, usage, and relationships with other data. Data dictionaries also provide a common reference point for those who build the systems and manage the applications that support your data. 

Your data dictionary is a foundational revenue operations (RevOps) component crucial for ensuring efficient and revenue-generating business operations.

What goes into a Data Dictionary?

So, what goes into a Data Dictionary? There are many appropriate elements for a data dictionary that include, but aren't limited to:

Data Types

Data Types can be defined as objects in HubSpot or how objects are defined in any integrations you use.

HubSpot objects represent different types of relationships and processes your business has. All HubSpot accounts have four standard objects: Contacts, Companies, Deals, and Tickets.  Depending on which HubSpot subscription your company pays for, you may be able to utilize custom objects, Calls, Payments, and more. 

"Documenting how the objects work with other integrations and how they interact with your overall system is crucial to constructing your company's data dictionary." Balhorn notes.

Property Definitions 

There are several best practices to remember when it comes to documenting property definitions:

  • Use the "Description" area in the Property settings. Detailing how the property is used here means anyone can easily view its purpose in the portal. Additionally, this field comes through in the .csv file when you download all of your properties for an audit. 
  • Organize your property documentation. Whether in a spreadsheet or document format, by HubSpot object or department, document how your properties are used and what they mean for your team and company. 
  • Define how your team and company report using these properties, especially if your methods are a little uncommon. When initially documenting this usage, you may find that some teams perform their analysis slightly differently with the same property. Use this opportunity to align and report in the same way. 

While documenting the use of properties in your HubSpot portal may at first feel unnecessary, remember that properties store the very data that fuels your company. They are vital to your business operations and are integral to every team that utilizes the portal. As you document your properties, make time for the inevitable process discussions. 

An example of a custom property (that you should document) would be a Renewal Date property, used to track renewal dates for timely follow-ups at a critical time in client relationships.  

"How are properties used? And how many different departments use the same properties? These are just a few questions to consider when documenting properties within HubSpot or your CRM instance. Key properties to include would be those used by multiple departments, especially if they use that property to report, said Balhorn. 

HubSpot property settings

Data Sources

For Data Sources broadly, when discussing Data Dictionary efforts, a company should document its lead sources and what each source means. Lead sources include organic social, email marketing, paid search, and offline sources. For example, email marketing means the lead came from an email sent by your marketing team in HubSpot.

You should also document and define how data can be brought into each system.

For example, many different sources fall under Offline Sources, such as a data import, a push from an integration, an API, etc. Companies should define each additional way that data can flow in and out of each system.  Companies should also account for their Qualitative Lead Sources. 

"Qualitative lead source is how a lead tells you how they came to your company. Usually, this is submitted via a form on your website," said Adam Stahl, Sr. HubSpot Strategist at Remotish. "This is incredible information that is sometimes overlooked when documenting lead sources because of the unstructured nature of the data, There are ways, though, to harness, leverage, automate, and structure that data to fit better in a quantitative way but at first making sure it's accounted for and documented is a strong first step."

Relationships with Other Data

How does your data interact?

An Entity Relationship Diagram, or ERD, is an illustration that depicts relationships among people, objects, places, concepts, or events within a system or database. Reporting and illustrating this relationship in HubSpot is easy and efficient with the HubSpot Data Model tool. Check out the screenshot below!

data model overview hubspotIntegration mapping ties the differences between two systems so that when that data is moved from one system, it is accurate and usable. If you leverage integrations, we recommend documenting the data flow! 

Data Validation Rules 

It helps to take an automated approach to keeping your data clean, and data validation rules do just that. 

An example of a data validation rule is single-line text or a numerical value.  "As much as you try to communicate to folks how to interact with data, those rules might not always be completely followed.  It can be for internal and external users to your HubSpot instance; external users fill out forms within HubSpot that become part of your CRM data," said Balhorn. 

Examples can be used for internal or external users, and include min/max character limit, min/max value limit, or limit the number of decimal places. Validation rules are essential to the integrity of your data and the overall structure of the CRM.

Learn more about setting data validation rules for a property in HubSpot.

Data Update Frequency

It isn't enough to create a data dictionary; it is also important to keep it updated.

Our best practice recommendation is to add intentional additions to the data dictionary on a monthly basis and review your data dictionary in full for cross-department alignment on a quarterly basis.  

For example, if you're heading up a new process and have created new custom properties for it, make sure they find their way into your data dictionary! At the quarterly review, you may identify legacy processes and their properties can be removed. 

We recommend having an internal owner of the data dictionary. Similar to how user permissions in HubSpot should be limited and only certain users should be able to create new properties, only certain internal owners should be responsible for updating the data dictionary. This will depend on how departments are structured and the size of the company primarily. 

Check out Establishing a Clean-Up Process In HubSpot for tips on maintaining your growing database. 


Running a full portal audit every quarter is recommended to clean data, identify inefficiencies, and highlight areas for improvement. If you use your portal, you could benefit from an audit! 

CRMs can get messy over time and accumulate bad data, eventually impacting your bottom line. It will impact the integrity of your data, your team will develop inefficiencies, and you'll have wasted time and resources. Audits help ensure that the maintenance processes put in place are effective and will highlight if adjustments are needed. 

Audits are a complete analysis of your current instance in HubSpot, examining your account and marketing efforts to help you get the most out of the platform.  The areas listed below are suggestions on where to get started.  

  • Lists
  • Workflows
  • Sequences
  • Forms
  • Landing Pages
  • Properties
  • Manage Duplicates
  • Establish a Naming Convention
  • Create A Data Health Dashboard

Learn more about When and How To Audit Your HubSpot Portal

Check out this article to learn more about Managing Duplicates in HubSpot.

See a quick demo below on how to create and use a Data Health Dashboard

In Summary

Achieving this goal can be time-consuming. You'll need to define your use cases, assign ownership to individuals from each department, analyze existing data elements, and document those elements. Be sure to review and refine your approach, educate your team, adopt the changes, and make intermittent improvements, as well.

Remember, your data is your most valuable asset: take the time to invest in its management. Documentation is at the heart of everything we do at Remotish, which is why we are so passionate about data dictionaries. Not sure where to start? Book a chat to get a jumpstart on creating this essential document!

Book a Chat
Share this post!
Picture of Ãndrea Peck Ãndrea Peck
Client Services Manager


Share a thought or two on this post

Related Posts

Check out other great posts on this topic