Do you shudder when you hear "VLOOKUP'? Don't worry, you're not alone!
But it doesn't have to be that way. VLOOKUP formulas are actually a simple and effective way to map data between two unrelated objects that have a matching unique identifier living on both records.
In this video, Liam Redding goes step-by-step through everything you need to know to end with a clean spreadsheet of data that is ready for import into HubSpot.
After the data was uploaded, it enabled Liam to automate the associations between the deal records and the new custom objects using Assoic8.
Hey LinkedIn, happy Monday! I made a post earlier today talking about how Excel and Google Sheets are something that we should be leveraging regularly as HubSpot administrators. Something that I leverage regularly in my advanced data modeling projects.
So in this example, I'm going to be going over one of the tips that I use very frequently to solve a complicated issue as we were looking to migrate custom objects out of Salesforce and into HubSpot. So historically, there was never any sync setup between HubSpot and Salesforce for this custom object. But there was a sync setup between HubSpot and Salesforce for the opportunity records. So what you ended up happening, what ended up happening was you had an instance of HubSpot that represented a piece of Salesforce. Now this client is looking to get out of Salesforce entirely. They want to migrate their entire operations out of Salesforce and into HubSpot. And in order to do so, we had to recreate that data model of this custom objects inside of HubSpot. I'm not going to get too into what that particular data model was or what those custom objects were, because that's private to that client.
In this example, we're going to go over a free trial, like a trial object. So for example, in Salesforce, there's a custom object set up that tracks trials in association to opportunities and the status of the trial, the type of trial, things like that there might be more data points than the few we have here. Just kind of put together something very simple that allows us to get a good idea for you know what this could look like in the real world. But in HubSpot in this example, there is no trial data so they have synced over the opportunity, all opportunity records. But no other data has been sent. So that's our job.
We've gone through and we've actually already created that custom object inside of HubSpot. And now what we need to do is associate the deal records inside of HubSpot to the deal records or to those new custom object records that we're gonna be importing into HubSpot. So let's say in this example, we've already imported all of that custom object data into HubSpot. And now we're looking to make the associations. And one of the things that you could potentially run into as you're working with a large volume workflow at scale, is you want to be able to go through like a step by step process to be able to be in control of the the upload and the associations, so that you're not just trying to do everything in bulk at once. And that's one of the things that you're going to run into.
So actually, instead of in this example, it was a very complicated model. Because you could have, for example, a one to one relationship between your custom object, or whatever data point you're trying to map in HubSpot, and a standard object that you have in HubSpot, for example, in this example, we're only going to have one trial for one opportunity or deal. But in the real world, that's not always gonna be the case, right? You could have somebody who has a one to many relationship. And that's the case that we had here where you could have one deal, or one opportunity that is associated to multiple trials. In that instance, things get a little bit more complicated. So we actually, instead of having an Associ8 automation run from the custom object, we tried that at first, some things broke, we figured out that we needed to pivot and be in more control of it.
So we actually needed to map that data from the custom object to that deal record, so that we could create a deal based workflow with the Associ8 to action, which allowed us to, to more easily update the records. And we were also trying to associate multiple custom objects, it wasn't just one. So it got very complicated very quick.
As you can see, the best way to map that data was just to map it over to the deal object itself. The issue is the custom object that we exported out of Salesforce had the opportunity ID on it, the 18-character opportunity ID. The deal that is in HubSpot has 18 character opportunity ID because it's synced through the Salesforce integration. But in order to get that custom object data to map over to the deal object and import it into HubSpot, we needed the record ID–right, we can't use the 18 character opportunity ID.
When we're mapping that data in HubSpot because HubSpot has to rely on its own independent record ID. Since those two things were not mapped together organically, we actually had to go through the process of doing that in a spreadsheet.
So what I'm gonna show you right now is a tip on how to use VLOOKUP formulas. I know everybody shudders when they hear that word. But VLOOKUP formulas are actually very simple. showing you guys how to use VLOOKUP formulas when you're trying to cross reference two completely separate objects. To map data from one to the other, that shares a unique identifier. In this example, and in many of those examples, when you're dealing with Salesforce, that's going to be your 18 character opportunity ID.
So in this example, we have imported or we created a spreadsheet. And in that spreadsheet, we have both our deal data, as we can see over here, as well as our custom object data, which lives on the right side. As we can see, we have the record ID here. And then we also have the 18 character ID, that's all we have on the deal. That's all we need for the deal. And then over here for the trial object, we have, again, the 18, character opportunity ID. And then we have a couple of different properties that we're going to be mapping over to that deal record, right, because that 18 character opportunity ID already lives on the deal. So you're going to use a formula. And I don't even know how to get there from the menu. Let's see the insert function. I just copied and pasted it, we'll see the look up there it is, I think, okay. So VLOOKUP seems really, really confusing when you first are looking at it. But it's actually very simple. And it's broken down into a few key parts, which we'll go over. And in this example, we can see on this side, what I actually had to go through the process of dealing with export all deals from HubSpot, right because we needed to cross reference every deal that could exist. Luckily, there was only about I think, 50,000 deal records or so and HubSpot, that could then become a lot more complicated though. Because if you were trying to do that from the contact record, because the contact record in that example, the index had over 4 million contacts. So when you're doing this process that can be one complicated thing that you run into is the volume of records that you're exporting out of HubSpot, but you need to export all of them. Or you need to have another way of doing it. That makes sense logically, for example, we couldn't export all the contact records because there were over 4 million that would break my computer.
So instead, we made the association first to the deal, right, there's less of those we knew we could handle 50,000 deals, we as we made those associations, and then we through automation passed over a value into a custom property, which then allowed us to export only a sub segment of those 4 million, which ended up being like 15,000 contacts.
But as you can see, here, we have exported all there's only 100 deals, not all 100 of those deals will have one of these custom objects associated to them. That's one of the things that makes this tricky, right? There's only 10 or so 11, I think, I guess nine of these custom object data points. So that should only match up with nine of these deals. So when you're doing the VLOOKUP formula, you're going to use a search key. So that's going to be the property that you're searching for. So in this example, we're going to go with E3 is going to be our search key. And then we are going to the other thing you're gonna want to do is add in that money sign. And what that does is that locks the value. So right now I locked E, so E would not change. And I could also lock three as well, then three would not change, but I need it to stay on E and just as it goes down. This net number needs to change. So from E three, and then if I drag and drop this down, we'll go to E four e five, E six, that becomes the search value, we're going to look through a range. So your range needs to be and I'm about to delete this F in a second. But your range is going to be E, it's going to be G and then you're going to do three as well. We're going to go all the way to the end, which would be J and that letter. And then you're going to do an index. So that's basically essentially saying, so if we start at G column G, what is the index where we want to pull the next value. So in this example, we're looking to pull as we can see, we have the 18 character trial ID, which we're trying to map over to the deal, which lives on column J. So if column G is column one, you go 1234 to column J, so column J is four columns away, and then is sorted and this example is going to be false in that sorted. So if it's not associated with anything, it's going to give you an N/A.
So what I'm going to do is… I'm actually going to drag and drop and this is apparently messed up, hold on. Yeah, so in this example, as well, we want to lock G3, you want to lock J11. Because we want that to remain the same all the way down, we don't want that to move as we're moving down. So it's still gonna say N/A, it's always gonna say N/A, if there's not a match, that's going to happen. And we'll scroll all the way down to the 100th deal. You'll see here we're starting to see some populate, so we have nine. So if we went, look at our, it's going to be 9H0, a be scrolled up here, nine, H zero, A, B. And then if we were to move over, we have to do this for all three of them, you would look at the same, we're actually gonna, we're going to be looking at right now, that's still the same. So that would still be E3. And then yet, because this is going to always be the same, and then the only thing that's actually changing here is two, so it's sort of column G would be one, and trial status, which we're trying to go into apologies there.
And then so for example, you can see here that seven, so if we just drag and drop this down, seven should be closed, so seven is closed. So you can continue to drag and drop or drag and drop down to eight, eight is closed as well drag and drop down to one, which is a different value that's open, we get open as well. So now map the active status as well. Nine all the way to this should be in progress. And then we're going to copy and paste for trial type as well, which should then become column three. And it's going to be an A, this should be free, seven, paid, yep. Eight is free. One is free. Can't see what that is. Three to two, it's free. And we will drag and drop the rest. Okay. So now we have all our values mashed up, this is how you would do this process. If you have a one to one relationship, what you can do is you can just map over and you can create a set of customer deal properties, and just map them over because you know that there's not going to be duplicates. We'll go over in one second what you need to do when there's duplicates.
But in this example, one other tip that I would give you is to always copy and paste or create different versions of these spreadsheets as you're going through them or different tabs in your spreadsheets that allow you to keep historical versions of them as you're going through this process. Because what could happen is that you figure out, you know, 17 steps into the process that you did something wrong, and instead of having to go back to square one, you can start at square five, where you left off with that error and pick back up from there. So that would definitely be one of the recommendations that I would make. Second recommendation would be to create non-formula, hard coded versions, like plain text versions, without any formatting of the actual data points.
So as we can see here, this is using a VLOOKUP formula. But if I were to copy and paste this without formatting which is Command Shift or Ctrl, Shift V. So we'll copy which is Ctrl C, that copied with formatting. If I were to cut + paste this without format, or if I were to paste this with formatting, it would paste with those formulas. But I'm going to use Ctrl Shift V, that's going to populate that as plain text. And then I don't have to worry about any formulas, right, that's a seven, that's a hard coded seven, I don't want that value to change. The issue with formulas is that if you were to try to change something and the formulas are still in there, because of the way the formula is written, it could break the way that that data is interpreted.
So next, one of the things that I would do is I'd say okay, now I want to add in a filter for this. And then we're going to look at A to Z. And then that will give us all the records that are associated and we can clear out everything else. And now we have a clean segment of data that we can import into HubSpot. So this would just be now I can import this into the deals.
So because I have the unique identifier that's related to that deal. And then I have all the values that lived on the custom object, now living on the deal when I import that, and I can run the associate automation from the deal object, which is much easier instead of custom objects, excuse me.
What to do with VLOOKUPS when there are duplicates
Now, what do you do, when there's duplicates, you actually kind of want to flip this process on its head. So what you're going to try to do is–none of this would matter, you would actually add a column, say what added on the end here to j, and I'll move this. So add a column one to j, right, we want to insert one, right? It's gonna be a VLOOKUP. And this is going to actually, that's not where we were gonna put… deal ID. We can change that to HubSpot ID, so we know which system that belongs to.
Okay, and then what we're going to do is we're going to do the same exact thing, I'm actually it's not probably not going to let me. So instead I have this merged column here, and Excel hates merged columns. So I'm actually just going to add the record ID a second time. Actually intersect and merge. Yeah. So what I'm gonna actually do is delete this column, because we already know what these data points these different sections stand for. Then I'll get rid of this. And then I'm gonna move this over.
Alright, so now what we have is the record ID next to the opportunity ID, and it is kind of important that they're next to each other. That's one of the things I noticed when I was going through this process is that like the order of operations for how columns stack up matter. So what we want to do is actually pull this HubSpot deal ID, so we're going to do the VLOOKUP. And this will allow us to have a one to many relationship, because we are now pulling in, say we have multiple matches of the G column to the E column, it's going to pull in those matches here. So we would get if there's, for example, five different deals that were associated to five different trials, it would pull in all five of them. So what do you VLOOKUP, our search key is going to be… let's see. That might mess this up as a matter. All right, search key is G, we're going through and do E2 through E100, I think. And then the index is going to be one, we actually want to reference that very first one, and false and this might break, we'll see. So we're actually going to swap these out. So that opportunity ID comes first. And now see, the order of operations is important. The thing that you're trying to match too, needs to be the first column in the lookup field that you're looking out. For the range, I think is what they call it. So here now we have pulling into that ID. So if we were to drag and drop, now we're going to get here, it's making a suggestion for me. It did not find that value. And that might be because we're not using… again, yeah, it's very important that you lock these. So we're always going to be looking at G and then we want to move that down and then these get locked 100%. But I'm gonna show you what happens is that so now it's D3 through 101. We wanted that to be D2 through D100.
And now it's moving on me and because it's moving, for example, here, it's not taking into consideration this one. So what happened when we got down here is that it's looking in the range D eight through D 106. And if I look at, if I look this up, it's gonna be before that range, so it's right here. And because we're only considering things after that, it's not able to find it. So now what we need to do is just re-drag and drop this with the one that has a locked column and that should be… interesting. 10, D2. Oh yeah, that's what's wrong. Okay. There's a couple things wrong here. And this is part of the process, right? These are, like I said, they're, they're not confusing, but you’ve got to make sure you have it all figured out. And we're pulling in the HubSpot ID so… still saying in N/A. If we look this value up, and F, two of them, so one is here, the other is down here that should pull in, I guess it'd be 100. That's a very low value. So that might be what's happening here is you need to expand the range beyond the last field. So that's an important thing.
Okay, so now we have this all sorted out here. And what I can show you is okay, so we have, for example, DL 28. So if I copy this value, and then I go to DL 28, which is right here, Al, you can see the AE L right here. So dl 28, if I were to update that to the same one as the old one. And then were to rerun it. Yeah, and then actually, I did that reverse. So if I were to change this, here, we're going to pull a 16 twice. So that's why it's important. If you have a one to many relationship, that's how you handle it, is by building it out this way. And then now you have the same thing, right, if we were to go over to this sheet, we have HubSpot deal ID, a record ID matched up with those custom object properties. We have that here. It's just done in the reverse order.
But then that that then allows us to do is let me copy and paste this real quick and show you some cool, it's my last final tip. And we're gonna log off. But this is pretty cool. And this is the process that I went through. Because I had to, to make sure that the data was clean when it went in. And these are a lot of the things that associate by itself from that custom object couldn't handle, because there was just too many intricacies, there's too many layers to the data, that it needed somebody to go through the process manually and actually map everything out in a very clean way in order to know what to do in order to know how to execute. So I'm actually going to paste this here, I need to again, I didn't even take my own advice hard paste it. So these are hard values. And then in this example, I want the deal ID always to be first, I'm gonna add a column to the right. I'm actually going to paste this so that we have two separate deal IDs, and I'm gonna add a third column that's just yes, column or one, or whatever you want to add here. Doesn't really matter. Okay, so then we're going to have, we'll call this equal check. So let's say like in this example, we only have like 10 records, right. But in the real world, you're dealing with 10s of 1000s of records, you don't want to have to go through and manually first of all, you're going to miss something, if you go through it that way. But you don't want to have to go through and try to manually find your duplicates, you want to create a process that allows you to do that automatically. So in this example, we're going to again, do a BNI lookup formula, and we are going to use the search key. It is going to be a two right where you use the deal ID, our search range, it's going to be really interesting. We're going to do B then B three through C with this 1011 Right we want to go one value beyond it. So what that's gonna allow us to do and the reason why we do B3 instead of a B2, which would be this one we start one layer down, is because if we were to try to match up a two to b two, in 100% of the instances, it's always going to come up true, because those values match each other. So you always want to look one row below. And then what that allows us to do is check everything else. So we'll say I did something wrong here, a two, okay, and then our index is going to be two and then false. And that's gonna populate a yes here should populate a yes, it's going to do the one below, it's going to get an N/A. So if I were to, let's say, say this is 35, as well, and it's not. And that's going to populate, this one's going to populate as Yes, as well. Let's say this was just due to two. Now we have our duplicate check. And now we can use this to sort and find where our duplicates are. Which is really, really cool. And what you would do then is you can sort this A through Z. It's been strange. Yeah, because that's the reason why. So you would again, want to copy paste hard paste that. And then what I would go through is, instead of filtering this, you'd actually want to go through because you know, that the one after is going to show up as an N/A, which, if you were to filter a through z, what's going to happen is all the yeses are going to group together, all the n A's are going to group together. So instead of going through and just just, you know, sorting or filtering them, you actually want to go through and identify that this is a match, this is a match, these are two analyzed back to back, those are not matches. And then we have a match here, we can get rid of this last one. And what you would want to do is you would want to handle those two segments separately, right? If you had a one to one relationship, you only need one set of properties to upload this data onto to then run that automation from, if you have multiple multiples, then you need to create multiple versions of those properties. For example, trial status, one, trial status two, trial status three, all the way up to however many iterations of that particular object you could have, or how many different associations a single in this example opportunity record could have with that custom object record.
And then you would need to create a clean version of that spreadsheet where you're actually going through a hand by hand copy and pasting those values over into the properties that would be your final clean import. At this point in this example, we've gone through three or four different spreadsheets at this point, to get it to a clean version, which we can then import into HubSpot. It sounds like a very complicated drawn out process. But it needs to be done in order to make sure that data goes in correct and it goes in correctly the very first time. The worst thing about data imports is when something goes wrong in the import, you didn't take everything into consideration when you're going through that cleansing process. And now you have to circle back and try to figure out what you did wrong, sometimes clearing out all the data that you imported. And like I said, starting from square one. So you want to have those safeguards built in there. And you want to be very, very smart about how you go through this process, especially when you're dealing with a huge volume of records. Those are just a couple of my tips that I have regarding how to clean data and how to use VLOOKUP formulas when you are mapping data into HubSpot. Hopefully everybody found this video very useful. And I will be back soon with more content. Thank you very much.
CommentsShare a thought or two on this post