From the course: Excel Business Intelligence: Power Pivot, DAX and Data Modeling

Add index and conditional columns with Power Query

All right. Can I show you two more types of column calculations in Power Query? The first is super simple. It's called an Index Column. So in the Add Column menu, you'll see it right here, Index Column. All it does is it creates a list of sequential values that you can use to identify each unique row in a table. And you can determine whether you want that to start with a zero or a one. These are often used to create unique IDs that can be used to form relationships between tables. So we've been hinting at that quite a bit. Trust me, we're going to talk a lot about it in the Data Modeling 101 section. And these should look pretty familiar. We had ID columns that were formatted just like this for each of the lookup tables that we've already loaded. So the Customer_Lookup, the Store_Lookup, the Product_Lookup, and an Index Column calculation or tool is another way to create those just from scratch. The second tool I want to show you is an interesting one. It's a Conditional Column. Now, you'll find that in the same Add Column menu, it's called Conditional Column. And these allow you to create new fields based on logical rules and conditions that you set. So if...then kinds of statements. So in this case, this is the demo that we're going to do hands-on, but we can create essentially different buckets or different categorizations based on the values of a field. So is quantity greater than five, then create a new column called order size that takes the value of large. Is quantity between 2 and 5, set order size equal to medium. Is quantity one, set size equal to small. And then you've got this catchall otherwise statement which usually takes something like other or false or NA. So both pretty simple. Let's go ahead and open up our data model and give these a shot. Okay. Back to the FoodMart_Data_Model. Now, we've got four queries with four tables. We're going to go ahead and add a new query to get a fifth. Again, from CSV, this time we're going to open up the file called FoodMart_Transactions_1997. Let's edit it, and open up the query editor window. Cool. So let's go into those Add Column tools. Here, you'll see the Conditional Column option and the Index Column option. So why don't we start with index? If you drop down, you've got the option to start from zero or one. Or if you're crazy, you can start with a custom number. Generally speaking, it's going to be zero or one. I like starting with one. And since that's the format of all the other tables we've been using, let's go ahead and set that the same way to be consistent. So created a new column called index. It formatted it as a number. Doesn't really matter, but let's just change it to a whole number. And instead of index, I'd like to name this transaction_id. And why don't we drag it as the first column in the table. Now, as usual, let's look at the table name FoodMart_Transactions_1997. That's okay. Let's just pull FoodMart out to shorten it up a bit, Transactions_1997. Now, you'll notice this one doesn't say lookup. And that's because this is not a lookup table. This is a data table which we'll talk about in the next section. But that is intentional that we're removing or not including that lookup label in this table name. And then our quick pass through the column headers. We just created transaction_id, we've got transaction_date, stock_date, and then a bunch of IDs that map this transaction data to products, to customers, and to stores with a quantity column here at the end. So let's go ahead and practice a conditional column now. And we're going to create a condition based on the quantity field. So let's select this column. And now in our Add Column tab, we can launch the Conditional Column dialog box. And we're going to create this new conditional column. And let's give it a name like order_size. And here's where we create the rules or the if statements of our logic. We can say, okay, if the quantity field is greater than any value we choose, say five, then the output of this order_size column should be large. And if that was the only thing we put in here, we would see a value of large in this new column for any transactions where quantity is greater than five, and we'd see a false for everything else. But we want to create more rules here, so we can say, add another rule. So it's going to check the top rule first. And then if that rule does not pass, if the criteria is not met, then Excel is going to move on to the else if statement next. So the else if will be, okay, if quantity is not greater than five, then let's check if it's greater than or equal to two. In other words, if it's two, three, four, or five. And if that's the case, let's call transactions with a quantity of two through five, medium. And then we'll add one more rule. This is really the only other possible outcome here, since we're dealing with positive whole numbers, but couldn't hurt to add it. The only other option is if quantity is equal to one, then we'll call this a small order size. Now, even though these do make up a mutually exclusive and fully comprehensive set of outcomes, there shouldn't be any value in the quantity field that does not match one of these three. It still doesn't hurt just to put a value in this otherwise box, you know, in case you run into something silly like a mistaken value or an error or something like that. So it's going to flow through each of these statements and then return the proper value in this new order_size column. So press "Okay." There you go. We've got order_size with small, medium, and large. And you see this icon, sometimes it says list may be incomplete. You can just press "Load More." Sometimes it just looks at a small sample of rows just to save some processing and memory, but this just confirms, yep, we're only seeing those values, large, medium, or small, for this new order_size column. So there you have it. Just one example. But the same process applies no matter what conditions you're using or what column you're creating conditions against. So a pretty user-friendly tools for the most part. So that just about does it. I think this table is good to go. So we'll go ahead and close and load to. Connection, to the data model, and press "Load. And we'll see how many 1997 transactions we're dealing with, 86,837. So you may have noticed we're not dealing with huge, huge files in this course, even though we certainly could. Power Query can handle it. The data model can store hundreds of millions of rows. But just for the sake of keeping things quick and snappy and avoiding lagging calculations or potential crashes, we're dealing with some relatively small tables. So there you have it. Check our data model Transactions_1997 with our new order_size column and our transaction IDs. All set. So close that window, give our file a save, and there you go, Index and Conditional Columns.

Contents