This post has been republished via RSS; it originally appeared at: Excel Blog articles.
|Round 1 Post||Round 2 Post||Round 3 Post|
Hello Excel Fans! I’m excited to kick off the first in a series of posts reviewing the tests of the #ExcelWorldChamp competition!
For those of you that have not seen any information on the competition, you can learn more here. In short, from October 2016-November 2016 we ran a series of Excel challenges in many countries to find local Excel Champions. In 2017, these country champs will compete for the world champ title!
Before I go any further I want to stress a few points for this post and all posts to follow...
- There are many ways to solve a problem in the real world, and Excel is no different.
I’ll share an approach to completing each challenge of each test that hopefully teaches you something new or at least gives you something to think about. I am in no way trying to say that my way is the only way to get the job done. This community is all about discussion. So feel free to share your own ideas.
- Data Visualization is a subjective art
I’ll share some thoughts on the data and point to some of the charts posted to the community, but I will not spend a lot of time discussing the visualization challenges. Personal tastes can have a strong influence in this space. There has been a lot of fantastic work shared on the community site, and I encourage you to explore those posts.
- Videos and Files included
Where possible I include videos that walk through the steps outlined in each solution, so you can see how to accomplish the tasks in Excel. When I write a time at the end of a step, the step is demonstrated at approximately that time of the video. Sorry, I know i talk a lot in the videos. I try to give as detailed an explanataion as i can so that a person with little Excel experience can follow along. I've also attached a copy of the solution workbook below (link), so you can inspect methods in detail.
- Shameless Plug for “Tell Me” feature
“Tell Me” was added to Office 2016 applications as simplified approach to help. Search words related to the task you want to accomplish and Tell Me provides some options, including the ability to click directly into an option and start using the feature – in other words you don’t have to navigate the ribbon to find the feature. This can be helpful even if you are an Excel pro and just can’t remember where to find a command, like the “Sort” button for example. In some of the videos that follow I will share some cases where you could use Tell Me.
Challenge 01-01 (Data Relays part 1)
The intent of the question was to introduce tables.
The question COULD be solved using Get & Transform (Power Query) tools to produce a query that easily updates results if your data changes. This is one reason why I love Get & Transform (and we will explore this in a later round).
For now, we’ll approach this question as having a table of data that we need to quickly manipulate once to get to an answer and not worry about needing to repeat the steps in the future if the data changes. I’ll also avoid the use of any formulas in this solution, so that we stick to just manipulating the data. This approach will be a little slow and a bit tedious, but it’s starting point for playing with data in Excel. This is not the best practice for tackling a similar problem if you encounter it in the real world…better to follow methods that will adjust dynamically if your data changes. Again, we’ll get a chance to see how much easier modern Excel can make our lives in later rounds.
- Filter the Original Data Table 1 (01:49) – For this question we are only concerned with Round 3 competitors, so let’s do a quick filtering of the results to get my list of Round 3 competitors
- Sort Table 1 (02:18) – To keep ourselves organized as we collect the data, let’s sort the competitors by their name. The question asks us to sort by Result Time, but we can do that later. In the next steps you will see why sorting by name will be helpful. Don’t know how to sort data? Search Sort in “Tell Me”
- Copy/Paste the Athlete Names (02:45) – We will use this list as the starting point for building our final answer. We now have an easy reference point of the names that should be in our final response.
- Structure Our Data Table (03:50) – We will add some place holder columns for the additional data we need for our answer. Following the question requirements, we’ll add columns for Country, Age, and the 3 rounds of scores. The order we put these columns in does not matter.
- Refilter and Sort Table 1 (06:30) – Since we now know the names of the Round 3 competitors, we can refilter the original data to just see the 3 rounds of data for those Round 3 competitors. By sorting the data by Name and Round number, we can make our next job (copying and pasting of data) easier. The data in the original table will be closer to how we have our new data table structured.
- Copy/Paste the Result Times (08:10)– We need to copy the results for our 8-Round 3 competitors over to our answer. This can be tedious, but the ordering of our data will help speed this up a bit – the results times are ordered in the same manner as our in-progress answer in structured. (This is a step we will especially appreciate skipping when we start using Get & Transform)
- Filter and Sort Table 2 (11:55) – We need the countries of our Round 3 competitors. By filtering this table down to Athlete names we want, we get the list of countries we need. If we then sort Table 2 by Athlete Name, the data will be sorted the same way as our answer table, which makes for another simple copy and paste….
- Copy/Paste the Countries (13:20) – We could use formulas to grab the countries in a less manual way, but for now we’ll stick to the manual process and keep our formula talk to the Formula Wrestling Challenge
- Filter and Sort Table 3 (13:35) – Just like step 7, we can quickly grab the ages for the athletes we are interested in
- Copy/Paste the Ages (15:00)
- Convert our Answer to a Table (15:18) – Tables are more than just convenient formatting. They are great for keeping data organized, making references more readable, and building dynamic formulas that expand with your data among other things. You can check out this article to learn more about tables and why they should be used. The video shows how to create a table via the ribbon, but you can also use the keyboard shortcut CTRL + T
- Sort our Answer (16:35) – The last step is to sort our data the way the question asked: Fastest round 3 Result Times first, with ties broken using the Fastest round 2 Result Time. Since the “fastest” time is the smaller time, we want to do ascending sorting (i.e. sort numbers so the smallest number is first and get bigger as you move down the list). We need to do 2 “levels” of sorting to accomplish the tie breaking outlined by the question. The video shows how to accomplish this more advanced sorting.
Challenge 01-02 (Data Relays part 2)
The previous question set us up well to tackle this challenge. We’ll leverage some of the strategies we used but layer on some more of Excel’s built-in data tools. Again, I COULD use formulas or Get & Transform to help make this process easily repeatable, but I am going to take a more simplistic approach. I am confident my data will not change nor will I need to repeat the process later, so I am going let some of Excel’s data tools, make this challenge just a few button clicks….
- Copy Paste Table 1 (00:35) – We will take a copy of the data table over to the response section and use this copy for all the data manipulation that gets us to the final answer. By doing this as a copy paste of the whole table, our response is already a formatted table.
- Sort by Countries and Time (01:53) – We want to group all the countries together and then sort by result time (two level sorting like what we did in the first challenge). Why do this? Sorting by country gets all the rows for each country together. It doesn’t matter what order (ascending or descending) we use, we just want the same countries grouped together in the table so that there is a little “section” (set of contiguous rows) for each country. Then sorting by Result Time (ascending) means we push the fastest time for each country (the time we are interested) to the top its section.
- Remove Duplicate Countries (03:00) – We will use Excel’s Remove Duplicate capability (found in the Data ribbon) to delete rows from the table. By running this command on the “Country” column, Excel will start at the top of the table and travel down deleting every row where the country has already appeared in the list. In other words, the first time the country appears, that row will be saved; all other times the row is deleted. Because of the sorting we did in step 2, this step will save the fastest times for each country!
- Sort our Answer (04:28) – The last step is to sort our table the way the question asked, by time (fastest first = ascending sort) and athlete number (smallest first = ascending sort)
Challenge 02 (Chart Gymnastics)
As mentioned before data visualization is highly subjective. That said, I was very impressed with the creativity and the thorough data analysis people incorporated into their final visualization. On the latter point especially, the community really blew me away.From my view, the challenging pieces of this data set were that it included:
- data over time
- Data over time makes a great candidate for line and column charts, where time is represented on the horizontal axis.
- data that had very different values (large and small numbers)
- Data with many different values can often be tackled by using a combo chart with 2 vertical axis with different scales on them. For example, a column chart using the primary axis to display large numbers and a line chart using the secondary axis to display large numbers.
- many different types of data (5 columns of data, not to mention the possibility to combine
- Many different types of data is another good reason this data set might be best served by multiple charts.
A very common pitfall when people make (I am now talking about data professionals, not just contest submissions) in this situation is try to cram all their data into one chart – which can often make charts unreadable and a message totally lost. Even when trying to convey a correlation between data, 2 well positioned charts with similar formatting and some shared component (e.g. the same time frame on the horizontal axis) can be effective.
- Many different types of data is another good reason this data set might be best served by multiple charts.
I’ll point out 3 of the most liked charts by the community from Round 1 courtesy of
|link to post||link to post||link to post|
You’ll notice that all three of them employed the tactics mentioned above. And when they really wanted to land a message, they spelled it out in plain text with arrows or other indicators telling us where to look. It doesn’t hurt that they also did some slick formatting to provide very clean but strikingly different outputs. Another one of my favorite things about Excel – it is really an open canvas where you can explore your creative side as well as any data you might have. For those looking for general pointers on charting in Excel, check out these 2 blog posts on Office.com. There are plenty more articles on charting tips and tricks if you search “Excel Charts” on the blog too.https://blogs.office.com/2013/10/02/telling-a-story-with-charts-in-excel-2013/ Challenge 03 (Formula Wrestling) This challenge was an introduction to formulas. The questions were intended to be straight forwarded and most can be solved with a single function, assuming you know the functions exist. If you don’t know about the functions, Tell Me can be a lifesaver. Things also get a little more complicated as we get down to questions 4 and 5. For the questions below I will share a solution that I think has a clear enough logic flow, especially for someone less familiar with Excel functions. Where I think it could be useful, I’ll also share the most streamlined (least number of functions/operations) solution submitted – often times these 2 approaches are the same. I recommended checking the video and the solution file to get the best understanding of these answers…
- What was the average result time across all rounds of the competition? (00:30)
We can use the AVERAGE function. As an intro to functions, the video also demonstrates how to use the SUM function to add the results times, and the COUNT function to count the number of times that are in the list. The sum of a set divided by the number elements in the set gets you the average. But SUM divided by COUNT is 3 functions/operations. AVERAGE is 1 function/operation.
- What was the age of the youngest athlete? (03:50)
“Youngest” means smallest age, so we can use the MINIMUM function to find that smallest age
- How many athletes were age 25 or younger? (04:33)
We need to count all ages less than or equal to 25. Luckily the COUNTIF function makes that easy. In the video I walk through an introduction to conditionals (e.g. the IF function) to show how we could construct a solution by doing an IF on every age to check if it is 25 or younger and then counting our matches, but again when we optimize for fewest functions/operations (and ease of creation J) COUNTIF is our winner
- What was the fastest result time (any round) for an athlete under 25 years of age? (09:12)
Here’s where we start to get into more complex questions. We need to find the fastest result time but only among athletes with an age less than 25. In the video I share 2 approaches. First, we can use the IF function to conditionally check athlete ages and generate the list of result times for those under 25. Finding the MINIMUM from this reduced list of times gives us our answer. Alternatively, users with Office 365 have access to new functions added in January 2016 including MAXIFS and MINIFS. They work in a similar way as COUNTIF. So we can do all of the work as the first approach with fewer functions.
- How many different countries participated? (16:44)
To put it simply, we are trying to count unique values in a list. If you think, Excel should have a UNIQUE function, a lot of people would agree with you :smileyhappy: (Don’t forget you can always go to http://excel.uservoice.com/ to make feature requests.) There are a lot of different approaches to take. In the video I show how we can leverage our trusty COUNTIF function plus some mathematical know how to generate our unique count. Some contestants out there used this same concept and deep knowledge of Excel to condense this COUNTIF approach into a very compact formula. I did not know this approach was possible until I saw these submissions, so THNAK YOU!
Step 1: We use the COUNTIF function to create a list that shows how many times each country appears in the list. But we do this for the entire list without caring if we calculate counts for a country multiple times. So if “JAM” appears 3 times, we generate the answer “3” 3 times as we run down the list.
Step 2: We take the inverse of our counts, so the 3 “3”s for “JAM” become 3 “1/3”s
Step 3: We SUM our list of inverse counts. The 3 “1/3”s for “JAM” add up to 1 – in other words 1 country represented. This same math works out for all the countries in our list regardless of how many times they appear (e.g. if a country appears 5 times in the list, it will produce 5 “1/5”s which still adds up to 1!)
STEP 1: We can condense our list of COUNTIFs into a single formula. This produces an “array” of counts. This doesn’t get represented well in a cell, because cells only store 1 value, but that’s ok for now. This is just an interim step
STEP 2: Take the inverse again. This time we get an array of fractions just like the many cells containing fractions like approach 1 - again this is not well represented in a single cell
STEP 3: Add up our inverse counts again. The SUM function will not work in this case. But we can use SUMPRODUCT. SUMPRODUCT is intended to multiply corresponding elements across many data sets and then add up the results of all those multiplication problems (in other words SUM up the PRODUCTs). By feeding SUMPRODUCT a single array, there are no products to calculate. So Excel returns the sum.
Approach 2 is they exact same mathematical logic as approach 1, but shrunk down into just 3 functions/operations (1 COUNTIF, 1 division, and 1 SUMPRODUCT)Now that just scratched the surface. Like I said before, there are many other possible approaches to take. But I hoped this help gives you ideas on how to tackle your own problems with the power of Excel.
To those of you that made it to the end of this post – thank you! We really appreciate your interest in the #ExcelWorldChamp, this community, and Excel in general. I’ll be following up with posts on the test solutions to the other rounds of the competition shortly. Thanks again!
|Round 1 Post||Round 2 Post||Round 3 Post|