Data Visualization – a real world example

In the following example we work through a real world example of a data visualization. We’ve chosen an example that involves Operations data – this is fairly non-domain specific so hopefully it can demonstrate some important points. The first, and most important point is that you have to define your audience.

We receive many questions about “what is the best chart for this situation” or “what colour should I use for emphasis”. These questions are usually attacking the problem from the wrong angle. The one question you need to ask before anything else is “who is this visualization going to be seen by and how?” Is it in a boardroom on a printed sheet or across a trading floor on a plasma screen. Are the consumers domain experts?

This example features data about an investment bank’s operations processing, the audience being the clients of the Operations department.

Starting Point

Initially the project started out as simply trying to record what operational problems were encountered on a daily basis across different product lines. A reporting system was built and various generic reports produced:

DVBlog1

Unfortunately the reports either didn’t contain data at a granular enough level or it was difficult for the product managers to see where the issues were occurring and what the trends were. In reality the report showed what the major problems had been – unfortunately this was already known, as when something major goes wrong you remember getting shouted at!

What was requested

The client wanted a report that showed where the problems were occurring across business lines (rather than operational units) and how they were doing historically in a single page that could be included in a weekly MIS pack (they currently had four pages per product line (8) so a total of 32 pages. As a first pass they simply wanted an Excel worksheet they could update manually:

DVBlog2

We felt this solution lacked clarity and it was very difficult to spot trends across products.

What we proposed

We designed a solution using MicroCharts to allow small multiples of charts to show a variety of views:

DVBlog3

This solution allowed the user to view the data simply as a cumulative set of data by Product (top line) or by Root Cause (vertically) and then look deeper into historical trends in the centre of the chart. For example, its fairly easy to see spikes in the Root Cause data historically and see that the overall trend has improved over time. By ranking the Products and Root Causes you immediately give some sense of scale to the data. For example you can see that there are many more Application failures than any other type of problem, but the majority of root causes are otherwise fairly evenly distributed.

One other point worth noting was that the original colour scheme was much more muted, but the client got very upset that it looked like a competitor’s corporate colour and wanted it to be “louder”.

What was the user reaction…

Ecstatic, 1 page replaced 34 and they could see at a glance how the entire (large) organisation was working but also quickly find out detail for a particular area and identify trends.

Cube Design – meeting the business needs

 

Following on from our previous blog post on a couple of the common cube performance issues we’ve seen this last month, I thought I’d mention some of the non-technical issues we see quite often. In one case, once we’d made a few teaks and sorted out the cube performance issues we had to ask – Is the cube doing what it needs to? (Of course we did ask this first but the priority was sorting out the current cube performance!) Does it meet the business requirement? There’s no point in having the most complex cube that uses all the greatest features if it can’t answer the users queries.

In reports, we’ve seen examples where clients have nested four or five attributes to build up the effects of a hierarchy or run huge queries then vlookups on them to get the data they need, or bring back 12 columns of data and manually work out year to date, or not have any hierarchies that reflected commonly used groupings of members, or not have member names formatted in the way the business needs. To us this just isn’t right.

The users might not seem to care too much if they don’t know how the cube could work or if it runs fast enough to bring back huge result sets they can manipulate themselves – but doesn’t that negate the point of having a cube and your investment in it? Consumers of the cube should have fast, timely, accurate and importantly appropriate data made available to them in a manner that makes sense.

Cube design and build is about understanding the business and users needs and then building the cube and associated processes, that’s before even starting to build the reports and conveying the information using good data visualisation practices.

All too often we’re seeing a drive to use the latest tech, the flashiest widgets, cool looking 3D and shading effects on reports through to cubes and databases with every conceivable hierarchy or type of measure thought possible but not bearing much resemblance to what the users need to see.

I won’t hide the fact that we’re very proud of our skills and experience in ensuring our clients get not just a technically excellent system but also one that fits their needs. If you want to talk to one of the team about how they can help, you can find our contact details here.

Common Analysis Services Performance Issues

A quick blog post from the Services team here at XLCubed on some performance problems with SSAS that we’ve seen again recently. With the processing power and memory available it’s pretty easy to build a fast cube – both for query performance and processing time. It is also easy to be lax in cube design, ignore the warnings and best practice guidelines, and end up with a cube that’s looks concise, is neat and clever but performs terribly for end users.

We’ve come across a couple of examples of this at client sites in the last month, and there are some common issues that always seem to jump out – rectifying these normally has a very positive impact. The three most common culprits we see are:

Parent-Child dimensions – Parent-Child dimensions are nice and easy to build and use. However, as you can’t build aggregations that include a parent-child dimension it can make for a badly performing cube! Try to flatten dimensions out and evaluate exactly why a parent-child dimension is required and being used. They are not the only option..

Unary operators, Custom-roll ups – we’ve seen cases where these have been included in every dimension in a cube by default. If there isn’t a need for them – leave them out! If you can get around using a custom rollup or unary operator by some simple work in the ETL process it may be better to do that first.

If your query performance is bad – try removing all unary operators and custom rollups then re-test the cube. How’s the performance now? It should be significantly faster – evaluate and review the need for the unary operators and custom rollups and see if the same effect can be achieved differently (e.g. in the ETL layer)

Cache vs. Non-Cache Data – Basically is the cube recalculating and re-querying numbers over and over again or can it re-use results? Use profiler to check for cache or non-cache data when your queries are running. So many times we’ve seen all queries not using the cache because AS hasn’t been given enough available memory or volatile operators such as now() have been used in mdx calcs.

Resolving the issues above had a massive impact – reports taking up to 3 minutes to run were down to a few seconds, users could begin to use the application properly for the first time, however fixing the performance may be only part of the task. The cube of course needs to have been designed to meet the business requirements, but that’s another blog..

2009 Excel Dashboard Competition Winners

Thanks to everyone who entered this years competition, again the standard was very high, and it’s always great to see the product being used so effectively. The entrants were extremely varied in both their style and subject matter, and made for a difficult decision. However I’m pleased to be able to announce the winners:

1) Ajay V Singh – Operations Dashboard for a Debt Collections Company.

The target audience are the CXO level execs of the business, aiming to provide a view of all the nerve points of the organization in a single unified interface that is portable and yet comprehensive.

The dashboard layout is dense but uncluttered and well thought through. Colours are well balanced, and allow the reds to draw the reader’s attention as intended.

Ajay’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

Collections Dashboard Screenshot

 

 

 

 

 

 

 

 

 

 

 

 

 

2) John Munoz – Insights into Unemployment in the United States.

Using data from the bureau of Labor statistics, the dashboard gives a deep glimpse into the unemployment situation in the US. A large volume of disparate and tabular information is brought together in a single concise view, which aids understanding and adds real insight. The trends and demographic splits come through very well, and make for easy comparison.

John’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

unemployment_dashboard_munoz

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3) Lisa Cunningham – Anti-Social Behaviour Dashboard

The dashboard is produced by the Research and Information Team at Leicestershire County Council as part of a suite of dashboards produced for the Crime and Disorder Reduction Partnerships. It is available to the public through the local web portal, which makes readability, and also the contact information provided vital. The dashboard aims to provide an at a glance view of the level and trend of ASB, and does an excellent job.

Lisa’s background summary of the dashboard, with larger screen shots, will be available on our web site in the coming week.

ASBDashboard

Excel Dashboard Competition – deadline extended

We have decided to extend the entry deadline through the holiday period, to 28th August.

As a reminder, the competition is for real world solutions (no sample data set), and judging criteria include:

  • Clean and clear organization
  • Effective table and chart design
  • A single-screen display, properly designed for the web, screen or print outs

See the competition page for more detail.

-Thanks to all of you who have already entered, the quality has again been good, and will doubtless lead to an interesting debate when it comes to choosing the winners. As we’ve extended the deadline if there are any additional tweaks you’d like to incorporate you can of course send revised versions.

Microsoft Business Intelligence roundup

There have been a few key announcements in the Microsoft BI world recently, we’ve gathered them up and summarised below in case our readership have missed any of the key announcements.

SQL Server 2008 R2 – (CTP Summer 09)

“SQL Server 2008 R2 expands on the value delivered in SQL Server 2008 by providing a wealth of new features and capabilities that can benefit your entire organization. This release will further improve IT Efficiency with new and enhanced management capabilities and empower business users to access, integrate, analyze and share information using business intelligence tools they already know.” Read more here.

So what does this mean for you? In R2 there will be a number of new features from Gemini to Master Data Services, support for more than 64 processors to extended functionality in the Management Studio. We’re all looking forward to Gemini and the potential that has to offer – rest assured the XLCubed development team are working closely to ensure that the product is compatible straight out of the box. If you have any questions contact support@xlcubed.com

Service Pack 1 for SQL Server 2008 Available  (April 09)

Microsoft announced the release of SP1 for SQL Server 2008 earlier this month, for many this marks the psychological point at which they’ll take interest in and investigate the product in depth. With a large uptake of the product already in the market place and the fastest OLAP engine we’ve seen from Microsoft, there is now no excuse not to evaluate upgrading or migrating to SQL Server.

Contact our services team for more information or how we can help you with SQL Server 2008.

SQL Server 2008 SP1

Service Pack 1 for SQL Server 2008 is now available for customers. The Service pack is available via download here and is primarily a roll-up of cumulative updates 1 to 3, quick fix engineering updates and minor fixes made in response to requests reported through the SQL Server community. While there are no new features in this service pack, customers running SQL Server 2008 should download and install SP1 to take advantage of the fixes which increase supportability and stability of SQL Server 2008.

Customers have no reason to wait to upgrade to SQL Server 2008 and many are already taking advantage of SQL Server 2008 as a smart IT investment. In fact, there have been over 3 million downloads of SQL Server 2008 since the RTM in August. With this Service pack, Microsoft is introducing 80% fewer changes to customer configurations compared to previous SQL Server Service Pack releases. This remarkable decrease is a testament to a revised product development process and updated servicing strategy that is focused on ease of deployment while keeping customer environments stable.

Microsoft BI Conference moves bi-annual

The MS BI conference last held in October 2008 in Seattle, WA has now been changed to an bi-annual event, citing  global economic constraints to travel budgets worldwide, Microsoft are moving the BI conference to a bi-annual event, with the next conference scheduled in Seattle on October, 2010. The next BI Conference scheduled for October 2009 will be moved to October 2010 in Seattle, WA, and all further BI Conferences will be held every second year on an ongoing basis. Content till then will be covered at the SQL Pass Summit, TDWI and SharePoint conferences.

If you were looking forward to seeing the XLCubed product team at the BI Conference this year, don’t worry you can still contact them at xlsales@xlcubed.com

SQL Server Fast Track Data Warehouse (Feb 09)

Microsoft announced SQL Server® Fast Track Data Warehouse, a new set of Reference Architectures for SQL Server 2008 that enables customers to accelerate their Data Warehouse deployments and reduce cost.  In addition, customers can further jump start their Data Warehouse design with new industry solution templates provided by System Integrators – Avanade, Hitachi Consulting, Cognizant and HP.

Seven new Reference Architectures with storage capacities from 4 to 32 TB were unveiled in partnership with HP, Dell and Bull.  Developed and tested by Microsoft, these architectures use balanced hardware optimized for Data Warehousing.  As a result customers will get

  • Better price performance than competitive solutions.  Fast Track Data Warehouse offers similar performance to the competition at 1/5th the price
  • Faster time to value and lower cost to setup and configure
  • Better performance out of box through pre-tested hardware. 

clip_image001

Customers can also choose the right Fast Track Data Warehouse with the right performance, storage capacity and pricing to suit their business needs.   Unlike Appliance Vendors with proprietary solutions, the new reference configurations use industry standard hardware from Dell, HP and Bull giving flexibility and cost savings to customers.

Fast Track Data Warehouse is available from today: customers will buy their SQL Server 2008 licenses through their preferred Microsoft Partner and the hardware from Dell, HP or Bull. If you’re looking to implement a data warehouse, contact the services team to see we can help.

Demise of Performance Point Planning (Jan 09)

It’s been a few months now since the announcement by Microsoft of the demise of Performance Point Planning, and the rebranding of the Monitoring and Analytics elements as PerformancePoint Services. This was an announcement back in January (09) that caught many by surprise, however for us its provided a useful segue into the new XLCubed Planning application. Many customers were waiting to see what was coming next, when PerformancePoint would be ready to compete with the likes of existing players with proven planning technology (i.e. in memory OLAP) and the  tempting announcements around Gemini certainly added confusion. Now looking back at the conversations we had in Seattle and Microsoft presentations perhaps the announcement isn’t as big a surprise as it felt at the time.

As above our long term commitment to an Excel front ended planning application continues, the demise of PerformancePoint Planning has simply increased the market for us and in many ways freed clients from the constraints of using purely Microsoft technology. Augmenting the Microsoft toolkit and providing our clients with the functionality they need to build effective planning, budgeting and forecasting applications remains at the forefront of our product set and services.

If you want to know more about our products and services (consulting team) just send an email to services@xlcubed.com and someone in your region and market sector will get back to you straight away.

Augmenting the MS BI Stack

Here at XLCubed we’re often asked how the product sits in relation to the Microsoft Business Intelligence tools.
The answer is that we add to and augment the features and functionality that Microsoft has to offer. Excel is a fantastically powerful and flexible spreadsheet engine and this is exactly what it should be used for. However all too often, Excel is used as a database. With linked spreadsheets, and huge data extracts.

XLCubed have a number of products designed to take advantage of the functionality available with Microsoft Business Intelligence tools, these include XLCubed Excel edition, MicroCharts, and XLCubed

2009 Excel Dashboard Competition

We are pleased to announce the 2009 Excel Dashboard Competition:

The Competition

Like last year, the competition is for real world solutions, we are not providing a sample data set, and we’re looking forward to seeing some great examples of reports, charts and dashboards.

The dashboards are judged on the clarity and effectiveness of their design, particularly

  • Clean and clear organization
  • Effective table and chart design
  • A single-screen display, properly designed for the web, screen or print outs

We’ll also consider technical aspects of the dashboard, did it use effective  techniques for

  • The Dashboard layout
  • Data management, data logic and calculation : YTD figures, variances, etc….
  • Dashboard delivery: Sharing the dashboard via PDF, the web or as an Excel Workbook

There will be prizes for the top 3 entries, with the winner having first choice of prize from:

The Rules
We’ve kept the rules simple:

  • The solution must be in Excel 2000 or more recent, and not require additional software other than Excel and Chart Tamer , MicroCharts and XLCubed.
  • Entries can use any combination of tables, Excel charts,  bullet graphs and MicroCharts (sparklines). Each have their strengths and role to play in an effective dashboard
  • We will publish the top 3 dashboards on our website, so please ensure this is not problematic for any of your submissions.
  • Please change names and data as appropriate in the dashboards to protect the innocent.
  • Final Entries by 19 July 2009, Judges decision final!

     

 

XLCubed Video Library

We have begun to compile a series of videos to help all users of our products. We will keep adding instructional videos on a regular basis, though if you are a current customer and there is a particular element you would like to see let us know at support@xlcubed.com These videos provide a highlighted quick introduction to the possibilities and functionality, to fully explore and understand the capabilities of the product set contact the team to arrange an evaluation or comprehensive training course.

XLCubed Videos:

Video 1 – Getting started with XLCubed In this video we show how to get started with XLCubed, creating a connection to you analysis cube and building a sample report.

Video 2 – XLCubed Grid Navigation in this video we show how to navigate around an XLCubed grid in Excel and utilise some more of the functions.

Video 3 – Introduction to formula reporting with XLCubed Excel edition in this video we show how to use the formula mode in XLCubed to query an analysis services cube.

Video 4 – Publishing a dashboard to the web using XLCubed in this video we show you how you can publish your XLCubed report/dashboard to the web using XLCubed web edition.

Video 5 – Ad-hoc reporting using XLCubed Web using the XLCubed web edition we demonstrate ad-hoc reporting against an analysis services cube in a thin client web browser.

Video 6 – XLCubed user defined calculations This report shows how you can use XLCubed Excel edition to create user defined calculations.

Video 7 – Extending an XLCubed report with Excel functionality The first brief introduction to extending an XLCubed report using Excel functionality. In this video we show how you can drive a report from an Excel range.

Video 8 – XLCubed Relational database support The first video using XLCubed to connect to a relational database, querying the dataset using SQL.

Video 9 – XLCubed Report Templates How to use XLCubed templates to provide a starting point for end users.

Video 10 – coming soon, using parameters in a URL to drive an XLCubed report

Video 11 – coming soon, visual grids – XLCubed grids with integrated MicroCharts

Channel: http://www.youtube.com/xlcubed

MicroCharts Videos:

Video MC1 – Introduction to MicroCharts – Sparklines An introduction to creating Sparklines with MicroCharts for Excel, in cell charts for your Excel spreadsheet or dashboard.

Video MC2 – Bullet Graphs in Information Dashboards  An overview of Bullet Graphs, and how to build them using MicroCharts in Excel

XLCubed V5

Yesterday we released version 5 of XLCubed. V5 offers continued enhancements across the product set including a completely new web interface, embedded pdf printing, and further extension of our interactive ‘Visual Grids’ to include all MicroChart chart types.

MB

 

 

 

 

 

 

 

 

 

 

 

Version 5 is joined by a new Website which now includes a series of ‘how do I’ youtube videos, available on the individual product pages or at the XLCubed youtube channel. The content will be expanded across additional areas in the coming months, and we hope will become a key resource for customers and evaluators alike.

The basic strucure of a cube

This week we’ll take a look at the basic structure of a cube from an end user perspective, as opposed to the architectural underpinnings. This is intended as a high level overview, and for brevity contains some generalizations, and focuses on Microsoft Analysis Services cubes.

An OLAP cube consists of several key elements, the most fundamental of which are dimensions and measures

1) Dimensions.

Dimensions are the business elements by which the data can be queried. They can be thought of as the ‘by’ part of reporting. For example “I want to see sales by region, by product by time”. In this case region, product and time would be three dimensions within the cube, and sales would be a measure, below. A cube based environment allows the user to easily navigate and choose elements or combinations of elements within the dimensional structure.

2) Measures

Measures are the units of numerical interest, the values being reported on. Typical examples would be unit sales, sales value and cost.

Note that there are modelling techniques which develop cubes with only one pseudo measure, typically called ‘value’ or similar, and implement what the user would think of as the measures through a dimension. There are performance and navigational reasons which can make this a good approach, but not one we’ll cover here in our introduction.

The diagram below shows a very simple cube which we’ll use for discussion.

cube

 

 

 

 

 

 

 

 

 

This particular cube is for an exports business, and consists of 3 dimensions, Source, Route, and Time. The two measures are Packages, being the number of packages shipped, and Last, being the last shipped date.

Very few real world cubes will have just three dimensions, but I’ve yet to learn how to draw a 12 dimensional cube! The diagram above is enough to illustrate the fundamental principle, that at every intersection of the different dimensions, are stored the value for each of the measures. In larger real world cubes the principle is the same, just the numbers of intersections is larger.

The diagram highlights a few additional features of dimensions which need to be understood.

  • Hierarchies

A dimension can contain one or more hierarchies. Hierarchies are really navigation or drill paths through the dimension. They are structured like a family tree, and use some of the same naming conventions (children / parent / descendant). Hierarchies are what brings much of the power to OLAP reporting, because they allow the user to easily select data at different granularity (day / month / year), and to drill down through data to additional levels of detail.

    • Hierarchies consist of different levels. For example a time dimension would typically have a year, a month and a day level. A customer hierarchy may consist of Country, State, City, and Name levels.
    • The levels are either implied in the case of dates, or exist as ‘attributes’ in the source data. So for example customer number 12324, John Brown, would have additional information recorded such as his address, broken into house number & street, city, state, country. Each of these is an attribute.
    • Hierarchies are really ordered navigation paths through the attributes
    • In Analysis Services 2005, the user’s view of a dimension will typically consist of both the defined Hierarchies, and also the Attributes. Attributes are ‘flat’, i.e. contain no ordered drill path.

 

  • Members

A member is any single element within a hierarchy. For example in a standard Time hierarchy, 1st January 2008 would be a member, as would 20th February 2008. However January 2008, or 2008 itself could also be members. The latter two would be aggregations of the days which belong to them. Members can be physical or calculated. Calculated members mean that common business calculations and metrics can be encapsulated into the cube, and are available for easy selection by the user, for example in the simplest case Profit = Sales – Cost

  • Aggregation

Aggregation is a key part of the speed of cube based reporting. The reason why a cube can be very fast when for example selecting data for an entire year, is because it has already calculated the answer. Whereas a typical relational database would potentially sum millions of day level records on the fly to get an annual total, Analysis Services cubes calculate these aggregations during the cube build and hence a well designed cube can return the answer quickly.

Sum is the most common aggregation method, but it’s also possible to use average, max etc. For example, if storing dates as measures it makes no sense to sum them.

The cube introduces a number of dimensions, hierarchies and measures, modeling the business of interest, and all of which are available to the end user to quickly and easily select, drill, and slice and dice. With a well designed cube the user benefits from a reporting environment which is highly flexible, contains the pre-calculated business metrics they regularly use, and is fast in terms of data retrieval.