Adaptive Modelling - An Item-based Approach

Introduction

Working on projects over the years, there was a confluence of “classical” TM1 model design and an approach that was often used to implement Cognos Planning. This occurred when TM1 Contributor was released, and TM1 was quietly ushered in to become the core engine for all of IBM’s analytics products.

This caused an interesting issue. When trying to design for Contributor, the typical TM1 model design often led to being unable to use Cube Views for simple data input — and recall in the first version of Contributor, all we had were CubeViews! It did not yet support Websheets at that point!

To overcome this, I began experimenting with cube designs that were intended to appear more friendly in Contributor, and this led to a model design I referred to as the “item-based approach”.

Over time, the necessity that had caused it to manifest had disappeared, but I kept returning to it, for the many reasons I will discuss in this article. The “item-based” approach has many advantages, and a few gotchas, that are worth considering in any implementation you might be involved with.

The Approach

The item-based design approach has the goal of allowing data input in a flat, tabular format without giving up the browsability and analytic capability of highly dimensional cubes. It also separates input from reporting in a very elegant fashion.

You begin with an input cube, which should have only the basic dimensions, plus a measures dimension to represent the columns of the input table, and item dimension to represent an arbitrary number of rows.

The measures dimension will include many string elements which map to other dimension elements. Thanks to the picklist feature in TM1 9.5+, these lists can even be restricted to ensure invalid entry does not occur.

A separate reporting cube is then created that maps the string elements to actual dimensions, for reporting and analysis, usually via rules. This cube has no data entry and populates itself entirely from data in the input cube. You could also use TI to populate such a cube without too much trouble, for implementations that have higher data volumes.

I call it item-based, because this approach naturally requires an item dimension, with arbitrary names. Most of the time we just call the elements “Item 1”, “Item 2” etc, up to a maximum amount. Because this maximum is imposed, it is important to keep the efficiency of the model from being affected by the number of elements in the item dimension. More about that below.

Advantages

There are many advantages to such an approach.

Data-entry simplicity

New users of TM1 are often uninitiated, and, dare I say it, sometimes under-prepared by standard TM1 training courses, to understand the full advantages of a multi-dimensional database model. It doesn’t matter what you do, some users have spent way too much time in Excel or Access and simply think in terms of tables.

And why should they bother? Many of these users are simply data contributors, and do not have any interest in performing extensive analysis on their data.

The flat input approach allows such users to contribute their data in a way that makes sense to them.

It also allows them to adjust manual inputs and correct errors without cutting the data from one intersection and pasting it in another, an operation which can be error prone and, let’s face it, slightly buggy in the TM1 Perspectives cube viewer, and difficult in the TM1 Contributor front-end.

Maintainability & Agility

TM1 implementations are naturally agile and flexible. Developer with an IT background like myself might fight against it, and try to impose strict, inflexible Business Requirements and a rigid change request process to protect against scope creep, but that really undermines one of TM1’s key advantages in the market place: agility.

Imagine a retail sales model, which has Region, Sales Rep and Distributor as data points of interest. Sales reps and other users contribute data from the field using their laptops.

In a typical TM1 design, you’d create a cube with Region, Distributor and Product as dimensions. The input form would ask the user to select elements from each of those 3 dimensions and would write the sales/inventory data to the intersection of the elements chosen.

All is good, and the managers and finance staff can browse the cube and get the insight they need.

However, imagine, after months of data has been collected, someone in head office decides they would like to also track the data by Customer Type. The data already exists in the point of sale system, as each customer is tracked by credit cards and loyalty cards they use when making a purchase.

With your typical design, you don’t have a lot of choice, but to redesign from scratch and create a new cube with the additional dimension. You might choose to keep the existing cube for backward compatibility, in which case you’d have two sources of the same data, which could lead to synchronization issues since the original data is manually contributed from Sales Reps in the field.

It’s your basic nightmare, and if you were halfway through the implementation, you’d probably tell your customer that it’s a change in scope and that it would have to be left to phase 2.

With an item-based approach, you don’t have these issues. You can take the new data from the POS systems, import the Customer Type field via TI (while creating the new Customer Type dimension on the fly), then update your reporting cube and rules.

Yes, you still have to do some basic redesign, but there is no requirement for a complex and error-prone data migration.

Contributor & Insight-friendly

TM1 Contributor (or “Applications” as it’s now known) and Cognos Insight, are great front end tools for data contribution. They are a little weak, however, when it comes to customizing views to be friendly for the end-user. A highly dimensional input cube forces the view designer to choose between unworkably large grids or many laborious title element selectors which make cutting and pasting data difficult.

A flat, item-based input cube is much simpler to work with, supports multi-level cut and paste, and presents itself in a more logical fashion for quick data input. String values can be typed in as well as selected from a list, then copied down as necessary.

Downsides and Gotchas

Performance

If you’re not careful, this design approach can tempt you into inefficient rules and over-feeding. Performance can suffer with large data volumes.

However, with better design and a clean rule-based approach this can be avoided. Over-feeding is not necessary and rules can be structured logically and efficiently.

As always, TM1 has it’s quirks, but once you understand the gotchas associated with this design approach, they are easy to avoid or work around.

I’m planning several follow-up articles that will go through these issues in detail, and how to make sure they don’t have you pulling your hair out.

Complexity

Rules in this design approach can appear more complex and be harder for another developer to understand. I have first hand experience of handing over such a design to very capable developers and having them screw up their noses and replace my cubes with a more standard TM1 design.

I believe this is partially a cultural issue, as TM1 is taught in a particular way, and that has become accepted as “correct”. Once a developer adjusts to this kind of thinking, it’s actually very difficult to go back!

Obviously well-formatted rules and code comments can go a long way to alleviating this issue also.

Limitations

There is a natural limitation imposed by the item-based approach, and that is the number of elements in the item dimension forms a maximum number of “slots” for data input.

To avoid the situation where a user does not have enough “slots” to input their data, a developer might be tempted to include a large number of elements in their item dimension, and, if the rules and feeders are designed poorly, this could lead to poor performance.

However, a well designed cube won’t need a lot of input slots, as the average person is not able to navigate, or even usefully perceive, thousands of rows of data!

In our retail sales example above, there may be many sales items entered, and at first glance, it may appear to require a form with thousands of visible items shown. With a bit of thought, it usually possible to group input tasks meaningfully so that only a useful number of items need to be shown for the current input task. For instance, the Sales Rep could enter only the data for the particular store they are visiting, as they most likely wouldn’t be entering data for several stores simultaneously.

And, either way, a more dimensional approach does not mitigate this problem either!

Conclusion

With a bit of planning and thought, an item-based approach to TM1 development offers many advantages and rewards.