Your Content Store is Big Data

By Paul Hausser, Envisn, Inc.

The term Big Data is used everywhere today but does it really apply to your Cognos Content Store? It depends on what you mean by Big Data. The consensus definition of Big Data actually is something like this:

Big Data is a term for data sets that are so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, disparity, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy. It’s a long list.

To this we need to add another important dimension; accessibility. Data that can’t be readily accessed or available for use in any meaningful way is just a blob.

This thing has no edges!

For many Cognos administrators the Content Store is only accessible by Cognos itself when the metadata within it is needed to serve users needs through processing a report or query. But that’s what it was designed for and it does that job very well. It was created for a different purpose than what you need it for.

Yes, you can use the SDK to get some basic things from the Content Store. But trying to do things like analyzing security, report documentation or dependency analysis requires a lot more work to get what you need. Even then, if it’s a one-off task that’s not easily replicated it will soon be seen as wasted time once it’s needed again. Without the ability to leverage access for multiple needs and tasks in a routine manner it’s beyond the reach of most administrators. Every administrator has come to the same realization with their Content Store; Simple needs are not so simple.

What’s inside your Content Store?

A large Content Store in an environment that has a few thousand users could easily have:

50-70 thousand objects or more; reports, queries, etc. of public and personal content.
Each of these objects can have hundreds of individual properties with jobs, schedules, dependencies, distributions, status, etc.
Security and all its permutations including inheritance, denials, groups, roles, etc.
Multiple data sources that can be used in multiple ways.
Relationships and dependencies between objects, properties and everything in the Content Store.
Saved output by users. This typically represents 90 percent of the physical size of a mature Content Store.

Deal with it

This is a lot of stuff and you’re looking at some very large numbers. The first thing that needs to be done is find a way to organize it so that it can be used. We faced this issue when we created our NetVisn product and found we needed a very large database with some unique properties for handling data that is primarily hierarchical. This, along with some good architectural design enabled us to bring order to this mass of data.

Do you really need all of it?

Yes, and just because you don’t know now what you may need doesn’t mean you won’t need it later on. So find a way to store it for later use. Storage today is cheap.

Plus, IBM Cognos continues to expand the breadth of its products and what they can do. Things like TM1, dynamic models and cubes, Workspaces, etc. all add to the list of what needs to be managed within the environment which means that it really is Big Data and it just keeps getting bigger and more complex.

The good news is that it can be conquered with the right construct and tools. A lot of good insight into how it all relates to each other really helps too. The payoff is huge.

Cognos Content Store Analytics

as you've never seen it

Envisn's IBM Cognos Blog