Category Archives: Uncategorized

Story of Metadata

Metadata is data about data. Metadata will define the architecture of the data that is beingstored. Successfully executing your Information Agenda begins with Getting Your Arms Around WhatYou Have Today. This is one of the most difficult challenges that companies continue to struggle with. Itmeans understanding and confirming what data you have, learning how to use and structure that datato optimize your business and how you can implement a repeatable process to manage this informationover its lifecycle and leverage across your enterprise. It means creating a blueprint that accuratelyrepresents your information.
For a successful governance of Data that is getting generated and filling up the warehouse, athorough understanding of the metadata play an important role. There are many aspects to thegovernance of the Data warehouse.
1.Ensure the data is Trusted and strategic.
2.Management of the Metadata.


As cross department information demands continue to grow, the need to address this keychallenge becomes even more important to tackle. For every new IT project comes the daunting task oflocating the information, validating its content, reconciling definitions between sources and ensuringproper usage. The problems are further exacerbated with the increased number of new IT projects thatrequire information across siloed sources with less time to respond. CXOs today still suffer from not being able to get the right information at the right time. Just as information is often times created in the context of specific projects and applications, so are the tools that understand and control the information.
Companies need a unified way to inventory, understand, define and optimize their information separate from applications and technologies. IBM’s unique set of InfoSphere Foundation Tools does exactly just that. Only IBM has the complete set of foundational tools for your Information Agenda to help you to do this across your existing data and content.
The InfoSphere Foundation Toolkit was created to work in any environment and can be deployed at any point with multiple configuration options on any given project for completely flexibility to match your organization’s needs. This unique industry-leading offering allows companies to understand disparate data spread across their heterogeneous systems, govern it as business information over time, and design trusted information structures for business optimization.Foundation Tools combine discovery & understanding, data modeling & mapping, creation of business rule specifications, data stewardship, business vocabulary management, lineage of information and metadata management, all with a shared repository. Foundation Tools works with any data integration,business intelligence, or data warehouse tools or in conjunction with the comprehensive set of IBM Foundation engines for complete end-to-end data integration processing. The Foundation Tools are a perfect entry point for your Information Agenda.




How to build confidence on data

We have sourced the data from various tools into a warehouse. From the Data Warehouse Business Analysts create data models to represent their needs to represent the data in a way suitable to make decisions.

Analysts analysis is based on the assumption that the sourced data is clean and rid of duplicates.

What is Clean data?

Picture shows the unclean data which needs to be cleaned

Unclean data

Unclean data

We can discuss how to clean the data in next post


Here we have some some trainings coming along

2014 Global Cost of Data Breach Study
How much can a data breach cost? And are you doing enough to prevent it from happening to you? The 2014 Global Cost of Data Breach Study from Ponemon Institute, sponsored by IBM, provides benchmark data based on actual experiences of more than 300 organizations. Read this critical research to know more:

How to Calculate Data Confidence
Register for this webinar to learn about compelling new research that identifies the critical criteria to measure and score confidence levels in customer data to make better business decisions:



Based on 2 scenarios, provide the case of requirement of data. But just providing data as it is will not be sufficient. It should be presented in a way that end-user will be able to deduce conclusions which is categorized as Information based on which decision are made.
We have various WIKI sites which give the definition of data and Information.

Information when matures leads to Knowledge. This gives end-user the confidence based on which a firm judgement is made.
we have few tools which assist the users to drive them to decisions with confidence.

Information, Information and Information

Welcome all,

In the current world information is the buzzword. The more one is informed the more one is in better position to make decisions. But why do we need to have information? Where do I need to use this?

Lets see couple of scenarios where we have to use the information
1. I need to buy a car. I do a research on web with the specifications interested ie
a. Budget available ( down payment, EMI – allowable funds)
b. Type of Car
c. No of Miles per day travelling
d. No of people driving and travelling
e. Fuel efficiency
f. Specifications
g. Alternatives available
h. security features

2. A retailer wants to promote sales and want to identify commodities that boost sales
What are all the commodities/items that retailer sells
what are the exclusive comoditives in terms of range, competitive pricing,stock
Margin of profit
Fast moving items and period of sales

From the above 2 scenarios, it is evident that information does makes sense and in fact required to
make decisions. Though it is not necessary that decisions are correct but they would be informed judgements

But would it be sufficient that one is just informed?
Is it should be qualified?
How qualified is the information?
How confident are you on the information?
Who are the sources of the information?

Again these questions lead only to the information sources and confidence in information.

But how is the information generated?
What is information and data?
Are any standards followed in the data acquisition?
What standards are followed and are there any certifications towards the data acquisition.
Is the data clean and standardized?

Lets discuss these things in the following posts.