William: August 2010

Sunday, August 22, 2010

OLAP and data mining

Many senior managers and executives say they like businesss intelligence systems/EIS. But majority of data warehouse users still using Excels, reporting and data analysis tools, or their own customized applications to draw data from warehouses and transform it into business reports and charts. In general, these approaches work fine static data analysis with small amount of data.

Multidimensional databases and reporting system usually generate attractive sales presentations and demonstrations. Sales information can be viewed from various dimensions such as region, product type, time and sales person. OLAP enables users to explore enterprise data from different perspectives.

OLAP servers and desktop tools support high-speed analysis of data. Many verdors provide OLAP tools including Microsoft, SAP, Oracle and Cognos. Data manipulaiton in multidimensional databases can be very fast because they store the data in structures denormalized and optimized for speed. But multidimensional databases take huge amount of time to update. Software developers are attempting to deal with the update issue through the use of partitioning.

Data is valuable asset to organizations because it enables decision making. OLAP along with data mining, when incorporated into a data warehousing products, help decision makers analyze historical data and extract hidden patterns in data. OLAP provides drill down and roll up data analysis. Data mining tools enables supervised and unsupervised learning. Multidimensional analysis requires users to interact with the database to find information in the database. Data mining tool does not require users to specify a problem to be examined. For example a data mining tool in a supermarket database can find out which products customers usually buy together. Then, supermarket can provide special offers on these products or put them in adjacent shelves to generate more sales.

Sunday, August 8, 2010

week 3 some thoughts after studying BI development cases

We have examined a few case studies of the development of business intelligence systems in week 3. Unlike any subjects in science which usually have refined theories, it can be concluded from the case studies that there is no ultimate approach to develop a BI system. Different firms use different methods to develop BI, including in – house development, outsourcing, adaptive development and many others which are not popular at the moment. The rationale behind companies’ choices is very complex and some companies even don’t have good reasons why they choose one approach rather than the other. As a student, I have been looking at all sorts of BI system development and hoping that I can extract some BI system development principles or rules from real cases to guide future practise. There are many similarities among technical details, but when I take social, cultural, people and external environment into consideration, it is extremely difficult to figure out what lead us to the successful BI development. Social paradigm suggests that we should look at a problem from different perspectives and try to understand each of them to improve overall understanding of the situation. But human brain can process limited amount of information at any given time. When I look at a case study from different angles, I don’t feel my understanding of the problem has significantly improved. Sometimes, it is even counterproductive that I feel the case is too complicated. We are living in an information age where information is at our fingertips. But how should we process and organize the information in an appropriate way so that we can process them in human brain? In the lecture, we have looked at case studies one by one. It seems that we have learned success and failures from each of them. But I want to know the connections between them so we can build knowledge to help us deal with a wider range of BI system development. In other words, knowledge specific to one company’s BI system does not have great value to other BI systems. I hope I can have more insight into the connections among a variety of BI systems at the end of the semester.

Sunday, August 1, 2010

week 2 data mining and BI application

I have changed one of my unit enrollments from distributed database to knowledge discovery and data mining this week. One reason is that distributed database unit has not changed its prescribed textbook for many years. It’s still using a book which was published in 1999. Technology keeps updating all the time, I don’t believe distributed database technology has not evolved during the last few years. Another important reason is that data mining has close relationship with business intelligence application. Studying these two units can give me more insight into data analysis within the context of business environment.

Business intelligence is information about a company's past performance that is used to help predict the company's future performance. Data mining allows users to sift through the enormous amount of information available in data warehouses, it is from this sifting process that business intelligence gems may be found. Data mining is intuitive, allowing for increased insight beyond data warehousing. An implementation of data mining in an organization will serve as a guide to uncovering unknown trends in historical data. It will also allow for statistical predictions, groupings and classifications of data. Data mining software allows users to analyze large databases to solve business decision-making problems. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. Data mining tools can answer business questions that traditionally were too time-consuming to resolve. As the semester progresses, it would be interesting to see how data mining technology is built into BI applications and how these two units support one another.

week1 introduction to BI application

Business Intelligence application is a fascinating unit. Rob gave a broad description on BI in week one lecture, but there were things I don’t agree with. Rob said some BI applications were powerful enough to be built directly on top of the existing systems. It would be fantastic if we could plug and play BI just like how we use USB drive, but there are issues around BI which need to be addressed if we want to reach the full potential of BI. Many organizations today have dozens of different information systems and each of those systems may be built with different languages and data format. Those systems usually have multiple version of the same fact and don’t talk with each other. In other words, data integrity and consistency is not guaranteed. BI application draws data from a wide range of information system to support decision making. Garbage - in and garbage – out theory suggests that the quality of information produced by computer applications is only as good as the data being used. If the input data is poor, no matter how good a BI application is, it will produce wrong information which has negative value to decision makers. So before a company implements BI, it must cleanse the data and integrate information systems. This can be done through implementing an ERP system in which a centralized database system and web services are used to extract and manage data from multiple system.
I didn’t ask Rob why BI can even be built on top of existing system directly. But I guess it’s probably because many vendors lie about what their BI applications can do to generate more sales.

William