What is Data Warehousing?
A data warehouse is the main
repository of an organization's historical data, it is corporate memory. It
contains the raw material for management's decision support system. The
critical factor leading to the use of a data warehouse is that a data analyst
can perform complex queries and analysis, such as data mining, on the
information without slowing down the operational systems. Data warehousing
collection of data designed to support management decision making. Data warehouses
contain a wide variety of data that present a coherent picture of business
conditions at a single point in time. It is a repository of integrated
information, available for queries and analysis.
What is Business Intelligence (BI)?
Business Intelligence (BI) refers to technologies,
applications and practices for the collection, integration, analysis, and
presentation of business information and sometimes to the information itself.
The purpose of business intelligence is to support better business decision
making. Thus, BI is also described as a decision support system (DSS).
BI systems provide historical,
current, and predictive views of business operations, most often using data
that has been gathered into a data warehouse or a data mart and occasionally
working from operational data.
What is Dimension table?
Dimensional table contains textual
attributes of measurements stored in the facts tables. Dimensional table is a
collection of hierarchies, categories and logic which can be used for user to
traverse in hierarchy nodes.
What is Dimensional Modeling?
Dimensional data model concept
involves two types of tables and it is different from the 3rd normal form. This
concept uses Facts table which contains the measurements of the business and Dimension table which contains the context
(dimension of calculation) of the measurements.
What is Fact table?
Fact table contains measurements of
business process. Fact table contains the foreign keys for the dimension
tables. Example, if you are business process is "paper production",
"average production of paper by one machine" or "weekly
production of paper" will be considered as measurement of business
process.
What are fundamental stages of Data
Warehousing?
There are four different
fundamental stages of Data Warehousing.
Offline Operational Databases:
Data warehouses in this initial
stage are developed by simply copying the database of an operational system to
an off‐line server where
the processing load of reporting does not impact on the operational system's
performance.
Offline Data Warehouse:
Data warehouses in this stage of
evolution are updated on a regular time cycle (usually daily, weekly or
monthly) from the operational systems and the data is stored in an integrated
reporting‐oriented data structure
Real Time Data Warehouse:
Data warehouses at this stage are
updated on a transaction or event basis, every time an operational system
performs a transaction (e.g. an order or a delivery or a booking etc.)
Integrated Data Warehouse:
Data warehouses at this stage are
used to generate activity or transactions that are passed back into the
operational systems for use in the daily activity of the organization.
What are the Different methods of
loading Dimension tables?
There are two different ways to
load data in dimension tables.
Conventional (Slow):
All the constraints and keys are
validated against the data before, it is loaded, this way data integrity is
maintained.
Direct (Fast):
All the constraints and keys are
disabled before the data is loaded. Once data is loaded, it is validated
against all the constraints and keys. If data is found invalid or dirty it is
not included in index and all future processes are skipped on this data.
Describes the foreign key columns
in fact table and dimension table?
Foreign keys of dimension tables
are primary keys of entity tables.
Foreign keys of facts tables are
primary keys of Dimension tables.
What is Data Mining?
Data Mining is the process of
analyzing data from different perspectives and summarizing it into useful
information.
What is the difference between view
and materialized view?
A view takes the output of a query and makes it appear like a virtual
table and it can be used in place of tables.
A materialized view provides indirect access to table data by storing the results of a
query in a separate schema object.
What is OLTP?
OLTP is abbreviation of On‐Line Transaction Processing. This
system is an application that modifies data the instance it receives and has a
large number of concurrent users.
What is OLAP?
OLAP is abbreviation of Online
Analytical Processing. This system is an application that collects, manages,
processes and presents multidimensional data for analysis and management
purposes.
What is the difference between OLTP
and OLAP?
Data Source
OLTP: Operational data is from
original data source of the data
OLAP: Consolidation data is from
various sources.
Process Goal
OLTP: Snapshot of business
processes which does fundamental business tasks
OLAP: Multi‐dimensional views of business
activities of planning and decision making
Queries and Process Scripts
OLTP: Simple quick running queries
ran by users.
OLAP: Complex long running queries
by system to update the aggregated data.
Database Design
OLTP: Normalized small database. Speed
will be not an issue due to smaller database and normalization will not degrade
performance. This adopts entity relationship(ER) model and an application‐oriented database design.
OLAP: De‐normalized large database. Speed is
issue due to larger database and de‐normalizing will improve performance as there will be lesser
tables to scan while performing tasks. This adopts star, snowflake or fact
constellation mode of subject‐oriented database
design.
Back up and System Administration
OLTP: Regular Database backup and
system administration can do the job.
OLAP: Reloading the OLTP data is
good considered as good backup option.
What are normalization forms?
There are different types of
normalization forms like,
1NF: Eliminate Repeating Groups
Make a separate table for each set
of related attributes, and give each table a primary key. Each field contains
at most one value from its attribute domain.
2NF: Eliminate Redundant Data
If an attribute depends on only
part of a multi‐valued key,
remove it to a separate table.
3NF: Eliminate Columns Not
Dependent On Key
If attributes do not contribute to
a description of the key, remove them to a separate table. All attributes must
be directly dependent on the primary key
BCNF: Boyce‐Codd Normal Form
If there are non‐trivial dependencies between
candidate key attributes, separate them out into distinct tables.
4NF: Isolate Independent Multiple
Relationships
No table may contain two or more
1:n or n:m relationships that are not directly related.
5NF: Isolate Semantically Related
Multiple Relationships
There may be practical constrains
on information that justify separating logically related many‐to‐many relationships.
ONF: Optimal Normal Form
A model limited to only simple
(elemental) facts, as expressed in Object Role Model notation.
DKNF: Domain‐Key Normal Form A model free from all modification
anomalies.
Remember, these normalization
guidelines are cumulative. For a database to be in
3NF, it must first fulfill all the
criteria of a 2NF and 1NF database.
What is ODS?
ODS is abbreviation of Operational
Data Store. A database structure that is a repository for near real‐time operational data rather than
long term trend data. The
ODS may further become the
enterprise shared operational database, allowing operational systems that are
being re‐engineered to use
the ODS as there operation databases.
What is ER Diagram?
Entity Relationship Diagrams are a
major data modeling tool and will help organize
the data in your project into
entities and define the relationships between the
entities. This process has proved
to enable the analyst to produce a good database
structure so that the data can be
stored and retrieved in a most efficient manner.
An entity‐relationship (ER) diagram is a
specialized graphic that illustrates the
interrelationships between entities
in a database. A type of diagram used in data
modeling for relational data bases.
These diagrams show the structure of each table
and the links between tables.
What is ETL?
ETL is abbreviation of extract,
transform, and load. ETL is software that enables
businesses to consolidate their
disparate data while moving it from place to place,
and it doesn't really matter that
that data is in different forms or formats. The data
can come from any source.ETL is
powerful enough to handle such data disparities.
First, the extract function reads
data from a specified source database and extracts a desired subset of data.
Next, the transform function works with the acquired data using rules or lookup
tables, or creating combinations with other data ‐ to convert it to the desired state. Finally, the load function is
used to write the resulting data to a target database.
What is VLDB?
VLDB is abbreviation of Very Large
Database. A one terabyte database would
Normally be considered to be a
VLDB. Typically, these are decision support systems
Or transaction processing
applications serving large numbers of users.
Is OLTP database is design optimal
for Data Warehouse?
No. OLTP database tables are
normalized and it will add additional time to queries to return results.
Additionally OLTP database is smaller and it does not contain longerperiod
(many years) data, which needs to be analyzed. A OLTP system is basically ER model
and not Dimensional Model. If a complex query is executed on a OLTP system, it
may cause a heavy overhead on the OLTP server that will affect the normal
business processes.
If de‐normalized is improves data
warehouse processes, why fact table is in normal form?
Foreign keys of facts tables are
primary keys of Dimension tables. It is clear that fact table contains columns
which are primary key to other table that itself make normal form table.
What are lookup tables?
A lookup table is the table placed on
the target table based upon the primary key of
The target, it just updates the
table by allowing only modified (new or updated)
Records based on the lookup
condition.
What are Aggregate tables?
Aggregate table contains the
summary of existing warehouse data which is grouped
to certain levels of dimensions. It
is always easy to retrieve data from aggregated
tables than visiting original table
which has million records. Aggregate tables reduce the load in the database
server and increase the performance of the query and can retrieve the result
quickly.
What is real time data‐warehousing?
Data warehousing captures business
activity data. Real‐time data
warehousing
captures business activity data as
it occurs. As soon as the business activity is
Complete and there is data about
it, the completed activity data flows into the data
Warehouse and becomes available
instantly.
Factless fact table?
What
is a Factless Fact table?
A
Fact table is a table which consists of measurements, metrics or facts of a
business process. A
Factless
fact table is a fact table
that doesn't contain any facts. They contain information to capture
Information
but not for calculation
Example:
Tracking a
student or employee attendance.
What is factless facts table?
A fact table which does not contain
numeric fact columns it is called factless facts
Table.
What are conformed dimensions?
Conformed dimensions mean the exact
same thing with every possible fact table to
which they are joined. They are
common to the cubes.
A Dimension which exists in more than
one fact table is known as conformed dimensions.
What is conformed fact?
Conformed dimensions are the
dimensions which can be used across multiple Data
Marts in combination with multiple
facts tables accordingly.
How do you load the time dimension?
Time dimensions are usually loaded
by a program that loops through all possible
Dates that may appear in the data.
100 years may be represented in a time
Dimension, with one row per day.
What is a level of Granularity of a
fact table?
Level of granularity means level of
detail that you put into the fact table in a data
Warehouse. Level of granularity
would mean what detail are you willing to put for
Each transactional fact.
What are non‐additive facts?
Non‐additive facts are facts that cannot be summed up for any of the
dimensions
Present in the fact table. However
they are not considered as useless. If there are
Changes in dimensions the same
facts can be useful.
What are slowly changing dimensions
(SCD)?
SCD is abbreviation of slowly
changing dimensions. SCD applies to cases where the
Attribute for a record varies over
time. There are three different types of SCD.
1) SCD1: The new record replaces
the original record. Only one record exist in
Database ‐ current data.
2) SCD2: A new record is added into
the customer dimension table. Two records
Exist in database ‐ current data and previous history data.
3) SCD3: The original data is
modified to include new data. One record exist in
Database ‐ new information are attached with old information in same row.
What is hybrid slowly changing dimension?
Hybrid SCDs are combination of both
SCD 1 and SCD 2. It may happen that in a table, some columns are important and
we need to track changes for them i.e. capture the historical data for them
whereas in some columns even if the data changes, we don't care.
What is BUS Schema?
BUS Schema is composed of a master
suite of confirmed dimension and standardized definition if facts.
What is a Star Schema?
Star schema is a type of organizing
the tables such that we can retrieve the result from the database quickly in
the warehouse environment.
What Snow Flake Schema?
Snowflake Schema, each dimension
has a primary dimension table, to which one or more additional dimensions can
join. The primary dimension table is the only table that can join to the fact
table.
Differences between star and
snowflake schema?
Star schema: A single fact table with N number of Dimension, all dimensions
will be linked directly with a fact table. This schema is denormalized and results in simple
join and less complex query as well as faster results.
Snow schema: Any dimensions with extended dimensions are known as Snowflake
schema, dimensions maybe interlinked or may have one to many relationships with
other tables. This schema is normalized and results in complex join and very
complex query as well as slower results.
What is Difference between ER
Modeling and Dimensional Modeling?
ER modeling is used for normalizing
the OLTP database design. Dimensional
Modeling is used for de‐normalizing the ROLAP/MOLAP design.
What is degenerate dimension table?
If a table contains the values,
which is neither dimension nor measures is called
degenerate dimensions.
Why is Data Modeling Important?
Data modeling is probably the most
labor intensive and time consuming part of the
development process. The goal of
the data model is to make sure that the all data
objects required by the database
are completely and accurately represented.
Because the data model uses easily
understood notations and natural language, it
can be reviewed and verified as
correct by the end‐users.
In computer science, data modeling
is the process of creating a data model by
applying a data model theory to
create a data model instance. A data model theory
is a formal data model description.
When data modeling, we are structuring and
organizing data. These data
structures are then typically implemented in a database management system. In
addition to defining and organizing the data, data modelling will impose
(implicitly or explicitly) constraints or limitations on the data placed within
the structure.
Managing large quantities of
structured and unstructured data is a primary function
of information systems. Data models
describe structured data for storage in data
management systems such as
relational databases. They typically do not describe
unstructured data, such as word
processing documents, email messages, pictures,
digital audio, and video.
What is surrogate key?
Surrogate key is a substitution for
the natural primary key. It is just a unique
Identifier or number for each row
that can be used for the primary key to the table.
The only requirement for a
surrogate primary key is that it is unique for each row in
The table. It is useful because the
natural primary key can change and this makes
updates more difficult. Surrogated
keys are always integer or numeric.
What is junk dimension?
A number of very small dimensions
might be lumped together to form a single
Dimension, a junk dimension ‐ the attributes are not closely related. Grouping of
Random flags and text attributes in
a dimension and moving them to a separate sub dimension is known as junk
dimension.
What is Data Mart?
A data mart (DM) is a specialized version
of a data warehouse (DW). Like data
Warehouses, data marts contain a
snapshot of operational data that helps business
people to strategize based on
analyses of past trends and experiences. The key
Difference is that the creation of
a data mart is predicated on a specific, predefined
Need for a certain grouping and
configuration of select data. A data mart
Configuration emphasizes easy
access to relevant information (Reference : Wiki).
Data Marts are designed to help
manager make strategic decisions about their
Business.
What is the difference between OLAP
and data warehouse?
Datawarehouse is the place where
the data is stored for analyzing where as OLAP is
the process of analyzing the data,
managing aggregations, partitioning information
into cubes for in depth
visualization.
What is a Cube and Linked Cube with
reference to data warehouse?
Cubes are logical representation of
multidimensional data. The edge of the cube
contains dimension members and the
body of the cube contains data values. The
linking in cube ensures that the
data in the cubes remain consistent.
What is snapshot with reference to
data warehouse?
You can disconnect the report from
the catalog to which it is attached by saving the
report with a snapshot of the data.
What is active data warehousing?
An active data warehouse provides
information that enables decision‐makers within
an organization to manage customer
relationships nimbly, efficiently and
proactively.
What is the difference between data
warehousing and business intelligence?
Data warehousing deals with all
aspects of managing the development,
Implementation and operation of a
data warehouse or data mart including meta
data management, data acquisition,
data cleansing, data transformation, storage
management, data distribution, data
archiving, operational reporting, analytical
reporting, security management,
backup/recovery planning, etc. Business
intelligence, on the other hand, is
a set of software tools that enable an organization to analyze measurable
aspects of their business such as sales performance, profitability, operational
efficiency, effectiveness of marketing campaigns, market penetration among
certain customer groups, cost trends, anomalies and exceptions, etc. Typically,
the term “business intelligence” is used to encompass OLAP, data visualization,
data mining and query/reporting tools. (Reference: Les Barbusinski)
Explain paradigm of Bill Inmon and
Ralph Kimball.
Bill Inmon's paradigm: Data warehouse is one part of the overall business
intelligence system. An enterprise
has one data warehouse, and data marts source
their information from the data
warehouse. In the data warehouse, information is
stored in 3rd normal form.
Ralph Kimball's paradigm: Data warehouse is the conglomerate of all data marts
within the enterprise. Information is always stored in
the dimensional model.
Maximum of Connection pools
The maximum number of connections
allowed for this connection pool. The default is 10. This value should be
determined by the database make and model and the configuration of the hardware
box on which the database runs as well as the number of concurrent users who
require access.
NOTE: For deployments with Intelligence Dashboard pages, consider
estimating this value at 10% to 20% of the number of simultaneous users
multiplied by the number of requests on a dashboard. This number may be
adjusted based on usage. The total number of all connections in the repository
should be less than 800. To estimate the maximum connections needed for a
connection pool dedicated to an initialization block, you might use the number
of users concurrently logged on during initialization block execution.
Define pipeline. Did you use it
in your projects?
Yes, pipelines are the stages in a particular transaction. Assessment, finance etc.
Yes, pipelines are the stages in a particular transaction. Assessment, finance etc.
How do you create filter on
repository?
Where condition on content tab.
Where condition on content tab.
How do you work in a multi user
environment? What are the steps?
Create a shared directory on the
network for Multi-user Development (MUD).
Open the RPD to use in MUD. From Tools->Options, setup the MUD directory to point to the above directory.
Define projects within the RPD to allow multiple users to develop within their subject area or Facts.
Save and move the RPD to the shared directory setup in point 1.
When users work in the MUD mode, they open the admin tool and start with
MUD ->Checkout to check out the project they need to work on (not use the File open as you would usually do).
After completely the development, user checking the changes back to the network and merge the changes.
Open the RPD to use in MUD. From Tools->Options, setup the MUD directory to point to the above directory.
Define projects within the RPD to allow multiple users to develop within their subject area or Facts.
Save and move the RPD to the shared directory setup in point 1.
When users work in the MUD mode, they open the admin tool and start with
MUD ->Checkout to check out the project they need to work on (not use the File open as you would usually do).
After completely the development, user checking the changes back to the network and merge the changes.
Where are passwords for userid?
LDAP, external table authentication stored respectively?
Passwords for userid are in OBIEE server repository LDAP authentication in LDAP server external database in a table in external database.
Passwords for userid are in OBIEE server repository LDAP authentication in LDAP server external database in a table in external database.
Can you bypass OBIEE server
security? if so how?
Yes you can by-pass by setting authententication type in NQSCONFIG file in the security section as: authentication_type=bypass_nqs.instanceconfig.xml and nqsconfig.ini are the 2 places.
Yes you can by-pass by setting authententication type in NQSCONFIG file in the security section as: authentication_type=bypass_nqs.instanceconfig.xml and nqsconfig.ini are the 2 places.
Where can you add new groups and
set permissions?
You can add groups by going to manage>security>add new groups> You can give permissions to a group for query limitation and filter conditions.
You can add groups by going to manage>security>add new groups> You can give permissions to a group for query limitation and filter conditions.
What are the things you can do in
the BMM layer?
Aggregation navigation, level base metrics, time series wizard, create new logical column, complex join.
Aggregation navigation, level base metrics, time series wizard, create new logical column, complex join.
What is ragged hierarchy and how
do you manage it?
Ragged Hierarchy is one of the different kinds of hierarchy.
A hierarchy in which each level has a consistent meaning, but the branches have inconsistent depths because at least one member attributes in a branch level is unpopulated. A ragged hierarchy can represent a geographic hierarchy in which the meaning of each level such as city or country is used consistently, but the depth of the hierarchy varies.
For example, a geographic hierarchy that has Continent, Country, Province/State, and City levels defined. One branch has North America as the Continent, United States as the Country, California as the Province or State, and San Francisco as the City. However, the hierarchy becomes ragged when one member does not have an entry at all of the levels. For example, another branch has Europe as the Continent, Greece as the Country, and Athens as the City, but has no entry for the Province or State level because this level is not applicable to Greece for the business model in this example. In this example, the Greece and United States branches descend to different depths, creating a ragged hierarchy.
Ragged Hierarchy is one of the different kinds of hierarchy.
A hierarchy in which each level has a consistent meaning, but the branches have inconsistent depths because at least one member attributes in a branch level is unpopulated. A ragged hierarchy can represent a geographic hierarchy in which the meaning of each level such as city or country is used consistently, but the depth of the hierarchy varies.
For example, a geographic hierarchy that has Continent, Country, Province/State, and City levels defined. One branch has North America as the Continent, United States as the Country, California as the Province or State, and San Francisco as the City. However, the hierarchy becomes ragged when one member does not have an entry at all of the levels. For example, another branch has Europe as the Continent, Greece as the Country, and Athens as the City, but has no entry for the Province or State level because this level is not applicable to Greece for the business model in this example. In this example, the Greece and United States branches descend to different depths, creating a ragged hierarchy.
What is the difference between
Single Logical Table Source and Multiple Logical Table Sources?
If a logical table in BMM layer has only one Table as the source table then it is Single LTS.
If the logical table in BMM layer has more than one table as the sources to it then it is called Multiple LTS.
Ex: Usually Fact table has Multiple LTS’, for which sources will be coming from different Physical tables.
If a logical table in BMM layer has only one Table as the source table then it is Single LTS.
If the logical table in BMM layer has more than one table as the sources to it then it is called Multiple LTS.
Ex: Usually Fact table has Multiple LTS’, for which sources will be coming from different Physical tables.
Can you let me know how many
aggregate tables you have in your project? On what basis have you created them?
As per resume justification document
As per resume justification document
How do you bring/relate the
aggregate tables into the OBIEE Logical layer?
One way of bringing the Aggregate Tables into the BMM layer is by bringing them as Logical Table sources for the corresponding Fact table.
This is done by dragging and dropping the aggregate table into the corresponding fact table. After doing that establish the column mappings and the set the aggregation levels.
One way of bringing the Aggregate Tables into the BMM layer is by bringing them as Logical Table sources for the corresponding Fact table.
This is done by dragging and dropping the aggregate table into the corresponding fact table. After doing that establish the column mappings and the set the aggregation levels.
How do you know which report is
hitting which table, either the fact table or the aggregate table?
After running the report, go to Administration tab and go to click on Manage Sessions. There you can find the queries that are run and in the View Log option in the Session Management you can find which report is hitting which table.
After running the report, go to Administration tab and go to click on Manage Sessions. There you can find the queries that are run and in the View Log option in the Session Management you can find which report is hitting which table.
Suppose I have report which is
running for about 3 minutes typically. What is the first step you take to
improve the performance of the query?
o Find the sql query of the report in Admin->manage Session-> run the sql query on toad ->read the explain plan output ->modify the SQL based on the explain plan output
o Find the sql query of the report in Admin->manage Session-> run the sql query on toad ->read the explain plan output ->modify the SQL based on the explain plan output
Suppose you have a report which
has the option of running on aggregate table. How does the tool know to hit the
Aggregate table and for that what the steps you follow to configure them?
o Explain the process of Aggregate navigation
Have you heard of Implicit Facts? If, so what are they?
O An implicit fact column is a column that will be added to a query when it contains columns from two or more dimension tables and no measures. You will not see the column in the results. It is used to specify a default join path between dimension tables when there are several possible alternatives.
o For example, there might be many star schemas in the database that have the Campaign dimension and the Customer dimension, such as the following stars:-
Campaign History star. Stores customers targeted in campaign.
Campaign Response star. Stores customer responses to a campaign.
Order star. Stores customers who placed orders as a result of a campaign.
In this example, because Campaign and Customer information might appear in many segmentation catalogs, users selecting to count customers from the targeted campaigns catalog would be expecting to count customers that have been targeted in specific campaigns.
To make sure that the join relationship between Customers and Campaigns is through the campaign history fact table, a campaign history implicit fact needs to be specified in Campaign History segmentation catalog. The following guidelines should be followed in creating
segmentation catalogs:
Each segmentation catalog should be created so that all columns come from only one physical star.
Because the Marketing module user interface has special features that allow users to specify their aggregations, level-based measures typically should not be exposed to segmentation users in a segmentation catalog.
o Explain the process of Aggregate navigation
Have you heard of Implicit Facts? If, so what are they?
O An implicit fact column is a column that will be added to a query when it contains columns from two or more dimension tables and no measures. You will not see the column in the results. It is used to specify a default join path between dimension tables when there are several possible alternatives.
o For example, there might be many star schemas in the database that have the Campaign dimension and the Customer dimension, such as the following stars:-
Campaign History star. Stores customers targeted in campaign.
Campaign Response star. Stores customer responses to a campaign.
Order star. Stores customers who placed orders as a result of a campaign.
In this example, because Campaign and Customer information might appear in many segmentation catalogs, users selecting to count customers from the targeted campaigns catalog would be expecting to count customers that have been targeted in specific campaigns.
To make sure that the join relationship between Customers and Campaigns is through the campaign history fact table, a campaign history implicit fact needs to be specified in Campaign History segmentation catalog. The following guidelines should be followed in creating
segmentation catalogs:
Each segmentation catalog should be created so that all columns come from only one physical star.
Because the Marketing module user interface has special features that allow users to specify their aggregations, level-based measures typically should not be exposed to segmentation users in a segmentation catalog.
What is aggregate navigation? How
do you configure the Aggregate tables in OBIEE?
Aggregate tables store pre-computed results, which are measures that have been aggregated (typically summed) over a set of dimensional attributes. Using aggregate tables is a very popular technique for speeding up query response times in decision support systems.
If you are writing SQL queries or using a tool that only understands what physical tables exist (and not their meaning), taking advantage of aggregate tables and putting them to good use becomes more difficult as the number of aggregate tables increases. The aggregate navigation capability of the OBIEE Server, however, allows queries to use the information stored in aggregate tables automatically, without query authors or query tools having to specify aggregate tables in their queries. The OBIEE Server allows you to concentrate on asking the right business question; the server decides which tables provide the fastest answers.
Aggregate tables store pre-computed results, which are measures that have been aggregated (typically summed) over a set of dimensional attributes. Using aggregate tables is a very popular technique for speeding up query response times in decision support systems.
If you are writing SQL queries or using a tool that only understands what physical tables exist (and not their meaning), taking advantage of aggregate tables and putting them to good use becomes more difficult as the number of aggregate tables increases. The aggregate navigation capability of the OBIEE Server, however, allows queries to use the information stored in aggregate tables automatically, without query authors or query tools having to specify aggregate tables in their queries. The OBIEE Server allows you to concentrate on asking the right business question; the server decides which tables provide the fastest answers.
(Assume you are in BMM layer) We
have 4 dimension tables, in that, 2 tables need to have hierarchy, then in such
a case is it mandatory to create hierarchies for all the dimension tables?
No, it’s not mandatory to define hierarchies to other Dimension tables.
No, it’s not mandatory to define hierarchies to other Dimension tables.
Can you have multiple data
sources in OBIEE?
Yes.
Yes.
How do you deal with case
statement and expressions in OBIEE?
Use expression builder to create case when…then... end statement
Use expression builder to create case when…then... end statement
Do you know about Initialization
Blocks? Can you give me an example where you used them?
Init blocks are used for instantiating a session when a user logs in.
To create dynamic variable you have to create IB to write sql statement.
Init blocks are used for instantiating a session when a user logs in.
To create dynamic variable you have to create IB to write sql statement.
What is query repository tool?
It is utility of Seibel/OBIEE Admin tool
allows you to examine the repository metadata tool
for example: search for objects based on name, type.
Examine relationship between metadata objects like which column in the presentation layer maps to which table in physical layer.
It is utility of Seibel/OBIEE Admin tool
allows you to examine the repository metadata tool
for example: search for objects based on name, type.
Examine relationship between metadata objects like which column in the presentation layer maps to which table in physical layer.
What is JDK and why do we need
it?
Java Development Kit (JDK), a software package that contains the minimal set of tools needed to write, compile, debug, and run Java applets.
Java Development Kit (JDK), a software package that contains the minimal set of tools needed to write, compile, debug, and run Java applets.
Oracle doesn’t recommend Opaque
Views because of performance considerations, so why/when do we use them?
An opaque view is a physical layer table that consists of select statement. An opaque view should be used only if there is no other solution.
An opaque view is a physical layer table that consists of select statement. An opaque view should be used only if there is no other solution.
Can you migrate the presentation
layer to a different server?
No we have to migrate the whole web & RPD files
No we have to migrate the whole web & RPD files
How do you identify what are the
dimension tables and how do you decide them during the Business/Data modeling?
Dimension tables contain descriptions that data analysts use as they query the database. For example, the Store table contains store names and addresses; the Product table contains product packaging information; and the Period table contains month, quarter, and year values. Every table contains a primary key that consists of one or more columns; each row in a table is uniquely identified by its primary-key value or values.
Dimension tables contain descriptions that data analysts use as they query the database. For example, the Store table contains store names and addresses; the Product table contains product packaging information; and the Period table contains month, quarter, and year values. Every table contains a primary key that consists of one or more columns; each row in a table is uniquely identified by its primary-key value or values.
What is the full form of RPD?
There is no full form for RPD as such, it is just a repository file (Rapid file Database)
There is no full form for RPD as such, it is just a repository file (Rapid file Database)
How do i disable cache for only 2
particular tables?
In the physical layer, right
click on the table there we will have the option which says cacheable
How do you split a table in the
RPD given the condition? (The condition given was Broker and customer in the
same table) Split Broker and customer.
We need to make an alias table in
the physical layer.
What type of protocol did you use
in SAS (Statistical Analysis System)?
TCP/IP
TCP/IP
Did you create any new logical
column in BMM layer, how?
Yes. We can create new logical column in BMM layer.
Example: Right click on fact table -new logical column-give name for new logical column like Total cost.
Now in fact table source, we have one option column mapping, in that we can do all calculation for that new column.
Yes. We can create new logical column in BMM layer.
Example: Right click on fact table -new logical column-give name for new logical column like Total cost.
Now in fact table source, we have one option column mapping, in that we can do all calculation for that new column.
Can you use physical join in BMM
layer?
Yes we can use physical join in BMM layer. When there is SCD type 2 we need complex join in BMM layer.
Yes we can use physical join in BMM layer. When there is SCD type 2 we need complex join in BMM layer.
Can you use outer join in BMM
layer?
Yes we can. When we are doing
complex join in BMM layer, there is one option type, outer join is there.
What are other ways of improving
summary query reports other than Aggregate Navigation and Cache Management
Indexes
Join algorithm
Mat/view query rewrite
Web proper report design it’s optimal by making sure that it is not getting any addition column or rows
Indexes
Join algorithm
Mat/view query rewrite
Web proper report design it’s optimal by making sure that it is not getting any addition column or rows
What is level-based metrics?
Level-based metrics means, having a measure pinned at a certain level of the dimension. For Example, if you have a measure called Dollars, you can create a Level Based Measure called Yearly Dollars which (you guessed it) is Dollars for a Year. This measure will always return the value for the year even if you drill down to a lower level like quarter, month… etc. To create a level based measure, create a new logical column based on the original measure (like Dollars in the example above). Drag and drop the new logical column to the appropriate level in the Dimension hierarchy (in the above example you will drag and drop it to Year in Time Dim
o A LBM is a metric that is defined for a specific level or intersection of levels.
o Monthly Total Sales or Quarterly Sales are the examples.
o You can compare monthly sales with quarterly sales. You can compare customer orders this quarter to orders this year.
Level-based metrics means, having a measure pinned at a certain level of the dimension. For Example, if you have a measure called Dollars, you can create a Level Based Measure called Yearly Dollars which (you guessed it) is Dollars for a Year. This measure will always return the value for the year even if you drill down to a lower level like quarter, month… etc. To create a level based measure, create a new logical column based on the original measure (like Dollars in the example above). Drag and drop the new logical column to the appropriate level in the Dimension hierarchy (in the above example you will drag and drop it to Year in Time Dim
o A LBM is a metric that is defined for a specific level or intersection of levels.
o Monthly Total Sales or Quarterly Sales are the examples.
o You can compare monthly sales with quarterly sales. You can compare customer orders this quarter to orders this year.
What is logging level? Where can
you set logging levels?
You can enable logging level for individual users; you cannot configure a logging level for a group.
Set the logging level based on the amount of logging you want to do. In normal operations, logging is generally disabled (the logging level is set to 0). If you decide to enable logging, choose a logging
level of 1 or 2. These two levels are designed for use by OBIEE Server administrators.
You can enable logging level for individual users; you cannot configure a logging level for a group.
Set the logging level based on the amount of logging you want to do. In normal operations, logging is generally disabled (the logging level is set to 0). If you decide to enable logging, choose a logging
level of 1 or 2. These two levels are designed for use by OBIEE Server administrators.
Set Logging Level
1. In the Administration Tool, select Manage > Security.
2. The Security Manager dialog box appears.
3. Double-click the user’s user ID.
4. The User dialog box appears.
5. Set the logging level by clicking the Up or down arrows next to the Logging Level field.
1. In the Administration Tool, select Manage > Security.
2. The Security Manager dialog box appears.
3. Double-click the user’s user ID.
4. The User dialog box appears.
5. Set the logging level by clicking the Up or down arrows next to the Logging Level field.
What is variable in Siebel?
You can use variables in a repository to streamline administrative tasks and modify metadata content dynamically to adjust to a changing data environment. The Administration Tool includes a Variable Manager for defining variables.
You can use variables in a repository to streamline administrative tasks and modify metadata content dynamically to adjust to a changing data environment. The Administration Tool includes a Variable Manager for defining variables.
What is system variable and
non-system variable?
System variables
System variables are session variables that the OBIEE Server and OBIEE Web use for specific purposes. System variables have reserved names, which cannot be used for other kinds of variables (such as static or dynamic repository variables, or for non-system session variables).
When using these variables in the Web, preface their names with NQ_SESSION. For example, to filter a column on the value of the variable LOGLEVEL set the filter to the Variable NQ_SESSION.LOGLEVEL.
System variables
System variables are session variables that the OBIEE Server and OBIEE Web use for specific purposes. System variables have reserved names, which cannot be used for other kinds of variables (such as static or dynamic repository variables, or for non-system session variables).
When using these variables in the Web, preface their names with NQ_SESSION. For example, to filter a column on the value of the variable LOGLEVEL set the filter to the Variable NQ_SESSION.LOGLEVEL.
Non-system variables.
A common use for non-system session variables is setting user filters. For example, you could define a non-system variable called SalesRegion that would be initialized to the name of the user’s sales region. You could then set a security filter for all members of a group that would allow them to see only data pertinent to their region.
o When using these variables in the Web, preface their names with NQ_SESSION. For example, to filter a column on the value of the variable SalesRegion set the filter to the Variable NQ_SESSION.SalesRegion.
What are different types of
variables? Explain each.
There are two classes of variables:
1. Repository variables
2. Session variables.
Repository variables.
There are two classes of variables:
1. Repository variables
2. Session variables.
Repository variables.
A repository variable has a single value at any point in time. There are two types of repository variables:
static: This value persists, and does not change until a OBIEE Server administrator decides to change it.
Dynamic: The values are refreshed by data returned from queries. When defining a dynamic repository variable, you will create an initialization block or use a preexisting one that contains a SQL query. You will also set up a schedule that the OBIEE Server will follow to execute the query and periodically refresh the value of the variable.
Session Variables
Session variables are created and assigned a value when each user logs on.
There are two types of session
variables:
1.system
2.nonsystem.
1.system
2.nonsystem.
What are the cache management?
Name all of them and their uses. For Event polling table do u need the table in
your physical layer?
Monitoring and managing the cache is cache management.There are three ways to do that.
Disable caching for the system. (INI NQ config file), Cashe persistence time for specified physical tables and Setting event polling table.
Disable caching for the system. (INI NQ config file :
You can disable caching for the whole system by setting the ENABLE parameter to NO in the NQSConfig.INI file and restarting the OBIEE Server. Disabling caching stops all new cache entries and stops any new queries from using the existing cache. Disabling caching allows you to enable it at a later time without losing any entries already stored in the cache.
Cache persistence time for specified physical tables:
You can specify a cacheable attribute for each physical table; that is, if queries involving the specified table can be added to the cache to answer future queries. To enable caching for a particular physical table, select the table in the Physical layer of the Administration Tool and select the option Make table cacheable in the General tab of the Physical Table properties dialog box. You can also use the Cache Persistence Time settings to specify how long the entries for this table should persist in the query cache. This is useful for OLTP data sources and other data sources that are updated frequently, potentially down to every few seconds.
Setting event polling table:
Monitoring and managing the cache is cache management.There are three ways to do that.
Disable caching for the system. (INI NQ config file), Cashe persistence time for specified physical tables and Setting event polling table.
Disable caching for the system. (INI NQ config file :
You can disable caching for the whole system by setting the ENABLE parameter to NO in the NQSConfig.INI file and restarting the OBIEE Server. Disabling caching stops all new cache entries and stops any new queries from using the existing cache. Disabling caching allows you to enable it at a later time without losing any entries already stored in the cache.
Cache persistence time for specified physical tables:
You can specify a cacheable attribute for each physical table; that is, if queries involving the specified table can be added to the cache to answer future queries. To enable caching for a particular physical table, select the table in the Physical layer of the Administration Tool and select the option Make table cacheable in the General tab of the Physical Table properties dialog box. You can also use the Cache Persistence Time settings to specify how long the entries for this table should persist in the query cache. This is useful for OLTP data sources and other data sources that are updated frequently, potentially down to every few seconds.
Setting event polling table:
OBIEE Server event polling tables
store information about updates in the underlying databases. An application
(such as an application that loads data into a data mart) could be configured
to add rows to an event polling table each time a database table is updated.
The Analytics server polls this table at set intervals and invalidates any
cache entries corresponding to the updated tables.
For event polling table, it is a standalone table and doesn’t require to be joined with other tables in the physical layer
For event polling table, it is a standalone table and doesn’t require to be joined with other tables in the physical layer
What is Authentication? How many types
of authentication.
Authentication is the process, by which a system verifies, through the use of a user ID and password, that a user has the necessary permissions and authorizations to log in and access data. The OBIEE Server authenticates each connection request it receives.
Operating system authentication
External table authentication
Database authentication
LDAP authentication.
Authentication is the process, by which a system verifies, through the use of a user ID and password, that a user has the necessary permissions and authorizations to log in and access data. The OBIEE Server authenticates each connection request it receives.
Operating system authentication
External table authentication
Database authentication
LDAP authentication.
What is object level security?
There are two types of object level security: Repository level and Web level
Repository level: In presentation layer we can set Repository level security by giving permission or deny permission to users/groups to see particular table or column.
Web level: this provides security for objects stored in the OBIEE web catalog, such as dashboards, dashboards pages, folder, and reports you can only view the objects for which you are authorized. For example, a mid-level manager may not be granted access to a dashboard containing summary information for an entire department.
There are two types of object level security: Repository level and Web level
Repository level: In presentation layer we can set Repository level security by giving permission or deny permission to users/groups to see particular table or column.
Web level: this provides security for objects stored in the OBIEE web catalog, such as dashboards, dashboards pages, folder, and reports you can only view the objects for which you are authorized. For example, a mid-level manager may not be granted access to a dashboard containing summary information for an entire department.
What is data level security?
This controls the type an amount of data that you can see in a report. When multiple users run the same report the results that are returned to each depend on their access rights and roles in the organization. For example a sales vice president sees results for all regions, while a sales representative for a particular region sees only data for that region.
This controls the type an amount of data that you can see in a report. When multiple users run the same report the results that are returned to each depend on their access rights and roles in the organization. For example a sales vice president sees results for all regions, while a sales representative for a particular region sees only data for that region.
What is the difference between
Data Level Security and Object Level Security?
Data level security controls the type and amount of data that you can see in a reports. Security provides security for objects stored in the OBIEE web catalog, like dashboards, dashboards pages, folder, and reports.
Data level security controls the type and amount of data that you can see in a reports. Security provides security for objects stored in the OBIEE web catalog, like dashboards, dashboards pages, folder, and reports.
How do you implement security
using External Tables and LDAP?
Instead of storing user IDs and passwords in a OBIEE Server repository, you can maintain lists of users and their passwords in an external database table and use this table for authentication purposes. The external database table contains user IDs and passwords, and could contain other information, including group membership and display names used for OBIEE Web users. The table could also contain the names of specific database catalogs or schemas to use for each user when querying data
Instead of storing user IDs and passwords in a OBIEE Server repository, you can have the OBIEE Server pass the user ID and password entered by the user to an LDAP(Lightweight Directory Access Protocol ) server for authentication. The server uses clear text passwords in LDAP authentication. Make sure your LDAP servers are set up to allow this.
Instead of storing user IDs and passwords in a OBIEE Server repository, you can maintain lists of users and their passwords in an external database table and use this table for authentication purposes. The external database table contains user IDs and passwords, and could contain other information, including group membership and display names used for OBIEE Web users. The table could also contain the names of specific database catalogs or schemas to use for each user when querying data
Instead of storing user IDs and passwords in a OBIEE Server repository, you can have the OBIEE Server pass the user ID and password entered by the user to an LDAP(Lightweight Directory Access Protocol ) server for authentication. The server uses clear text passwords in LDAP authentication. Make sure your LDAP servers are set up to allow this.
If you have 2 fact and you want
to do report on one with quarter level and the other with month level how do
you do that with just one time dimension?
Using level base metrics.
Using level base metrics.
Did you work on a standalone
Siebel system or was it integrated to other platforms?
Deploying the OBIEE platform without other Siebel applications is called OBIEE Stand -Alone .If your deployment includes other OBIEE Application it called integrated analytics -You can say Stand-Alone OBIEE
Deploying the OBIEE platform without other Siebel applications is called OBIEE Stand -Alone .If your deployment includes other OBIEE Application it called integrated analytics -You can say Stand-Alone OBIEE
How to sort columns in RPD and
web?
Sorting on web column, sort in the RPD its sort order column
Sorting on web column, sort in the RPD its sort order column
If you want to create new logical
column where will you create (in repository or dashboard) why?
I will create new logical column in repository. Because if it is in repository, you can use for any report. If you create new logical column in dashboard then it is going to effect on those reports, which are on that dashboard. You cannot use that new logical column for other dashboard(or request)
I will create new logical column in repository. Because if it is in repository, you can use for any report. If you create new logical column in dashboard then it is going to effect on those reports, which are on that dashboard. You cannot use that new logical column for other dashboard(or request)
What is complex join, and where
it is used?
We can join dimension table and fact table in BMM layer using complex join. When there is SCD type 2 we have to use complex join in BMM layer.
We can join dimension table and fact table in BMM layer using complex join. When there is SCD type 2 we have to use complex join in BMM layer.
If you have dimension table like
customer, item, time and fact table like sale and if you want to find out how
often a customer comes to store and buys a particular item, what will you do?
write a query as SELECT
customer_name, item_name, sale_date, sum(qty) FROM customer_dim a, item_dim b,
time_dim c, sale_fact d WHERE d.cust_key = a.cust_key AND d.item_key =
b.item_key AND d.time_key = c.time_key GROUP BY customer_name, item_name,
sale_date.
You worked on standalone or
integrated system?
Standalone.
Standalone.
If you want to limit the users by
the certain region to access only certain data, what would you do?
Using data level security.
OBIEE Administrator: go to Manage -> Security in left hand pane u will find the user, groups, LDAP server, Hierarchy
What you can do is select the user and right click and go to properties, you will find two tabs named as users and logon, go to user tab and click at permission button in front of user name you have selected as soon as u click at permission you will get a new window with user group permission having three tabs named as general ,query limits and filter and you can specify your condition at filter tab, in which you can select presentation table ,presentation columns ,logical table and logical columns where you can apply the condition according to your requirement for the selected user or groups.
Using data level security.
OBIEE Administrator: go to Manage -> Security in left hand pane u will find the user, groups, LDAP server, Hierarchy
What you can do is select the user and right click and go to properties, you will find two tabs named as users and logon, go to user tab and click at permission button in front of user name you have selected as soon as u click at permission you will get a new window with user group permission having three tabs named as general ,query limits and filter and you can specify your condition at filter tab, in which you can select presentation table ,presentation columns ,logical table and logical columns where you can apply the condition according to your requirement for the selected user or groups.
If there are 100 users accessing
data, and you want to know the logging details of all the users, where can you
find that?
To set a user’s logging level
1. In the Administration Tool, select Manage > Security.
The Security Manager Dialog box appears.
2. Double-click the user’s user ID. The User dialog box appears.
3. Set the logging level by clicking the Up or Down arrows next to the Logging Level field.
To set a user’s logging level
1. In the Administration Tool, select Manage > Security.
The Security Manager Dialog box appears.
2. Double-click the user’s user ID. The User dialog box appears.
3. Set the logging level by clicking the Up or Down arrows next to the Logging Level field.
How do implement event polling table?
OBIEE Server event polling tables store information about updates in the underlying databases. An application (such as an application that loads data into a data mart) could be configured to add rows to an event polling table each time a database table is updated. The Analytics server polls this table at set intervals and invalidates any cache entries corresponding to the updated tables.
OBIEE Server event polling tables store information about updates in the underlying databases. An application (such as an application that loads data into a data mart) could be configured to add rows to an event polling table each time a database table is updated. The Analytics server polls this table at set intervals and invalidates any cache entries corresponding to the updated tables.