Wednesday, April 26, 2017

IQ

Experience Level ~3
Informatica

Qn1 – load file containing 100 records in a target file, following below approach
1-10 à file 1
11-20à file 2 .. and so on
Qn2 - Active and passive transformations
Qn3 – Is sorter and Union active transformation if yes, why?
Qn4 – I am having a file as below
EID, ENAME,SAL
1,Robert, 30
2,William, 40 .. until 10 records
I want a footer record in a flat file target with the above records which would make the file look as below
EID, ENAME,SAL
1,Robert, 30
2,William, 40 .. until 10 records
The sum of Salary is 70
(considering only 2 rows given in the example above) . This question was most frequently asked.
Qn5 – Which transformation create Caches.
Qn6 – I have a file with some duplicate records. I want all the unique records pushed in a file and repeated records pushed to a different file.
Src
1,1,2,3,4,5,5,6,7,2,1,6,3,5
Tgt 1- unique
1,2,3,4,5,6,7
Static and Dynamic lookup cache
Qn7 – Lookup is an active or passive transformation
Qn8 – SCD 2 implementation
Qn9 – Different caches in Lookup Transformation.
Qn10 – Difference between Static and Dynamic lookup cache
Qn11- Different type of ports and their order of execution of ports in Informatica expression transformation.


UNIX
Qn1 – command used to replace string
Qn2 – Command to fetch third column from a CSV file
Qn3 – Status of last executed command

SQL
Qn1– delete all remaining records except the latest record
DELETE FROM student
WHERE ApplicationDateTime <> (SELECT max(ApplicationDateTime)
                              FROM student s2
                              WHERE s2.StudentID  = student.StudentID)

Qn2 – second highest salary
Qn3 – Analytical functions. This was asked in a twisted manner. The question asked was I have data as below,
EmpID
Dept
Sal
1
10
100
2
20
400
3
10
800
4
30
400
5
20
300
6
20
300

I want the output to look like
EmpID
Dept
Sal
Total Sal
1
10
100
900
2
20
400
1000
3
10
800
900
4
30
400
400
5
20
300
1000
6
20
300
1000

Where total Sal comprises of data summed up dept wise.

Experience level 5+
Informatica
Qn1 - I have a file in the below format
USD
50,55,40,23
60,34,20,36
INR
1,3,4,6
5,3,6,5
EUR
80,84,85,88
85,86,88,89

I want the output in the below tabular format

Currency
Val1
Val2
Val3
Val4
USD
50
55
40
23
USD
60
34
20
36
INR
1
3
4
6
INR
5
3
6
5
EUR
80
84
85
88
EUR
85
86
88
89


Qn2 - 3 Flat files having same structure are to be loaded in the target table.. We need to load the file in the target table ad ensure no duplicates are loaded.
2 scenarios here
1)      There can be duplicates in the file which needs to be avoided.
2)      The files can have duplicates between them i.e  File 1 and file 2 can have same record. We need to avoid these as well.
Qn3 – This was asked as a follow up on other question. If my joiner transformation receives inputs from 2 pipelines coming from the same source, is there a specific property I need to set to allow that?
Qn4 – Cache files created in informatica and what will the files contain.

SQL
Qn1 – I have data in a table as below
Seq
Emp ID
Start Date
End Date
1
1
1/1/2016

2
2
2/20/2016

3
3
3/11/2016

4
1
1/1/2017

5
2
8/31/2016

6
1
3/1/2017


The above table has some data discrepancies, wherein the end dates have been deleted accidently. We need to write a single update statement, which would fill the end dates as below
Seq
Emp ID
Start Date
End Date
1
1
1/1/2016
12/31/2016
2
2
2/20/2016
8/30/2016
3
3
3/11/2016
NULL
4
1
1/1/2017
2/28/2017
5
2
8/31/2016
NULL
6
1
3/1/2017


So end date in seq 1 is start date of seq 4 minus 1 day.


Most of the questions asked to 3+ were repeated to 5+ as well. Also there were a few adhoc questions asked as people were solving scenario based qns trying to understand the reason behind a particular implementation like, why a sorter is used here, order of ports when writing the variable logic, etc.

Sunday, April 16, 2017

Variables in OBIEE 11g

Variables in Oracle OBIEE 11g :-

There are basically 4 different types of variables in OBIEE 11g.
1. Session Variables
2. Repository Variables
3. Presentation Variables

4. Request Variables.


Session Variables:
1. As the name suggests, session variables are created during the creation of session
i.e., as soon as a user logs into the BI server. So, Every login has its own session
variable.

2. There are two types of session variables
    1. System (which are defined by OBIEE and are reserved)
    2. Non System
    which are defined by developers.

3. Session Variables can be created only through Oracle BI Administration Tool.

Referencing session variable:

For displaying session variables, we should use
@{biServer.variables['NQ_SESSION.VariableName']} .
For using session variables in expression, we should use
VALUEOF(NQ_SESSION.“VariableName”).

System Session Variable
EXP :-
NQ_SESSION.LOGLEVEL
USER,USERGUID,GROUP,ROLES,PERMISSIONS,PROXY

Repository Variables:

1. A repository variable is a variable that has a single value at any point in time.

2. There are two types of repository variables
    1. Static (which changes only if admin or developer changes it value).
    2. Dynamic ( value is refreshed using a query).

3. Repository variables can be created only through Oracle BI Administration Tool.

Referencing repository variable:

For displaying repository variables, we should use @{biServer.variables.VariableName} or
@{biServer.variables['VariableName']}.
For using repository variables in expression, we should use VALUEOF(“VariableName”) for
static variable and for dynamic variable VALUEOF(“Dynamic Initialization Block

Name”.“VariableName”).

Variables should be used as arguments of the function VALUEOF(). This happens automatically when you double-click the variables to paste them into the expression.

VALUEOF("prime_end")

CASE WHEN "Hour" >= VALUEOF("prime_begin")AND "Hour" < VALUEOF("prime_end") THEN 'Prime Time' WHEN ... ELSE...END
Presentation Variable:

1. A presentation variable is a variable which can be created as a part of creation of
dashboard prompts. Dashboard prompts must be either Column Prompt or Variable
Prompt.

2. The value of presentation variable is set by the prompt for which it is created (upon
user selection).

Referencing presentation variable:

For displaying presentation variables, we should use either
1. @{variables.VariableName}[Format]{DefaultValue} or

1. Format and DefaultValue are optional


1. Scope should be used if you create variables with same name.

2. Scope can be analyses, dashboard etc.

3. Order of precedence is analyses, dashboard pages, dashboards.

For using presentation variables in expression, we should use @{“VariableName”}
{DefaultValue}. Default value is optional.

Request Variable:

1. Request Variable is used to overwrite the value of session variable and it happens only
during request initiation to the database from column prompt.

2. Can be created only during the creation of column prompt.
Referencing request variable:

Same as presentation variable.

In an answer, in the advanced tab, you have the section “Advanced Sql Clauses”.

In the prefix field, you can add a “SET VARIABLE MYVARIABLE=MyValue;”.
Example with DISABLE_CACHE_HIT (to disable a cache hit):

SET VARIABLE DISABLE_CACHE_HIT=1;SELECT "Customer"."Name" FROM SH ...
3.2.1.2 - With a presentation variable

You can then use the value of a presentation variable to initialize a request variable.

Example of prefix with multiple presentation variables and request variables:

SET VARIABLE MyRequestVariable1='@{MyPresentationVariableName1}{DefaultValue}', 
MyRequestVariable2='@{MyPresentationVariableName2}{DefaultValue}';


In a Dashboard prompt

SET VARIABLE DISABLE_CACHE_HIT=1, MY_SESSION_VARIABLE=MyValue;
SELECT Calendar."Calendar Year" 
FROM SH

Performance improvement Tips using Hint in OBIEE 11g RPD or Other ways..

Performance improvement Tips using Hint in OBIEE 11g RPD or Other ways..

We can improve the performance of reports by doing tuning using Hint. This can be done by the following ways:

Tuning in RPD using Hint

Tuning in Report using Hint
In this article I will explain how to add Hint in RPD for tuning report performance.  Even also before using Hint either in RPD or report, we need to check whether the performance can be improved by developing proper inner versus outer joins, modeling efficient star schemas. Report performance tuning can trigger action across RPD, server and DB.

Hints allow us to change sub optimal SQL execution plans. They simply allow us to provide more information to the optimizer and influence the plan executed.


For slower performance of OBIEE Reports Hints are added in RPD to speed up the queries of those reports. In OBIEE we can add a database hint to a table object in the Physical layer, whenever this table is referenced in a query the hint will be applied. Importantly a hint should use the table alias whenever a query specifies an alias for a table, the table itself should not be used. Hints can be added by two ways:


Hints can be added in the Physical table itself

Or it can be added on the join of two tables in Physical Layer
Sometimes OBIEE report runs very slowly because of the main fact table holds huge volume data. So it takes some time to fetch data from the fact table for a specific period also. For this reason we need to use Hint in RPD in Physical Layer. There are multiple Hint functions available in database: Index, Hash, Parallel, Merge, Full.

Hint in RPD


Hint in Physical Object:


Open the RPD and check out the physical table where the Hint needs to be applied

Click on “General” tab of the table and apply the Hint
Here I am applying Parallel Hint in the table W_DAY_D.
Parallel Hint instructs the optimizer to use the specified number of concurrent servers for a parallel operation.
Syntax is:
/*+ PARALLEL (Table Name counter) */
Save the RPD and Run the report.

After adding the Hint in table “W_DAY_D” RPD will look like:




Hint in Joining Condition of Physical Objects:


We can Use USE_HASH function or PARALLEL function on the joining condition of fact (say W_GL_REVN_F) and Dimension (say W_GL_ACCOUNT_D).


Open the RPD and check out the physical join where the Hint needs to be applied

Double click on the joining condition and apply the Hint
Here I am applying Hash Hint in the joining condition between the tables Fact_W_GL_REVN_F and Dim_W_GL_ACCOUNT_D.
This USE_HASH hint instructs the optimizer to join each specified table with another row source using a hash join.
Syntax is:
/*+ USE_HASH (Table Name1 Table Name2 ) */


After adding the hints in Rpd it should look like:--




Performance improvement Tips :-

1. Use the log level judiciously. Switch it off for all users except Administrator. You can use the ‘Act as’ functionality to run the
request as a different user. Once you have authorized a user to act as a proxy user, he or she can then use the Settings > Act
As option in Oracle BI Presentation Services to select the target user to act as.

2. Use filter functions instead of case statements.

3. Avoid cast function.

4. Split huge dimensions.

5. Creating the Query Workload: This is a process of identifying the worst queries using either OBIEE’s log or database data
dictionary views like v$longops etc or using reports like AWR. Once the problematic queries are identified the SQL summary
advisor can be used to create materialized views which can have the most significant impact on the execution of these
problematic queries.

6. Populating Logical Level Counts Automatically: Rightclick
one or more business models and dimension objects, and choose
Estimate Levels. This helps OBIEE to gather some information that can help OBIEE to pick the most apt path which
answering queries related to hierarchies(or dimensions).

7. Intelligent join: Within a logical table source, the joins are always executed. When between logical tables, the joins are only
performed when required.
The tables snowflaked off the dimensions have parentchild
relationships with each other that mimic the dimensional
hierarchies.

8. Eliminate outer joins: Eliminating outer joins results in a more consistent record set, a simplified business model, and
improved performance. Make changes in the data model and change the ETL such that most of the joins in the BMM layer
are inner joins
Outer joins in logical table sources are always included in a query, even if the table source is not used. If possible, create one
logical table source without the outer join and another with the outer join. Order the logical table source with the outer join
after the nonouter
join so that it will be used only when necessary.

9. An opaque view (a physical layer table that consists of a Select statement) should be used only if there is no other solution.
Ideally, a physical table should be created, or alternatively a materialized view. A traditional database view is not needed
because it is identical to the opaque view.

10. OCI should be used for connecting to Oracle. Oracle Call Interface (OCI) applications can utilize client memory to take
advantage of the OCI result cache. A result cache stores the results of queries shared across all sessions. When these queries
are executed repeatedly, the results are retrieved directly from the cache memory, resulting in faster query response time. The
query results stored in the cache become invalid when data in the database objects being accessed by the query is modified.
The clientside
result cache is a separate feature from the serverside
result cache. Unlike the server result cache, the OCI
result cache does not cache results in the server/

11. Use Multithreaded Connections checkbox in the connection pool properties: When the check box is select ed, Oracle BI
Server terminates idle physical queries (threads). When not selected, one thread is tied to one database connection (number of
threads = maximum connections). Even if threads are idle, they consume memory.
The parameter DB_GATEWAY_THREAD_RANGE in the Server section of NQSConfig.ini establishes when Oracle BI Server
terminates idle threads. The lower number in the range is the number of threads that are kept open before Oracle BI Server
takes action. If the number of open threads exceeds the low point in the range, Oracle BI Server terminates idle threads. For
example, if DB_GATEWAY_THREAD_RANGE is set to 40200
and 75 threads are open, Oracle BI Server terminates any
idle threads.

12. Leading Hint: This helps you set the driving table in a join. This table will be accessed before any other table while executing
the join. Use this when you have an equijoin and one of the tables in the join has very few records compared to the other
tables. Ideally, in such a scenario, CBO should do nested loop join with the smaller table as the driving table. If this doesnot
happen then try the leading hint. Test the performance of the query before implementing it in the rpd
An almost similar technique is specifying the driving table in the joins in the BMM layer.
When you specify a driving table, the Oracle BI Server will use it if the query plan determines that its use will optimize query
processing. The small table (the driving table) is scanned, and parameterized queries are issued to the large table to select
matching rows. The other tables, including other driving tables, are then joined together.
In general, driving tables can be used with inner joins, and for outer joins when the driving table is the left table for a left outer
join, or the right table for a right outer join. Driving tables are not used for full outer joins.
MAX_PARAMETERS_PER_DRIVE_JOIN: This is a performance tuning parameter. In general, the larger its value, the fewer
parameterized queries will be generated. Values that are too large can result in parameterized queries that fail due to backend
database limitations. Setting the value to 0 (zero) turns off drive table joins.
MAX_QUERIES_PER_DRIVE_JOIN: This is used to prevent runaway drive table joins. If the number of parameterized
queries exceeds its value, the query is terminated and an error message is returned to the user.

13. Uncheck DISTINCT_SUPPORTED and ORDERBY_SUPPORTED check boxes from the database in the physical layer. Apply explicit ordering if required.

14. Don’t forget to mention the level of fact table sources, OBIEE picks tables to join based on this. logical dim sources should not be specified with level.

15. Push as much processing to the database as possible. This includes tasks such as filtering, string manipulation, and additive measures.

16. Feed cache for the broadest possible result. If you feed your cache for some aggregated result then it will be useful only for that result. So while making a plan to seed the cache, pick a report that has a more granular data to be displayed. If possible,
make reports that will exclusively be used for caching. The cached result of such reports should benefit other reports. These reports can then be scheduled to feed cache using ibots.

17. Use usage tracking to identify the candidates of aggregation.

18. Exchanging metadata with oracle database: By exchanging Oracle Business Intelligence metadata from the Oracle BI Server with your Oracle Database, you enable the database to accelerate the performance of data warehouse queries. You use the
Oracle BI Server utility sametaexport to exchange the metadata.
19. Have a data purging policy in place to remove unwanted data from the warehouse
20. Use star transformation.

21. Indexes and table partitioning are usually beneficial in datawarehouses. Make sure to avoid full table scans on tables that fetch 15% or less data of big tables. Small dimensions are better without index. Partitioning should be done wisely so that it helps most of your answers. Partitioning not only improves performance but also makes the data purging and data maintenance easier. Partitioning a global index allows partition pruning to take place within an index access, which results in reduced I/Os. By definition of good range or list partitioning, fast index scans of the correct index partitions can result in very
fast query times.
V$SQL_MONITOR and V$SQL_PLAN_MONITOR can help you do real time performance monitoring.

22. The SQL Performance Analyzer enables you to forecast the impact of system changes on SQL performance by testing these changes using a SQL workload on a test system.

23. SQL Plan management can be used for plan stability.

24. Use of the with clause – Check WITH_CLAUSE_SUPPORTED in database properties
It lets you reuse the same query block in a SELECT statement when it occurs more than once within a complex query. Oracle
Database retrieves the results of a query block and stores them in the user's temporary table space.

25. Database Resource Manager: The Database Resource Manager provides the ability to prioritize work within the Oraclesystem. Users with higher priority jobs get resources in order to minimize response time for online work, for example, while
users with lower priority jobs, such as batch jobs or reports, might encounter slower response times.
You can specify the maximum number of concurrently active sessions for each consumer group. When this limit is reached,
the Database Resource Manager queues all subsequent requests and runs them only after existing active sessions complete.

26. Using data compression: It reduces I/O. Both indexes and tables can be compressed. Check the trace to find out if high I/O
has been the reason for slow performance. If yes, compression can help. Compression can help improve performance for queries that scan large amounts of data, by reducing the amount of I/O required to scan that data.

27. The SQL Access Advisor is primarily responsible for making schema modification recommendations, such as adding or dropping indexes and materialized views. It also recommends a partitioning strategy. The SQL Tuning Advisor makes other
types of recommendations, such as creating SQL profiles and restructuring SQL statements. In some cases where significant performance improvements can be gained by creating a new index, the SQL Tuning Advisor may recommend doing so.
However, these recommendations must be verified by running the SQL Access Advisor with a SQL workload that contains a
set of representative SQL statements. In 11G, The SQL Access Advisor has been enhanced to include partition advice. It
recommends the right strategy to partition tables, indexes, and materialized views to get best performance from an application.
SQL tuning advisor can also be scheduled to get advisories

28. Using parallel hints for queries that involve large table scans, joins, or partitioned index scans. Parallelism might spoil
performance on over utilized systems or systems with small I/O bandwidth.

29. Use bitmap index for low cardinality columns. For example, on a column with one million rows, 10,000 distinct values.
Unlike most other types of indexes, bitmap indexes include rows that have NULL values. So queries with ‘is null’ will use
bitmap indexes. Do not use bitmap index in systems which have huge DML activity. Bitmap join index is also an interesting strategy.

30. Local indexes are better compared to global indexes for an OLAP system because search for data is at the partition level.

31.The size of the redo log files can influence performance, because the behavior of the database writer and archiver processes depend on the redo log sizes.

32.In case of nested loop joins it is very important to ensure that inner table is driven from (dependent on) the outer table. If the inner table's access path is independent of the outer table, then the same rows are retrieved for every iteration of the outer loop, degrading performance considerably. In such cases, hash joins joining the two independent row sources perform better.

Thanks

Location of the Startup folder in Windows 10/8


The Current Users Startup folder in Windows 10/8 is located at:

These programs start up for the current logged in user only. To directly access this folder, open Run, type shell:startup and hit Enter.

C:\ProgramData\Microsoft\Windows\Start Menu\Programs

Saturday, April 15, 2017

Checking Operation system certificate in obiee11g installation error


Checking Operation system certificate in obiee11g installation error






I faced the above issue while installing OBIEE11g in my local machine.

Issue:
OBIEE 11g 11.1.1.7.0 was recently certified on Windows Server 2012 64‑bit. When installing, the pre‑requisite check for the Operating
System fails.
Cause:
The software is certified; however, the Oracle Installer is not yet updated to reflect the certification. This can sometimes occur when a
certification happen post software release.

Resolution:
You can resolve this error by updating the pre‑requisite check file in the install media.
Step 1:
Navigate to where you have unzipped, with 7‑zip, the OBIEE 11g (11.1.1.7.0) installation media (locally or on a mount).
Step 2:
Backup the following file:
bishiphome\Disk1\stage\prereq\win64\refhost.xml
Step 3:
Update the refhost.xml file to add the new operating system at the end of the certified systems section




Step 4: Save the file
Step 5: Re‑start the Installer to re‑start and complete the install.
Step 6: The installer will then proceed without the error.

Thanks 

Friday, April 14, 2017

loopback adapter on windows 10

How to install Microsoft Loopback Adapter on Windows 10To install the microsoft loopback adapter on win 10 you must:

  1. right click on window start menu icon and select Device manager. Device manager window will immediatel open (or you may use any other way how to open device manager window)
  2. click on Action, and select Add legacy hardware
  3. click Next on welcome screen
  4. choose "Install the hardware that i manually select from a list" and click on Next
  5. scroll down and select Network adapters from offered common hardware types and click on Next
  6. select Microsoft as the manufacturer, and then select Microsoft KM-TEST Loopback adapter card model, click on Next
  7. click on Next
  8. click on Finish

Data engineering Interview Questions

1)  What all challenges you have faced and how did you overcome from it? Ans:- Challenges Faced and Overcome As a hypothetical Spark develop...