Informatica Interview Questions
1. What are the advantages of Informatica?
Informatica has some advantages over other data integration systems. A couple of the advantages are:
It is faster than the available platforms.
You can easily monitor your jobs with Informatica Workflow Monitor.
It has made data validation, iteration and project development to be easier than before.
If you experience failed jobs, it is easy to identify the failure and recover from it. The same applies to jobs that are running slowly.
2. In what real situations can Informatica be used?
Informatica has a wide range of application that covers areas such as:
3. What are some examples of Informatica ETL programs?
Some basic Informatica programs are:
Mappings: A mapping is designed in the Designer. It defines all the ETL processes. Data are read from their original sources by mappings before the application of transformation logic to the read data. The transformed data is later written to the targets.
Workflows: The processes of runtime ETL are described by a collection of different tasks are known as workflow. Workflows are designed in the Workflow Manager.
Task: This is a set of actions, commands, or functions that are executable. How an ETL process behaves during runtime can be defined by a sequence of different tasks.
4. Which development components of Informatica have the highest usage?
There are many development components in Informatica. However, these are the most widely used of them:
Expression: This can be used to transform data that have functions.
Lookups: They are extensively used to join data.
Sorter and Aggregator: This is the right tool for sorting data and aggregating them.
Java transformation: Java transformation is the choice of developers if they want to invoke variables, java methods, third-party API’s and java packages that are built-in.
Source qualifiers: Many people use this component to convert source data types to the equivalent Informatica data types.
Transaction control: If you want to create transactions and have absolute control over rollbacks and commits, count on this component to bail you out.
5. What are the uses of ETL tools?
ETL tools are quite different from other tools. They are used for performing some actions such as:
Loading important data into a data warehouse from any source known as Target.
Extracting data from a data warehouse from any sources such as database tables or files.
Transforming the data received from different sources in an organized way. Some of the notable sources where data are received include SAP solutions, Teradata, or web services.
6. What are types of data ware houses?
There are three types of data ware houses:
· Enterprise data ware house
· ODS(operational data store)
· Data mart
7. What is data mart?
A data mart is a subset of data warehouse that is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise wide data warehouse. In a independent data mart can be collected directly from sources.
8. What is Data warehouse?
According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.
9. What is star schema?
A star schema is the simplest form of data warehouse schema that consists of one or more dimensional and fact tables.
10. What is snow flake schema?
A Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, The snowflake and star schema are methods of storing data which are multidimensional in nature.
11. What are ETL Tools?
ETL Tools are stands for Extraction, Transformation, and Loading the data into the data warehouse for decision making. ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.
12. What are Dimensional table?
Dimension tables contain attributes that describe fact records in the fact table.
13. What is data Modelling?
Data Modeling is representing the real world set of data structures or entities and their relationship in their of data models, required for a database.Data Modelling consists of various types like :
Conceptual data modeling
Logical data modeling
Physical data modeling
Enterprise data modeling
Relation data modeling
Dimensional data modeling.
14. What is Surrogate key?
Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number of each row that can be used for the primary key to the table.
15. What is Data Mining?
A Data Mining is the process of analyzing data from different perpectives and summarizing it into useful information.
16. What is Operational Data Store?
A ODS is an operational data store which comes as a second layer in a datawarehouse architecture. It has got the characteristics of both OLTP and DSS systems.
17. What is the Difference between OLTP and OLAP?
OLTP is nothing but OnLine Transaction Processing which contains a normalised tables .
But OLAP(Online Analtical Programming) contains the history of OLTP data which is non-volatile acts as a Decisions Support System.
18. How many types of dimensions are available in Informatica?
There are three types of dimensions available are :
19. What are the designer tools for creating tranformations?
20. What is the maplet?
Maplet is a set of transformations that you build in the maplet designer and you can use in multiple mapings.
21. What is Session and Batches?
Session: A session is a set of commands that describes the server to move data to the target.
Batch: A Batch is set of tasks that may include one or more numbar of tasks (sessions, ewent wait, email, command, etc).
22. What are slowly changing dimensions?
Dimensions that change overtime are called Slowly Changing Dimensions(SCD).
Slowly Changing Dimension-Type1 : Which has only current records.
Slowly Changing Dimension-Type2 : Which has current records + historical records.
Slowly Changing Dimension-Type3 : Which has current records + one previous records.
23. What are 2 modes of data movement in Informatica Server?
There are two modes of data movement are:
Normal Mode in which for every record a separate DML stmt will be prepared and executed.
Bulk Mode in which for multiple records DML stmt will be preapred and executed thus improves performance.
24. What is the difference between Active and Passive transformation?
Active Transformation:An active transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.
Passive Transformation:A passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.
25. What is the difference between connected and unconnected transformation?
Connected Transformation:Connected transformation is connected to other transformations or directly to target table in the mapping.
UnConnected Transformation:An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.
26. What are various types of Aggregation?
Various types of aggregation are
27. What are Aggregator Transformation?
Aggregator transformation is an Active and Connected transformation. This transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups).
28. What are Expression transformation?
Expression transformation is a Passive and Connected transformation. This can be used to calculate values in a single row before writing to the target.
29. What are Filter transformation?
Filter transformation is an Active and Connected transformation. This can be used to filter rows in a mapping that do not meet the condition.
30. What are Joiner transformation?
Joiner Transformation is an Active and Connected transformation. This can be used to join two sources coming from two different locations or from same location.
31. Why we use lookup transformations?
Lookup Transformations can access data from relational tables that are not sources in mapping.
32. What are Normalizer transformation?
Normalizer Transformation is an Active and Connected transformation. It is used mainly with COBOL sources where most of the time data is stored in denormalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data.
33. What are Rank transformation?
Rank transformation is an Active and Connected transformation. It is used to select the top or bottom rank of data.
34. What are Router transformation?
Router transformationis an Active and Connected transformation. It is similar to filter transformation. The only difference is, filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions.
35. What are Sorter transformation?
Sorter transformation is a Connected and an Active transformation. It allows to sort data either in ascending or descending order according to a specified field.
36. Name four output files that informations server creates during session running?
37. What is the difference between lookup cache and lookup index?
Look up Cache contains Index cache and data cache.
Index cache : Contains columns used in condition.
38. What are the difference between static cache and dynamic cache?
Dynamic cache decreases the performance in comparision to static cache.
Static cache do not see such things just insert data as many times as it is coming
39. Define maping and sessions?
Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to targets.
40. What is a command that used to run a batch?
pmcmd is used to start a batch.
41. What is Datadriven?
The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert, update, delete or reject.
42. What is power center repository?
The PowerCenter repository allows you to share metadata across repositories to create a data mart domain.
43. What is parameter file?
A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file.
44. What are the types of lookup caches?
45. What are Stored Procedure transformation?
Stored Procedure transformation is an Passive & Connected or UnConnected transformation. It is useful to automate time-consuming tasks and it is also used in error handling, to drop and recreate indexes and to determine the space in database, a specialized calculation.
46. What is fact table?
The centralized table in a star schema is called as fact table. Fact tables are three types
47. What transformations are used for variable port?
48. What is Data Transformation Manager(DTM)?
After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run.
49. How can you define a transformation?
A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions.
50. What are Lookup transformation?
Lookup transformation is Passive and it can be both Connected and UnConnected as well. It is used to look up data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables.
51. What are Source Qualifier transformation?
Source Qualifier transformation is an Active and Connected transformation. When adding a relational or a flat file source definition to a mapping, it is must to connect it to a Source Qualifier transformation. The Source Qualifier performs the various tasks such as overriding default SQL query, filtering records; join data from two or more tables etc.
52. What is difference between maplet and reusable transformation?
Maplet consists of set of transformations that is reusable.
A reusable transformation is a single transformation that can be reusable.
53. What are Update Strategy transformation?
Update strategy transformation is an active and connected transformation. It is used to update data in target table, either to maintain history of data or recent changes. You can specify how to treat source rows in table, insert, update, delete or data driven.
54. How many types of dimensions are available in informatica?
There are three types of dimensions.
55. What is difference between maplet and reusable transformation?
Maplet : one or more transformations.
set of transformations that are reusable.
Reusable transformation: only one transformation.
Single transformation which is reusable.
56. What are different types of parsing?
57. What are Lookup and Fact Tables?
A lookup (Dimension) table contains information about the entities. In general the Dimension and details objects are derived from lookup tables. A fact table contains the statistical information about transactions.
58. What is Designer?
Designer is the Business objects product that is intended to develop the universes. These universe is the semantic - layer of the database structure that isolates from technical issues.
59. What is Surrogate Key?
Surrogate keys are keys that are maintained within the data warehouse instead of keys taken from source data systems.
60. What are the pitfalls of DWH?
Limited value of data (Historical data not current data)
DW solutions complicate business processes
DW solutions may have too long a learning curve
Costs of cleaning, capturing and delivering data
61. How do you handle large datasets?
By Using Bulk utility mode at the session level and if possible by disabling constraints after consulting with DBA; Using Bulk utility mode would mean that no writing is taking place in Roll Back Segment so loading is faster. However the pitfall is that recovery is not possible.
62. What are the limitations of handling long datatypes?
When the length of a datatype (e.g varchar2(4000)) goes beyond 4000, Informatica makes this as varchar2(2000).
63. What are the types of OLAP?
ROLAP (Relational OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in RDBMS. The performance is slow. A storage mode that uses tables in a relational database to store multidimensional structures.
MOLAP (Multidimensional OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in MDBMS. Query performance is fast.
HOLAP (Hybrid OLAP) - It is a combination of ROLAP and HOLAP. EG: HOLOs. In this one will find data queries on aggregated data as well as detailed data.
64. What is the difference between data mart and data warehouse?
Data mart used on a business division/department level where as data warehouse is used on enterprise level.
65. What is Meta data?
Data about the data, contains the location and description of data warehouse system components such as name, definitions and end user views.
66. How does the recovery mode work in informatica?
In case of load failure an entry is made in OPB_SERV_ENTRY(?) table from where the extent of loading can be determined.
67. What is Aggregate Awareness?
Aggregate awareness is a feature of DESIGNER that makes use of aggregate tables in a database. These are tables that contain pre-calculated data. The purpose of these tables is to enhance the performance of SQL transactions; they are thus used to speed up the execution of queries.
68. What is a difference between OLTP and OLAP?
It focus on day to day transaction.
Access Frequency High.
It focus on future predictions and decisions
Static until refreshed
Demoralized and replicated data
Medium to low.
69. When should you use a star schema and when a snowflake schema?
A star schema is a simplest data warehouse schema. Snowflake schema is similar to the star schema. It normalizes dimension table to save data storage space. It can be used to represent hierarchies of information.
70. What parameters can be tweaked to get better performance from a session?
DTM shared memory, Index cache memory, Data cache memory, by indexing, using persistent cache, increasing commit interval etc.
71. What are the benefits of DWH?
Immediate information delivery
Data Integration from across, even outside the organization
Future vision of historical trends
Tools for looking at data in new ways
Enhanced customer service.
72. Is It Possible to invoke Informatica batch or session outside Informatica UI?
73. Why we are going for surrogate keys?
Data tables in various source systems may use different keys for the same entity.
Keys may change or be reused in the source data systems.
Changes in organizational structures may move keys in the hierarchy.
74. When is more convenient to join in the database or in Informatica?
Definitely at the database level
at the source Qualifier query itself
rather than using Joiner transformation
75. How do you measure session performance?
By checking Collect performance Data check box.
76. What is Dimension Table?
It contains data used to reference data stored in the fact table.
Primarily character data
One primary key (dimensional key)
77. What is a database connection?
A connection is a set of parameters that provides access to an RDBMS. These parameters include system information such as the data account, user identification, and the path to the database. Designer provides three types of connections: secured, shared, and personal.
78. What are the parts of Informatica Server?
Data Transformation manager
79. What is the functionality of repository manager?
Repository manager is used to navigate through multiple folders and perform the basic repository tasks.
80. What is parameter file?
Parameter file is to define the values for parameters and variables used in a session. A parameter file is a file created by text editor such as word pad or notepad.
81. What is difference between aggregator and expression?
Aggregator transformation used to perform aggregate calculation on group of records Where as expression used perform calculation with single record.
82. What is the default source option for update strategy transformation?
83. What is the difference between sequential batch and concurrent batch?
In sequential batch one session will end then the other will begin. Where as in the concurrent. Batch the sessions will run simultaneously depending on the CPU availability.
84. What are the types of tracing level?
85. What is update strategy transformation?
This transformation is used to maintain the history data or just most recent changes in to target table.
86. What is power center repository?
The Power center repository allows you to share metadata across repositories to create a data mart domain. In a data mart domain, You can create a single global repository to store metadata used across an enterprise, and a number of local repositories to share the global metadata as needed.
87. What are designer objects?
88. Can you generate reports in Informatcia?
By using Metadata reporter we can generate reports in informatica.
89. What is a reusable transformation?
Reusable transformation: Developed in transformation developer. It can be invoked from more than one mapping or mapplet.
90. What are different Types of facts?
There are three types of facts :
Additive : Additive facts are facts that can be summed up through all of the dimensions in the fact table.
Semi–Additive : Semi–additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.
Non–Additive: Non–additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
91. What are the types of designer tools?
The Designer provides the following tools are :
92. What is the subquery?
Subquery is a query whose return values are used in filtering conditions of the main query.
93. What is tracing level?
Tracing level represents the amount of information that informatcia server writes in a log file
94. Why you use repository connectivity?
When you edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users are valid.All the metadata of sessions and mappings will be stored in repository.
95. What are the options in the target session of update strategy transformation?
· Update as update
· Update as insert
· Update else insert
· Truncate table
96. What is difference between ER Modeling and Dimensional Modeling?
ER Modeling is used for normalizing the OLTP database design.
Dimesional modeling is used for de-normalizing the ROLAP / MOLAP design.
97. What is the advantage of Lookup persistent cache?
When Lookup cache is saved in Look up Transformation It is called persistent cache. The first time session runs it is saved on the disk and utilized in subsequent running of the Session. It is used when the look up table is Static i.e. doesn’t change frequently.
98. What are all the types of dimensions?
99. Which transformation should u need while using the cobol sources as source defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources are oftenly consists of Denormailzed data.
100. How many ways you create ports?
Drag the port from another transforamtion.
Click the add buttion on the ports tab.
101. What are different types of transformations available in Informatica?
There are various types of transformations available in Informatica :
Application Source Qualifier
XML Source Qualifier
102. How can you improve session performance in aggregator transformation?
Use sorted input.
103. Why we use stored procedure transformation?
A stored procedure transformation is an important tool for populating and maintain databases.