Google
Information Storage and Retrieval: SQL, PL/SQL and DataWarehousing Interview Questions

Pages

Friday, January 22, 2010

SQL, PL/SQL and DataWarehousing Interview Questions

Q1. What are collections? What are the advantages of using Collections?

PL/SQL collection types let you declare high-level datatypes similar to arrays, sets, and hash tables found in other languages. In PL/SQL, array types are known as varrays (short for variable-size arrays), set types are known as nested tables, and hash table types are known as associative arrays.Each kind of collection is an ordered group of elements, all of the same type. Each element has a unique subscript that determines its position in the collection. When declaring collections, you use a TYPE definition. The adantage of collections is that Collections can be passed as parameters, so that subprograms can process arbitrary numbers of elements.You can use collections to move data into  and out of database tables using high-performance language features known as bulk SQL.

Q2. Tell some new features of Oracle 10g from a developer's point of view.


1. UNIX-style regular expressions  can be used  while performing queries and string manipulations. You use the REGEXP_LIKE operator in SQL queries, and the REGEXP_INSTR, REGEXP_REPLACE, and REGEXP_SUBSTR functions anywhere you would use INSTR, REPLACE, and SUBSTR.

2. Flashbach query functions introduced. The functions SCN_TO_TIMESTAMP and TIMESTAMP_TO_SCN let you translate between a date and time, and the system change number that represents the database state at a point in time.

3. Nested tables defined in PL/SQL have many more operations than previously. You can compare nested tables for equality, test whether an element is a member of a nested table, test whether one nested table is a subset of another, perform set operations such as union and intersection, and much more.

4. New datatypes BINARY_FLOAT and BINARY_DOUBLE represent floating-point numbers in IEEE 754 format. These types are useful for scientific computation where you exchange data with other programs and languages that use the IEEE 754 standard for floating-point.

5. Staring with Oracle 10g release 1, the BINARY_INTEGER datatype was changed to be identical to PLS_INTEGER so the datatypes can be used interchangeably.

Q3. What is high water mark?

Each table's data is stored in its own data segment. The high water mark is the boundary between used and unused space in a segment. When a table is deleted, the high water mark remains unchanged but when a table is truncated, the high water mark drops.

Q4. What are mutating tables?

Mutating means changing.A mutating table is a table that is currently being modified by an update, delete or insert statement. When a trigger tries to reference a table that is in state of flux(being changed), it is considered mutating and raises an error since oracle should never return inconsistenet data.


Q5 How does PL/SQL offers performance benefits over SQL.

Without PL/SQL, Oracle must process SQL statements one at a time. Programs that
issue many SQL statements require multiple calls to the database, resulting in significant network and performance overhead. With PL/SQL, an entire block of statements can be sent to Oracle at one time. This can drastically reduce network traffic between the database and an application.You can use PL/SQL blocks and subprograms to group SQL statements before sending them to the database for execution

Q6 Define %TYPE and %ROWTYPE


Ans The %TYPE attribute provides the datatype of a variable or database column. This is particularly useful when declaring variables that will hold database values. For example, assume there is a column named last_name in a table named employeesTo declare a variable named v_last_name that has the same datatype as column title, use dot notation and the %TYPE attribute, as follows:

v_last_name employees.last_name%TYPE;

Declaring v_last_name with %TYPE has two advantages. First, you need not know the exact datatype of last_name. Second, if you change the database definition of last_name, perhaps to make it a longer character string, the datatype of v_last_name changes accordingly at run time.

%ROWTYPE

In PL/SQL, records are used to group data. A record consists of a number of related fields in which data values can be stored. The %ROWTYPE attribute provides a record type that represents a row in a table. The record can store an entire row of data selected from the table or fetched from a cursor or cursor variable.

Q7. Define NOT NULL constraint

Ans:  NOT NULL constraint puts a constraint on the value of a varaible. It can't be NULL. If NULL is tried to be inserted into such a variable, PL/SQL raises the predefined exception VALUE_ERROR.

Q8 Whats the difference between RANK() and DENSE_RANK() functions?
Ans RANK() will leave a gap in the ranking sequence when there are ties e.g rather than 1,2,2,3 RANK() will return 1,2,2,4.

Q9 Whats the difference between CASE and DECODE?

DECODE works only on equality operator. CASE can handle inquality operators as well  e.g <,>,<=,>=,!=


Q10 Take us through the life-cycle of an SQL query when fired against the database.


These are the following steps during the execution of the SQL query:

  • A cursor is established to hold the context of the SQL statement. A cursor is a connection to a specific area in the Program Global Area (PGA) that contains information about the specific SQL statement.
  • The SQL statement is now parsed. Parsing is the process of checking an SQL statement to ensure that it is valid(syntax checking), as well as creating an optimal execution plan for the statement.
  • Preparation for the return of result set is done.
  • Bind variables are supplied with run time values.
  • The final statement is now executed using the execution plan that has been prepared.
  • Oracle returns the result set.

Q11 What is an Oracle Instance?


An oracle instance is a software service that act as an intermediary between application requests and Oracle database. It is a set of some background processes and  shared memory. An Oracle instance is either started as part of the process of booting a server or can be started explicitly with commands. There are 3 steps in the start-up process:

  • Starting the instance process itself
  • Mounting the database, which consist of opening the control files for the instance
  • Opening the database which makes the database available for user requests.
An instance can mount and open only a single database ever. However, a database can ne mounted and opened by one or more instances(using Real Application Clusters)



(I like to imagine instance as a 'database armed and ready for action)


Q12 Define SGA and PGA


These are memory areas used by an Individual oracle instance.


SGA:
System Global Area is an area of memory that is accessible to all user processes of an Oracle instance. 3 main areas use by SGA are :

  • Redo log buffer which holds information used for recovery until the information can be written to the redo log
  • The shared pool which holds information that can be shared across user processes such as execution plans for SQL statements, compiled stored procedures and information retrieved from the Oracle data dictionary.
  • The database buffer pools which are used to hold data blocks.
PGA:
Program Global Area is an area of memory that is just available to a single server process. PGA contains items like user varaibles and cursor information for an individual user's SQL statement, such as number of rows that have been retreived so far. The PGA is also used for sorting data for an individual user process.

Q13 What is hard and soft parsing?

During parsing, after validating an SQL statement, the Oracle database computes a hash alogorithm to check whether a version of statement exists in the shared pool of SGA. If Oracle can not find the statement in shared pool, the instance prepares the execution plan for it. This is called hard parsing. If Oracle finds the statement in the shared pool, it simply retrieves the execution plan for the statment. This is called soft parsing. Performance is improved by increasing the chances for soft parsing

Q14 What are bind variables and how they improve performance of a SQL code?

A bind variable is a value in an SQL statement that is not known until the statement is run. In an SQL statement , the bind variables act as a placeholder for this subsequent data. In an Oracle SQL syntax, a bind variable is indicated by a colon (:) before the variable name.

e.g Select ename from emp where emp_id = :n_empid

The use of bind variables helps Oracle to do soft parsing of the SQL statements which enhances performance. 

Consider following 2 statements :

Select ename from emp where emp_id = 7

Select ename from emp where emp_id = 5

Both these statements should use identical optimizer plans. If Oracle finds the execution plan for first statement, it should use the same for the second. However it will not since, the hash algorithm generates the execution plans based on characters in the SQL statement. Using bind variables, this problem can be resolved.

Select ename from emp where emp_id = :n_empid

Using this syntax, the execution plan can be retrieved for n number of such statements.

Q15 What is the difference between 'DELETE' and 'TRUNCATE' command?

DELETE is used to delete the rows from a table. However it retains the rows to be deleted in redo log buffers and if ROLLBACK is issued, the rows restored back into the table. Only when COMMIT is executed after DELETE, the rows are permanently deleted.

TRUNCATE requires no COMMIT command to delete the rows. As soon as TRUNCATE is fired, rows are permnently deleted. Thats why its also called as DDL statement. TRUNCATE causes the movement of High Water Mark.

TRUNCATE is faster than DELETE since TRUNCATE do not requires to hold the rows in redo log buffers.


Q16 What is a surrogate key? How it is different from a natural key? What are the advantages of a surrogate key?


surrogate key (also known as artificial or identity key) is a system generated key with no business value. A surrogate key is a substitution for the natural primary key. 
It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. Data warehouses typically use a surrogate key for the dimension tables primary keys.

A natural key is a value that has a business meaning and people use them e.g SSN, Customer Id 

Advantages of Surrogate Keys:
  • Natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.And in many cases natural primary key is a combination of more than 1 columns. In such cases, the usage of surrogate key simplifies the design.
  • Surrogate keys are numeric and hence indexing on them is faster.
  • Proper tracking the slowly changing dimension. e.g if an employee E1 belongs to business unit BU1 on 1st June 2009 but migrates to Business Unit BU2 on 1st November 2009. If natural key Employee_Id E1 is used, every other attribute will now be moved to BU2. In this case, if surrogate key is used, a new record can be created for E1 to show new data that belongs to BU2.
Q17 What are external tables?

External Tables are a way to query data in a flatfile as if the file were an Oracle table. Thus, it provides a convenient way to move data into/out of database. No DML operations(other than CREATE TABLE) are permitted on External tables. Hence, we can not create indexes on External Tables.

Internally External table uses ORACLE_LOADER access driver to move data from flat file into database. It uses DATA PUMP access driver to move data out of the db into a file. 

No comments: