Tuesday, July 19, 2016

SSIS Interview Question and Answers Part 2




1. Logging. Different types of Logging files
Logging is used to log the information during the execution of package.
A log provider can be a text file, the SQL Server Profiler, a SQL Server relational database, a Windows event log, or an XML file. If necessary, you can define a custom log provider (e.g., a proprietary file format).
2. How to debug a package
For debugging a package, we have 3 options:
-Setting breakpoints in a package, container or Task
-Using Data Viewer in the Data flow task
-Setting Error redirection in the Data Flow task

3. What is the LoggingMode property?
SSIS packages, tasks and containers have a property called LoggingMode. This property accepts 3 possible values:
Enabled - to enable logging of that component
Disabled - to disable logging of that component
UseParentSetting - to use parent's setting of that component to decide whether or not to log the data.

4. How to configure Error Output in SSIS
We have 3 options to configure error output in ssis.
1. Ignore Failure
2. Redirect Row
3. Fail Component
For Example:
I have comma-separated value in a flat file with two columns (code, Name). Code is an integer value and name is a varchar(20) data type configured in the flat file connection manager. Some of the codes  in the flat files are characters. So, flat file reader component will fail reading the character value. But, I want to redirect the error data to separate table.

5. What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data(wrong type of data ) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.

6. Incremental Load in SSIS
Using Slowly Changing Dimension
Using Lookup and Cache Transformation

7. Difference between Synchronous and Asynchronous Transformation
Synchronous T/F process the input rows and passes them onto the data flow one row at a time.
When the output buffer of Transformation created a new buffer, then it is Asynchronous transformation. Output buffer or output rows are not sync with input buffer.

8. What are Row Transformations, Partially Blocking Transformation, Fully Blocking Transformation with examples.
In Row Transformation, each value is manipulated individually. In this transformation, the buffers can be re-used for other purposes like following:
OLEDB Datasource,         OLEDB Data Destinations
Other Row transformation within the package,     Other partially blocking transformations within the package.
examples of Row Transformations: Copy Column, Audit, Character Map
Partially Blocking Transformation:
These can re-use the buffer space allocated for available Row transformation and get new buffer space allocated exclusively for Transformation.
examples: Merge, Conditional Split, Multicast, Lookup, Import, Export Column
Fully Blocking Transformation:
It will make use of their own reserve buffer and will not share buffer space from other transformation or connection manager.
examples: Sort, Aggregate, Cache Transformation

9. Difference between FileSystem and Sql server Deployment
File System Deployment: We can save the package on a physical location on hard drive or any shared folder with this option, and we should provide a full qualified path to stored package in the FileSystem option.
Sql Server Deployment: SSIS packages will be stored in the sysssispackages table of MSDB Database.

10. Difference between Lookup and Fuzzy Lookup transformation
Lookup Transformation finds the exact match.
Fuzzy Lookup transformation matches input table with reference table. It finds the closest match and indicates the quality of the match.

Thursday, July 14, 2016

SSIS Interview Question and Answers Part 1

1.In BIDS, the SSIS project contain 10 packages. But How to deploy only 5 packages in Destination machine even though Manifest file contains all 10 packages after the Build?
-Open the Manifest file in any editor like BIDS or Notepad, keep the required 5 packages, remove remaining 5 packages.
-Save and Close the Manifest file.
-Double click on Manifest file to deploy the required 5 package.
2.What are the different types of Transformations you have worked
AGGEGATE -The Aggregate transformation applies aggregate functions to column values and copies the results to the transformation output. Besides aggregate functions, the transformation provides the GROUP BY clause, which you can use to specify groups to aggregate across.
The Aggregate Transformation supports following operations:
Group By, Sum, Average, Count, Count Distinct, Minimum, Maximum
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
CHARACTER MAP - When it comes to string formatting in SSIS, Character Map transformation is very useful, used to convert data lower case, upper case.
CONDITIONAL SPLIT – used to split the input source data based on condition.
COPY COLUMN - Add a copy of column to the output, we can later transform the copy keeping the original for auditing.
DATA CONVERSION - Converts columns data types from one to another type. It stands for Explicit Column Conversion.
DATA MINING QUERY – Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
DERIVED COLUMN - Create a new (computed) column from given expressions.
EXPORT COLUMN – Used to export a Image specific column from the database to a flat file.
FUZZY GROUPING – Groups the rows in the dataset that contain similar values.
FUZZY LOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORT COLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data source. It is used to find exact matches only.
MERGE - Merges two sorted data sets of same column structure into a single output.
MERGE JOIN - Merges two sorted data sets into a single dataset using a join.
MULTI CAST - is used to create/distribute exact copies of the source dataset to one or more destination datasets.
ROW COUNT - Stores the resulting row count from the data flow / transformation into a variable.
ROW SAMPLING - Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.
UNION ALL - Merge multiple data sets into a single dataset.
PIVOT – Used for Normalization of data sources to reduce analomolies by converting rows into columns
UNPIVOT – Used for demoralizing the data structure by converts columns into rows incase of building Data Warehouses.
3. What are the different types of Transaction Options
Required: If a transaction already exists at the upper level, the current executable will join the transaction. If No transaction at the upper level, a new transaction is created automatically.
Supported:In any executable, if there is a transaction at upper level, the executable join the transaction else do not create a new transaction.
Not Supported:The executable of the package do not honour any transaction ie do not join other transaction nor creates new transaction.
4. Explain about Checkpoints with properties
Checkpoint is used to restart the package execution from the point of failure rather than from initial start.
Set the following Properties:
   CheckpointFileName: Specifies the name of the checkpoint file.
   CheckpointUsage: Never, IfExists, Always
   SaveCheckpoints: indicates whether the package needs to save checkpoints.  This property must be set to True to restart a package from a point of failure.

FailPackageOnFailure: property needs to be set to True for enabling the task in the checkpoint.
Checkpoint mechanism uses a Text File to mark the point of package failure.
These checkpoint files are automatically created at a given location upon the package failure and automatically deleted once the package ends up with success.

5. Package configuration? Different types of Configuration Files
The package can be transferred across various environments like development and unit testing, system testing, UAT and production. Most packages will have environment specific variables like connection string to a database or path to a flat file, or user defined variables etc. that would be impacted while moving the package across environments as part of deployment process. Hence, it is mandatory to change these environment dependent variables when the package is transferred across environments. Package configurations help in managing such changes without actually opening and editing the SSIS package in Business Intelligence Development Studio (BIDS). After deploying the package to a different machine (using SQL Server or file system deployment mode) it is mandatory to copy the related package configuration files on to that machine. If the package is scheduled to run through a SQL Agent job, the configuration file should be added while creating the job so that package will read the information from the configuration file. While executing the job, SQL Agent will take the design time values for connection strings if the package configuration file is not supplied.
There are 5 configuration types available with package configurations. 

Sunday, July 3, 2016

Important Interview Questions in Sql Server.



1.Ho to print 1-100 numbers without using loop?


SOL:
  with cte
  as
  (
  select 1 as count
    union all
 select count + 1 from cte where count <100
 )
select * from cte



Monday, May 30, 2016

How to get unmatched rows from two tables

Let Create  two tables

create table t1 (Id int, name varchar(50))
create table t2 (Id int, name varchar(50))

insert t1 values (1,       'avi')
insert t1 values (2,      'ravi')
insert t1 values (3,       'sun')
insert t1 values (5,       'naveen')
insert t1 values (7,       'rachin')

insert t2 values (1,       'avi')
insert t2 values (2,       'ravi')
insert t2 values (4,       'naveen')
insert t2 values (6,       'raja')


Query to find unmatched rows 

Option 1:
Select Id, Name
from T1
where Id not in (select Id from T2)
UNION
Select Id, Name
from t2

where Id not in (select Id from t1)

Option 2:

select isnull(t1.id, t2.id), isnull(t1.name,t2.name)  from t1
full outer join t2 on t1.id = t2.id
where t2.Id is null or t1.id is null



Monday, May 16, 2016

How to calculate running total

How to calculate running total ?

create table store(Id int primary key,productname varchar(20), sales int)

insert into store values (101,'grapes', 4000),
(102,'apples', 10000),
(103,'carrot', 14000),
(104,'watermelon', 9000),
(105,'mangoes', 12000)

select * from  store

Below is the query to find the running total sales 


 select ID,  productname,sales
 ,(select sum(sales) from store 
 where ID <= S.ID)
 'Running Total'
from store S

How to find a Leap year by using User Defined functions

Leap year by using User Defined functions



Function:


CREATE  FUNCTION [dbo].[LeapYear] ( @Date    DATETIME )
RETURNS VARCHAR(100)
AS
BEGIN

    IF (YEAR( @Date ) % 4 = 0 AND YEAR( @Date ) % 100 != 0) OR
        YEAR( @Date ) % 400 = 0

        RETURN 'Leap Year'

    RETURN 'Non-Leap Year'

END
GO

To execute the above function 

select [dbo].[LeapYear] ('2016')


Once you practice delete all these stuff 


 drop function [dbo].[LeapYear]


Monday, April 4, 2016

Stored Procedure in Sql Server


what is a stored procedure ?
 A stored procedure is group of T-SQL statements. whenever we want write  a same query over and over again,we can save the specific query as  a stored procedure and call just by its name.

Lets take a simple  example 

we have table named employee .we want to get name and department of employee every time .

for that we need to write a query and run every time

query :select name, department from employee 

Instead of  writing the query every time, we can wrap up this query inside stored procedure and call that .
To create a stored procedure

create procedure getemployeedtls
as 
begin 

select name, department from employee
end 



and execute the stored procedure by using its name  getemployeedtls or else 

 execute getemployeedtls



we can execute the stored procedure whenever we want the details of employee table no need to run the query.

Actually stored procedure has many uses

In reality the stored procedure will be long ,there are queries 2000 to 3000 lines.it is very difficult to run query every time . by using the stored procedure we can speed up the process.
  It will reduce the network traffic .
security also main aspect of stored procedure.

Stored procedure with parameters


create procedure employeeiddept
@department varchar(20)
as 
begin 
select name ,salary,department from employee where department=@department
end 

to execute the above procedure we should pass parameter

 employeeiddept  'finance'


we get all the names and salary of employee table belongs to finance department.

We can alter the stored procedure by using the alter command

For example we want the result order by salary

Alter  procedure employeeiddept
@department varchar(20)
as 
begin 
select name ,salary,department from employee where department=@department order by salary
end 


Thursday, March 31, 2016

Performance Tuning of SQL SERVER

Performance Tuning of SQL SERVER

Best performance is the main concern to develop a successful application. Like a coin database is the tail side (back-end) of an application. A good database design provides best performance during data manipulation which results into the best performance of an application.
During database designing and data manipulation we should consider the following key points:
1.          Choose Appropriate Data Type
Choose appropriate SQL Data Type to store your data since it also helps in to improve the query performance. Example: To store strings use varchar in place of text data type since varchar performs better than text. Use text data type, whenever you required storing of large text data (more than 8000 characters). Up to 8000 characters data you can store in varchar.
2.          Avoid nchar and nvarchar
Practice to avoid nchar and nvarchar data type since both the data types takes just double memory as char and varchar. Use nchar and nvarchar when you required to store Unicode (16-bit characters) data like as Hindi, Chinese characters etc.
3.          Avoid NULL in fixed-length field
Practice to avoid the insertion of NULL values in the fixed-length (char) field. Since, NULL takes the same space as desired input value for that field. In case of requirement of NULL, use variable-length (varchar) field that takes less space for NULL.
4.          Avoid * in SELECT statement
Practice to avoid * in Select statement since SQL Server converts the * to columns name before query execution. One more thing, instead of querying all columns by using * in select statement, give the name of columns which you required.
1.   -- Avoid
2.  SELECT * FROM tblName
3.  --Best practice
4.  SELECT col1,col2,col3 FROM tblName
5.          Use EXISTS instead of IN
Practice to use EXISTS to check existence instead of IN since EXISTS is faster than IN.
1.   -- Avoid
2.  SELECT Name, Price FROM tblProduct
3.  where ProductID IN (Select distinct ProductID from tblOrder)
4.  --Best practice
5.  SELECT Name,Price FROM tblProduct
6.  where ProductID EXISTS (Select distinct ProductID from tblOrder)
6.          Avoid Having Clause
Practice to avoid Having Clause since it acts as filter over selected rows. Having clause is required if you further wish to filter the result of an aggregations. Don't use HAVING clause for any other purpose.

7.          Create Clustered and Non-Clustered Indexes
Practice to create clustered and non-clustered index since indexes helps in to access data fastly. But be careful, more indexes on a tables will slow the INSERT, UPDATE, DELETE operations. Hence try to keep small no of indexes on a table.
8.          Keep clustered index small
Practice to keep clustered index as much as possible since the fields used in clustered index may also used in nonclustered index and data in the database is also stored in the order of clustered index. Hence a large clustered index on a table with a large number of rows increase the size significantly.
9.          Avoid Cursors
Practice to avoid cursor since cursor are very slow in performance. Always try to use SQL Server cursor alternative.  
10.      Use Table variable inplace of Temp table
Practice to use Table varible in place of Temp table since Temp table resides in the TempDb database. Hence use of Temp tables required interaction with TempDb database that is a little bit time taking task.
11.      Use UNION ALL inplace of UNION
Practice to use UNION ALL in place of UNION since it is faster than UNION as it doesn't sort the result set for distinguished values.
12.      Use Schema name before SQL objects name
Practice to use schema name before SQL object name followed by "." since it helps the SQL Server for finding that object in a specific schema. As a result performance is best.
1.   --Here dbo is schema name
2.  SELECT col1,col2 from dbo.tblName
3.  -- Avoid
4.  SELECT col1,col2 from tblName
13.      Keep Transaction small
Practice to keep transaction as small as possible since transaction lock the processing tables data during its life. Sometimes long transaction may results into deadlocks.
14.      SET NOCOUNT ON
Practice to set NOCOUNT ON since SQL Server returns number of rows effected by SELECT, INSERT, UPDATE and DELETE statement. We can stop this by setting NOCOUNT ON like as:
1.   CREATE PROCEDURE dbo.MyTestProc
2.  AS
3.  SET NOCOUNT ON
4.  BEGIN
5.  .
6.  .
7.  END
15.      Use TRY-Catch
Practice to use TRY-CATCH for handling errors in T-SQL statements. Sometimes an error in a running transaction may cause deadlock if you have no handle error by using TRY-CATCH.
  
16.      Use Stored Procedure for frequently used data and more complex queries
Practice to create stored procedure for quaery that is required to access data frequently. We also created stored procedure for resolving more complex task.
17.      Avoid prefix "sp_" with user defined stored procedure name

Practice to avoid prefix "sp_" with user defined stored procedure name since system defined stored procedure name starts with prefix "sp_". Hence SQL server first search the user defined procedure in the master database and after that in the current session database. This is time consuming and may give unexcepted result if system defined stored procedure have the same name as your defined procedure.

Tuesday, March 29, 2016

What is a subquery, Nested Subqueries&Correlated Subqueries?


What is a subquery?

A subquery is a SELECT statement with another SQL statement, like in the example below:

select *
from employee
where id in
           (select employee_id
            from employee
            where id = 15);

Subqueries are further classified as either a correlated sub query or a nested sub query

Nested Subqueries
Nested sub-queries are sub queries that don't rely on an outer query. In other words, both queries in a nested sub query may be run as separate queries.
This type of sub query could be used almost everywhere



Select *
from employee
where city in (select city
               from employer);

The example subquery returns all clients that are from the same city as the product providers.
The IN operator checks if the value is within the table and retrieves the matching rows.

Correlated Subqueries
Subqueries are correlated when the inner and outer queries are interdependent, that is, when the outer query is a query that contains a subquery and the subquery itself is an inner query. Users that know programming concepts may compare it to a nested loop structure.

Let's start with a simple example. The inner query calculates the average value and returns it. In the outer query’s where clause, we filter only those purchases which have a value greater than the inner query’s returned value.

select id
from employee E1
where joiningdate > '2016-01-15'
and salary > (select avg(salary)
             from employee E2
             where E1.date = E2.date)

Wednesday, March 16, 2016

What is the use of COALESCE in SQL Server

 Use of COALESCE in SQL Server:


Let us understand this by taking an example





create table candidate
( id int identity ,primaryemail varchar(20),secondaryemail varchar(20))

insert into candidate values ('avi@gmail.com',null)
insert into candidate values ('siva@gmail.com',null)
insert into candidate values ('ravii@gmail.com',null)
insert into candidate values ('bavi@gmail.com',null)
insert into candidate values (null,'chiull@gmail.com')
insert into candidate values (null,'boht@gmai.com')
insert into candidate values ('pandu@gmail.com','kaya@gmail.com')


QUERY:
select coalesce(primaryemail,secondaryemail) from candidate 




Here in this example, the Candidate table is shown to include three columns with information about a Candidate: 

1. id
2. PrimaryEmail
3. SecondaryEmail

COALESCE in the SELECT statement , selects the PrimaryEmail if it is not null. If the PrimaryEmail is null then SecondaryEmail will be selected. If both PrimaryEmail and SecondaryEmail is present then only PrimaryEmail is selected. So, COALESCE returns the first nonnull column among the list of columns passed. If both PrimaryEmail and SecondaryEmail is NULL, COALESCE returns NULL.


Differenece between isnull and coalesce ?


Coalesce is use for more than two columns where as Isnull works only for two column. Also you can use coalesce function in joining to get first not null value from more than two different table.



Reference:
http://venkatsqlinterview.blogspot.in/

Monday, March 14, 2016

How to find the Nth highest salary

To find the Nth highest salary

Step 1: Create a schema of a table named "Employee" in your database as in the following:

create table Employee
(
ID int identity,
Name varchar(20),
Salary money,
Department varchar(20)
)
 Step 2:Insert some values in the table according to the column behaviour like Name and Data Type.
Insert into Employee(Name,Salary,Department)values('Avi',20000,'Finance')
Insert into Employee(Name,Salary,Department)values('Bhar',10000,'Finance')
Insert into Employee(Name,Salary,Department)values('Chait',28000,'IT')
Insert into Employee(Name,Salary,Department)values('Dhar',15000,'Finance')
Insert into Employee(Name,Salary,Department)values('Einstein',39000,'Finance')
Insert into Employee(Name,Salary,Department)values('Fedarer',12000,'Finance')
Insert into Employee(Name,Salary,Department)values('Giri',40000,'IT')
Insert into Employee(Name,Salary,Department)values('Hari',32000,'Finance')
Insert into Employee(Name,Salary,Department)values('Irfa',56000,'IT')
Insert into Employee(Name,Salary,Department)values('Janaki',29000,'Finance'


Select *  from Employee




To find the 3 rd highest salary

Query : By using Top 
SELECT TOP 1 salary
FROM (
      SELECT DISTINCT TOP 3 salary
      FROM employee
      ORDER BY salary DESC
      ) a
ORDER BY salary


method 2 : By Using MIN 

SELECT MIN(Salary)
FROM employee
WHERE Salary IN(SELECT TOP 3 Salary FROM employee ORDER BY Salary DESC) 



Method 3 BY using Dense rank                                                                                                           

SELECT Salary,Name
FROM
(
SELECT Salary,Name,DENSE_RANK() OVER(ORDER BY Salary DESC) Rno from EMPLOYEE
) tbl
WHERE Rno=3
By using above examples you can find any nth highest salary .You can get any highest value by  replace 3 with your value .