Category: Advanced SQL

  • Database Tuning

    SQL Database Tuning

    Database Tuning in SQL is a set of activities performed to optimize a database and prevents it from becoming a bottleneck.

    There are various techniques with which you can configure the optimal performance of a particular database. Database tuning overlaps with query tuning; so, good indexing and avoiding improper queries help in increasing the database efficiency. In addition, increasing storage, updating to latest database versions and investing in a more powerful CPU (if needed) are also some of the general techniques.

    Database Tuning Techniques

    We can implement the following techniques to optimize the performance of a database −

    Database Normalization

    Normalization is the process of removing of duplicate data from a database. We can normalize a database by breaking down larger tables into smaller related tables. This increases the performance of database as it requires less time to retrieve data from small tables instead of one large table.

    Proper Indexes

    In SQL, indexes are the pointers (memory address) to the location of specific data in database. We use indexes in our database to reduce query time, as the database engine can jump to the location of a specific record using its index instead of scanning the entire database.

    Avoid Improper Queries

    Choosing the correct query to retrieve data efficiently also improves the performance of a database. For example, choosing to retrieve an entire table when we only need the data in a single column will unnecessarily increase query time. So, query the database wisely.

    Let us discuss some of the common improper queries made and how to rectify them to optimize the database performance.

    1. Use SELECT fields instead of SELECT (*)

    In large databases, we should always retrieve only the required columns from the database instead of retrieving all the columns, even when they are not needed. We can easily do this by specifying the column names in the SELECT statement instead of using the SELECT (*) statement.

    Example

    Assume we have created a table with name CUSTOMERS in MySQL database using CREATE TABLE statement as shown below −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Following query inserts values into this table using the INSERT statement −

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00),(5,'Hardik',27,'Bhopal',8500.00),(6,'Komal',22,'Hyderabad',4500.00),(7,'Muffy',24,'Indore',10000.00);

    Let us say we only want the data in ID, NAME and SALARY columns of the CUSTOMERS table. So, we should only specify those three columns in our SELECT statement as shown below −

    SELECT ID, NAME, SALARY FROM CUSTOMERS;

    Output

    The output obtained is as shown below −

    IDNAMESALARY
    1Ramesh2000.00
    2Khilan1500.00
    3Kaushik2000.00
    4Chaitali6500.00
    5Hardik8500.00
    6Komal4500.00
    7Muffy10000.00

    2. Use Wildcards

    Wildcards (%) are characters that we use to search for data based on patterns. These wildcards paired with indexes only improves performance because the database can quickly find the data that matches the pattern.

    Example

    If we want to retrieve the names of all the customers starting with K from the CUSTOMERS table, then, the following query will provide the quickest result −

    SELECT ID, NAME FROM CUSTOMERS WHERE NAME LIKE'K%';

    Output

    Following is the output of the above query −

    IDNAME
    2Khilan
    3Kaushik
    6Komal

    3. Use Explicit Join

    SQL JOINs are used to combine two tables based on a common column. There are two ways of creating a JOIN implicit join and explicit join. Explicit Join notation use the JOIN keyword with the ON clause to join two tables while the implicit join notation does not use the JOIN keyword and works with the WHERE clause.

    Performance wise, they are both on the same level. However, in more complicated cases, the implicit join notation might produce completely different results than intended. Therefore, Explicit Joining is preferred.

    4. Avoid using SELECT DISTINCT

    The DISTINCT operator in SQL is used to retrieve unique records from the database. And on a properly designed database table with unique indexes, we rarely use it.

    But, if we still have to use it on a table, using the GROUP BY clause instead of the DISTINCT keyword shows a better query performance (at least in some databases).

    5. Avoid using Multiple OR

    The OR operator is used to combine multiple conditions when filtering a database. Whenever we use OR in a filter condition each statement is processed separately. This degrades database performance as the entire table must be scanned multiple times to retrieve the data that matches the filter condition.

    Instead, we can use a more optimized solution; by breaking the different OR conditions into separate queries, which can be processed parallelly by the database. Then, the results from these queries can be combined using UNION.

    Example

    For example, let us say we have a requirement of getting the details of all the customers whose age is greater than 25 or whose salary is greater than 2,000. The optimized query would be as show below −

    SELECT ID, NAME FROM CUSTOMERS WHERE AGE >25UNIONSELECT ID, NAME FROM CUSTOMERS WHERE SALARY >2000;

    Output

    After executing the above code, we get the following output −

    IDNAME
    1Ramesh
    5Hardik
    4Chaitali
    6Komal
    7Muffy

    6. Use WHERE instead of HAVING

    The WHERE and HAVING clause are both used to filter data in SQL. However, WHERE clause is more efficient than HAVING. With WHERE clause, only the records that match the condition are retrieved. But with HAVING clause, it first retrieves all the records and then filters them based on a condition. Therefore, the WHERE clause is preferable.

    Database Defragmentation

    When the data is stored in a database, they are placed in contiguous physical locations. In this case, both logical locations and physical locations are in the same order.

    However, when the database tables are altered by deleting or updating the records, indexes are also altered to suit the changes made. This will cause the indexes to be scattered across the storage. Physical locations also lose the contiguous allocation. Thus, reducing the database performance.

    Defragmentation is the solution this problem. It will reorganize/rebuild the logical ordering of the indexes to match the physical ordering. But, this process first analyses the indexes and chooses whether they only need to be reorganized or rebuilt completely.

    Built-In Tuning Tools

    Some databases provide built-in tuning tools to monitor the database performance. For instance, the Oracle database provides the following tuning tools −

    • EXPLAIN − In SQL, the EXPLAIN command give us the order in which a query is executed along with the estimated cost of each step. We can use this to find the query the least cost to optimize the database.
    • tkprof − tkprof is a command that gives us various statistics, such as CPU and I/O usage of a query. By using these statistics, we can tune our queries to reduce CPU and I/O utilization to increase the efficiency of our database.
  • IN vs EXISTS

    In SQL, we use the IN operator to simplify queries and reduce the need for multiple OR conditions. It allows us to match a value against a list of values. On the other hand, the EXISTS operator checks whether one or more rows exist in a subquery and returns either true or false based on this condition. If the subquery finds at least one row, the EXISTS operator returns true; otherwise, it returns false.

    The SQL IN Operator

    The IN operator in SQL is used to check if a particular value matches any within a given set. This set of values can be specified individually or obtained from a subquery. We can use the IN operator with the WHERE clause to simplify queries and reduce the use of multiple OR conditions.

    Suppose we have a table named CUSTOMERS and we want to retrieve customer details based on their IDs. In this scenario, we can use the IN operator with the WHERE clause to fetch the details of these specific IDs.

    Syntax

    Following is the syntax of the SQL IN operator −

    SELECT column_name
    FROM table_name
    WHERE column_name
    IN(value1, value2, valueN);

    In the above syntax, the column_name matches every value (value1, value2, … valueN). If the matches occur, The IN operators returns true; otherwise, false.

    Example

    First of all, let us create a table named CUSTOMERS using the following query −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Now, add records into the above created table using the INSERT INTO statement as shown below −

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00),(5,'Hardik',27,'Bhopal',8500.00),(6,'Komal',22,'Hyderabad',4500.00),(7,'Muffy',24,'Indore',10000.00);

    The table will be created as follows −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    The following query retrieves the NAME and SALARY columns from the CUSTOMERS table for rows where the ID is 1, 2, or 3.

    SELECT NAME, SALARY FROM CUSTOMERS WHERE ID IN(1,2,3);

    Output

    On executing the query above, the output is displayed as −

    NameSalary
    Ramesh2000.00
    Khilan1500.00
    Kaushik2000.00

    The SQL EXISTS Operator

    The EXISTS operator is used to look for the existence of a row in a given table that satisfies a set of criteria. It is a Boolean operator that compares the result of the subquery to an existing record and returns true or false.

    The returned value is true, if the subquery fetches single or multiple records; and false, if no record is matched. EXISTS operator follows the querys efficiency features, i.e. when the first true event is detected, it will automatically stop processing further.

    We can use the EXISTS operator with the SELECTUPDATEINSERT and DELETE queries.

    Syntax

    Following is the basic syntax of SQL EXISTS operator −

    SELECT column_name FROM table_name
    WHEREEXISTS(SELECT column_name FROM table_name
       WHERE condition 
    );

    Example

    First of all, consider the CUSTOMERS table, and create another table named EMPLOYEES using the following query −

    CREATETABLE EMPLOYEES (
       EID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       CITY CHAR(25),
       CONTACT INT,PRIMARYKEY(EID));

    Now, let us insert some records into the EMPLOYEES table using the INSERT INTO statement as shown below −

    INSERTINTO EMPLOYEES VALUES(1,'Varun',32,'Ahmedabad',12345),(2,'Mahesh',22,'Kashmir',34235),(3,'Suresh',43,'Kerala',12355);

    The table will be created as follows −

    EIDNAMEAGECITYCONTACT
    1Varun32Ahmedabad12345
    2Mahesh22Kashmir34235
    3Suresh43Kerala12355

    In the following query, we are using the EXISTS operator to fetch the names and ages of CUSTOMERS whose AGE is same as the AGE in the EMPLOYEES table.

    SELECT NAME, AGE 
    FROM CUSTOMERS 
    WHEREEXISTS(SELECT*FROM EMPLOYEES 
       WHERE CUSTOMERS.AGE = EMPLOYEES.AGE
    );

    Output

    Following is the output of the above query −

    NAMEAGE
    Ramesh32
    Komal22

    IN vs EXISTS

    Following table summarizes all the differences between IN and EXISTS −

    S.No.INEXISTS
    1It is applied to the SQL query to remove the multiple OR conditions.It is used to find whether the data in the subquery truly exist.
    2It executes all values contained within the IN block.If the value is matched, displays the details of the given value. It will terminate the further process if the condition is met.
    3It can be used for the comparison of a null value because it returns true, false, and a null value.It cannot be used for the comparison of a null value because it returns only true and false values.
    4It can be used with subqueries as well as with values.It can be used only with subqueries.
    5It executes faster when the subquery is smaller.It executes faster when the subquery is larger. Because it is more efficient than IN and returns only a Boolean value.

  • Group By vs Order By

    In SQL, we have two commonly used clauses that help us to manipulate data; Group By clause and Order By clause.

    A Group By clause is used to arrange the identical data/records into groups and the Order By clause is used to sort the data in ascending or descending order.

    The SQL Group By Clause

    Using the GROUP BY clause we can organize the data in a table into groups (based on a column) and perform required calculations on them.

    This clause is often used with the aggregate functions such as MIN(), MAX(), SUM(), AVG(), and COUNT() etc.

    It is often used with the SELECT statement, and it is placed after the WHERE clause or before the HAVING clause. If we use the Order By clause, the Group By clause should precede the Order By clause.

    Syntax

    Following is the syntax of the SQL Group By clause −

    SELECT column_name, aggregate_function()FROM table_name
    WHERE condition GROUPBY column_name;

    The aggregate_function() and the WHERE clause are optional in the above syntax.

    Example

    Assume we have created a table named CUSTOMERS that contains records of customers such as NAME, AGE, ADDRESS, and SALARY etc.., using the following CREATE statement −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Now, we are inserting values into this table using the INSERT statement as follows −

    Open Compiler

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00),(5,'Hardik',27,'Bhopal',8500.00),(6,'Komal',22,'Hyderabad',4500.00),(7,'Muffy',24,'Indore',10000.00);

    The CUSTOMERS table will be created as −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    In the SQL query below, we are using the Group by clause to group the rows based on their salaries from the CUSTOMERS table and counting the number of records in each group −

    SELECT SALARY,COUNT(SALARY)from CUSTOMERS GROUPBY SALARY;

    Output

    When we execute the above query, the following result will be displayed −

    SALARYMAX(SALARY)
    2000.002
    1500.001
    6500.001
    8500.001
    4500.001
    10000.001

    The SQL Order By Clause

    The ORDER BY clause is used to sort the query results. This clause is used at the end of a SELECT statement, following the WHERE, HAVING and GROUP BY clauses. We can sort the table column in ascending or descending order with the by specifying the sort order as ASC and DESC respectively. If we do not specify any order, it defaults to ascending order.

    Syntax

    Following is the syntax to sort the column value in ascending/descending order using the SQL ORDER BY clause −

    SELECT column_name FROM table_name ORDERBYASC/DSC;

    Example

    In the following query, we are retrieving the ID and NAME from the CUSTOMERS table and using the ORDER BY clause, we are sorting the names in ascending order.

    Open Compiler

    SELECT ID, NAME FROM CUSTOMERS ORDERBY NAME;

    Output

    When we run the above query, we can see that the resultant table is sorted by name in ascending order.

    IDNAME
    4Chaitali
    5Hardik
    3Kaushik
    2Khilan
    6Komal
    7Muffy
    1Ramesh

    Example

    In the following example, we are retrieving the NAME, calculating the AVG SALARY, and using the GROUP BY clause to group the table by NAME.

    Open Compiler

    SELECT NAME,AVG(SALARY)FROM CUSTOMERS GROUPBY NAME;

    Output

    When we run the above query, we get the name and average salary. The average salary is the same as the actual salary because there are no two or more than two records with the same name. As a result, the average salary is the same as the actual salary, and the table is grouped by name. as shown in the table below.

    NAMEAVG(SALARY)
    Ramesh2000.000000
    Khilan1500.000000
    Kaushik2000.000000
    Chaitali6500.000000
    Hardik8500.000000
    Komal4500.000000
    Muffy10000.000000

    Example

    In the following example, we are retrieving, NAME, AGE, and SALARY and using the ORDER BY clause to sort the AGE in the ascending order.

    Open Compiler

    SELECT NAME, AGE, SALARY FROM customers ORDERBY AGE;

    Output

    The table generated by the above query is as shown below −

    NAMEAGESALARY
    Komal224500.00
    Kaushik232000.00
    Muffy2410000.00
    Khilan251500.00
    Chaitali256500.00
    Hardik278500.00
    Ramesh322000.00

    Group by vs Order by

    Following table summarizes the differences between the Group By clause and Order by clause −

    S.No.Group ByOrder By
    1It is applied to group rows with same values.It sorts the columns in either ascending or descending order.
    2It could be allowed in the create view statement.It is not allowed to create view statement.
    3The attribute cannot be assigned to the aggregate function in the Group By statement.The attribute can be assigned to the aggregate function in the Order By statement.
    4It is always used before the Order by clause in the select statement.It is always used after the Group by clause in the select statement.
    5Here grouping is done based on the similarity among the rows attribute value.Here, the result-set is sorted based on the columns attribute value either ascending or descending order.
    6It controls the presentation of the rowIt controls the presentation of the column.
    7We can use the aggregate function in the Group by.Here its not mandatory to use the aggregate function in the Order by.
  • Common Table Expression

    A Common Table Expression (CTE) can make it easier to manage and write complex queries by making them more readable and simple, like database views and derived tables. We can reuse or rewrite the query by breaking down the complex queries into simple blocks.

    The SQL Common Table Expression

    The WITH clause in MySQL is used to specify a Common Table Expression.

    A Common Table Expression (CTE) in SQL is a one-time result set, i.e. it is a temporary table that exists only during the execution of a single query. It allows us to work with data specifically within that query, such as using it in SELECTUPDATEINSERTDELETECREATEVIEW, OR MERGE statements.

    CTE is temporary because it cannot be stored anywhere for later use; once the query is executed, it is lost.

    The MySQL WITH Clause

    To specify common table expressions, we use WITH clause that consists of one or more comma-separated subclauses. Within each subclause, we can present a subquery that produces a result set and assigns a name to this subquery.

    You cannot use the WITH clause in MySQL versions before 8.0.

    Syntax

    Following is the syntax to create a CTE using WITH clause −

    WITH CTE_NAME (column_name)AS(query)SELECT*FROM CTE_NAME;

    Where,

    • CTE_NAME − It is the name assigned to the CTE.
    • column_name − It is the column names for the CTE, which can be useful for improving query readability.
    • query − It defines the CTE and it can be any valid SQL query.
    • After defining the CTE, you can reference it in subsequent queries within the same session.

    Example

    Assume we have created a table named CUSTOMERS in MySQL database using CREATE TABLE statement as shown below −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Now, we are inserting some records into the above created table −

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00),(5,'Hardik',27,'Bhopal',8500.00),(6,'Komal',22,'Hyderabad',4500.00),(7,'Muffy',24,'Indore',10000.00);

    The table created is as shown below −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    Here, we are creating a Common Table Expression (CTE) named CUSTOMER_AGE that selects all customers with an age of 23. We are then retrieving the ID, NAME, and AGE of these customers from the CTE.

    WITH CUSTOMER_AGE AS(SELECT*FROM customers WHERE AGE =23)SELECT ID, NAME, AGE FROM CUSTOMER_AGE;

    Output

    Following is the output of the above query −

    IDNAMEAGE
    3Kaushik23

    CTE from Multiple Tables

    We can also create a Common Table Expression (CTE) that combines data from multiple tables by using JOIN operations within the CTE’s subquery. To do this, we need to use the comma operator to separate each CTE definition, effectively merging them into a single statement.

    Syntax

    Following is the basic syntax for multiple Common Table Expression (CTE) −

    WITH
       CTE_NAME1 (column_name)AS(query),
       CTE_NAME2 (column_name)AS(query)SELECT*FROM CTE_NAME1
    UNIONALLSELECT*FROM CTE_NAME2;

    We can use multiple Common Table Expressions (CTEs) with various SQL operations, such as UNION, UNION ALL, JOIN, INTERSECT, or EXCEPT.

    Example

    In here, we are defining two CTEs namely ‘CUSTOMERS_IN_DELHI’ and ‘CUSTOMERS_IN_MUMBAI’ to segregate customers based on their addresses in Delhi and Mumbai. Then, we are using the UNION ALL operator to combine the results from both CTEs into a single result set, retrieving customer information from both cities.

    WITH
    CUSTOMERS_IN_DELHI AS(SELECT*FROM CUSTOMERS WHERE ADDRESS ='Delhi'),
    CUSTOMERS_IN_MUMBAI AS(SELECT*FROM CUSTOMERS WHERE ADDRESS ='Mumbai')SELECT ID, NAME, ADDRESS FROM CUSTOMERS_IN_DELHI
    UNIONALLSELECT ID, NAME, ADDRESS FROM CUSTOMERS_IN_MUMBAI;

    Output

    Output of the above query is as shown below −

    IDNAMEADDRESS
    2KhilanDelhi
    4ChaitaliMumbai

    Recursive CTE

    A common table expression is a query that keeps referring back to its own result in a loop repeatedly until it returns an empty result.

    A recursive query continually iterates across a subset of the data during its execution, and defines itself in a self-referencing manner. This self-referencing mechanism allows it to repeatedly process and expand its results until a stopping condition is met.

    To make a CTE recursive, it must include a UNION ALL statement and provide a second definition of the query that utilizes the CTE itself. This allows the CTE to repeatedly reference to its own results, creating a recursive behaviour in the query.

    Example

    Now, we are using a recursive CTE named recursive_cust to retrieve data from the ‘CUSTOMERS’ table created above. Initially, we are selecting customers with salaries above 3000 and then recursively appending customers older than 25 to the result set using the UNION ALL operator −

    WITH recursive_cust (ID, NAME, ADDRESS, AGE)AS(SELECT ID, NAME, ADDRESS, AGE
       FROM CUSTOMERS
       WHERE SALARY >3000UNIONALLSELECT ID, NAME, ADDRESS, AGE
       FROM CUSTOMERS
       WHERE AGE >25)SELECT*FROM recursive_cust;

    Output

    When the above query is executed, all data from the customers table whose age is greater than 25 or salary is greater than 3000 will be displayed recursively as shown below −

    IDNAMEADDRESSAGE
    4ChaitaliMumbai25
    5HardikBhopal27
    6KomalHyderabad22
    7MuffyIndore24
    1RameshAhmedabad32
    5HardikBhopal27

    Example

    In the following query, we are using a recursive CTE named Numbers to generate and display numbers from 1 to 5. The recursive part continually adds 1 to the previous value until it reaches 5, creating a sequence −

    WITH RECURSIVE Numbers AS(SELECT1AS N
      UNIONALLSELECT N +1FROM Numbers WHERE N <5)SELECT n FROM Numbers;

    Output

    After executing the above query, we get the following output −

    N
    1
    2
    3
    4
    5

    Advantages of CTE

    Following are the advantages of the CTE −

    • CTE makes the code maintenance easier.
    • It increases the readability of the code.
    • It increases the performance of the query.
    • CTE allows for the simple implementation of recursive queries.

    Disadvantages of CTE

    Following are the disadvantages of the CTE −

    • CTE can only be referenced once by the recursive member.
    • We cannot use the table variables and CTEs as parameters in a stored procedure.
    • A CTE can be used in place of a view, but a CTE cannot be nested while views can.
  • Cursors

    A database cursor solves the problem of impedance mismatch. It acts as a filter between the result of a SQL query and the statements that process this result.

    Cursors in SQL

    Cursor is a temporary memory that is allocated by the database server at the time of performing the Data Manipulation Language operations on a table, such as INSERT, UPDATE and DELETE etc. It is used to retrieve and manipulate data stored in the SQL table.

    In MySQL, you cannot declare a cursor directly outside of a stored procedure or function. Cursors are generally declared within stored procedures, functions, or blocks of SQL code in MySQL database.

    Using cursors, we can perform multiple operations on each row of a result set, with or without returning the original data.

    Properties of Cursors

    Following are the properties of MySQL Cursors −

    • READ ONLY − We cannot update or modify any records in the table using the MySQL cursors. We can just fetch and process data from a table.
    • Non-Scrollable − We can retrieve records from a table in a single direction, i.e. from the first record or the last. We cannot move backward or jump to a specific position within the result set.
    • Asensitive Cursor − An asensitive cursor operates directly on the actual data in the database, it does not create a copy of the data. If any change is made to the data by other connections, it can affect the data that the cursor is working with.

    In addition to the Asensitive cursor there is another type known as Insensitive Cursor. An insensitive cursor uses a temporary copy of the data. Therefore, these cursors are insensitive (not affected) to the changes that are made in the table.

    Life Cycle of the Cursor

    There are four steps to manage these cursors. Following diagram illustrates the lifecycle of an SQL cursor −

    Cursor Lifecycle

    Now, let us discuss the phases of life cycle of the cursor one-by-one.

    Declare Cursor Statement

    In MySQL we can declare a cursor using the DECLARE statement and associate it with a SELECT statement to retrieve records from a database table.

    However, this SELECT statement associated with a cursor does not use the INTO clause, as it’s purpose is to fetch and process rows rather than assigning values to variables.

    Syntax

    Following is the syntax to declare a cursor in MySQL database −

    DECLARE cursor_name CURSORFOR select_statement;

    Open Cursor Statement

    After declaring a cursor in MySQL, the next step is to open the cursor using the OPEN statement. It initializes the result-set, allowing us to fetch and process rows from the associated SELECT statement in the cursor.

    Syntax

    Following is the syntax to open a cursor in MySQL database −

    OPEN cursor_name;

    Fetch Cursor Statement

    Then, we can use the FETCH statement to retrieve the current row pointed by the cursor, and with each FETCH, the cursor moves to the next row in the result set. This allows us to process each row one by one.

    Syntax

    Following is the syntax to fetch a cursor in MySQL database −

    FETCH cursor_name INTO variable_list;

    Close Cursor Statement

    Once all the rows are fetched, we must close the cursor to release the memory associated with it. We can do this using the CLOSE statement.

    Syntax

    Following is the syntax to close a cursor in MySQL database −

    CLOSE cursor_name;

    Example

    In this example, let us see how to manage a cursor in a stored procedure.

    Assume we have created a table with the name CUSTOMERS using the CREATE TABLE statement as follows −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Now, let us insert some records into the CUSTOMERS table using the INSERT statement as follows −

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00);

    Now, we will create a backup table named ‘CUSTOMERS_BACKUP’ to store customer data −

    CREATETABLE CUSTOMERS_BACKUP (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,PRIMARYKEY(ID));

    Here, we are creating a stored procedure named FetchCustomers to fetch customer names from the CUSTOMERS table and inserting them one by one into the BACKUP table. We are using a cursor to iterate through the rows and a handler to detect the end of the result-set, ensuring all names are processed −

    DELIMITER//CREATEPROCEDURE FetchCustomers()BEGINDECLARE done INTDEFAULTFALSE;DECLARE customer_id INT;DECLARE customer_name VARCHAR(255);DECLARE auto_id INT;-- Declare cursorDECLARE MY_CURSOR CURSORFORSELECT id, name FROM CUSTOMERS;-- Declare exit handlerDECLARECONTINUEHANDLERFORNOT FOUND SET done =TRUE;-- Open cursorOPEN MY_CURSOR;-- Fetch and insert rows
       read_loop: LOOPFETCH MY_CURSOR INTO customer_id, customer_name;IF done =1THENLEAVE read_loop;ENDIF;-- Insert the fetched data into the backup tableINSERTINTO customers_backup VALUES(customer_id, customer_name);-- Get the last auto-generated ID used in the insertionSET auto_id = LAST_INSERT_ID();ENDLOOP;-- Close cursorCLOSE MY_CURSOR;END//DELIMITER;

    Once we create the procedure successfully, we can execute it using the CALL statement as shown below −

    CALL FetchCustomers();

    Verification

    You can verify the contents of the CUSTOMERS_BACKUP table using the SELECT statement as shown below −

    SELECT*FROM CUSTOMERS_BACKUP;

    The contents of the table would be −

    IDNAME
    1Ramesh
    2Khilan
    3Kaushik
    4Chaitali
  • Date & Time

    SQL provides multiple datatypes and functions to handle Date and Time values in a database. This is because Date and Time values are represented in various formats. For instance, there are two common ways to represent a date value: DD/MM/YYYY and MM/DD/YYYY. Similarly, there is more than a single way to represent time values.

    For a database to recognize such data given in any format, we make use of multiple datatypes and functions.

    The only tricky part about storing the Date and Time data in a database is making sure that the values are inserted in the tables with the same format as the datatype.

    Different database systems use different datatypes and functions to store and handle the Date and Time data.

    Date & Time Datatypes in SQL

    Date and time datatypes are used in SQL to store date and time values in various formats. The datatypes available in SQL are listed below.

    S.No.Datatype & DescriptionStorage
    1datetimeIt stores date and time both from January 1, 1753 to December 31, 9999 with an accuracy of 3.33 milliseconds; with the format: YYYY-MM-DD HH:MI:SS.8 bytes
    2datetime2It stores date and time both from January 1, 0001 to December 31, 9999 with an accuracy of 100 nanoseconds.6 – 8 bytes
    3smalldatetimeIt stores date and time both from January 1, 1900 to June 6, 2079 with an accuracy of 1 minute. It is stored in the format of YYYY-MM-DD HH:MI:SS.4 bytes
    4dateIt stores date only from January 1, 0001 to December 31 9999, in the format: YYYY-MM-DD.3 bytes
    5timeIt store time only to an accuracy of 100 nanoseconds.3 – 5 bytes
    6datetimeoffsetIt is the same of the datetime2 with the addition of the time zone offset.8 – 10 bytes
    7timestampIt stores the unique number that gets updated every time a row gets created or modified. It does not correspond to real time and is based on internal time. Each table may have only one timestamp variable.

    Example

    In the following example, let us create a table named SALES_DETAILS which accepts only date and time values in different formats.

    CREATETABLE SALES_DETAILS(
       orderDate DATE, 
       shippingDate DATETIME, 
       deliveredDate TIMESTAMP,timeTIME);

    To insert values into this table, use the following query −

    INSERTINTO SALES_DETAILS VALUES('2023-02-01','2023-02-01 :10:00','2023-02-03 :18:00','18:00');

    Output

    The table will be created as follows −

    orderDateshippingDatedeliveredDatetime
    2023-02-012023-02-01 :10:002023-02-03 :18:0018:00

    Date & Time Functions in SQL

    SQL also provides multiple functions to handle date and time values.

    For instance, there are different functions to retrieve the current timestamp in different formats. Let us see some of such functions below −

    CURDATE() Function

    To get the current date, we use the CURDATE() function in MySQL. The format of the resultant date will be ‘YYYY-MM-DD’ (string) or YYYYMMMDD (numeric).

    SELECT CURDATE();

    Output

    When we execute the above query, we get the current days date −

    CURDATE()
    2023-08-22

    NOW() Function

    The MySQL NOW() function will retrieve the current date and time value as a timestamp based on the context and, the value returned will be in either of the two formats: ‘YYYY-MM-DD hh:mm:ss’ and ‘YYYYMMDDhhmmss’.

    SELECTNOW();

    Output

    When we execute the above SQL query, we get the current date with time as follow −

    NOW()
    2023-08-22 15:30:25

    CURRENT_TIMESTAMP() Function

    The MySQL CURRENT_TIMESTAMP() function is used to get the current timestamp. The value returned will be in ‘YYYY-MM-DD hh:mm:ss’ (string) or YYYYMMDDhhmmss (numeric) format. This function is a synonym for NOW().

    SELECTCURRENT_TIMESTAMP();

    Output

    When we run the above SQL query, we get the following output −

    CURRENT_TIMESTAMP()
    2023-08-22 15:31:32

  • Auto Increment

    The SQL Auto Increment is used to automatically add unique sequential values into a column of a table.

    We usually define the Auto Increment on a column while creating a table. And when we insert new records into the table, the unique values are added to them.

    When we use Auto Increment on a table column, there is no need to insert NOT NULL values to that column. If we provide such values, they will overwrite the unique identities and the increment will be continued (only) on the NULL values (if any); causing ambiguity on the data.

    Different RDBMS support the Auto Increment feature in different ways.

    Auto Increment in MySQL

    In MySQL, you can add the auto-increment feature to a column of a table using the attribute named AUTO_INCREMENT.

    By default, when we define the AUTO_INCREMENT attribute on a column, the unique values are generated from “1”; and for each new record we enter into the table, the values in the column will increment by 1. Thus, the first record inserted will have a value of 1, the second record will have a value of 2, and so on.

    Syntax

    Following is the syntax to add AUTO_INCREMENT attribute to a column of a table in MySQL −

    CREATETABLE table_name(
       column1 datatype AUTO_INCREMENT,
       column2 datatype,
       column3 datatype,.....
       columnN datatype
    );

    Example

    In the query to we are creating a table named CUSTOMERS and adding the AUTO_INCREMENT to the column named ID −

    CREATETABLE CUSTOMERS(
       ID INTPRIMARYKEYAUTO_INCREMENT,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2));

    Now, let us insert values into the CUSTOMERS table using the INSERT statement −

    INSERTINTO CUSTOMERS (NAME, AGE, ADDRESS, SALARY)VALUES("Ramesh",32,"Ahmedabad",2000.00),("Khilan",25,"Delhi",1500.00),("Kaushik",23,"Kota",2000.00),("Chaitali",25,"Mumbai",6500.00);

    Verification

    To verify this, you need to retrieve the contents of the CUSTOMERS using the SELECT query as −

    SELECT*FROM CUSTOMERS;

    Output

    Following is the output of the above query, here you can observe that the ID values are generated automatically −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00

    AUTO_INCREMENT on Existing Columns

    MySQL also allows you to implement the AUTO_INCREMENT attribute on an existing table, using the ALTER TABLE statement.

    Following query starts incrementing the ID values from 5 in the CUSTOMERS table CUSTOMERS −

    ALTERTABLE CUSTOMERS AUTO_INCREMENT=100;

    Now, let us insert more records to see if the ID values are auto incremented.

    INSERTINTO CUSTOMERS(NAME, AGE, ADDRESS, SALARY)VALUES("Hardik",27,"Bhopal",8500.00),("Komal",22,"MP",4500.00),("Muffy",24,"Indore",10000.00);

    To view the above table data, we use the following SELECT query −

    SELECT*FROM CUSTOMERS;

    Output

    The output of the above query is shown below. It shows the auto increment in action. We are getting the ID values of the newly inserted records begins at 100.

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    100Hardik27Bhopal8500.00
    101Komal22Hyderabad4500.00
    102Muffy24Indore10000.00

    Auto Increment in SQL Server

    In SQL Server, there is no direct command/query to perform Auto Increment. Instead, we use the IDENTITY() property. This property works similar to the AUTO_INCREMENT attribute in MySQL. It generates unique, sequential numbers automatically and it is mostly used on the PRIMARY KEY constraint.

    Syntax

    Following is the basic syntax of IDENTITY() property in SQL Server −

    CREATETABLE table_name (
       column1 datatype IDENTITY[(seed, increment)],
       column2 datatype,
       column3 datatype,.....
       columnN datatype
    );

    This property accepts two parameters. The same are described below:

    • seed: It sets the starting value for the auto-incrementing column.
    • increment: It specifies how much the value increases by for each new row.

    Example

    In the following example, we are using the IDENTITY() property on the ID column of table named CUSTOMERS −

    CREATETABLE CUSTOMERS(
       ID INTPRIMARYKEYIDENTITY(1,1),
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2));

    After creating the table, we are inserting some records using the following query −

    INSERTINTO CUSTOMERS(NAME, AGE, ADDRESS, SALARY)VALUES('Ramesh',32,'Ahmedabad',2000.00),('Khilan',25,'Delhi',1500.00),('Kaushik',23,'Kota',2000.00),('Chaitali',25,'Mumbai',6500.00);

    To view the table data, we use the following SELECT query −

    SELECT*FROM CUSTOMERS;

    Output

    Following is an output of the above query, where ID values are generated automatically −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
  • Using Sequences

    Sequences in SQL are database objects that generate a sequence of unique integer values. They are frequently used in databases because many applications require that each row in a table must contain unique values and sequences provide an easy way to generate them.

    Sequences are a feature of many SQL database management systems, such as Oracle, PostgreSQL, SQL server, and IBM DB2.

    MySQL does not support the CREATE SEQUENCE statement to create sequences for table rows or columns. Instead, we can use AUTO_INCREMENT attribute.

    Sequences in MySQL

    In MySQL, we use the AUTO_INCREMENT attribute to generate unique integer values (sequences) for a column. By default, the sequence starts with an initial value of 1 and increments by 1 for each new row.

    Syntax

    Following is the syntax of AUTO_INCREMENT attribute in MySQL −

    CREATETABLE table_name (
    	column1 datatype AUTO_INCREMENT,
    	column2 datatype,
    	column3 datatype,...
    	columnN datatype
    );

    Example

    In the following example, we are creating a table named CUSTOMERS. In addition to that, we are defining AUTO_INCREMENT on ID column of the table.

    CREATETABLE CUSTOMERS (
       ID INTAUTO_INCREMENT,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Here, we are adding some records into the above created table −

    INSERTINTO CUSTOMERS VALUES(NULL,'Ramesh',32,'Ahmedabad',2000.00),(NULL,'Khilan',25,'Delhi',1500.00),(NULL,'Kaushik',23,'Kota',2000.00),(NULL,'Chaitali',25,'Mumbai',6500.00),(NULL,'Hardik',27,'Bhopal',8500.00),(NULL,'Komal',22,'Hyderabad',4500.00),(NULL,'Muffy',24,'Indore',10000.00);

    The table will be created as follows −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    As we can see in the above table, the values in the ID column are auto incremented.

    Starting a Sequence at a Particular Value in MySQL

    By default, MySQL sequences start from 1. To start a sequence with a different value, we use the AUTO_INCREMENT in combination with the ALTER statement.

    Syntax

    Following is the syntax to start the sequence with different value −

    ALTERTABLE table_name AUTO_INCREMENT=value;

    In the following query, we are creating a table named BUYERS with AUTO_INCREMENT defined on the ID column.

    CREATETABLE BUYERS (
       ID INTAUTO_INCREMENT,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Here, we are making the sequence start with 100 using the ALTER Statement as shown below −

    ALTERTABLE BUYERS AUTO_INCREMENT=100;

    Now, we are adding records into the BUYERS table using the INSERT INTO statement −

    INSERTINTO BUYERS (NAME, AGE, ADDRESS, SALARY)VALUES('Ramesh',32,'Ahmedabad',2000.00),('Khilan',25,'Delhi',1500.00),('Kaushik',23,'Kota',2000.00),('Chaitali',25,'Mumbai',6500.00),('Hardik',27,'Bhopal',8500.00),('Komal',22,'Hyderabad',4500.00),('Muffy',24,'Indore',10000.00);

    The table will be created as −

    IDNAMEAGEADDRESSSALARY
    100Ramesh32Ahmedabad2000.00
    101Khilan25Delhi1500.00
    102Kaushik23Kota2000.00
    103Chaitali25Mumbai6500.00
    104Hardik27Bhopal8500.00
    105Komal22Hyderabad4500.00
    106Muffy24Indore10000.00

    As observed in the table above, the values in the “ID” column begin with 100 instead of 1.

    Sequences in SQL Server

    In SQL server, a sequence can be created using the CREATE SEQUENCE statement. The statement specifies the name of the sequence, the starting value, the increment, and other properties of the sequence.

    Syntax

    Following is the syntax to create a sequence in SQL −

    CREATE SEQUENCE Sequence_Name
    STARTWITH Initial_Value
    INCREMENT BY Increment_Value
    MINVALUE Minimum_Value
    MAXVALUE Maximum_Value
    CYCLE|NOCYCLE;

    Here,

    • Sequence_Name − This specifies the name of the sequence.
    • Initial_Value − This specifies the starting value from where the sequence should start.
    • Increment_Value − This specifies the value by which the sequence will increment by itself. This can be valued positively or negatively.
    • Minimum_Value − This specifies the minimum value of the sequence.
    • Maximum_Value − This specifies the maximum value of the sequence.
    • Cycle − When the sequence reaches its Maximum_Value, it starts again from the beginning.
    • Nocycle − An exception will be thrown if the sequence exceeds the Maximum_Value.

    Example

    First of all, let us create a table named CUSTOMERS using the following query −

    CREATETABLE CUSTOMERS (
       ID INT,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),);

    We are inserting some records in the above-created table using INSERT INTO statement as shown in the query below −

    INSERTINTO CUSTOMERS VALUES(NULL,'Ramesh',32,'Ahmedabad',2000.00),(NULL,'Khilan',25,'Delhi',1500.00),(NULL,'Kaushik',23,'Kota',2000.00),(NULL,'Chaitali',25,'Mumbai',6500.00),(NULL,'Hardik',27,'Bhopal',8500.00),(NULL,'Komal',22,'Hyderabad',4500.00),(NULL,'Muffy',24,'Indore',10000.00);

    The table is successfully created in the SQL database.

    IDNAMEAGEADDRESSSALARY
    NULLRamesh32Ahmedabad2000.00
    NULLKhilan25Delhi1500.00
    NULLKaushik23Kota2000.00
    NULLChaitali25Mumbai6500.00
    NULLHardik27Bhopal8500.00
    NULLKomal22Hyderabad4500.00
    NULLMuffy24Indore10000.00

    Now, create a sequence using the following query −

    CREATE SEQUENCE My_Sequence ASINTSTARTWITH1
    INCREMENT BY1
    MINVALUE 1
    MAXVALUE 7CYCLE;

    In the above query, the sequence is named My_Sequence and it starts with the value 1 and increments by 1 each time a value is generated. The sequence has a maximum value of 5 and cycles back to the starting value when it reaches the maximum value.

    Once the sequence is created, it can be used to generate unique integer values. Now, let us update the data in the ID column of the CUSTOMERS table using the following query −

    UPDATE CUSTOMERS SET ID =NEXTVALUEFOR my_Sequence;

    Output

    When you execute the above query, the output is obtained as follows −

    (7 rows affected)
    

    Verification

    Let us verify whether is sequence is updated in the ID column of the table or not using the following query −

    SELECT*FROM CUSTOMERS;

    The table will be displayed as −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3Kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00
  • Handling Duplicates

    Sometimes, tables or result sets contain duplicate records. While in most cases, duplicates are allowed, there are situations where it is necessary to prevent duplicate records and remove them from a database table.

    Why is Handling Duplicates in SQL Necessary?

    Handling duplicates in an SQL database becomes necessary to prevent the following consequences −

    • The existence of duplicates in an organizational database will lead to logical errors.
    • Duplicate data occupies space in the storage, which leads to decrease in usage efficiency of a database.
    • Due to the increased use of resources, the overall cost of the handling resources rises.
    • With increase in logical errors due to the presence of duplicates, the conclusions derived from data analysis in a database will also be erroneous.

    This chapter will describe how to prevent the occurrence of duplicate records in a table and how to remove the already existing duplicate records.

    Preventing Duplicate Entries

    To prevent the entry of duplicate records into a table, we can define a PRIMARY KEY or a UNIQUE Index on the relevant fields. These database constraints ensure that each entry in the specified column or set of columns is unique.

    Example

    Let us create a CUSTOMERS table using the following query −

    CREATETABLE CUSTOMERS (
       FIRST_NAME CHAR(20),
       LAST_NAME CHAR(20),
       SEX CHAR(10));

    As we have not defined any constraints on the table, duplicate records can be inserted into it. To prevent such cases, add a PRIMARY KEY constraint on relevant fields (say LAST_NAME and FIRST_NAME together) −

    ALTERTABLE CUSTOMERS 
    ADDPRIMARYKEY(LAST_NAME, FIRST_NAME);

    Using INSERT IGNORE Query:

    Alternatively, we can use the INSERT IGNORE statement to insert records without generating an error for duplicates as shown below −

    INSERTIGNOREINTO CUSTOMERS (LAST_NAME, FIRST_NAME)VALUES('Jay','Thomas'),('Jay','Thomas');

    As you can see below, the table will only consist of a single record (ignoring the duplicate value).

    FIRST_NAMELAST_NAMESEX
    ThomasJayNULL

    Using REPLACE Query:

    Or, use the REPLACE statement to replace duplicates as shown in the following query −

    REPLACEINTO CUSTOMERS (LAST_NAME, FIRST_NAME)VALUES('Ajay','Kumar'),('Ajay','Kumar');

    The table will contain the following records −

    FIRST_NAMELAST_NAMESEX
    KumarAjayNULL
    ThomasJayNULL

    The choice between the INSERT IGNORE and REPLACE statements should be made based on the desired duplicate-handling behaviour. The INSERT IGNORE statement retains the first set of duplicate records and discards any subsequent duplicates. Conversely, the REPLACE statement preserves the last set of duplicates and erases any earlier ones.

    Using UNIQUE Constraint:

    Another way to enforce uniqueness in a table is by adding a UNIQUE constraint rather than a PRIMARY KEY constraint −

    CREATETABLE BUYERS (
       FIRST_NAME CHAR(20)NOTNULL,
       LAST_NAME CHAR(20)NOTNULL,
       SEX CHAR(10),UNIQUE(LAST_NAME, FIRST_NAME));

    Counting and Identifying Duplicates

    To count and identify duplicate records based on specific columns, we can use the COUNT function and GROUP BY clause.

    Example

    Following is the query to count duplicate records with FIRST_NAME and LAST_NAME in the BUYERS −

    SELECTCOUNT(*)as repetitions, LAST_NAME, FIRST_NAME
    FROM BUYERS
    GROUPBY LAST_NAME, FIRST_NAME
    HAVING repetitions >1;

    This query will return a list of all the duplicate records in the PERSON_TABLE table. To identify sets of values that are duplicated, follow the steps given below −

    • Determine which columns contain the values that may be duplicated.
    • List those columns in the column selection list, along with the COUNT(*).
    • List the columns in the GROUP BY clause as well.
    • Add a HAVING clause that eliminates the unique values by requiring the group counts to be greater than one.

    Eliminating Duplicates from a Table

    We can use the DISTINCT keyword along with the SELECT statement to retrieve unique records from a table.

    SELECTDISTINCT LAST_NAME, FIRST_NAME
    FROM BUYERS
    ORDERBY LAST_NAME;

    Alternatively, you can include a GROUP BY clause specifying the columns you are selecting to eliminate duplicates −

    SELECT LAST_NAME, FIRST_NAME
    FROM BUYERS
    GROUPBY LAST_NAME, FIRST_NAME;

  • Sub Queries

    SQL Subqueries

    An SQL Subquery, is a SELECT query within another query. It is also known as Inner query or Nested query and the query containing it is the outer query.

    The outer query can contain the SELECT, INSERT, UPDATE, and DELETE statements. We can use the subquery as a column expression, as a condition in SQL clauses, and with operators like =, >, <, >=, <=, IN, BETWEEN, etc.

    Rules to be followed

    Following are the rules to be followed while writing subqueries −

    • Subqueries must be enclosed within parentheses.
    • Subqueries can be nested within another subquery.
    • A subquery must contain the SELECT query and the FROM clause always.
    • A subquery consists of all the clauses an ordinary SELECT clause can contain: GROUP BY, WHERE, HAVING, DISTINCT, TOP/LIMIT, etc. However, an ORDER BY clause is only used when a TOP clause is specified. It can’t include COMPUTE or FOR BROWSE clause.
    • A subquery can return a single value, a single row, a single column, or a whole table. They are called scalar subqueries.

    Subqueries with the SELECT Statement

    Subqueries are most frequently used with the SELECT statement. The basic syntax is as follows −

    SELECT column_name [, column_name ]FROM table1 [, table2 ]WHERE  column_name 
    OPERATOR (SELECT column_name [,column_name ]FROM table1 [, table2 ][WHERE]);

    Example

    In the following query, we are creating a table named CUSTOMERS −

    CREATETABLE CUSTOMERS (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Here, we are inserting records into the above-created table using INSERT INTO statement −

    INSERTINTO CUSTOMERS VALUES(1,'Ramesh',32,'Ahmedabad',2000.00),(2,'Khilan',25,'Delhi',1500.00),(3,'Kaushik',23,'Kota',2000.00),(4,'Chaitali',25,'Mumbai',6500.00),(5,'Hardik',27,'Bhopal',8500.00),(6,'Komal',22,'Hyderabad',4500.00),(7,'Muffy',24,'Indore',10000.00);

    The table is displayed as −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    Now, let us check the following subquery with a SELECT statement.

    SELECT*FROM CUSTOMERS 
    WHERE ID IN(SELECT ID FROM CUSTOMERS WHERE SALARY >4500);

    This would produce the following result −

    IDNAMEAGEADDRESSSALARY
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    7Muffy24Indore10000.00

    Subqueries with the INSERT Statement

    We can also use the subqueries along with the INSERT statements. The data returned by the subquery is inserted into another table.

    The basic syntax is as follows −

    INSERTINTO table_name [(column1 [, column2 ])]SELECT[*|column1 [, column2 ]FROM table1 [, table2 ][WHEREVALUE OPERATOR ]

    Example

    In the following example, we are creating another table CUSTOMERS_BKP with similar structure as CUSTOMERS table −

    CREATETABLE CUSTOMERS_BKP (
       ID INTNOTNULL,
       NAME VARCHAR(20)NOTNULL,
       AGE INTNOTNULL,
       ADDRESS CHAR(25),
       SALARY DECIMAL(18,2),PRIMARYKEY(ID));

    Now to copy the complete records of CUSTOMERS table into the CUSTOMERS_BKP table, we can use the following query −

    INSERTINTO CUSTOMERS_BKP 
    SELECT*FROM CUSTOMERS 
    WHERE ID IN(SELECT ID FROM CUSTOMERS);

    The above query produces the following output −

    Query OK, 7 rows affected (0.01 sec)
    Records: 7  Duplicates: 0  Warnings: 0
    

    Verification

    Using the SELECT statement, we can verify whether the records from CUSTOMERS table have been inserted into CUSTOMERS_BKP table or not −

    SELECT*FROM CUSTOMERS_BKP;

    The table will be displayed as −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad2000.00
    2Khilan25Delhi1500.00
    3kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal8500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    Subqueries with the UPDATE Statement

    A subquery can also be used with the UPDATE statement. You can update single or multiple columns in a table using a subquery.

    The basic syntax is as follows −

    UPDATEtableSET column_name = new_value 
    [WHERE OPERATOR [VALUE](SELECT COLUMN_NAME FROM TABLE_NAME [WHERE]);

    Example

    We have the CUSTOMERS_BKP table available which is backup of CUSTOMERS table. The following example updates SALARY by 0.25 times in the CUSTOMERS table for all the customers whose AGE is greater than or equal to 27.

    UPDATE CUSTOMERS 
    SET SALARY = SALARY *0.25WHERE AGE IN(SELECT AGE FROM CUSTOMERS_BKP WHERE AGE >=27);

    Following is the output of the above query −

    Query OK, 2 rows affected (0.01 sec)
    Rows matched: 2  Changed: 2  Warnings: 0
    

    Verification

    This would impact two rows and if you verify the contents of the CUSTOMERS using the SELECT statement as shown below.

    SELECT*FROM CUSTOMERS;

    The table will be displayed as −

    IDNAMEAGEADDRESSSALARY
    1Ramesh32Ahmedabad500.00
    2Khilan25Delhi1500.00
    3kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    5Hardik27Bhopal2125.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00

    Subqueries with the DELETE Statement

    The subquery can be used with the DELETE statement as well; like with any other statements mentioned above.

    The basic syntax is as follows −

    DELETEFROM TABLE_NAME
    [WHERE OPERATOR [VALUE](SELECT COLUMN_NAME FROM TABLE_NAME)[WHERE)];

    Example

    We have a CUSTOMERS_BKP table available which is a backup of the CUSTOMERS table. The following example deletes the records from the CUSTOMERS table for all the customers whose AGE is greater than or equal to 27.

    DELETEFROM CUSTOMERS 
    WHERE AGE IN(SELECT AGE FROM CUSTOMERS_BKP WHERE AGE >=27);

    The above query generate the following output −

    OK, 2 rows affected (0.01 sec)
    

    Verification

    If you verify the contents of the CUSTOMERS table using the SELECT statement as shown below.

    SELECT*FROM CUSTOMERS;

    The table will be displayed as −

    IDNAMEAGEADDRESSSALARY
    2Khilan25Delhi1500.00
    3kaushik23Kota2000.00
    4Chaitali25Mumbai6500.00
    6Komal22Hyderabad4500.00
    7Muffy24Indore10000.00