Structured Query Language (SQL) is the standard language used for retrieving and manipulating database information. Today’s blog, I will be focusing on how we can use this language to simply get an organized result set back that answers a specific question. This will be accomplished by using the Select, From, Where, Group By, Having, and Order By clauses
We will be using the EMPLOYEES table in the HR schema from Oracle live as our practice table. You can find the structure of the table in the image below. This can be accessed for free. All you need is an oracle account and login with those credentials. Go to this link (Oracle Live) and select “Start Coding Now”.
SELECT: All SQL statements start with the SELECT. This is how the query knows what information you want to return. The SELECT is where you place the column names, how many columns, column alias, and aggregations you want to see in your result. If you don’t know the structure of the table in advance you can place an asterisk (*) after the SELECT and this will return every column in the table. However the column is named in the table is exactly how it needs to be type in the SELECT statement in order for it to be retrieved. This is before you choose to give it an alias.
FROM: FROM is required in all select statements. This tells the query what table the columns are coming from. If the table you select doesn’t have the specified columns, you will get an error.
SELECT EMAIL, JOB_ID, SALARY FROM HR.EMPLOYEES;
SELECT * FROM HR.EMPLOYEES;
SELECT FIRST_NAME as FNAME FROM HR.EMPLOYEES;
WHERE: WHERE comes after the FROM and acts as a filter for the query. You specify what column you want to filter and provide an operator (=,<,>,=>,=<,<>, in, between, like, is null, and, or) and a value that may or may not be in the table. If the value isn’t in the table, the query will succeed but yield no results. For the scope of this blog I will focus on the equal, in, like, and, and or operators.
SELECT SALARY FROM EMPLOYEES WHERE SALARY = 2400;
SELECT * FROM HR.EMPLOYEES WHERE FIRST_NAME in ('TJ','Ki','Guy','Funk');
SELECT * FROM HR.EMPLOYEES WHERE LAST_NAME like 'C%';
SELECT * FROM HR.EMPLOYEES WHERE LAST_NAME like '%o';
SELECT * FROM HR.EMPLOYEES WHERE LAST_NAME like '%i%';
SELECT * FROM HR.EMPLOYEES WHERE LAST_NAME like '%i%' AND FIRST_NAME in ('TJ','Ki','Guy','Funk');
The result can have either a salary equal to 24000 or Jobs with names of AD_VP, or FI_MGR.
SELECT * FROM HR.EMPLOYEES WHERE SALARY = 24000 OR JOB_ID in ('AD_VP','FI_MGR');
GROUP BY: GROUP BY is strictly used when there are aggregations present in the select statement (I.E: SUM, COUNT, MIN, MAX). The purpose of GROUP BY is to take the columns not affected by the aggregate and inform the query that these columns are to be grouped based on their matching result set.
SELECT COUNT(*) FROM HR.EMPLOYEES;
SELECT MAX(SALARY), MIN(SALARY) FROM HR.EMPLOYEES;
SELECT MAX(SALARY), MIN(SALARY) FROM HR.EMPLOYEES GROUP BY DEPARTMENT_ID;
SELECT JOB_ID, SUM(SALARY) FROM HR.EMPLOYEES GROUP BY JOB_ID;
SELECT DEPARTMENT_ID ,COUNT(DEPARTMENT_ID) ,COUNT(EMPLOYEE_ID) ,AVG(SALARY) FROM HR.EMPLOYEES GROUP BY DEPARTMENT_ID;
HAVING: HAVING is only to be used when a GROUP BY is also used. This acts as a filter for the aggregations. The aggregations used in the HAVING clause don’t need to be in the select statement and the query only returns records that meet the filter.
SELECT JOB_ID, SUM(SALARY) FROM HR.EMPLOYEES GROUP BY JOB_ID HAVING SUM(SALARY) > 10000;
This is an extension of an example in the GROUP BY section. The HAVING clauses makes sure my results only have JOB_ID’s with SALARIES over 10,000 dollars.
ORDER BY: ORDER BY, as the name implies, is a way of ordering the data based on the column you specify. Whether it be descending (abbreviated by desc), ascending (abbreviated by asc), nulls first, nulls last, column position and column name. Ascending is the default and is not necessary for it be in the query, but can be for transparency of the query. The column dictated to the ORDER BY doesn’t have to be in the select statement for it to work, but it does need to exist in the table. The ORDER BY clause is the only place in the select statement that can accept the alias name of a column you put in the SELECT statement. Everywhere else would need the original column name.
SELECT FIRST_NAME, LAST_NAME, EMAIL FROM HR.EMPLOYEES ORDER BY LAST_NAME ASC;
Show the result ordered by last_name A to Z
SELECT JOB_ID, SALARY, DEPARTMENT_ID FROM HR.EMPLOYEES ORDER BY EMPLOYEE_ID DESC;
Show the result ordered by Employee_ID largest number first.
Notice that Employee_ID is not in the select statement but the order or records would change is removed.
SELECT MANAGER_ID, DEPARTMENT_ID FROM HR.EMPLOYEES ORDER BY MANAGER_ID DESC NULLS FIRST;
This result makes sure all records with NULL Manager_ID’s are first, followed by the largest manager_id.
SELECT EMPLOYEE_ID, EMAIL, COMMISSION_PCT FROM HR.EMPLOYEES ORDER BY COMMISSION_PCT ASC NULLS LAST;
Have all records with Null Commission_PCT appear last and order it smallest to largest.
SELECT * FROM HR.EMPLOYEES ORDER BY 1;
I am selecting every column but ordering it based on the first column in the table. In this case Employee_ID. 2 would’ve been First_Name, 3 Last_Name, and so on.
SELECT * FROM HR.EMPLOYEES ORDER BY EMPLOYEE_ID;
Same as the previous example but with the name spelled out.
SELECT EMPLOYEE_ID AS EMP_NUMBER FROM HR.EMPLOYEES ORDER BY EMP_NUMBER;
This Order by accepts the alias of the Employee_Id column as EMP_NUMBER
Sample question: What is the average salary between all departments in the employees table? Order by the average salary
SELECT DEPARTMENT_ID, AVG(SALARY) FROM HR.EMPLOYEES GROUP BY DEPARTMENT_ID ORDER BY AVG(SALARY);
Sample question: Find the max salary in all departments, except for and department id greater than 90, only salaries less than 10,000, order by the salary column descending and rename it as Max_Salary.
SELECT DEPARTMENT_ID ,MAX(SALARY) AS MAX_SALARY FROM HR.EMPLOYEES WHERE DEPARTMENT_ID < 90 GROUP BY DEPARTMENT_ID HAVING MAX(SALARY) < 10000 ORDER BY MAX_SALARY DESC;