" 白毛——浮绿水,红掌—— 好吃!"

[2] SQL Notes – Clause

Basic Clause

SELECT

SELECT prod_name
FROM Products;

For multiple columns, be sure to use a comma.

SELECT prod_name,prod_id,prod_price
FROM Products;

The following expression is more clearly.

SELECT   prod_name
        ,prod_id
        ,prod_price
FROM Products;

Retrieving multiple columns using a wildcard

Request all columns by using a asterisk(*) wildcard character.

SELECT *
FROM Products;

Limit results

If the database is very large, we can use LIMIT to limit the number of retrieved results.

SELECT *
FROM Products
LIMIT 5;

NULL and Primary Keys

Every column is either NULL or NOT NULL.

Primary keys cannot be null, they must have a value.

INSERT

INSERT INTO Shoes
(Id
,Brand
,Price
,Desc
)
VALUES
('12345'
,'Gucci'
,'695.00'
,NULL
);

CREATE

CREATE TEMPORARY TABLE Sandals AS
(
SELECT *
FROM Shoes
WHERE shoe_type = 'sandals'
);

Comments

Single line

SELECT shoe_id
--,brand_id
,shoe_name
FROM Shoes;

Section

SELECT shoe_id
/*,brand_id
,shoe_name
*/
FROM Shoes;

Filtering, Sorting, and Calculating

Introductions

Clauses and Operators in SQL

  • WHERE
  • BETWEEN
  • IN
  • OR
  • NOT
  • LIKE
  • ORDER BY
  • GROUP BY

Wildcards

*,%

Math Operators

  • AVERAGE
  • COUNT
  • MAX
  • MIN

 

Why?

  • Be specific about the data you want to retrieve
  • Reduce the number of records you retrieve
  • Increase query performance
  • Reduce the strain on the client application
  • Governance the limitations

WHERE

SELECT column_name,column_name
FROM table_name
WHERE column_name operator value;

 

Single condition

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE ProductName =  'Tofu';

Single Value

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE UnitPrice >=  75;

Checking for Non-Matches

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE ProductName <>  'Alice'; --excluding 'Alice'

Range of Values(Between…and…)

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE UnitsInStock BETWEEN 15 AND 80;

No Value (NULL)

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE ProductName IS NULL;

IN, OR, and NOT

IN

  • Specifies a range of conditions
  • Comma delimited list of values
  • Enclosed in ()

Example

SELECT ProductID
,UnitPrice
,SupplierID
FROM Products
WHERE SupplierID IN (9,10,11);

OR operator

DBMS will not evaluate the second conditions in WHERE clause if the first condition is met.

Use for any rows matching the specific conditions.

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE ProductName =  'Tofu' OR 'Konbu';

IN vs. OR

IN works the same as OR.

Benefits of IN:

  1. Long list of options,
  2. IN executes faster than OR,
  3. Don’t have to think about the order with IN,
  4. Can contain another SELECT5

OR with AND

SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE SupplierID =  9 OR SupplierID =  11 AND UnitPrice > 15;
SELECT ProductName
,UnitPrice
,SupplierID
FROM Products
WHERE (SupplierID =  9 OR SupplierID =  11) AND UnitPrice > 15;

Don’t rely on the default order, it’s better to have the habit of using the ().

NOT

SELECT *
FROM Employees
WHERE NOT City = 'London' AND NOT City = 'Seattle';

Wildcards

  • Special character used to match parts of a value,
  • Use LIKE Operator

What are wildcards?

  • Can only be used with strings.
  • Cannot be used for non-text datatype.
  • Helpful for data scientists as they explore string variables.

Using % wildcards

  • % wildcards will NOT match NULLS, NULL represents no value in a column.

Using underscore(_) wildcards

  • Matches a single character.
  • Is not supported by DB2.

Downsides of wildcards

  • Takes longer to run.
  • Better to use another operator (=, <, >=, and etc.)
  • Statements with wildcards will take longer to run if used at the end of search pattern.
  • Placement of wildcards is important.

ORDER BY

SELECT something
FROM database
ORDER BY characteristic;
  • Can sort by a column not retrieved
  • Must always be the last clause in a select statement

Sorting by column position

ORDER BY 2,3;

It means data sorted by 2nd and 3rd column.

Sort direction

DESC: descending order

ASC: ascending order

Only applies to the column names it directly precedes.

Math operators

Operator Description
+ Addition
Subtraction
* Multiplication
/ Division

Order of operators

  • Parentheses ()
  • Exponents ^
  • Multiplication *
  • Division /
  • Addition +
  • Subtraction –

Parentheses

SELECT ProductID
,Quantity
,UnitPrice
,Discount
,(UnitPrice - Discount) / Quantity AS Total_Cost
FROM Orders;

 

 

Aggregate function

 

Average

SELECT AVG(UnitPrice) AS avg_price

FROM products

 

Count

Counts all the row in table (including NULL)

SELECT COUNT(*) AS total_customers

FROM Customers

Counts all the row in specific column (excluding NULL)

SELECT COUNT(cumtomerID) AS total_customers

FROM Customers

 

MIN & MAX

NULL

SELECT MAX(UnitPrice) AS max_prod_price,
,MIN(UnitPrice) AS min_prod_price

FROM products

 

SUM

SELECT SUM(UnitPrice) AS sum_prod_price

FROM products

 

SELECT SUM(UnitPrice*UnitInStock) AS total_price

FROM products

WHERE SupplierID = 25;

 

DISTINCT(is not aggregate function)

SELECT COUNT(DISTINCT SupplierID) 

FROM customers
  • If DISTINCT is not specified, ALL is assumed.
  • Cannot use DISTINCT on COUNT(*).
  • No value to use with MIN and MAX function.

 

Group by

Every column(including NULLs) will be present except for aggregated calculations

WHERE filter on rows

WHERE HAVING
filters before data is grouped filters after data is grouped
Rows eliminated by WHERE clause will not be included in the group  
Group by Order by
NOT sorts data sorts data

Notice

  • GROUP BY clause can contain multiple columns.
  • Every column in your SELECT statement must be present in a GROUP BY clause (except for aggregate functions).
  • NULLs will be grouped together if your GROUP BY column contains NULLs.

HAVING

WHERE does not work for GROUP.

WHERE filters on rows.

Instead using HAVING clause to filter for groups.

SELECT 
CustomerID
,COUNT(*) AS orders
FROM Orders
GROUP BY CustomerID
HAVING COUNT(*) >=2;

WHERE vs. HAVING

WHERE filters before data is grouped.

HAVING filters after data is grouped.

Rows eliminated by WHERE clause will not be included in the group.

SELECT 
SupplierID
,COUNT(*) AS Num_prod
FROM Products
WHERE UnitPrice >= 4
GROUP BY SupplierID
HAVING COUNT(*) >=2;

 

Reference:https://shipengfei92.github.io/2018/02/05/SQL-for-data-science/

YOU MIGHT ALSO LIKE

0 0 vote
Article Rating
Subscribe
提醒
guest
0 评论
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x