[1] SQL Notes – Introduction


Introduction
What is SQL?
Structured Query Language (SQL) is a standard computer language for relational database management and data manipulation.
Used to query, insert, update and modify data.
Pronounced as “sequel” or S-Q-L.
Used to communicate with databases;
SQL is non-procedural language:
- Cannot write complete applications
- Simple, but powerful
How is SQL used?
Read/retrieve data
Write data – add data to a table;
Update data – insert new data;
Who uses SQL?
- Backend Developer
- Data Architect
- QA Engineer
- ETL Developer
- Database Admin (DBA)
- Data analyst
- System Admin
- System Engineer
- Data Scientist
DBA or Data Scientist
DBA | Data Scientist |
---|---|
Manages/governs entire database | End user of a database |
Gives permissions to users | Use SQL to query and retrieve data |
Determines access to data | |
Manages and creates tables | |
Uses SQL to query and retrieve data |
How do Data Scientists use SQL?
Retrieve data;
May create their own table or test environments;
Combine multiple sources together;
Write complex queries for analysis.
Relational Database Management Systems
- SQL Server
- IBM DB2 Oracle
- Sybase ASE
- PostgreSQL
- MySQL
- Microsoft SQL Server
- Apache Open Office Base
- SQLite
Write syntax depends on what DBMS you are using.
Each DBMS has its own dialect.
SQL can translate.
Tweak based on the dialect your DBMS speaks.
Data Models
Thinking about your data
Understand the business process or subject matter the data is modeled after
Know the business rules
Understand how your data is organized and structured in the table(modeled).
Why this is worthwhile
Get more accurate results,
Speed up your work,
Have less rework.
Some concepts
Name | Description |
---|---|
Database | A container to store organized data |
Tables | A structured list of data or a specific type |
Column | A single field in a table |
Row | A record in a table |
What is data modeling?
Organizes and structures information into multiple, related tables.
Can represent a business process or show relationships between business processes.
Should closely represent real world.
Types of data modeling
Models for prediction built by data scientist.
Data model as data tables represented and organized in a database.
Evolution of data models
Time | Types |
---|---|
1960 | Hierarchical |
1969 | Network |
1970 | Relational |
1976 | Entity relationship |
1978 | Semantic |
1985 | Object-oriented |
1990 | Extended relational(O/R DBMS) |
2009 | NoSQL – Not only SQL |
Relational vs. Transactional Models
Relational Model | Transactional Model |
---|---|
Allows for easy querying and data manipulation in an easy, logical and intuitive way. | Operational database – insurance claims within a healthcare database |
Data model building blocks
Name | Description |
---|---|
Entity | Person, place thing or event, Distinguishable, unique and distinct |
Attribute | A characteristic of an entity |
Relationship | Describes association among entities. One-to-many,Many-to-many,One-to-one |
- One-to-many: customer to invoices.
- Many-to-many: student to classes.
- One-to-one: manager to store.
ER Diagrams
ER Model
- Show relationships
- Business process
- Represented visually
- Show links (Primary keys)
Some examples of ER Diagram
Reference:https://shipengfei92.github.io/2018/02/05/SQL-for-data-science/