" -大海为什么是蓝色的?
-因为每条鱼吐泡泡的时候 都是blue blue的 "

[1] SQL Notes – Introduction

Introduction

What is SQL?

Structured Query Language (SQL) is a standard computer language for relational database management and data manipulation.

Used to query, insert, update and modify data.

Pronounced as “sequel” or S-Q-L.

Used to communicate with databases;

SQL is non-procedural language:

  • Cannot write complete applications
  • Simple, but powerful

How is SQL used?

Read/retrieve data

Write data – add data to a table;

Update data – insert new data;

Who uses SQL?

  • Backend Developer
  • Data Architect
  • QA Engineer
  • ETL Developer
  • Database Admin (DBA)
  • Data analyst
  • System Admin
  • System Engineer
  • Data Scientist

DBA or Data Scientist

DBA Data Scientist
Manages/governs entire database End user of a database
Gives permissions to users Use SQL to query and retrieve data
Determines access to data  
Manages and creates tables  
Uses SQL to query and retrieve data  

How do Data Scientists use SQL?

Retrieve data;

May create their own table or test environments;

Combine multiple sources together;

Write complex queries for analysis.

Relational Database Management Systems

  • SQL Server
  • IBM DB2 Oracle
  • Sybase ASE
  • PostgreSQL
  • MySQL
  • Microsoft SQL Server
  • Apache Open Office Base
  • SQLite

Write syntax depends on what DBMS you are using.

Each DBMS has its own dialect.

SQL can translate.

Tweak based on the dialect your DBMS speaks.

Data Models

Thinking about your data

Understand the business process or subject matter the data is modeled after

Know the business rules

Understand how your data is organized and structured in the table(modeled).

Why this is worthwhile

Get more accurate results,

Speed up your work,

Have less rework.

Some concepts

Name Description
Database A container to store organized data
Tables A structured list of data or a specific type
Column A single field in a table
Row A record in a table

What is data modeling?

Organizes and structures information into multiple, related tables.

Can represent a business process or show relationships between business processes.

Should closely represent real world.

Types of data modeling

Models for prediction built by data scientist.

Data model as data tables represented and organized in a database.

Evolution of data models

Time Types
1960 Hierarchical
1969 Network
1970 Relational
1976 Entity relationship
1978 Semantic
1985 Object-oriented
1990 Extended relational(O/R DBMS)
2009 NoSQL – Not only SQL

Relational vs. Transactional Models

Relational Model Transactional Model
Allows for easy querying and data manipulation in an easy, logical and intuitive way. Operational database – insurance claims within a healthcare database

Data model building blocks

Name Description
Entity Person, place thing or event, Distinguishable, unique and distinct
Attribute A characteristic of an entity
Relationship Describes association among entities. One-to-many,Many-to-many,One-to-one
  • One-to-many: customer to invoices.
  • Many-to-many: student to classes.
  • One-to-one: manager to store.

ER Diagrams

ER Model

  • Show relationships
  • Business process
  • Represented visually
  • Show links (Primary keys)

Some examples of ER Diagram

ER Diagram

Reference:https://shipengfei92.github.io/2018/02/05/SQL-for-data-science/

YOU MIGHT ALSO LIKE

0 0 vote
Article Rating
Subscribe
提醒
guest
0 评论
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x