SQLAlchemy¶
SQLAlchemy es una librería para Python que facilita el acceso a una base de datos relacional, así como las operaciones a realizar sobre la misma.
Es independiente del motor de base de datos a utilizar, es decir, en principio, es compatible con la mayoría de bases de datos relacionales conocidas: PostgreSQL, MySQL, Oracle, Microsoft SQL Server, Sqlite, …
Conección a bases de datos¶
Lo primero que hay que hacer para trabajar con SQLAlchemy es crear un engine
. El engine
es el punto de entrada a la base de datos, es decir, el que permite a SQLAlchemy comunicarse con esta.
El motor se usa principalmente para manejar dos elementos:
pools de conexiones: Cuando se usa una conexión a la base de datos, esta ya está creada previamente y es reutilizada por el programa.
dialectos de base de datos: A pesar de que el lenguaje SQL es universal, cada motor de base de datos introduce ciertas variaciones propietarias sobre dicho lenguaje.
En la carpeta data se encuentra el archivo chinook.db
, que es básicamente una base de datos sql.
Las 11 tablas se definen de la siguiente forma (en inglés):
employees
table stores employees data such as employee id, last name, first name, etc. It also has a field named ReportsTo to specify who reports to whom.customers
table stores customers data.invoices
&invoice_items
tables: these two tables store invoice data. Theinvoices
table stores invoice header data and theinvoice_items
table stores the invoice line items data.artists
table stores artists data. It is a simple table that contains only artist id and name.albums
table stores data about a list of tracks. Each album belongs to one artist. However, one artist may have multiple albums.media_types
table stores media types such as MPEG audio file, ACC audio file, etc.genres
table stores music types such as rock, jazz, metal, etc.tracks
table store the data of songs. Each track belongs to one album.playlists
&playlist_track tables
:playlists
table store data about playlists. Each playlist contains a list of tracks. Each track may belong to multiple playlists. The relationship between theplaylists
table andtracks
table is many-to-many. Theplaylist_track
table is used to reflect this relationship.
Nuestra primera consulata consiste en ver todas las tablas de el esquema de datos.
from sqlalchemy import create_engine
import pandas as pd
import os
# Crear conector
engine = create_engine(os.path.join('sqlite:///','data', 'chinook.db'))
# Crear query
query ="""
SELECT name
FROM sqlite_master
WHERE type='table'
"""
df = pd.read_sql(query, con=engine)
df
name | |
---|---|
0 | albums |
1 | sqlite_sequence |
2 | artists |
3 | customers |
4 | employees |
5 | genres |
6 | invoices |
7 | invoice_items |
8 | media_types |
9 | playlists |
10 | playlist_track |
11 | tracks |
12 | sqlite_stat1 |
Comandos en SQL¶
Manipulación de datos¶
Los comandos básicos de manipulación de datos son:
Comando |
Descripción |
|
---|---|---|
SELECT |
Utilizado para consultar registros de la base de datos que satisfagan un criterio determinado |
|
INSERT |
Utilizado para cargar lotes de datos en la base de datos en una única operación. |
|
UPDATE |
Utilizado para modificar los valores de los campos y registros especificados Utilizado para modificar las tablas agregando campos o cambiando la definición de los campos. |
|
DELETE |
Utilizado para eliminar registros de una tabla |
De estos cuatro comandos, esta seción estará centrada en el comando SELECT
(útil para hacer consultas en SQL)
Clausulas¶
Las cláusulas son condiciones de modificación utilizadas para definir los datos que desea seleccionar o manipular.
Comando |
Descripción |
---|---|
FROM |
Utilizada para especificar la tabla de la cual se van a seleccionar los registros |
GROUP BY |
Utilizada para separar los registros seleccionados en grupos específicos |
HAVING |
Utilizada para expresar condición que debe satisfacer cada grupo |
ORDER BY |
Utilizada para ordenar los registros seleccionados de acuerdo con un orden específico |
WHERE |
Utilizada para determinar los registros seleccionados en la clausula FROM |
Operadores¶
Operadores Lógicos¶
Operador |
Descripción |
---|---|
AND |
Es el “y” lógico. Evalúa dos condiciones y devuelve un valor de verdad sólo si ambas son ciertas. |
OR |
Es el “o” lógico. Evalúa dos condiciones y devuelve un valor de verdad si alguna de las dos es cierta. |
NOT |
Negación lógica. Devuelve el valor contrario de la expresión. |
Operadores de comparación¶
Operador |
Descripción |
---|---|
< |
Menor que |
> |
Mayor que |
<> |
Distinto de |
<= |
Menor o igual que |
>= |
Mayor o igual que |
BETWEEN |
Intervalo |
LIKE |
Comparación |
In |
Especificar |
Funciones de agregado¶
Las funciones de agregado se usan dentro de una cláusula SELECT en grupos de registros para devolver un único valor que se aplica a un grupo de registros.
Comando |
Descripción |
---|---|
AVG |
Utilizada para calcular el promedio de los valores de un campo determinado |
COUNT |
Utilizada para devolver el número de registros de la selección |
SUM |
Utilizada para devolver la suma de todos los valores de un campo determinado |
MAX |
Utilizada para devolver el valor más alto de un campo especificado |
MIN |
Utilizada para devolver el valor más bajo de un campo especificado |
Consultas¶
Las consultas de selección se utilizan para indicar al motor de datos que devuelva información de las bases de datos, esta información es devuelta en forma de conjunto de registros. Este conjunto de registros es modificable.
Lo primero es crear una función para poder conectarse a la base de datos y realizar las distintas consultas.
Consultas básicas¶
Se realizarán distintas consultas ocupando los operadores básicos y de comparación.
# funcion de consultas
def consulta(query,conn):
return pd.read_sql_query(query, con=conn)
# Crear conector
conn = create_engine(os.path.join('sqlite:///','data', 'chinook.db'))
# basicas
query = """
SELECT EmployeeId,LastName, FirstName , Title FROM employees
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | |
---|---|---|---|---|
0 | 1 | Adams | Andrew | General Manager |
1 | 2 | Edwards | Nancy | Sales Manager |
2 | 3 | Peacock | Jane | Sales Support Agent |
3 | 4 | Park | Margaret | Sales Support Agent |
4 | 5 | Johnson | Steve | Sales Support Agent |
5 | 6 | Mitchell | Michael | IT Manager |
6 | 7 | King | Robert | IT Staff |
7 | 8 | Callahan | Laura | IT Staff |
# Ordenar los registros: una columna
query = """
SELECT EmployeeId,LastName, FirstName , Title FROM employees ORDER BY LastName
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | |
---|---|---|---|---|
0 | 1 | Adams | Andrew | General Manager |
1 | 8 | Callahan | Laura | IT Staff |
2 | 2 | Edwards | Nancy | Sales Manager |
3 | 5 | Johnson | Steve | Sales Support Agent |
4 | 7 | King | Robert | IT Staff |
5 | 6 | Mitchell | Michael | IT Manager |
6 | 4 | Park | Margaret | Sales Support Agent |
7 | 3 | Peacock | Jane | Sales Support Agent |
# Ordenar los registros: varias columnas
query = """
SELECT EmployeeId,LastName, FirstName , Title FROM employees ORDER BY LastName,Title
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | |
---|---|---|---|---|
0 | 1 | Adams | Andrew | General Manager |
1 | 8 | Callahan | Laura | IT Staff |
2 | 2 | Edwards | Nancy | Sales Manager |
3 | 5 | Johnson | Steve | Sales Support Agent |
4 | 7 | King | Robert | IT Staff |
5 | 6 | Mitchell | Michael | IT Manager |
6 | 4 | Park | Margaret | Sales Support Agent |
7 | 3 | Peacock | Jane | Sales Support Agent |
# Ordenar los registros: especificar el orden
query = """
SELECT EmployeeId,LastName, FirstName , Title FROM employees ORDER BY LastName DESC , Title ASC;
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | |
---|---|---|---|---|
0 | 3 | Peacock | Jane | Sales Support Agent |
1 | 4 | Park | Margaret | Sales Support Agent |
2 | 6 | Mitchell | Michael | IT Manager |
3 | 7 | King | Robert | IT Staff |
4 | 5 | Johnson | Steve | Sales Support Agent |
5 | 2 | Edwards | Nancy | Sales Manager |
6 | 8 | Callahan | Laura | IT Staff |
7 | 1 | Adams | Andrew | General Manager |
# operador *
query = """
SELECT * FROM employees;
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | ReportsTo | BirthDate | HireDate | Address | City | State | Country | PostalCode | Phone | Fax | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Adams | Andrew | General Manager | NaN | 1962-02-18 00:00:00 | 2002-08-14 00:00:00 | 11120 Jasper Ave NW | Edmonton | AB | Canada | T5K 2N1 | +1 (780) 428-9482 | +1 (780) 428-3457 | andrew@chinookcorp.com |
1 | 2 | Edwards | Nancy | Sales Manager | 1.0 | 1958-12-08 00:00:00 | 2002-05-01 00:00:00 | 825 8 Ave SW | Calgary | AB | Canada | T2P 2T3 | +1 (403) 262-3443 | +1 (403) 262-3322 | nancy@chinookcorp.com |
2 | 3 | Peacock | Jane | Sales Support Agent | 2.0 | 1973-08-29 00:00:00 | 2002-04-01 00:00:00 | 1111 6 Ave SW | Calgary | AB | Canada | T2P 5M5 | +1 (403) 262-3443 | +1 (403) 262-6712 | jane@chinookcorp.com |
3 | 4 | Park | Margaret | Sales Support Agent | 2.0 | 1947-09-19 00:00:00 | 2003-05-03 00:00:00 | 683 10 Street SW | Calgary | AB | Canada | T2P 5G3 | +1 (403) 263-4423 | +1 (403) 263-4289 | margaret@chinookcorp.com |
4 | 5 | Johnson | Steve | Sales Support Agent | 2.0 | 1965-03-03 00:00:00 | 2003-10-17 00:00:00 | 7727B 41 Ave | Calgary | AB | Canada | T3B 1Y7 | 1 (780) 836-9987 | 1 (780) 836-9543 | steve@chinookcorp.com |
5 | 6 | Mitchell | Michael | IT Manager | 1.0 | 1973-07-01 00:00:00 | 2003-10-17 00:00:00 | 5827 Bowness Road NW | Calgary | AB | Canada | T3B 0C5 | +1 (403) 246-9887 | +1 (403) 246-9899 | michael@chinookcorp.com |
6 | 7 | King | Robert | IT Staff | 6.0 | 1970-05-29 00:00:00 | 2004-01-02 00:00:00 | 590 Columbia Boulevard West | Lethbridge | AB | Canada | T1K 5N8 | +1 (403) 456-9986 | +1 (403) 456-8485 | robert@chinookcorp.com |
7 | 8 | Callahan | Laura | IT Staff | 6.0 | 1968-01-09 00:00:00 | 2004-03-04 00:00:00 | 923 7 ST NW | Lethbridge | AB | Canada | T1H 1Y8 | +1 (403) 467-3351 | +1 (403) 467-8772 | laura@chinookcorp.com |
# operador LIMIT
query = """
SELECT LastName FROM employees LIMIT 5;
"""
consulta(query,conn)
LastName | |
---|---|
0 | Adams |
1 | Edwards |
2 | Peacock |
3 | Park |
4 | Johnson |
# operador DISTINCT
query = """
SELECT DISTINCT LastName FROM employees
"""
consulta(query,conn)
LastName | |
---|---|
0 | Adams |
1 | Edwards |
2 | Peacock |
3 | Park |
4 | Johnson |
5 | Mitchell |
6 | King |
7 | Callahan |
# operadores logicos
query1 = """
SELECT * FROM employees WHERE ReportsTo>1 AND ReportsTo<4
"""
query2 = """
SELECT * FROM employees WHERE ReportsTo>1 OR City='Calgary'
"""
query3 = """
SELECT * FROM employees WHERE NOT City = 'Calgary'
"""
consulta(query1,conn)
EmployeeId | LastName | FirstName | Title | ReportsTo | BirthDate | HireDate | Address | City | State | Country | PostalCode | Phone | Fax | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3 | Peacock | Jane | Sales Support Agent | 2 | 1973-08-29 00:00:00 | 2002-04-01 00:00:00 | 1111 6 Ave SW | Calgary | AB | Canada | T2P 5M5 | +1 (403) 262-3443 | +1 (403) 262-6712 | jane@chinookcorp.com |
1 | 4 | Park | Margaret | Sales Support Agent | 2 | 1947-09-19 00:00:00 | 2003-05-03 00:00:00 | 683 10 Street SW | Calgary | AB | Canada | T2P 5G3 | +1 (403) 263-4423 | +1 (403) 263-4289 | margaret@chinookcorp.com |
2 | 5 | Johnson | Steve | Sales Support Agent | 2 | 1965-03-03 00:00:00 | 2003-10-17 00:00:00 | 7727B 41 Ave | Calgary | AB | Canada | T3B 1Y7 | 1 (780) 836-9987 | 1 (780) 836-9543 | steve@chinookcorp.com |
# Operador BETWEEN
query = """
SELECT * FROM employees WHERE ReportsTo BETWEEN 1 AND 4
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | ReportsTo | BirthDate | HireDate | Address | City | State | Country | PostalCode | Phone | Fax | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | Edwards | Nancy | Sales Manager | 1 | 1958-12-08 00:00:00 | 2002-05-01 00:00:00 | 825 8 Ave SW | Calgary | AB | Canada | T2P 2T3 | +1 (403) 262-3443 | +1 (403) 262-3322 | nancy@chinookcorp.com |
1 | 6 | Mitchell | Michael | IT Manager | 1 | 1973-07-01 00:00:00 | 2003-10-17 00:00:00 | 5827 Bowness Road NW | Calgary | AB | Canada | T3B 0C5 | +1 (403) 246-9887 | +1 (403) 246-9899 | michael@chinookcorp.com |
2 | 3 | Peacock | Jane | Sales Support Agent | 2 | 1973-08-29 00:00:00 | 2002-04-01 00:00:00 | 1111 6 Ave SW | Calgary | AB | Canada | T2P 5M5 | +1 (403) 262-3443 | +1 (403) 262-6712 | jane@chinookcorp.com |
3 | 4 | Park | Margaret | Sales Support Agent | 2 | 1947-09-19 00:00:00 | 2003-05-03 00:00:00 | 683 10 Street SW | Calgary | AB | Canada | T2P 5G3 | +1 (403) 263-4423 | +1 (403) 263-4289 | margaret@chinookcorp.com |
4 | 5 | Johnson | Steve | Sales Support Agent | 2 | 1965-03-03 00:00:00 | 2003-10-17 00:00:00 | 7727B 41 Ave | Calgary | AB | Canada | T3B 1Y7 | 1 (780) 836-9987 | 1 (780) 836-9543 | steve@chinookcorp.com |
# Operador LIKE
query = """
SELECT * FROM employees WHERE PostalCode LIKE '%T2P%'
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | ReportsTo | BirthDate | HireDate | Address | City | State | Country | PostalCode | Phone | Fax | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | Edwards | Nancy | Sales Manager | 1 | 1958-12-08 00:00:00 | 2002-05-01 00:00:00 | 825 8 Ave SW | Calgary | AB | Canada | T2P 2T3 | +1 (403) 262-3443 | +1 (403) 262-3322 | nancy@chinookcorp.com |
1 | 3 | Peacock | Jane | Sales Support Agent | 2 | 1973-08-29 00:00:00 | 2002-04-01 00:00:00 | 1111 6 Ave SW | Calgary | AB | Canada | T2P 5M5 | +1 (403) 262-3443 | +1 (403) 262-6712 | jane@chinookcorp.com |
2 | 4 | Park | Margaret | Sales Support Agent | 2 | 1947-09-19 00:00:00 | 2003-05-03 00:00:00 | 683 10 Street SW | Calgary | AB | Canada | T2P 5G3 | +1 (403) 263-4423 | +1 (403) 263-4289 | margaret@chinookcorp.com |
# Operador IN
query = """
SELECT * FROM employees WHERE ReportsTo IN (1,6)
"""
consulta(query,conn)
EmployeeId | LastName | FirstName | Title | ReportsTo | BirthDate | HireDate | Address | City | State | Country | PostalCode | Phone | Fax | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | Edwards | Nancy | Sales Manager | 1 | 1958-12-08 00:00:00 | 2002-05-01 00:00:00 | 825 8 Ave SW | Calgary | AB | Canada | T2P 2T3 | +1 (403) 262-3443 | +1 (403) 262-3322 | nancy@chinookcorp.com |
1 | 6 | Mitchell | Michael | IT Manager | 1 | 1973-07-01 00:00:00 | 2003-10-17 00:00:00 | 5827 Bowness Road NW | Calgary | AB | Canada | T3B 0C5 | +1 (403) 246-9887 | +1 (403) 246-9899 | michael@chinookcorp.com |
2 | 7 | King | Robert | IT Staff | 6 | 1970-05-29 00:00:00 | 2004-01-02 00:00:00 | 590 Columbia Boulevard West | Lethbridge | AB | Canada | T1K 5N8 | +1 (403) 456-9986 | +1 (403) 456-8485 | robert@chinookcorp.com |
3 | 8 | Callahan | Laura | IT Staff | 6 | 1968-01-09 00:00:00 | 2004-03-04 00:00:00 | 923 7 ST NW | Lethbridge | AB | Canada | T1H 1Y8 | +1 (403) 467-3351 | +1 (403) 467-8772 | laura@chinookcorp.com |
# Operadores matematicos
query = """
SELECT SUM(ReportsTo),COUNT(ReportsTo),AVG(ReportsTo) FROM employees
"""
consulta(query,conn)
SUM(ReportsTo) | COUNT(ReportsTo) | AVG(ReportsTo) | |
---|---|---|---|
0 | 20 | 7 | 2.857143 |
Cosultas mediante GROUP BY¶
Combina los registros con valores idénticos, en la lista de campos especificados, en un único registro:
Ejemplo: En la base datos chinook existe la tabla employees. ¿Cómo agrupar por la columna Title?
query = """
SELECT Title, SUM(ReportsTo),COUNT(ReportsTo),AVG(ReportsTo) FROM employees GROUP BY Title
"""
consulta(query,conn)
Title | SUM(ReportsTo) | COUNT(ReportsTo) | AVG(ReportsTo) | |
---|---|---|---|---|
0 | General Manager | NaN | 0 | NaN |
1 | IT Manager | 1.0 | 1 | 1.0 |
2 | IT Staff | 12.0 | 2 | 6.0 |
3 | Sales Manager | 1.0 | 1 | 1.0 |
4 | Sales Support Agent | 6.0 | 3 | 2.0 |
Cosultas mediante JOIN¶
La sentencia SQL JOIN se utiliza para relacionar varias tablas. Nos permitirá obtener un listado de los campos que tienen coincidencias en ambas tablas.
Ejemplo: En la base datos chinook existen las tablas albums y artists. ¿Cómo agregar el nombre del artista a la tabla de álbumes?
# EJEMPLO 01
query = """
SELECT * FROM artists JOIN albums ON artists.ArtistId=albums.ArtistId
"""
consulta(query,conn)
ArtistId | Name | AlbumId | Title | ArtistId | |
---|---|---|---|---|---|
0 | 1 | AC/DC | 1 | For Those About To Rock We Salute You | 1 |
1 | 2 | Accept | 2 | Balls to the Wall | 2 |
2 | 2 | Accept | 3 | Restless and Wild | 2 |
3 | 1 | AC/DC | 4 | Let There Be Rock | 1 |
4 | 3 | Aerosmith | 5 | Big Ones | 3 |
... | ... | ... | ... | ... | ... |
342 | 226 | Eugene Ormandy | 343 | Respighi:Pines of Rome | 226 |
343 | 272 | Emerson String Quartet | 344 | Schubert: The Late String Quartets & String Qu... | 272 |
344 | 273 | C. Monteverdi, Nigel Rogers - Chiaroscuro; Lon... | 345 | Monteverdi: L'Orfeo | 273 |
345 | 274 | Nash Ensemble | 346 | Mozart: Chamber Music | 274 |
346 | 275 | Philip Glass Ensemble | 347 | Koyaanisqatsi (Soundtrack from the Motion Pict... | 275 |
347 rows × 5 columns
# EJEMPLO 02
query = """
SELECT A.ArtistId, A.Name, B.AlbumId,B.Title FROM artists AS A JOIN albums AS B ON A.ArtistId=B.ArtistId
"""
consulta(query,conn)
ArtistId | Name | AlbumId | Title | |
---|---|---|---|---|
0 | 1 | AC/DC | 1 | For Those About To Rock We Salute You |
1 | 2 | Accept | 2 | Balls to the Wall |
2 | 2 | Accept | 3 | Restless and Wild |
3 | 1 | AC/DC | 4 | Let There Be Rock |
4 | 3 | Aerosmith | 5 | Big Ones |
... | ... | ... | ... | ... |
342 | 226 | Eugene Ormandy | 343 | Respighi:Pines of Rome |
343 | 272 | Emerson String Quartet | 344 | Schubert: The Late String Quartets & String Qu... |
344 | 273 | C. Monteverdi, Nigel Rogers - Chiaroscuro; Lon... | 345 | Monteverdi: L'Orfeo |
345 | 274 | Nash Ensemble | 346 | Mozart: Chamber Music |
346 | 275 | Philip Glass Ensemble | 347 | Koyaanisqatsi (Soundtrack from the Motion Pict... |
347 rows × 4 columns