Adding a million rows from CSV file to a Mysql database table using Python and Pandas!
Do you have a lot of CSV files, with sometimes many rows of data that you need to import to a Mysql database?
You could go the route of the various “LOAD INFILE” etc that various versions of Mysql supports, however I always feel that Python is the most obvious road to take.
Look at this code:
from sqlalchemy import create_engine import pandas as pd import pymysql engine = create_engine('mysql+pymysql://mydbuser:mydbpass@mydbhost/mydatabase') # connect to server conn = engine.connect() # if csv don't have column names you msut give them as argument! # df = pd.read_csv("mydata.csv", names=["some-header-1","some-header-1","some-header-2"]) df = pd.read_csv("mydata.csv") print("Processing", end='') chunksize = 10000 total_rows = 0 with pd.read_csv(filename, chunksize=chunksize) as reader: for chunk in reader: total_rows += len(chunk) print(".", end='') sys.stdout.flush() result = chunk.to_sql(con=conn, name=db_table, if_exists="append") print("") print("* Total rows processed: " + str(total_rows)) engine.dispose() conn.close()
Pretty neat. huh?