Here’s another post based on a question from Quora:
Can I export the contents of an HTML table to Excel or MySQL via Selenium/Python?
No, you can’t export a table from HTML to Excel or MySQL using Selenium with Python.
But you’re in luck! You just asked the wrong question.
It’s like if you had asked: “Can I build a bookshelf using my DeWalt Table Saw?”
The answer is, of course, still no, but a reasonable person would say that your table saw is one of the tools you could use, but you’re also going to need a hammer, and some nails — or a screwdriver and screws — and maybe some paint or varnish. And a couple clamps to hold it all to together while you’re building it.
Unfortunately, I am not a reasonable person. But I’ll show you how to do it anyway.
You can use Selenium with Python to extract content from an HTML table. You can then use other tools (with Python) to import that content into an Excel spreadsheet or MySQL database.
For example, I’ll fetch cryptocurrency prices from CoinMarketCap using Selenium:
# get the HTML table data using Selenium from selenium import webdriver url = "https://coinmarketcap.com/" table_locator = "xpath", "//table" driver = webdriver.Chrome() driver.get(url) table_webelement = driver.find_element(*table_locator) table_html = table_webelement.get_attribute("outerHTML")
In addition to Selenium, I’d recommend using Pandas DataFrames to export to Excel — both because it’s easier than working with openpyxl directory (or xlwt, or xlsxwriter, or one of several other lower level libraries) and because pandas has all sorts of other great data manipulation features that might come in handy. (And it looks great on a resume!)
Here’s how you can use Python to read an HTML table directly into a Pandas Dataframe and then export it to a Microsoft Excel Spreadsheet using DataFrame.to_excel()
# load the HTML table to Pandas DataFrame import pandas dataframes = pandas.read_html(table_html) # get the first and only table on the page table_dataframe = dataframes # export data to Excel table_dataframe.to_excel("data.xlsx")
Here is the resulting spreadsheet:
Or you can export the Dataframe to a database.
Here, we use MySQL Connector with SQL Alchemy to append our results from the HTML table to a table named “prices” into the MariaDB “test” database. If the table does not exist, it creates it.
Using Pandas Datataframe.to_sql()
# export data to MySQL (or MariaDB) import sqlalchemy from mysql import connector conn = sqlalchemy.create_engine("mysql+mysqlconnector://username:password@localhost/test") table_dataframe.to_sql(con=conn, name="prices", if_exists='append', index=False)
And our resulting table will look like this:
There are also other tools you may be able to use:
requests HTTP library instead of Selenium if you don’t need to manipulate the browser to fetch data
# get prices using requests import requests response = requests.get("https://coinmarketcap.com/")
BeautifulSoup4 to parse the HTML into a List of Lists
# parse data using Beauitful Soup from bs4 import BeautifulSoup soup = BeautifulSoup(response.content, "html.parser") table_soup = soup.find("table") headings = [th.text.strip() for th in table_soup.find_all("th")] rows =  for tr in table_soup.find_all('tr', limit=11): row =  for td in tr.find_all('td'): row.append(td.text.strip()) rows.append(row)
csv to parse into a CSV instead of Excel format (a CSV file can also be loaded directly into MySQL without Pandas).
#export data to csv file import csv with open("data.csv", mode="w") as csvfile: writer = csv.writer(csvfile) writer.writerow(headings) writer.writerows(rows)
Of course , you can also load your data List into pandas as well:
table_dataframe = pandas.DataFrame(rows)
Here is a gist showing all the code together: