How to get cryptocurrency prices using Python (and various tools)

Here’s another post based on a question from Quora:

Can I export the contents of an HTML table to Excel or MySQL via Selenium/Python?

No, you can’t export a table from HTML to Excel or MySQL using Selenium with Python.

But you’re in luck! You just asked the wrong question.

It’s like if you had asked: “Can I build a bookshelf using my DeWalt Table Saw?”

The answer is, of course, still no, but a reasonable person would say that your table saw is one of the tools you could use, but you’re also going to need a hammer, and some nails — or a screwdriver and screws — and maybe some paint or varnish. And a couple clamps to hold it all to together while you’re building it.

Unfortunately, I am not a reasonable person. But I’ll show you how to do it anyway.

You can use Selenium with Python to extract content from an HTML table. You can then use other tools (with Python) to import that content into an Excel spreadsheet or MySQL database.

For example, I’ll fetch cryptocurrency prices from CoinMarketCap using Selenium:

# get the HTML table data using Selenium
from selenium import webdriver

url = "https://coinmarketcap.com/"
table_locator = "xpath", "//table"

driver = webdriver.Chrome()
driver.get(url)

table_webelement = driver.find_element(*table_locator)
table_html = table_webelement.get_attribute("outerHTML")

In addition to Selenium, I’d recommend using Pandas DataFrames to export to Excel — both because it’s easier than working with openpyxl directory (or xlwt, or xlsxwriter, or one of several other lower level libraries) and because pandas has all sorts of other great data manipulation features that might come in handy. (And it looks great on a resume!)

Here’s how you can use Python to read an HTML table directly into a Pandas Dataframe and then export it to a Microsoft Excel Spreadsheet using DataFrame.to_excel()

# load the HTML table to Pandas DataFrame
import pandas

dataframes = pandas.read_html(table_html)

# get the first and only table on the page
table_dataframe = dataframes[0] 

# export data to Excel
table_dataframe.to_excel("data.xlsx")

Here is the resulting spreadsheet:

Excel Spreadsheet with Most Recent Cryptocurrency Pricing

Or you can export the Dataframe to a database.

Here, we use MySQL Connector with SQL Alchemy to append our results from the HTML table to a table named “prices” into the MariaDB “test” database. If the table does not exist, it creates it.

Using Pandas Datataframe.to_sql()

# export data to MySQL (or MariaDB)
import sqlalchemy
from mysql import connector

conn = sqlalchemy.create_engine("mysql+mysqlconnector://username:password@localhost/test")

table_dataframe.to_sql(con=conn, name="prices", if_exists='append', index=False)

And our resulting table will look like this:

MYSQL Table with Most Recent Cryptocurrency Pricing

There are also other tools you may be able to use:

requests HTTP library instead of Selenium if you don’t need to manipulate the browser to fetch data

# get prices using requests
import requests

response = requests.get("https://coinmarketcap.com/")

BeautifulSoup4 to parse the HTML into a List of Lists

# parse data using Beauitful Soup
from bs4 import BeautifulSoup

soup = BeautifulSoup(response.content, "html.parser")
table_soup = soup.find("table")

headings = [th.text.strip() for th in table_soup.find_all("th")]

rows = []
for tr in table_soup.find_all('tr', limit=11):
    row = []
    for td in tr.find_all('td'):
        row.append(td.text.strip())
    rows.append(row)

csv to parse into a CSV instead of Excel format (a CSV file can also be loaded directly into MySQL without Pandas).

#export data to csv file
import csv 
 
with open("data.csv", mode="w") as csvfile: 
    writer = csv.writer(csvfile) 
    writer.writerow(headings) 
    writer.writerows(rows) 

Of course , you can also load your data List into pandas as well:

table_dataframe = pandas.DataFrame(rows) 

Here is a gist showing all the code together:

pip install selenium
pip install pandas
pip install openpyxl
pip install sqlalchemy
pip install mysql-connector-python
pip install beautifulsoup4
pip install requests
view raw dependencies hosted with ❤ by GitHub
# get prices using requests
import requests
response = requests.get("https://coinmarketcap.com/")
# parse data using Beauitful Soup
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.content, "html.parser")
table_soup = soup.find("table")
headings = [th.text.strip() for th in table_soup.find_all("th")]
rows = []
for tr in table_soup.find_all('tr', limit=11):
row = []
for td in tr.find_all('td'):
row.append(td.text.strip())
rows.append(row)
# save data using CSV
import csv
with open("data.csv", mode="w") as csvfile:
writer = csv.writer(csvfile)
writer.writerow(headings)
writer.writerows(rows)
from selenium import webdriver
url = "https://coinmarketcap.com/"
table_locator = "xpath", "//table"
driver = webdriver.Chrome()
driver.get(url)
table_webelement = driver.find_element(*table_locator)
table_html = table_webelement.get_attribute("outerHTML")
# load the HTML table to Pandas DataFrame
import pandas
dataframes = pandas.read_html(table_html)
table_dataframe = dataframes[0] # get the first and only table on the page
# export data to Excel
table_dataframe.to_excel("data.xlsx")
# export data to MySQL (or MariaDB)
import sqlalchemy
from mysql import connector
conn = sqlalchemy.create_engine("mysql+mysqlconnector://username:password@localhost/test")
table_dataframe.to_sql(con=conn, name="prices", if_exists='append', index=False)

A story of performance optimization and refactoring

Optimizing string to date format conversion in Go?

Faster TIme Parsing in Go – a story in 3 acts – by Phil Pearl at Ravelin

A great story of performance refactoring and profiling.

But why go to the effort?

Well, when you process billions of records in a NoSQL database and you made a decision early on to store dates in as a string because

— why worry about typing when you’re prototyping? —

The takeaway lesson from Ravelin is:

Friends don’t let friends store dates in the database as strings

Phil Pearl’s blog is often about performance tuning for big data and definitely a good read worth checking out.

Bitcoin doesn’t work – and what can we do about it?

Bitcoin doesn’t work.

Two problems:

  1. Technical – Blockchain doesn’t scale.
  2. Social – Why should Bitcoin believers (and speculators) profit at the cost of everyone else.

If Bitcoin was the solution then there would be no need of the FOMO. Everyone could adopt it when it is stable. But nobody interested in Bitcoin wants it to stabilize — because they wouldn’t profit.

And alternatives — from Etherium to everything else — show both these flaws in Bitcoin.

Bitcoin forks try to address issues with Bitcoin but primarily want to to a piece of the pie. And other crypto currencies do the same even though there are interesting diverse ideas being tried like proof of stake, blockchain contracts, non-fungible tokens, and many others I’m sure I’m not even aware of.

Lighting network seeks to offload the technical limitation but at the cost of losing the blockchain and its guarantee of trust.

But even though there are technical improvements and social innovations in the crypto / blockchain space — I’m not sure what to call it because neither term is exclusive or all encompassing — both the ultimate technical limit and unavoidable social problem remain.

The two challenges are:

Greed
Trust
Because of greed there is either a proliferation (and thus devaluation) of coins and networks.And because of lack of trust, no one network can be relied on implicitly.

Bitcoin was created to combat both the greed engendered by the corrupt legacy monetary system and the lack of trust engendered by that greed.

But as I said in my opening statement. It failed to address both. Blockchain and cryptography are used to eliminate the need for explicit trust but the way Bitcoin proposes to do so is with an infinite ledger — that you can’t read.

Bitcoin’s proof of work was primarily meant as a way to give the profit to the technical class — and it did so very well at first, but it turns out that like PCs and Windows, once someone figures out how to sell a computer with software to someone else, it doesn’t take a technical person to use it.

Bitcoin miners are primarily not technical people. No more so than the average PC gamer. Their grasp of the technical aspects of blockchain is very limited. And their understanding of the social and economic aspects is non-existent. And of course, they’ve primarily cashed out to institutional investors looking for a way to expand inflation. Or they are hodling small stakes.

Privacy is non-existent

I, as a Bitcoin user, have zero visibility into transactions, but large organizations (and governments) can track your every purchase and paper trail — in a way they could never dream about with paper money.

Security is no better than physical assets.

As has been shown in several non-hypothetical scenarios, all it takes for a thief to get your bitcoin is to point a gun at your head, and then it becomes his — irreversible, and (nominally) un-trackable. And exchanges are as open to hacking and corruption as banks — perhaps more so, because the government won’t step in an defend an independent exchange or punish an offender. The mainstream financial system has the benefit of being defended by the mainstream legal system. And lastly, that same mainstream system can do the same thing as the thief, but more elegantly, and take your bitcoin from you.

But, it was a good idea. Just flawed. There are lots of good engineering ideas that have come from perpetual motion pipe-dreams, and Bitcoin suffers the same type of flaws — however much you try to reduce friction (in a mechanical, or financial system) you still need to have inputs — and you will still, inevitably have outputs.

You can’t dismiss the laws of nature any more than you can dismiss the laws of human nature.

What you still need in a financial system is trust — trust in transactions, in contracts, and in the institutions that enforce them. And you can have that, just not with a technical solution.

But here’s how you get trust — you earn it.

Once upon a time, banks and governments earned trust. You put your money in the bank, and you got it back out when you asked for it. The money was verified to be real and not depreciated. And the government enforced laws and punished criminals. And did not inflate the money supply. To some reasonable degree — no system is perfect. And you can’t replace a good system with a perfect one, only a with a bad one.

When the American monetary system (and the English before it) decided (reasonably) that inflation was bad, they tried to stop it. Only to learn that when the monetary supply doesn’t increase — but production of real value does, that deflation happens — in other words, money becomes more valuable. And those who have more of it become proportionally richer than those who have less of it. Folks like Benjamin Franklin and Alexander Hamilton saw the problem with this — the rich got richer — and kept a hold of their money because merely hodling it increased their wealth proportionally.

But the opposite, monetary creation, means that holding onto money has negative value — it decreases in value because it is diluted by the inflation of currency. So the wealthy are encouraged to invest it — and their investment in real properties (and imaginary ones like stocks) goes up in value by the very expansion of money.

So it looks like a win-win (or lose-lose, depending on which end of the stick you get), but the idea was that a modest inflation that tracks real growth reduces (but does not really eliminate) the disparities. And based on the growth of the past couple centuries, I’d say that was right — or at least it wasn’t as bad as the alternatives.

I don’t think the growth of the past couple centuries was founded — or even primarily based upon — liberal monetary policy, rather that the policy of modest inflation was possible because of the two factors that are cropping up as issues in Bitcoin: greed and trust.

Or rather, honesty and integrity. Because you could trust your bank and government not to be too bad — and because you could trust your neighbor, or customer, or employer, to fulfill their contracts, the existing monetary system allowed for fluidity and growth.

It doesn’t anymore. And it’s not the systems that are broken, it’s the people.

So what do you do in a broken system, or in a system full of (or infiltrated by) greedy and dishonest people?

You work with honest people.

But how do you do that — how can you trust people? How can you build a system of trust?

A-ha! That’s where encryption, digital signatures, and public ledgers can come in.

That’s all really Bitcoin is — it’s a digitally signed ledger, with a encrypted hash (block) of transactional history (chain). Well, that and a few gimmicks. Like the proof-of-work for mining (which provided an incentive for people to enter the system) and the limited supply (to combat runaway inflation).

These actually worked to encourage adoption, but as I argued (incompletely) above, their are both social and technical problems that are perhaps insurmountable. And there is also the economic problem of potential deflation which results in speculators and hodlers.

And finally, there is the inflation cycle of infinite alt-coins much like the government backed banking system of infinite debt.

Again, what can you do to solve all this?

You can build a network of trust by honoring transactions, not debasing your monetary system, and providing transparency. Independent banks did this by printing their own currencies — or receipts, really — and by building trust between each other — exchanges, and clearinghouses — and governments enforced contracts and punished thievery.

So, we can scrap the blockchain, the proof of work, the hocus-pocus, and the competition by providing real value — a safe, secure, transparent transaction system, and clearinghouse between competing systems.

There is an economy of scale challenge here. It does take millions to develop, and billions to promote. Maybe a bitcoin-like land grab is actually essential to get it off the ground. And it does take an open, benevolent government to both allow such a system to exist, and to enforce it’s contracts. Maybe that is insurmountable too with governments and banks in each others’ pockets.

But we don’t have to fall back to a barter system or wagons full of gold and silver coins and armed guards to deter robbers. We can use cryptographically signed receipts and public ledgers to transfer money and reconcile between competing systems — the competition between systems will build trust. Those who provide the service we need — transactions we can trust — will get our business.

There’s one more challenge — other than needing people and institutions to become honest and benevolent (not greedy) — and that is the tendency towards monopoly. If one system of exchange becomes more efficient, has more integrity, and hence gains more market share, and thus power, it will tend to corrupt.

But it’s a workable system, and it looks like it only needs a revolution every hundred years or so.

AWS CLI command completion in BASH

Sometimes you just have to do something simple in AWS — launch an EC2 instance, create a S3 bucket, run a lambda expression, insert some data into DynamoDB, etc.

You could do this via the AWS Management console — but that means logging in and using the Amazon Web Services very user-unfriendly user interface.

You could write a script that uses the Amazon SDK API via a library like Python Boto3.

Or you can install the AWS CLI and run a simple shell script.

But Amazon’s API is far from clear. And while the CLI is easier for simple tasks than Boto, for example, it still means you need to know just the right command sequence and arguments to get it right .

For example:

aws s3api create-bucket --region us-west-2 --acl public-read --bucket mybucket

It can be tricky to figure out the right flags (the above example is not right).

One nice thing to have is command line completion, and with the AWS CLI, in addtional to the aws command, it install aws_completer. In order to use the aws completer in your BASH shell prompt, you want to add the following to your .bashrc or .bash_profile file:

# Add aws command completion to .bashrc file

complete -C '/usr/local/bin/aws_completer' aws

Thanks to the following post for showing me how to do this:

View at Medium.com

Taking back the internet from the Behemoth

The internet has pretty much become a walled garden for a few major corporations such Google, Facebook, Amazon, etc.

It takes luck (and skill) to find anything outside these gatekeepers.

Whether it’s searching for news, videos, music, or products & services for sale… good luck finding content on your own.

Independent voices are stifled (or promoted) to suit the whims of these behemoths — who work in collusion.

But what if you want an independent web — where you can find the content and commentary of independent creators based on your interests, not theirs.

Even though the majority of content is published within these walled gardens — and people do so because that’s where they can be found — it’s actually easier than ever to publish your own content. It’s just harder to find.

And we’ve gotten lazy — why set up your own blog, or sell your own products — when a big corporation lets you use their tools for free — as long you put it in their walled garden.

I think I’ve found a way to beat the system.

Social networks and search engines promote their own content, algorithically and deliberately. It only makes sense — if it’s a big network, more people are interested in it, so more people must be searching for content from big networks.

It’s circular logic.

But what if you could search for a topic, and your search results would promote independent content by demoting (or distinguishing it from) major content networks.?

That way you could more easily find products from independent sellers, news from independent sources, people who share your interests outside social networks, and yes, even cat videos from independent publishers.

So what I’m proposing is a search engine with a reverse Google algorithm – that promotes results with fewer inbound links, or more specifically, separates results from major media and networks from independently published content.

And providing incentives for people to own their own content, whether it’s cat videos, political commentary, pictures of you grandkids, or crafts for sale.

We don’t need the walled gardens if we are willing to plant our own seeds and — here’s the tricky part — pull out our own weeds, whether it’s offensive content (that we don’t want, not just what they don’t want) or just spam.

When should you use JavaScriptExecutor in Selenium?

When you want to execute JavaScript on the browser :)

This was my answer to a question on Quora

https://www.quora.com/When-should-I-use-JavaScriptExecutor-in-Selenium-WebDriver/answer/Aaron-Evans

JavaScriptExecutor is an interface that defines 2 methods:

in Java (and similarly in C#):

Object executeScript(String script, Object... args)

and

Object executeAsyncScript(String script, Object... args)

which take as an argument a string representing the JavaScript code you want to execute on the browser and (optionally) one or more arguments. If the second argument is a WebElement it will apply the script to the corresponding HTML element. Arguments are added to the JS magic arguments variable which represents the values passed to a function. If the code executed returns a value, that is returned to your Selenium code

Each driver is responsible for implementing it for the browser.

RemoteWebDriver implements it as well.

But when *you* as a Selenium user want to use JavaScriptExecutor is when you assign a driver to the base type WebDriver, which does not implement it.

in this case, you cast your driver instance (which really does implement executeScript() and executeScriptAsync().

For example

WebDriver driver = new ChromeDriver();  

// base type ‘WebDriver’ does not define executeScript() although our instance that extends RemoteWebDriver actually does implement it.

// So we need to cast it to ‘JavaScriptExecutor’ to let the Java compiler know.

JavaScriptExecutor js = (JavaScriptExecutor) driver;

js.executeScript(“alert(‘hi from Selenium’);”

if you keep your instance typing, you do not need to cast to JavaScriptExecutor.

RemoteWebDriver driver = new RemoteWebDriver(url, capabilities);  

// information about our type is not lost so the Java compiler knows our object implements executeScript()

WebElement element = driver.findElement(By.id(“mybutton”));

driver.executeScript(“arguments[0].click();", element);

// in the above case it adds the element to arguments and performs a click() event (in JavaScript in the browser) on our element

String htmlsnippet = driver.executeScript(“return document.querySelector(‘#myid’).outerHTML” , element);

// this time we use native JavaScript on the browser to find an element and return its HTML, bypassing Selenium’s ability to do so.

The above two examples illustrate ways you can accomplish in JavaScript what you would normally use Selenium for.

Why would you do this?

Well, sometimes the driver has a bug, or it can be more efficient (or reliable) to do in JavaScript, or you might want to combine multiple actions in 1 WebDriver call.

Use Python @contextmanager decorator to start and stop WebDriver

One of the most frustrating simple annoyances with using Selenium is the need to manage the creation and destruction of your WebDriver instances.

If your configuration isn’t correct, it won’t start. And if your test does not complete successfully (or if you forget to close it down properly) you will have an orphaned browser window and a stray webdriver process, for example chromedriver or geckodriver

I can’t count the number of times I’ve had to go into the command line and type:

(on Windows)

taskkill /F /IM chromedriver.exe /T

(on Linux or Mac)

killall chromedriver

or

ps -ef | grep '[c]hromedriver' | awk '{print $2}' | xargs -l kill -9 

When starting a Remote WebDriver instance, you need the Selenium Server URL (or command executor) and Desired Capabilities

from selenium import webdriver

selenium_url = "http://localhost:4444"
capabilities = {"browserName": "chrome"}
driver = webdriver.Remote(command_executor=selenium_url, desired_capabilities=capabilities)

driver.quit()

And when you’re done, you need to make sure that you call

driver.quit()

The obvious solution is to make sure that you setup and teardown your webdriver instance. But in practice this means wrapping your code in try/except/finally blocks or some other mechanism

from selenium import webdriver
from selenium.common.exceptions import WebDriverException

try:
    driver = webdriver.Chrome()
except WebDriverException as e:
    print(e)
finally:
    driver.quit()

PyTest has a cool fixture mechanism, so you can do this:

from selenium import webdriver
import pytest

@pytest.fixture
def driver():
    print("starting webdriver")
    driver = webdriver.Remote(command_executor="http://localhost:4444", desired_capabilities={"browserName": "chrome"})
    
    yield driver # passes execution control to your test code

    print("stopping webdriver")
    driver.quit()

And your test can look as simple as this, by passing in the fixture as an argument to your test:

def test_with_webdriver(driver):
    driver.get("https://fijiaaron.wordpress.com");
    print(driver.title)

But what if you’re not using Pytest? What if you’re not actually testing, but using Selenium for process automation or data scraping?

Python has a couple of very useful tools that can help you manage your driver instance.

Using a decorator (much like the pytest fixture — which is, in fact, a decorator) — you can wrap a function and return another function. What Pytest is doing is creating a generator function — where you can use the yield statement to pass temporary control, similar to a return statement, but it works on a generator — which means you create a decorator that turns your simple setup and teardowm function into a generator that you can then yield to your automation.

It’s possible to do this yourself, but this is such a common pattern, that Python already has a library built in for doing this, contextlib: https://docs.python.org/3/library/contextlib.html

Specifically, we want to use the contextmanager decorator:

from contextlib import contextmanager

@contextmanager
def managed_resource(*args, **kwds):
    # Code to acquire resource, e.g.:
    resource = acquire_resource(*args, **kwds)
    try:
        yield resource
    finally:
        # Code to release resource, e.g.:
        release_resource(resource)

>>> with managed_resource(timeout=3600) as resource:
...     # Resource is released at the end of this block,
...     # even if code in the block raises an exception

You’ve probably seen something like this before when you read a file:

with open("file.txt") as file:
    for line in file.readlines():
        print(line)

This handles the opening and closing of the file handle.

You can do the same thing with a WebDriver instance:

from contextlib import contextmanager
from selenium import webdriver 

url = "https://fijiaaron.wordpress.com"

@contextmanager
def chromedriver(*args, **kwargs):
    print("starting webdriver")
    driver = webdriver.Chrome()    
    try: 
        yield driver
    finally:
        print("quitting webdriver")
        driver.quit()

with chromedriver() as driver:
    driver.get(url)
    print(driver.title)

You can even pass in arguments including your configuration capabilities to webdriver:

@contextmanager
def browser(*args, **kwargs):
    print(f"starting webdriver with {kwargs}")
    driver = webdriver.Remote(**kwargs)    
    try: 
        yield driver
    finally:
        print("quitting webdriver")
        driver.quit()

with browser(command_executor="http://localhost:4444", desired_capabilities={"browserName": "firefox"}) as driver:
    driver.get(url)
    print(driver.title)

Should employers require a college degree?

This was my response to a thread on Code Mentor about hiring developers without a degree.

If a company requires a degree and you don’t have one — it seems like a mutually beneficial filter.

Personally, I’d rather hire someone based on skills and ability to learn and contribute. And I’d rather work for somewhere that values the same thing.

A university degree has traditionally meant someone who comes from the upper classes who is willing to conform. For a brief period (in the late 1900s) that probably changed to include people with intellectual curiousity and academic excellence, and so employers thought that a degree was a reasonable proxy indicator for that.

Then it was thought that everyone should have a degree so that everyone can have “good” jobs, but it doesn’t work like that. You can’t take the assumption that outcome B indicates attribute A, so if we give everyone B, then they will all possess A by association. So degrees, being worthless, became undervalued.

And now we’re back to the condition that a degree primarily indicates inherited social status — but for a larger group — and it doesn’t indicate academic effort at all, but rather social conformity. In fact, it rigorously filters for it. So those with intellectual curiousity are not only turned off by, but shunned by the university system.

Ergo, I don’t want to work for a company that requires a college degree, and they probably don’t want to hire someone like me, anyways, even if I do have a degree.