Visualization

Fortune 100

I used Selenium to web-scrape the Fortune 100 website and drew a radial chart based on revenue and market value.

Selenium

D3.js

Data Collection with Selenium

I usually use BeautifulSoup in Python to web-scrape data based on the page’s HTML code. However, the Fortune 100 has some sneaky thing that doesn’t render the HTML code unless you click a button to see more, making BeautifulSoup useless by itself. I used this blog post to learn how to use Selenium because someone else faced this issue before me (and solved this problem before me).

I used webdriver, which allows me to use features like waiting a few seconds to let the next page load. In order to use one, I downloaded a Chrome webdriver and imported it with this line of code.

# Open browser
driver = webdriver.Chrome('/path_to_chromedriver/chromedriver')
url = 'https://fortune.com/fortune500/2019/search/'
driver.get(url)

To overcome the delayed page rendering, my program waited a few seconds before reading off the website.

# Find all the rows on the page
# Wait 10 seconds before thowing an exception
# Calls ExpectedCondition every 500 milliseconds until returns successfully
rows = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'rt-tr-group')))

Finally, this block of code allows me to click the “Next” button directly on the website, which will render the next 10 companies, adding them to the HTML code.

# Go to next page   
page += 1
next_button = driver.find_element_by_xpath("//div[contains(@class,'-next')]")
next_button.click()
time.sleep(5)

Visualization

My first chart was based on a very nice visual I saw on Tableau Public. Tableau can create Polygons by inputting vertices, similar to filling in shapes in CSS using paths. Therefore, I coded this chart using the same concepts and D3-arc.

I was also introduced to using transitions that would reorder the companies and color the appropriate level based on what metrics was selected.

My second chart was based on a college tuition ranking I saw on Visual Capitalist. It also used scales and arcs like the first chart, but it also included circles to show Revenue.

Fun Fact: Technology companies (Apple, Amazon, Google, Microsoft, Facebook, Oracle) tend to have the highest top-line sales.