I used Selenium to web-scrape the Fortune 100 website and drew a radial chart based on revenue and market value.
I usually use
BeautifulSoup in Python to web-scrape data based on the page’s HTML code. However, the Fortune 100 has some sneaky thing that doesn’t render the HTML code unless you click a button to see more, making
BeautifulSoup useless by itself. I used this blog post to learn how to use
Selenium because someone else faced this issue before me (and solved this problem before me).
webdriver, which allows me to use features like waiting a few seconds to let the next page load. In order to use one, I downloaded a Chrome webdriver and imported it with this line of code.
# Open browser driver = webdriver.Chrome('/path_to_chromedriver/chromedriver') url = 'https://fortune.com/fortune500/2019/search/' driver.get(url)
To overcome the delayed page rendering, my program waited a few seconds before reading off the website.
# Find all the rows on the page # Wait 10 seconds before thowing an exception # Calls ExpectedCondition every 500 milliseconds until returns successfully rows = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'rt-tr-group')))
Finally, this block of code allows me to click the “Next” button directly on the website, which will render the next 10 companies, adding them to the HTML code.
# Go to next page page += 1 next_button = driver.find_element_by_xpath("//div[contains(@class,'-next')]") next_button.click() time.sleep(5)
My first chart was based on a very nice visual I saw on Tableau Public. Tableau can create Polygons by inputting vertices, similar to filling in shapes in CSS using paths. Therefore, I coded this chart using the same concepts and D3-arc.
I was also introduced to using transitions that would reorder the companies and color the appropriate level based on what metrics was selected.
My second chart was based on a college tuition ranking I saw on Visual Capitalist. It also used scales and arcs like the first chart, but it also included circles to show Revenue.
Fun Fact: Technology companies (Apple, Amazon, Google, Microsoft, Facebook, Oracle) tend to have the highest top-line sales.