• Claire Matuka

Automating Python Scripts

Updated: Oct 2, 2021

If you are anything like me, you have probably read a million blog posts trying to figure out how to automate your python scripts. As I was fumbling around the internet trying to figure this out, I went through a lot of trial and error, and "AHA!" moments. This article will save you from the fumbling and can simply be described as a step by step tutorial of actually getting this done.





Before we get started, here are a few things to note about this particular tutorial;

  • The methods described apply to Windows users only. Sorry Mac and Linux :( better luck next time

  • Anaconda is used but I will also show how to do this on traditional Python

  • I ran this locally on my laptop (no fancy infrastructure needed)

Fun Fact: A cron job is a time based job scheduler in Unix-like computer operating systems (Linux, Mac OS). The equivalent of this is a Task Scheduler in Windows operating system.

Web Scraping

My goal is to scrape a stocks website and store the information in an excel sheet on my laptop . The Task Scheduler will allow me to scrape the website numerous times in a day and constantly get updated information.


Hope you are excited and let's get started!



1. Creating an environment


Quick disclaimer, I personally love having different Anaconda environments for different kind of tasks or projects. If you use one environment for everything (base), or are using Python instead of Anaconda Python, feel free to skip to step 2.


After launching the Anaconda Prompt, I typed in the following code to create my web scraping environment:

conda create --name myenv

Where "myenv" is the name of the environment.



2. Installing Packages


Anaconda has made it so easy to install packages, thanks to the Anaconda Navigator. Simply navigate to the Environments tab as seen below, select your environment and install all the packages you need.


For those who simply prefer to work using the Anaconda Prompt, or are working with Python, you can easily install packages using the good ol' pip install command.


For this particular project, I installed the following packages into my new environment: jupyter, pandas, bs4, urllib



3. Write code


As mentioned earlier, my goal is to scrape a stocks website, in my case Yahoo Finance. Because this tutorial is mainly focused on automating the script, I only chose to get information regarding 3 stocks: Apple, Amazon and Tesla. This information is then exported to an excel file that I can easily access.


After writing the script, I stored it as a .py file (yahooscrape.py). In case you are curious, here is my code:

from bs4 import BeautifulSoup
from time import sleep
from random import randint
from urllib.request import Request, urlopen
import pandas as pd

apple = "https://finance.yahoo.com/quote/AAPL?p=AAPL"
amazon = "https://finance.yahoo.com/quote/AMZN?p=AMZN"
tesla = "https://finance.yahoo.com/quote/TSLA?p=TSLA"
all_links = [apple, amazon, tesla]


# creating empty lists to store the information
symbol = []
last_price = []
change = []
pct_change = []

#Loop to go over all links
for link in all_links:
    
    # opening the link
    req = Request(link, headers={'User-Agent': 'Mozilla/5.0'})
    webpage = urlopen(req).read() 
    
    #creating time between requests
    sleep(randint(2,10))
    
    #parsing the html
    soup = BeautifulSoup(webpage, 'html.parser')
    
    #finding the necessary data
    link_symbol = soup.find(class_= 'D(ib) Fz(18px)').get_text()
    link_title = soup.find(class_='D(ib) Mend(20px)')
    link_last_price = link_title.find_all('span')[0].get_text()
    link_changes = link_title.find_all('span')[1].get_text()
    link_change = link_changes.split(" ")[0]
    link_pct_change = link_changes.split(" ")[1]
    
    #adding the data to respective lists
    symbol.append(link_symbol)
    last_price.append(link_last_price)
    change.append(link_change)
    pct_change.append(link_pct_change)


mydict = {'Symbol': symbol, 'LastPrice': last_price, 'Change': change, 'PctChange': pct_change}
mydf = pd.DataFrame(mydict)

mydf.to_excel('D:/scrapped data/MajorStocks.xlsx', sheet_name='data')



4. Adding Anaconda or Python to PATH variable in Windows


Now, this right here is one of the most crucial steps in this tutorial. This is because it will guide on where to run the Python Script you have just created. Usually when installing Anaconda and Python, we often select the option of adding it to our PATH variable. Incase you did not do this, no worries, you can still add it.

  • On your PC, search "environment variables" and open "Edit the system environment variables"















  • Select the Path variable and click on Edit





















  • Confirm if you have Python and Anaconda present. As you can see below, at the top, I have both Python and Anaconda present. If not, simply click on "New" and enter the location of your Anaconda or Python























5. Create a .bat executable file


Open notepad and enter the location of your python.exe, followed by the location of the python script you just created. I made sure to select the python.exe file that is located in the anaconda environment I used (very very very important). Make sure to enclose both in double quotes and leave a space in between. It should look something like this:

For normal Python users, simply enter the location of your python.exe file.


Then save the file, preferably in the same location as your Python script as a .bat file. To do this, just enter the name of the file and add .bat, then save (yahooscrape.bat)


You can now easily run the script by simply double-clicking on the .bat file you just created. This is important to note because the Task Scheduler will simply be performing this task.



6. Create a task on Task Scheduler

  • On your PC, search "Task Scheduler" and open the Task Scheduler app
















  • On the right side menu, click on "Create Task"

  • Name your task

  • On the top bar, select "Triggers" and click on "New". This is where you get to play around and decide how often you want the script to be run. In my case, I set it to run every 1 hour. Feel free to play around with the settings and adjust according to your preferences.














  • On the top bar, select "Actions" and click on "New", browse to the location of the .bat file created and click OK




















  • You can adjust the settings on the different tabs displayed to suit your needs. When done, click on OK and just like that, you have automated your Python Script!!!!


**********


Thank you for making it this far. I hope this tutorial has been of help to you. It sure was a lot of fun for me to write. Let's keep coding.