Walmart.ca Ramen Tracker

indomie

Introduction

Alright so this is a weird one but it’s a great example of the type of thing I use programming for day-to-day. I have this ramen that I really enjoy from Walmart, it’s called Indomie. If you haven’t had it, seriously, give it a go.

I’ve recently moved and where I am this Indomie ramen is a popular commodity, and as such the local Walmart is almost always out of stock. This is a problem, how can we solve it?

Brainstorming

As it usually happens the first thing that pops into my head is “I could definitely make a python script to handle this”. Let’s make a simple script that when executed downloads the Indomie product page and checks the stock. If there’s new stock available I’ll send a notification to myself somehow.

Many people have told me that python is a terrible choice to program in because it’s slow and has high memory consumption, and they aren’t necessarily wrong. But in this instance it’s a great choice because I don’t care how fast the program is and with python I can write a quick and easy script that will avoid making the task at hand more complicated than it needs to be. When I publish my production build for millions we'll start talking about porting to C. /s

Intelligence gathering

So before we can start pulling down all this amazing data we need to find out where it lives. This will begin by me going to the loved by all: “Inspect Element” page in a web browser.

From this image we can see that the stock count, currently “Out of Stock” sits within a span with an obviously generated class name. In fact this entire site looks pre-generated which isn’t surprising seeing how it’s entire purpose is to host dynamic content.

So what we could do is use an xpath for this element to pull it out of the page in Python. It might look something like this:

//div[@data-automation="find-in-store-table-grid-item"]//div[2]//span

3 Years ago that’s exactly what I would have done, download the page and use xpath to get the element value I want, but this isn’t the best solution.

Most modern websites use APIs in behind the main page that allow them to quickly fetch and update dynamic content on the fly. If it’s possible it’s much better to download a tiny data file with the information we need then to fetch an entire web page.

To investigate this let’s have a look at the “Network” tab in the Firefox (or Chrome) developer console. Once you open it hit the “Reload” button and you’ll see all the requests the page makes in a neat list. I’ll start from the bottom looking for data files like json or xml until I see something that looks interesting.

Sure enough there’s a request with a JSON response at an endpoint called “find-in-store”. Here’s a snippet of what the response data looks like:

{
         "distance":1.59325056,
         "id":3144,
         "displayName":"Guelph Supercentre",
         "intersection":"Woodlawn Ave & Woolwich St",
         "sellPrice":1.97,
         "availableToSellQty":0,
}

And there we have it, that’s exactly the data we are looking for.

So we’ve successfully found our data, now we just have to worry about getting it into python so we can send notifications. We can start by copying our URL from the Firefox developer console: copy api url

Before downloading directly with python we need to take care of some gotchas. It’s not uncommon that website developers will block directly downloading of certain files unless you have some sort of cookie, which is to try and stop exactly what we are doing.

To check if this is the case I’ll use the command-line utility “curl” to fetch the data at our copied URL:

curl error

Yepp, it’s blocked. Well that unfortunate. This is the part where we have to start trying things to make the website happy. Anti-botting step 1 is always to deny untrusted User-Agents. If you haven’t heard of them before User-Agents are a fingerprint the browser reports that tells a website information about what platform you are running on.

I’m going to try setting a user agent string telling the website I am on a Windows machine using Firefox when instead I’m actually downloading it through curl on Linux.

Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0

curl good

And there we go! It’s not usually this easy so I’m thankful it worked.

Implementing our Python script

Now I’m going to create a python script with the name walmart_notifier.py and put in my general boilerplate. I need to pass headers to the website when downloading the page so I’ll use the popular Python library Requests to perform the web fetching. We should also define some constants for the information in the web request. In this request we pass our latitude and longitude (likely so the Walmart API can give us our local stores) as well as the product’s UPC.

Here’s what our simple download script will look like:

#!/usr/bin/env python

import requests

upc = 8968617079
coords = {'lat': '43.5588', 'lon': '-80.3004'}
url = "https://www.walmart.ca/api/product-page/find-in-store?latitude=" + str(coords['lat']) + "&longitude=" + str(coords['lon']) + "&lang=en&upc=" + str(upc)

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0'
}

jsonResp = requests.get(url, headers=headers).text

print(jsonResp)

test script

Once we have basic downloading down we can start working at parsing the JSON. The Walmart API returns a few of the closest stores but in my case I only care about one store. It would be nice to specify the stores we care about. Each object in the JSON array does have a store ID, so let’s just filter by that.

#!/usr/bin/env python

import requests
import json

upc = 8968617079
coords = {'lat': '43.5588', 'lon': '-80.3004'}
store_ids = [1199]
url = "https://www.walmart.ca/api/product-page/find-in-store?latitude=" + str(coords['lat']) + "&longitude=" + str(coords['lon']) + "&lang=en&upc=" + str(upc)

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0'
}

jsonResp = requests.get(url, headers=headers).text

obj = json.loads(jsonResp)

for item in obj['info']:
    if item['id'] in store_ids:
        print("--- Store: " + item['displayName'] + " @ " + item['intersection'])
        print("⇒   " + str(item['availableToSellQty']) + " Units available")

real script

Sending notifications

Okay so we have our data finally, now we just need to send ourselves some notifications. There’s beyond a ridiculous amount of ways to do this, and luckily this relates to another problem I’m having.

I would really like to have an app on my phone that will accept push notifications from multiple sources so that multiple scripts can send me updates. I used to use a custom Android app that implemented Firebase push notifications, but this was clunky and annoying to use. I would also like to avoid using a proprietary service if possible.

This is where I found Gotify:

gotify ui

Gotify allows you to host a custom push notification server that will pair up with their Android app which can be found on F-Droid.

I won’t go over the full setup as that could be an entire article in itself, but I’ll show you how to send notifications from your scripts using it. Their documentation is really simple and easy to follow if you want to give it a try though: Gotify Docs

Once the Gotify server is setup it will have a REST API running which you can use to send notifications from your code. Following the format in the documentation I’ll include this new function and message template in my script:

message = "{units} available at {storename}." # notification message, wildcards: "{units}, {storename}, {upc}"

def notify(units, store):
    messageStr = message.replace("{units}", str(units)).replace("{storename}", store).replace("{upc}", upc)
    url = "http://localhost:8080/message?token=APbQ97X6xX-z._t"
    resp = requests.post(url, data={'title': 'Walmart product available', 'message': messageStr, 'priority': 10})
    print(resp.status_code)

When my script is run and the JSON file is read, I’ll use an if statement to check the value of item['availableToSellQty']. If it’s numeric and larger than zero we call our notify() function.

If I give this a go with some test data I get this:

gotify notification

And there we have it! Now if I just add my script as a cronjob on my desktop it’ll run and check the stock once every hour

0 * * * * /usr/bin/env python /home/shane/walmart_notifier.py

Conclusion

So I’ve been running this script for a few weeks now and it seems to work pretty well. I usually only enable it a few days before I’m running out of stock on ramen and it gives me a good wake up call to go do my weekly shopping. If you’re interested in these types of simple data driven scripts they are really easy to implement once you have some experience. We live in an age where data is almost infinite and heavily accessible, there’s lots of fun crazy things you can do. A great example is Tom Scott making a twitter bot that tweets when someone in British parliament edits Wikipedia.

Although every website is different the methods I used in this post can be used practically anywhere. So If you make anything cool send me an email or post a comment! :D

Here’s the full script I’m using now if you want to have a look:

#!/usr/bin/env python
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
# File: walmart_notifier.py
# Author: Shane Brown <contact at shanebrown dot ca>
# Description: Watch for a walmart product to become available at your local store
#||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

import sys
import json
import argparse
import requests

# USER VARIABLES - START
message = "{units} available at {storename}." # notification message, wildcards: "{units}, {storename}, {upc}"
upc = "8968617079" # change the UPC to use a different walmart product
store_ids = [1199] # Add whichever store IDs you want notifications for
coords = {'lat': '43.5588', 'lon': '-80.3004'} # Set your local coordinates so walmart knows what stores to send you
# USER VARIABLES - END

def notify(units, store):
    messageStr = message.replace("{units}", str(units)).replace("{storename}", store).replace("{upc}", upc)
    url = "<YOUR_GOTIFY_SERVER>"
    resp = requests.post(url, data={'title': 'Walmart product available', 'message': messageStr, 'priority': 10})

parser = argparse.ArgumentParser(description="Walmart product availability notifier")
parser.add_argument("-q", "--quiet", action="store_true", help="Don't print to stdout")
parser.add_argument("-d", "--dryrun", action="store_true", help="Don't send notification")
args = parser.parse_args()

url = "https://www.walmart.ca/api/product-page/find-in-store?latitude=" + coords['lat'] + "&longitude=" + coords['lon'] + "&lang=en&upc=" + upc
headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:83.0) Gecko/20100101 Firefox/83.0'
}

jsonResp = requests.get(url, headers=headers).text

if not jsonResp:
    print("Invalid response")
    sys.exit(1)

obj = None
try:
    obj = json.loads(jsonResp)
except json.decoder.JSONDecodeError as err:
    print("Failed to parse json of contents:")
    print(jsonResp)
    with open("error_file.txt", "w") as handle:
        handle.write(jsonResp)
    print("Error: " + str(err))
    sys.exit(1)


available = False
for item in obj['info']:
    if item['id'] in store_ids and item['availableToSellQty'] > 0:
        available = True
        if not args.quiet:
            print("--- Store: " + item['displayName'] + " @ " + item['intersection'])
            print("⇒   " + str(item['availableToSellQty']) + " Units available")

        if not args.dryrun:
            notify(item['availableToSellQty'], item['displayName'] + " @ " + item['intersection'])

if not available and not args.quiet:
    print("No units available")