Automating the apartment hunt via Bostadsformedlingen

2018-07-29

As you may or may not be familiar with, finding an apartment in Stockholm is not very easy. Your alternatives are basically buying your apartment or renting either via subletting contracts or utilzing the public utlity company Bostadsformedlingen. It turns out that neither of the latter alternatives are optimal. While leasing you often up with overpriced subpar apartments (despite rent control) whereas you’re faced with an unrealistically long queue using Bostadsformedlingen.

I’ve personally been in the queue for an apartment with Bostadsformedlingen for aproximately 10 years and no luck at finding an apartment so far. Mostly because it’s a time consuming task (mind you the user experience on their website is terrible) as aparments are listed for only a few days at the time. Hence I thought - why not (semi) automate this task.

While casually browsing Bostadsformedlingen I found that it loads a resource called AllaAnnonser (all listings) at the following URL: https://bostad.stockholm.se/Lista/AllaAnnonser.

devtools_bostadsformedlingen

This conveniently GETs all the currently listed apartments and hence the idea to automate the task of filtering and downloading this. I decided to write a simple Python program (Github repo here) to do just this.

The resource AllaAnnonser convinelty wraps the data in a json list which can easlily be parsed in Python.

[
   {
      "AnnonsId":144964,
      "Stadsdel":"Flemingsberg",
      "Gatuadress":"Rontgenvagen 5",
      "Kommun":"Huddinge",
      "Vaning":3,
      "AntalRum":1,
      "Yta":28,
      "Hyra":4055,
      "AnnonseradTill":"2018-07-31",
      "AnnonseradFran":"2018-07-28",
      "KoordinatLongitud":17.937958782410504,
      "KoordinatLatitud":59.223580253558971,
      "Url":"/Lista/Details/?aid=144964",
      "Antal":1,
      "Balkong":false,
      "Hiss":true,
      "Nyproduktion":false,
      "Ungdom":false,
      "Student":true,
      "Senior":false,
      "Korttid":false,
      "Vanlig":false,
      "Bostadssnabben":false,
      "Ko":"Bostadskon",
      "KoNamn":"Bostadskon",
      "Lagenhetstyp":"Studentlagenhet",
      "HarAnmaltIntresse":false,
      "KanAnmalaIntresse":false,
      "HarBraChans":false,
      "HarInternko":false,
      "Internko":false,
      "Externko":false,
      "Omraden":[
         {
            "Id":306,
            "PlatsTyp":2
         },
         {
            "Id":8,
            "PlatsTyp":1
         },
         {
            "Id":76,
            "PlatsTyp":0
         }
      ],
      "ArInloggad":false,
      "LiknadeLagenhetStatistik":{
         "KotidFordelningQ1":3,
         "KotidFordelningQ3":6
      }
   },
   ...
]

In my case I decided to build a class to handle all the data management but the data download and parsing can easily be captured by a simple program.

import Requests

response = requests.get('https://bostad.stockholm.se/Lista/AllaAnnonser')
if response.status_code == 200:
    data = response.json()

The next steps are rather straight forward. I store the data as a pandas.DataFrame and save it to disk by “Pickling” it in order to determine what listings are “new” since the program ran the last time. My initial idea was to store the data as csv but I ended up having precision issues with the map coordinates while stored as floating points which caused the slight differences and I wanted to get away with as little data manipulation as possible.

Now I had the option to host the full program on my local desktop computer (which is typically not always on) or use some sort of cloud server. I opted for the latter and spun up a Compute Engine with Google Cloud. My idea was simply to email myself the an HTML table version of the data every day for me to look at. It turns out that cloud providers are very much subject to abuse where people use the services to send mass emails with malware or ads and Google has hence blocked the standard SMTP ports. It’s all very well document here and they offer workarounds by utilizing thrid-party email providers that you can call via standard http requests. I decided to go with MailJet as the sign up process was simple (even for the free tier) and that they offer a very simple Python module for authenticating and sending the emails.

I created a simple HTML template and use convert the pandas DataFrame to html with pandas.DataFrame.to_html. The final result looks something like:

id district municipality sqm rooms type rent Q3 fromDate toDate
54 144961 Johanneshov Stockholm 37.0 1.0 Hyresratt 5949.0 13.0 2018-07-27 2018-07-31
55 144965 Kristineberg Stockholm 29.0 1.0 Korttidskontrakt 5105.0 12.0 2018-07-28 2018-07-31
68 144956 Norrmalm Stockholm 37.0 1.0 Hyresratt 7795.0 18.0 2018-07-28 2018-07-31
79 144952 Stadshagen Stockholm 40.0 1.5 Korttidskontrakt 5920.0 NaN 2018-07-28 2018-07-31
81 145023 Sodermalm Stockholm 89.0 2.0 Hyresratt 9163.0 23.0 2018-07-31 2018-08-01
83 145002 Sodermalm Stockholm 40.0 1.0 Hyresratt 6775.0 18.0 2018-07-31 2018-08-01
86 144966 Sodermalm Stockholm 73.0 2.0 Korttidskontrakt 6871.0 21.0 2018-07-28 2018-07-31

It probably doesn’t look the best on this page as it uses dynamic sizing in a limited sized blog post but I will probably end up re-purposing this the function for presenting further data on this blog in the future. I ended up having to replace some junk produced by to_html() but the basics of it all goes something like:

import pandas

def html_table(df):
    # Load table template
    with open('table_template.html', 'r') as f:
        template = f.read().replace('\n', '')

    # Replace formatting options that come with to_html()
    html_str = df.to_html()\
        .replace('\n', '')\
        .replace('<table border=\"1\" class=\"dataframe\">', '')\
        .replace('</table>', '')
    return template % html_str

I’ll happily answer any questions or help out if you want to implement something similar via my email address below.