Twitter cards

We added twitter meta tags to our pages so you can get twitter cards describing traffic events in your twitter feed. It's much more readable than what our road watchers were tweeting before.

See ?

Don't forget to follow @BeRoads_NL, @BeRoads_FR, @BeRoads_DE or @BeRoads_EN for realtime traffic events updates in the language of your choice.

And don't forget that sharing is caring ;-)

Q

Detecting dead webcams with python

We receive a lot of feedback from our users since the launch of BeRoads in october. A big thanks to everybody ! There was a common pattern in all these mails / posts / tweets : users were complaining about unavailable webcams feeds.

We explained to users that it was not on our side but on the providers side. After weeks of watching our dashboard, we gradually got a taste of what users might experience : dead webcams everywhere.

Okay, so lots of webcams feeds are broken down. What should we do about it ? What about detecting broken feeds and "tag" them as unavailable ?

Our webcams downloader is a python job that runs in 3 different steps :

  • scrap webcams links from providers
  • download webcams feeds
  • update MySQL database

We spotted 3 different kinds of broken feeds :

Checking last update with requests

The HTTP protocol provide a way to tell when was the last time the resource was updated : the 'last-modified' header. It's pretty straight forward to access it with requests :

import requests  
response = requests.get('http://foo.bar/img.jpg')  
if 'last-modified' in response.headers:  
    print response.headers['last-modified']

The last-modified value is an UTC formatted date (i.e. 'Tue, 15 Nov 1994 12:45:26 +0000').

The simplest way to check if a resource has been modified in the last < insert time span here > is to use timestamps. To convert a last-modified value to a timestamp, you will have to use the calendar and datetime libs :

import datetime  
import calendar  
timestamp = calendar.timegm(datetime.datetime.strptime(                response.headers['last-modified'],  
    '%a, %d %b %Y %H:%M:%S %Z'
).utctimetuple())

We use this technique to check if the webcam feed has been updated since the last time we downloaded it.

Compare feed with 404 images

There is two kind of 404 images that our providers return when a feed is not available. This one and this one. We compare our images against these 404 with a similarity function and write the image on disk only if the similarity is lower than a predefined ratio.

import requests  
response = requests.get('http://foo.bar/img.jpg')  
with open('404.jpg') as f1:  
    similarity = float(sum([a == b for a, b in zip(f1.read(), response.content)])) / len(c1)
    if similarity > ratio:
        return False    # we don't write on disk
    else:
        return True     # we write on disk

The ratio value depend on the 404 that we compare to.

Checking if an image is completely blacked out.

To verify that our webcam feed is not like this one, we use Python Imaging Library (PIL).

We simply load the rgb matrix, travel through each pixels and count black pixels. We check the number of black pixels against a predefined ratio.

from PIL import Image  
im = Image.open('webcam.jpg')  
R, G, B = im.convert('RGB').split()  
r = R.load()  
g = G.load()  
b = B.load()  
w, h = im.size  
pixels = 0  
for i in range(w):  
    for j in range(h):
        if r[i,j] < 10 and g[i, j] < 10 and b[i, j] < 10:
            pixels +=1
 if black > ratio:
    return False    # we tag it as unavailable
 else:
     return True     # we tag it as available

And that's it ! The API now provide an 'enabled' attribute for webcams, we even create a use-case for this one : the Dead webcams everywhere project :)

Quentin