Skip to main content

more data musings

(the advantage of traveling by public transport once in a while is you can sit and faff on laptop)

More data musings

The only guarantee about user entered data is that, given enough entries it'll be inconsistent :-(

take for example an openstreetmap xapi query to pull out '/api/0.6/*[amenity=post_box]'

which is nice dataset of ~85k enties which I'll use for some simple analysis

So, the UK has ~40k postboxes, of which according to draco the breakdown of entries from the count are sources as follows:
13.5k - osm, 26.7k - website.

so of those 13504 UK postboxes in OSM, how many are royal mail run (hint - most of them!)
does the data match?

$ grep "operator" ~/Downloads/data.osm | sort | uniq -c | grep -i royal
1 <tag k='operator' v='Post Office: Royal Mail'/>
1 <tag k='operator' v='royal mail'/>
1 <tag k='operator' v='Royal mail'/>
5065 <tag k='operator' v='Royal Mail'/>
1 <tag k='operator' v='RoyalMail'/>
1 <tag k='operator' v='Royal MAil'/>
1 <tag k='operator' v='Royal Mail Warwick'/>
2 <tag k='operator' v='Royal York'/>

not bad - only a few CaSe sEnsiTive issues to sort out

What about other operators, say La Poste?

$ grep "operator" ~/Downloads/data.osm | sort | uniq -c | grep -i poste
1 <tag k='operator' v='Bureau de poste'/>
1 <tag k='operator' v='De Post - La Poste'/>
7 <tag k='operator' v='la poste'/>
21 <tag k='operator' v='la Poste'/>
12 <tag k='operator' v='La poste'/>
917 <tag k='operator' v='La Poste'/>
1 <tag k='operator' v='La Poste Belgique'/>
6 <tag k='operator' v='La Poste - De Post'/>
1 <tag k='operator' v='La Poste Suisse'/>
1 <tag k='operator' v='Le Poste'/>
1 <tag k='operator' v='poste'/>
5 <tag k='operator' v='Poste'/>

again - it's the 'long tail' problem. So, out of the ~85k entries how many unique operators?
404 (how apt for a web service)

and of those how many are singles? 222 - OVER HALF!


qu1j0t3 said…
What, nobody put in "Consignia" as a joke? :)

Popular posts from this blog

Growatt inverter monitoring with Raspberry Pi

At home we have a small (2.5KW - 10*250w panels) PV system to try and offset our daytime electricity usage. This is connected to a 'Growatt' inverter that handily has both RS485 (wierd 2 pin plugs) and RS232 (9 pin D connector buried under a screwplate) outputs.

With the firmware on ours (installed Sept 2013) it supports modbus-rtu over serial 9600 8N1.

I had done some initial digging and experimentation (as announced on Whirlpool) but never really got sensible values out.When my guruplug (via a long USB to serial adaptor) finally died and I shelved the whole thing. With the completion of the structured wiring though I finally got round to reconnecting it and starting again.

Small D9 Gender changer, + cisco console cable (all hail fleabay) gives a nice neat look on the outside, and in the garage I have another console cable plugged into the relevant patch outlet and a cheap usb-serial adaptor in a Raspberry Pi (which also has a GPS module connected, acting as a PPS NTP master)

Publishing DHT22 data via MQTT with an ESP8266

Some time ago I picked up a couple of ESP-01 modules with the intention of using them as wireless temperature/humidity sensors coupled with a DHT22.

Initial investigations took place at the Perth Artifactory "Arduino-U" evenings - I managed to put on a nodemcu lua firmware and found a few (varying) dht22 libraries. however I couldn't ever manage to get it to consistently publish the information to my message broker - it'd do one or two and then lock up. I dug it out again recently and decided to have another go - especially as Pete Scargill seemed to be having success with them (running native C).

So trying to 'revert' to a newer espressif release turned out to be non-trivial - installing the relevant toolchain needs multiple bits. I gave up and noticed that there was a newer (0.9.6-dev_20150704) nodemcu release, so I gave that a try.

First discovery - There's native support for the dht sensors in the firmware, so to get the current values all you need is…

Pretty Colours via MQTT

What does a geek do when they have some spare RGB LED strip (addressable WS2812B) and some cheap nasty LED devices? LED transplant time...

So, first to go was the LED glass prism stand received as a christmas present - out went the potted pcb with three fading LEDs, and in went a single piece of RGB strip fixed in place with a hot glue gun.
wire comes out the bottom and goes to a nanode.
So far so good, but I don't just want fixed or fading colours so time to revisit an IoT idea: Cheerlights

The cheerlights API defines 10 colors that can be set, but I want the possibility of sending any RGB value, so I created @FakeCheerlights as an MQTT series of topics on the broker


which contain the hex RGB value, the identified colour name and the raw tweet.

A separate script (running on the NAS) uses the twitter API via tweepy to follow the twitter stream search for 'cheerlights' and 'fakecheerlights…