Between classic business transactions and social interactions and machine-generated observations, the digital data tap has been turned on and it will never be turned off. The flow of data is everlasting. Which is why you see a lot of things in the loop around real time frameworks and streaming frameworks. – Mike Hoskins, CTO Actian
From Mike Hoskins to Mike Richards (yes we can do that kind of leap in logic, it’s the weekend)…
Oh, Joel Miller, you just found the marble in the oatmeal! You’re a lucky, lucky, lucky little boy – because you know why? You get to drink from… the firehose! Okay, ready? Open wide! – Stanley Spadowski, UHF
Firehose of Terror
I think you get the picture – a potentially frightening picture for those unprepared to handle the torrent of data that is coming down the pipe. Unfortunately, for those who are unprepared, the disaster will not merely overwhelm them. Quite the contrary – I believe they will be consumed by irrelevancy.
It’s not every day that you receive snail mail with life-changing information in it, but when it does come, it can come from the unlikeliest sources.
A year ago, when doing a simple change of health insurance vendors, I had to give the requisite blood sample. I knew the drill… nurse comes to the house, takes blood, a month later I get new insurance documents in the mail.
But this time the package included something new: the results of my tests.
The report was a list of 13 metrics and their values, including a brief description about what they meant and what my scores should be. One in particular was out of the norm. My ALT score, which helps measure liver malfunction, was about 50% higher than the expected range.
Simple Data Can Be Valuable
Here is the key point: I then followed up with my family doctor, with data in hand. I did not have to wait to see symptoms of a systemic issue and get him to figure it out. We had a number, right there, in black and white. Something was wrong.
Naturally, I had a follow up test to see if it was just a blip. However, my second test showed even worse results, twice as high in fact! This lead to an ultrasound and more follow up tests.
In the end, I had (non-alcoholic) Fatty Liver Disease. Most commonly seen in alcoholics, it was a surprise as I don’t drink. It was solely due to my diet and the weight I had put on over several years.
It was a breaking point for my system and the data was a big red flag calling me to change before it was too late.
Not impressed with my weight nor all my other scores, I made simple but dramatic changes to improve my health.* Changes were so dramatic that my healthcare provider was very curious about my methods.
By only making changes to my diet I was able to get my numbers to a healthy level in just a few months. In the process I lost 46 pounds in 8 months and recovered from various other symptoms. The pending train wreck is over.
Long Term Value in Sharing Healthcare Data
It’s been one year this week, so I’m celebrating and it is thanks to Manulife or whoever does their lab tests, for taking the initiative to send me my lab results.
It doesn’t take long to see the business value in doing so, does it? I took action on the information and now I’m healthier than I have been in almost 20 years. I have fewer health issues, will use their systems less, will cost them less money, etc.
Ideally it benefits the group plan I’m in too as a lower cost user of the system. I hope both insurers and employers take this to heart and follow suit to give the data their people need to make life changing and cost reducing decisions like this.
One final thought.. how many people are taking these tests right now? Just imagine what you could do with a bit of data analysis of their results. Taking these types of test results, companies could be making health predictions for their customers and health professionals to review. That’s why I’m jumping onto “biohacking” sites like WellnessFX.com to track all my scores these days and to get expert advice on next steps or access to additional services.
I’m so happy with any data sharing, but why give me just the raw data when I still have to interpret it? I took some initiative to act on the results, but what if I had needed more incentive? If I had been told “Lower your ALT or your premiums will be 5% higher” I would have appreciated that.
What’s your price? If your doctor or insurer said “do this and save $100” – would you do it? What if they laid the data out before you and showed you where your quality of life was headed, would it make a difference to you?
I’m glad I had this opportunity to improve my health, but at this point I just say thanks for the data … and pass the salad please!
Tiles in a Tile Map Server (TMS) context are basically raster map data that’s broken into tiny pre-rendered tiles for maximum web client loading efficiency. GDAL, with Python, can chop up your input raster into the folder/file name and numbering structures that TMS compliant clients expect.
The bonus with this utility is that it also creates a basic web mapping application that you can start using right away.
The script is designed to use georeferenced rasters, however, any raster should also work with the right options. The (georeferenced) Natural Earth raster dataset is used in the first examples, with a non-georeferenced raster at the end.
There are many options to tweak the output and setup of the map services; see the complete gdal2tiles chapter for more information.
Open the openlayers.html file in a web browser to see the results.
The default map loads a Google Maps layer, it will complain that you do not have an appropriate API key setup in the file, ignore it and switch to the OpenStreetMap layer in the right hand layer listing.
The resulting map should show your nicely coloured world map image from the Natural Earth dataset. The TMS Overlay option will show in the layer listing, so you can toggle it on/off to see that it truly is loading. Figure 5.2 (above) shows the result of our gdal2tiles command.
Geospatial Power Tools is 350+ pages long – 100 of those pages cover these kinds of workflow topic examples. Each copy includes a complete (edited!) set of the GDAL/OGR command line documentation as well as the following topics/examples:
Use a SQL-style -where clause option to return only the features that meet the expression. In this case, only return the populated places features that meet the criteria of having NAME = ’Shanghai’:
Building on the above, you can also query across all available layers, using the -al option and removing the specific layer name. Keep the same -where syntax and it will try to use it on each layer. In cases where a layer does not have the specific attribute, it will tell you, but will continue to process the other layers:
ERROR 1: 'NAME' not recognised as an available field.
NOTE: More recent versions of ogrinfo appear to not support this and will likely give FAILURE messages instead.
Geospatial Power Tools is 350 pages long – 100 of those pages cover these kinds of workflow topic examples. Each copy includes a complete (edited!) set of the GDAL/OGR command line documentation as well as the following topics/examples: