Data Visualization

A Picture is Worth a Thousand Data Points

Summer means vacations, and vacation means pictures—lots of them! Nowadays they are almost exclusively digital, and those kinds of pictures contain Exchangeable Image File Format (EXIF) data like camera type, camera orientation, exposure time, geographical location, and much more. If you could extract this data, you could aggregate and analyze it. If you work for MicroStrategy, analyzing and visualizing everything is not just a possibility, it's an obligation.

I store all my pictures on my home PC in two folders: SLR Camera and Mobile Pictures (with many subfolders). I wrote a Python script that crawled through this folder structure, analyzed every picture it found, and saved that data in a CSV file. Then I used MicroStrategy Desktop to analyze all 43,000 digital pictures I had taken since 2002, when I bought my first digital camera, the Canon PowerShot A40. I had already forgotten the name and look of it, but, as stated above, EXIF data stores this information.

 

Devices

Since 2002 I’ve owned several digital cameras, and as you can see from the chart below I’ve become a member of the Canon clan. My first real digital SLR camera was the EOS 400D I bought in 2008. Then I upgraded to EOS 650D (2012), and most recently to EOS 80D (2017).

At first, I was surprised to see this extremely high bar for photos from 2008. I realized that was the year I took my brand-new EOS 400D on an unforgettable, three-week road trip through rural Ukraine, Romania, and Moldova. I encourage you to travel there and find out why I took so many shots.

As everyone else nowadays, I take a lot of pictures with my smartphone. My upgrade path can be observed on the bar chart below: Blackberry 8900 > iPhone 4 > iPhone 5 > iPhone SE > iPhone X.

The trend is clear: mobile photography contribution is on the rise, and in 2018 it is already 74% of the total photos taken this year.

Mobile devices are easy to use, always at hand. If your kid does something unusual you don’t tell her to hold still until you find a “real” camera and mount a proper lens. You just pull out your phone and snap a photo or two, or 23.

However, there is one more factor that makes mobile photography so popular: growing picture quality. The average resolution of my mobile pictures has been growing (due to growing sensor sizes) steadily from 4.5 in 2011 to 11.5 megapixels in 2018. Pictures from my iPhone X look very good and the second lens for portraits is great. Pictures taken at night are still behind those from SLR cameras, but I believe it will improve in the future.

Types of photos

One of the most surprising outcomes from this exercise were the statistics around image orientation. I’ve read some books about photography and this general advice repeats often: horizontal (landscape) photo orientation is better than vertical, because it’s the way humans see the surrounding world.

The emergence of mobile photography, due to the way we usually hold a device, caused many people to take vertical photos. I was afraid I’d become a victim. To my surprise, it’s the opposite! Eighteen percent (18%) of my mobile pictures are vertical while the number is as high as 30% in case of the SLR photos. I knew I took a lot of vertical pictures with my SLR camera (I love taking portraits), but I did not expect such a difference.

I was a bit disappointed to find out that I am a “weekend photographer.”

But one thing is obvious: I don’t take too many selfies (5.4%) and I don’t overuse flash (1.75% of mobile photos).

Geographical data

Over 7,000 of my mobile photos contain altitude EXIF data. I live near Warsaw in Poland, and the majority of my mobile pictures were taken in an area of altitude between 80 and 150 meters. It is not strange that the histogram of altitude is concentrated around the 100m mark point. What is strange is that the scale of X axis reaches 11k meters.

What could cause that? After excluding the 0-1000 meters range I’ve realized that a few times I forgot to enable airplane mode on my phone while taking pictures through the window of the plane…

It’s also visible on the map visualization. Below you can see the path of my plane approaching Copenhagen airport in April 2018. It must have been a plane, since I don’t recollect boarding a ship this year.

With map visualizations, I found point clustering a very useful feature. I can quickly select a cluster and see the list of pictures from this area (with thumbnails). It is also possible to click on it, and the original picture will automatically pop up in the system picture browser.

Analyzing pictures

My Python script analyzes and aggregates the color of each pixel (with the range of 0-259). By setting the filter to [Brightness<10], I could quickly identify all underexposed photos.

If you set the filter to [blue>180 | green<120 | red<120] you will be able to find all blue pictures—usually of water or sky.

A few thoughts on privacy

I am sure some of you will check the EXIF data of attached pictures. That attempt would be fruitless, since I’ve removed it. I did that for privacy/security reasons—there were real stories of burglars and other criminals using this information. On the other hand, law enforcement organizations can use EXIF data to catch ignorant criminals (link 1, link 2, link 3). You also should be aware of the existence of EXIF data when you post your photos online. Some services, like Facebook, remove them automatically, but some don’t. Here’s how you can remove them yourself.

MicroStrategy Desktop is a great tool that you can download for free to analyze and visualize all kinds of data—for business uses or personal. The Python script and MicroStrategy Dossier I’ve used can be downloaded from the MicroStrategy Community. You can reuse it to analyze your own photo library—if you do, please share your experience and insights with the community!

Comments Blog post currently doesn't have any comments.
Security code
custom.divId_social-share_Class_icon-block-groupM