Privacy risks from iOS photo metadata

There’s a ton of very personal information associated with a photo that you take with your smartphone. By default, the phone captures all of the camera settings (aperture, shutter speed, focal length). But it also captures location and timestamp. The timestamp and location from a photo, or series of photos, can be used by a domestic violence perpetrator to infer places a victim frequents, and their patterns of travel. When this information is posted via a photo sharing service or social media account, it can be an unexpected (and even unknown, silent) privacy breach. 

A Twitter conversation the other night prompted this post. A very senior graphics engineer was surprised to see how much of her personal information and travel patterns was exposed to a stalker ex-partner via photo sharing. The location history revealed by someone’s photo stream is at least as rich (and endangering) as the direct location history determined from GPS. That’s a dangerous privacy breach. If it caught a senior engineer by surprise, imagine how many non-technical smartphone customers are at risk!

All of this photo reference information is commonly referred to as metadata, but that’s an imprecise technical buzzword. Properly written messaging and photo sharing apps will educate the customer about what’s being captured, shared, and posted. “It’s not just the photo, but we’re going to tell the world where you were and when you were there. And once we post it, that information will be available forever, and indexed by all of your favorite search engines.” Many apps won’t be quite that honest. And many customers won’t pay attention. If they do pay attention, they might not remember, years later, that they had given permission, when domestic violence becomes a possibility or reality.

Apple can help this by making an iOS app’s photo sharing permissions more granular.

At the moment, there are three levels of permission for access to the camera and the camera roll. They are defined in Cocoa Keys. They are: NSPhotoLibraryAddUsageDescription (write-only access to the photo library);  NSCameraUsageDescription (direct capture of the camera image); and NSPhotoLibraryUsageDescription (full read-write access to the photo library’s images and metadata).

An additional level of granularity, call it NSPhotoLibraryImagesUsageDescription, would help. This proposed new setting would allow an app to read the images in the photo library. It would not allow photo editing, metadata editing, or metadata viewing. If a customer grants NSPhotoLibraryImagesUsageDescription access to an app, that app cannot (deliberately or inadvertently) share the customer’s position history via photos. The privacy fence would be enforced by the operating system. And that’s exactly what we want an operating system to do.

I’ve filed this as rdar://33421676 with Apple. Dupe freely!

I have no idea what the analogous answer for Android is. Drop me a note if you know, and I’ll update this post.

Options for Full Text Search in Core Data

Last weekend Chris Olds and I were discussing text search engines, and in particular how to take advantage of them to speed up searches of free-form text in Core Data. Here’s a summary of what we found. I haven’t tested or implemented any of these ideas. This is simply a summary of what’s out there.

I’m not including techniques that deal with fast searches of short text fields: normalizing your query strings and searchable text, using case-insensitive searches, etc. That’s all well documented by Apple and in the usual Core Data reference books.

I did run across one very cool article outlining a profiling method I hadn’t ever seen before. The Art & Logic Blog goes one step further in the typical use of com.apple.CoreData.SQLDebug. Take advantage of the fact that you have SQLite installed on your Mac! You can paste the SQL query being logged by your iOS app into SQLite on your Mac, and use the EXPLAIN QUERY command there to understand the search plan.

Full Text Search

Full text search (FTS) is about finding search terms within large bodies of text. This is different from matching someone’s last name to the lastName attribute in a Core Data entity. Imagine instead that your Core Data database contains notes, or newspaper articles, or patent descriptions, or travel resort reviews, and you want to search within the text of those articles. The brute force method is to scan all of the text of each article, searching for matches to the search term. That takes a very long time, and doesn’t always give you the results you want.

Ideally, your FTS within Core Data will respond as quickly as Google or Bing does when you enter a search term. The results will be ranked by relevance, The search will handle word stemming correctly: if I enter a search for “lodge”, I probably want to see results containing “lodges” or “lodging”, too. Core Data does not handle any of these need.

Roll Your Own

Michael Heyeck wrote an 8 part series of blog articles describing how to build your own FTS capability directly within Core Data, using only Core Data tools and constructs. It’s a very comprehensive series, and it’s a shame it isn’t more widely known. He doesn’t just teach you how to do FTS in Core Data. He also shows you how to read and understand the SQL queries that are generated on your behalf, and how to modify your NSPredicates and data model design to make the queries fast.

The series includes source code for a Notes application with FTS, under BSD license.

Search Kit

When you type something into the Spotlight search bar on your Mac, you’re using FTS. Mac OS X has already built an FTS index of the files on your system, and queries that index. Search Kit is the Foundation framework that Apple uses to deliver those search results, and it’s available to you too. The catch? It’s Mac only, and not integrated into Core Data.

When we were chatting, I mentioned to Chris that Search Kit would make a terrific NSHipster topic. The next day, that’s what happened! The NSHipster article also summarizes the technical issues in Full Text Search nicely.

Indragie Karunaratne has a project on Github that uses Search Kit to back Core Data searches. I’ve only read over the source, and haven’t tried it, but it looks solid. His approach is to build a Search Kit index that returns NSManagedObjectIDs of Core Data objects matching a particular full text search.

Commercial Library

Locayta makes their FTS mobile search engine available to iOS developers: free for non-commercial use, $1000 per commercial app. It’s not integrated with Core Data. An approach similar to the one Indragie Karunaratne took with Search Kit integration would probably work, though.

Hackery

The backing store most commonly used with Core Data, SQLite, includes FTS support. It’s just not exposed in any Core Data API (at least, not as of iOS 6.1).

Wolfert de Kraker describes a technique for using the SQLite FTS4 engine simultaneously with Core Data. It involves creating a Virtual Table within the same SQLite database that Core Data uses. Then he uses FMDB to create a search method which uses the FTS4 search to respond to UISearchDisplayController delegate calls. NSManagedObjectIDs are returned as the raw SQLite search results, and then Core Data retrieves these objects.

This 2010 Stack Overflow answer describes a similar approach. A different answer a few months later makes a sideways variation: instead of storing NSManagedObjectIDs in the shadow SQLite table, store SQLite row IDs as Core Data attributes.

These solutions included a custom copy of SQLite in their projects. Although they are iOS projects, I see no reason you couldn’t use the same approach on OS X.

I found two other blog posts describing other implementations of this approach, one from Regular Rate & Rhythm and one from Long Weekend Mobile, both from 2010.

I have to say that it makes me very nervous to think of mucking around in Core Data’s SQLite file. Call me superstitious.

Open Source FTS

We looked at two long-established open source FTS engines, Xapian and Lucene.

Lucene is a Java-based search engine, part of the Apache project. A port to Gnustep, Lucene Kit, was begun in 2005 and seems to have languished for a while. The most current version I found was https://github.com/zbowling/LuceneKit, which was active as recently as 2012.

Xapian is a C++ search engine, and the one that Chris uses in his production code. It is presently licensed under GPL, which would make for some complications if you were to include it  in an iOS project. There was some mention on the Xapian forum of writing an Objective-C binding. The conclusion was that it should be straightforward, but that no one has done it yet.