Big thanks to Mieszko and Kelton for invaluable planning/design/production input & help.
This whole year, we’re lucky enough to have collaborator & all-round awesome guy Alan Mislove spending his sabbatical connected to my group. Alan is an associate professor College of Computer and Information Science at Northeastern University. His research concerns distributed systems and networks, with a focus on using social networks to enhance the security, privacy, and efficiency of newly emerging systems. He is a recipient of an NSF CAREER Award (2011), and his work has been covered by the Wall Street Journal, the New York Times, and the CBS Evening News.
In October, Alan will give a talk about recent work that has been widely covered in the media – and that I think will be interesting to many of you – the details are here:
- Date & Time: October 9th, 2015, 11am
- Venue: DTU, Building 321, first floor lab space
- Title: Measuring personalization of online services
- Abstract: Today, many web services personalize their content, including Netflix (movie recommendations), Amazon (product suggestions), and Yelp (business reviews). In many cases, personalization provides advantages for users: for example, when a user searches for an ambiguous query such as “router,” Amazon may be able to suggest the woodworking tool instead of the networking device. However, personalization is rarely transparent (or even labeled), and has the potential be used to the user’s disadvantage. For example, on e-commerce sites, personalization could be used to manipulate the set of products shown (price steering) or by customizing the prices of products (price discrimination). Unfortunately, today, we lack the tools and techniques necessary to be able to detect when personalization is occurring, as well as what inputs are used to perform personalization.
In this talk, I discuss my group’s recent work that aims to address this problem. First, we develop a methodology for accurately measuring when web services are personalizing their content. While conceptually simple, there are numerous details that our methodology must handle in order to accurately attribute differences in results to personalization (as opposed to other sources of noise). Second, we apply this methodology to two domains: Web search services (e.g., Google, Bing) and e-commerce sites (e.g., BestBuy.com, Expedia). We find evidence of personalization for real users on both Google search and nine of the popular e-commerce sites. Third, using fake accounts, we investigate the effect of user attributes and behaviors on personalization; we find that the choice of browser, logging in, and a user’s previously content can significantly affect the results presented.
- Date: September 3rd, 2015
- Time: 11am
- Location: Technical University of Denmark, Building 321, 1st floor lab space
- Title: The Spontaneous Emergence of Social Conventions: An Experimental Study of Cultural Evolution
- Abstract: How do shared conventions emerge in complex decentralised social systems? This question engages fields as diverse as linguistics, sociology, and cognitive science. Previous empirical attempts to solve this puzzle all presuppose that formal or informal institutions, such as incentives for global agreement, coordinated leadership, or aggregated information about the population, are needed to facilitate a solution. Evolutionary theories of social conventions, by contrast, hypothesise that such institutions are not necessary in order for social conventions to form. However, empirical tests of this hypothesis have been hindered by the difficulties of evaluating the real-time creation of new collective behaviours in large decentralised populations. Here, I will present experimental results—replicated at several scales—that demonstrate the spontaneous creation of universally adopted social conventions and show how simple changes in a population’s network structure can direct the dynamics of norm formation, driving human populations with no ambition for large scale coordination to rapidly evolve shared social conventions. I will also show that a simple model describes well the experimental results on different classes of social networks.
This week my good friend & collaborator James Bagrow (assistant professor at University of Vermont) is visiting the group. He’s an excellent speaker, and we’re lucky enough that he’s agreed to give a talk as part of his visit. If you’re anywhere near Copenhagen, his talk is worth the trip out to DTU. Here are the details:
- Time: Friday June 19th, 2015. 10:00am
- Location: Technical University of Denmark, Building 321. First floor “Lab Space”. If you need directions, click here.
- Title: Data-driven approaches to studying human dynamics
- Abstract: Research on human dynamics and computational social science has been revolutionized by new data taken from online social networks. These modern datasets capture activity patterns across very large populations. Using these records, new results have been discovered and existing hypotheses have been tested. But what is the fundamental limit of social information stored in these data? These data also have sampling biases and other issues that make uncertainty quantification crucial. Along these lines, I will discuss current projects related to inferring hidden structure in partially observed networks and using large-scale Twitter data to estimate how information is stored and flows through social networks.
(And Vedran Sekara’s PhD defense is that same afternoon).
When I started working on understanding social systems, privacy really wasn’t on my mind. (I generally want to write down equations, understand the universe and all that). But one of the central realizations arising from our SensibleDTU experiment is that privacy needs to be an important part of this kind of research. I’ve written about this at length elsewhere. One of the things we noticed while digging into terabytes of social data is that data-channels are highly correlated. Information “bleeds through” … something which has serious implications for privacy. Case in point: My group has just released a new preprint (get it here) that shows how the WiFi information routinely collected by your smartphone can easily be converted to precise information about your location. WiFi routers reveal where you live, work, and spend your leisure time. While your phone may have told you that WiFi helps “improve location accuracy”, it may come as a surprise that
- A majority of apps in the store have access to the list of routers around you (scanned every 20 seconds).
- Your Android smartphone by default scans for WiFi routers even if you disable WiFi.
Our research shows
- How to easily convert WiFi information into geographical position.
- That although it sounds like all WiFi scans might be a lot of data to process, your mobility can be described using just a few of access points. And we have built an Android app which only requires WiFi data to illustrate how this works for your own mobility: Download here.
- That if someone knows these routers at some point in time, they will still know a lot about your mobility six months later.
Thus, while WiFi networks are intended for enabling connectivity, they are also a de facto location tracking infrastructure. More generally, our world is becoming more enclosed in a web infrastructures supporting communication, mobility, payments, and advertising. Logs from mobile phone networks (call detail records, CDRs) constitute a global database of human mobility and communication networks. Credit card records form high-resolution traces of our spending behaviors.
Update June 3rd, 2015 (maybe-our-paper-played-a-role-in-this edition)
Yesterday, while scouring Google I/O for details on the updated permissions (and to see if anyone mentioned our work), we found that a Google engineer (Ben Poiesz) was asked about the issue of WiFi tracking during the session discussing the new permission model. The session took place on May 29th – the clip is here:
In the video, the friendly Google engineer notes that that – under the new system – apps without the location permission will no longer be able to see the mac addresses of WiFi and Bluetooth devices around … because that’s that’s equivalent to location.
No one is claiming (least of all us) that our work caused the change, but we would like to point out a couple of things about the way Google chose to announce it, which might indicate that the choice of fixing wifi is a recent decision on Google’s part:
- The published source code [find it here] (lines 99-114) and documentation [find it here] do not yet indicate that WiFi information is to be treated as location.
- When you install the current Android M beta on your phone, our “WiFi Watchdog”app still works … and WiFi is not treated as location. And a technical point: This it’s not just because of the “legacy mode” – according to the same presentation (https://youtu.be/f17qe9vZ8RM?t=13m): “WiFi Watchdog” should just receive empty data on Android M, but instead it continues to receive the same data as on Lollipop
- The announcement of this arguably major change (80% of apps on the market would potentially be affected) was not a part of the main presentation … but an answer during the Q&A session.
Now, it is probably just a coincidence, and maybe a fix for the WiFi permissions has been in the works for months. But it’s quite striking that Google decided to fix wifi permissions 7 years after the existing scheme was introduced (and just days after we published our paper).
I just love this quote which uses a Tiger Woods anecdote to illustrate what it means to be a professional. It’s from The War of Art by Steven Pressfield (a great read, btw).
With four holes to go on the final day of the 2001 Masters (which Tiger went on to win, completing the all-four-majors-at-one-time Slam), some chucklehead in the gallery snapped a camera shutter at the top of Tiger’s backswing. Incredibly, Tiger was able to pull up in mid-swing and back off the shot. But that wasn’t the amazing part. After looking daggers at the malefactor, Tiger recomposed himself, stepped back to the ball, and striped it 310 down the middle.
That’s a professional. It is tough-mindedness at a level most of us can’t comprehend, let alone emulate. But let’s look more closely at what Tiger did, or rather what he didn’t do.
First, he didn’t react reflexively. He didn’t allow an act that by all rights should have provoked an automatic response of rage to actually produce that rage. He controlled his reaction. He governed his emotion.
Second, he didn’t take it personally. He could have perceived this shutterbug’s act as a deliberate blow aimed at him individually, with the intention of throwing him off his shot. He could have reacted with outrage or indignation or cast himself as a victim. He didn’t.
Third, he didn’t take it as a sign of heaven’s malevolence. He could have experienced this bolt as the malice of the golfing gods, like a bad hop in baseball or a linesman’s miscall in tennis. He could have groaned or sulked or surrendered mentally to this injustice, this interference, and used it as an excuse to fail. He didn’t.
What he did do was maintain his sovereignty over the moment. He understood that, no matter what blow had befallen him from an outside agency, he himself still had his job to do, the shot he needed to hit right here, right now. And he knew that it remained within his power to produce that shot. Nothing stood in his way except whatever emotional upset he himself chose to hold on to.
This month we have a two excellent of long-term visitors in the group.
Visiting all month is Ivan Brugere a graduate from Tanya Berger-Wolff‘s group at University of Illinois, Chicago. Ivan is interested in Spatiotemporal network mining, Network inference and prediction, and Social network privacy modeling.
Stopping by between April 12th and April 18th is Laura Allesandretti, who’s a graduate student with Andrea Baronchelli at City University London. Laura, Andrea and I are studying the long-term changes in individual and collective mobility patterns. In the literature, human mobility is typically described on a meta-stable time-scale, where mobility is characterized by regular patterns. We are interested in how this meta-stable regime evolves over long stretches of time (years).
Ivan & Laura will both be giving talks during their visits, so stay tuned for more info.